版块导航: 正在加载中...

登录注册

应《网络安全法》要求，自2017年10月1日起，未进行实名认证将不得使用互联网跟帖服务。为保障您的帐号能够正常使用，请尽快对帐号进行手机号验证，感谢您的理解与支持！

24小时热门版块排行榜

>论坛更新日志 (8126)
>考研 (2604)
>导师招生 (1828)
>文献求助 (479)
>虫友互识 (373)
>休闲灌水 (296)
>考博 (248)
>基金申请 (203)
>硕博家园 (153)
>论文投稿 (148)
>招聘信息布告栏 (102)
>找工作 (98)
>博后之家 (96)
>教师之家 (89)
>公派出国 (79)
>论文道贺祈福 (36)

返回列表

【悬赏金币】回答本帖问题，作者he1wen2zhi将赠送您 5 个金币

he1wen2zhi

新虫 (初入文坛)

应助: 0 (幼儿园)
金币: 33
帖子: 8
在线: 37分钟
虫号: 34033718
注册: 2023-08-08
专业: 自然语言理解与机器翻译

[求助] 大修20天，3个审稿人，求教大佬们我该怎么改已有1人参与

求教大佬我该侧重哪方面改
要加什么实验呢
第三个审稿人说我没用自己提的数据集做实验，实际上我论文中写了我用了我的数据集做的训练，我该怎么合理回复呢
求求大佬们给我些建议

三条审稿意见如下：
Reviewer #1: This paper presents an audio-visual cross-modality generation method for talking face videos with rhythmic head.
The studied topic is meaningful.
The authors are suggested to further improve the paper from the following aspects.

The quality evaluation of the generated audio-visual talking heads is very important for the method design. The authors have used some criteria for evaluation. The authors may give some discussions on whether it is possible to use some quality assessment methods for evaluation. For example using the audio-visual quality assessment methods proposed in 'Study of subjective and objective quality assessment of audio-visual signals', 'Attention-Guided Neural Networks for Full-Reference and No-Reference Audio-Visual Quality Assessment' for evaluation.
The authors are suggested to give some discussions on this aspect and the above works.

'The proposed method demonstrates improved performance in terms of video quality compared to traditional approaches'
Some discussions about visual quality assessment are suggested to be given here, considering that there are many visual quality assessment studies in the literatures, for example, 'Blind quality assessment based on pseudo-reference image', 'Blind image quality estimation via distortion aggravation', 'Unified blind quality assessment of compressed natural, graphic, and screen content images', 'Objective quality evaluation of dehazed images', 'Quality evaluation of image dehazing methods using synthetic hazy images'.

Following the above comments, the quality assessment of multimedia signals is also highly relevant to this work, thus some surveys for quality assessment are suggested to be given in the introduction section of the paper, for example, 'Perceptual image quality assessment: a survey', 'Screen content quality assessment: overview, benchmark, and beyond'.

Audio-visual attention is critical for various audio-visual applications. Many audio-visual attention prediction methods have been proposed, for example, 'A multimodal saliency model for videos with high audio-visual correspondence', 'Fixation prediction through multimodal analysis'. The authors may give some discussions on the possibility of using audio-visual attention prediction methods to improve the proposed method.
The authors are suggested to give some discussions on this aspect and the above works.

Reviewer #2: This paper addresses the generation of realistic talking facial videos by incorporating audio and head pose information. Existing methods lack natural head pose generation and audio synchronization, impacting video realism. The authors propose Flow2Flow, an autoregressive method that encodes audio and historical head poses using a multimodal transformer block with cross-attention. They introduce AVVS, a large-scale dataset for investigating rhythmic head movement patterns. The proposed method generates identity-independent facial motion representations, enabling photo-realistic videos with natural head poses and accurate lip-syncing, as demonstrated through experiments and comparisons with state-of-the-art approaches on public datasets. However, some concerns should be addressed.

The organization of the paper could benefit from improvements, e.g., some video synthesis part is introduced in the feature encoding part.

The authors pointed out that the full attention structure in the model excessively focuses on a single source during integration, leading to the neglect of crucial information from other modalities. As a result, accurately generating movements for the facial generation task becomes challenging. It would be helpful to provide supporting evidence or examples to further illustrate this issue.

Instead of delving into the intricacies of flow theory, it would be more beneficial to focus on incorporating references in the facial attribute generation process.

The model utilizes 15 neutral keypoints as facial attributes. It would be valuable for the authors to explore the impact of varying the number of keypoints and investigate whether incorporating certain 3DMM parameters and other types of audio features would enhance the results.

The authors have primarily focused on discussing the applications of common loss functions. However, IQA models also have the wide-ranging applications in evaluating generative image, video, audio, and multimedia models, e.g., "Blind image quality assessment via cross-view consistency" and "Comparative perceptual assessment of visual signals using free energy features." The authors are suggested to give some discussions on this aspect and the above works. Additionally, considering the significance of attention mechanism, the authors are encouraged to provide discussions on related works like "Toward visual behavior and attention understanding for augmented 360-degree videos," "Viewing behavior supported visual saliency predictor for 360-degree videos," and "Learning a deep agent to predict head movement in 360-degree images."

Reviewer #3: introductions:
This paper proposes a normalizing flow based network to generate realistic talking face videos, by using audio and past head poses as inputs.
Besides, they also contributes a solo-singing-themed audio-visual dataset called AVVS for research.

Strength:
1. Experimental results do show that their methods can generate photo realistic videos with natural head poses and lip-syncing. And the
performance looks good.
2. Utilizing normalizing flow model is novel and convincing.

Weakness:
1. It is kind of stange that I do not see any experiments on AVVS dataset. Since you are proposing a dataset, I think some experiments should
be conducted on it.

回复此楼

» 猜你喜欢

不限学校专业的调剂同学看过来已经有8人回复
268求调剂已经有3人回复
新疆大学地质与矿业工程学院招生已经有7人回复
306求调剂已经有4人回复
274求调剂已经有8人回复
26申博-目前4篇SCI一作已经有4人回复
考研282分求调剂，接受跨专业已经有7人回复
324求调剂已经有6人回复
A区一本交叉课题组，低分调剂，招收机械电子信息通信等交叉方向已经有28人回复
334求调剂已经有7人回复

1楼 2023-08-10 14:57:53

已阅回复此楼关注TA 给TA发消息送TA红花 TA的回帖

半生梦君

新虫 (职业作家)

应助: 0 (幼儿园)
金币: 3019.9
散金: 200
沙发: 112
帖子: 3507
在线: 639.6小时
虫号: 33637970
注册: 2023-04-23
专业: 泛函分析

对于第三个审稿人，首先表示感谢他的评论，然后列举你论文中所用的你的数据集。

发自小木虫Android客户端

赞一下

回复此楼

2楼2023-08-10 15:01:44

已阅回复此楼关注TA 给TA发消息送TA红花 TA的回帖

nono2009

超级版主 (文学泰斗)

No gains, no pains.

专家经验: +21105
SEPI: 10
应助: 28684 (院士)
贵宾: 513.911
金币: 2555220
散金: 27828
红花: 2147
沙发: 66666
帖子: 1602255
在线: 65200.9小时
虫号: 827383
注册: 2009-08-13
性别: GG
专业: 工程热物理与能源利用
管辖: 科研家筹备委员会

【答案】应助回帖

感谢参与，应助指数 +1

能改的尽量改，不能改的诚恳说明。

发自小木虫Android客户端

赞一下

回复此楼

3楼2023-08-11 07:24:56

已阅回复此楼关注TA 给TA发消息送TA红花 TA的回帖

Mr_jianye

新虫 (正式写手)

应助: 0 (幼儿园)
金币: 513
散金: 1171
沙发: 1
帖子: 615
在线: 64.2小时
虫号: 8529537
注册: 2018-04-15
性别: GG
专业: 人工晶体

能改的改了

发自小木虫Android客户端

回复此楼

4楼2024-01-04 12:11:30

已阅回复此楼关注TA 给TA发消息送TA红花 TA的回帖

相关版块跳转我要订阅楼主 he1wen2zhi 的主题更新

返回列表

不应助 确定回帖应助 (注意：应助才可能被奖励，但不允许灌水，必须填写15个字符以上)

普通表情龙兔虎猫

最具人气热帖推荐 [查看全部]		作者	回/看	最后发表

[考博] 26申博-目前4篇SCI一作 +4	chen_2024 2026-03-02	4/200	2026-03-05 22:59 by sicilyl0001
[考研] 复试调剂 +4	呼呼？~+123456 2026-03-05	7/350	2026-03-05 22:21 by WTUChen
[基金申请] 成果系统访问量大，请15分钟后再尝试。由此给您造成的不便，敬请谅解。 +15	xhuama 2026-03-02	17/850	2026-03-05 17:14 by 超神赏金
[考研] 一志愿清华深研院材料专硕294分，专业课111分，本科中南大学材料，有六级，有工作经验 +3	H14528 2026-03-04	3/150	2026-03-05 10:35 by ms629
[考研] 0856材料与化工，270求调剂 +17	YXCT 2026-03-01	20/1000	2026-03-05 09:22 by 一切OK
[考研] 一志愿985材料与化工 326分求调剂 +3	Hz795795 2026-03-04	3/150	2026-03-04 20:54 by wutongshun
[考研] 320材料与化工，求调剂 +6	鹤遨予卿 2026-03-04	8/400	2026-03-04 20:47 by wutongshun
[考研] 材料化工调剂 +15	今夏不夏 2026-03-01	18/900	2026-03-04 15:44 by 每天只摆一小会
[考研] 0703化学调剂 +4	G212 2026-03-03	5/250	2026-03-04 09:34 by 每天只摆一小会
[考研] 085602化学工程350，调剂，有没有211的 +5	利好利好. 2026-03-02	9/450	2026-03-03 17:06 by 利好利好.
[考研] 0805总分292，求调剂 +12	幻想之殇 2026-03-01	12/600	2026-03-03 15:58 by tgxtgxtgx9
[考研] 267求调剂 +6	钓鱼佬as 2026-03-02	6/300	2026-03-03 13:59 by 13589
[考研] 307求调剂 +6	wyyyqx 2026-03-01	6/300	2026-03-03 09:24 by 2235787770
[论文投稿] 通讯作者写谁，问题是你意想不到的问题 15+3	阿尔法啊 2026-03-01	3/150	2026-03-03 09:13 by 北京莱茵润色
[考研] 290分材料工程085601求调剂数二英一 +8	llx0610 2026-03-02	9/450	2026-03-02 22:09 by 无际的草原
[考研] 材料085601调剂 +5	多多子. 2026-03-02	5/250	2026-03-02 19:15 by zhukairuo
[考研] 291 求调剂 +3	化工2026届毕业� 2026-03-02	3/150	2026-03-02 12:55 by houyaoxu
[考研] 265分求调剂不调专业和学校有行学上就 +6	礼堂丁真258 2026-02-28	9/450	2026-03-02 12:04 by 52hz~~
[考研] 化工299分求调剂一志愿985落榜 +5	嘻嘻(^ω^) 2026-03-01	5/250	2026-03-01 19:47 by 无际的草原
[考研] 317一志愿华南理工电气工程求调剂 +6	Soliloquy_Q 2026-02-28	11/550	2026-03-01 11:14 by 歌liekkas

24小时热门版块排行榜

he1wen2zhi

[求助] 大修20天，3个审稿人，求教大佬们我该怎么改 已有1人参与

» 猜你喜欢

半生梦君

nono2009

【答案】应助回帖

Mr_jianye

[求助] 大修20天，3个审稿人，求教大佬们我该怎么改已有1人参与