Unsupervised and Pseudo-Supervised Vision-Language Alignment in Visual Dialog

CORC > 自动化研究所 > 中国科学院自动化研究所 > 数字内容技术与服务研究中心 > 听觉模型与认知计算

	Unsupervised and Pseudo-Supervised Vision-Language Alignment in Visual Dialog
	Feilong Chen 1,2; Duzhen Zhang 2; Xiuyi Chen 2; Jing Shi 2; Shang Xu 2; Bo Xu 2
	2022
会议日期	October 10–14, 2022
会议地点	Lisboa, Portugal
英文摘要	Visual dialog requires models to give reasonable answers accordingto a series of coherent questions and related visual concepts inimages. However, most current work either focuses on attentionbased fusion or pre-training on large-scale image-text pairs, ignoring the critical role of explicit vision-language alignment in visualdialog. To remedy this defect, we propose a novel unsupervisedand pseudo-supervised vision-language alignment approach forvisual dialog (AlignVD). Firstly, AlginVD utilizes the visual anddialog encoder to represent images and dialogs. Then, it explicitlyaligns visual concepts with textual semantics via unsupervised andpseudo-supervised vision-language alignment (UVLA and PVLA)Specifically, UVLA utilizes a graph autoencoder, while PVLA usesdialog-guided visual grounding to conduct alignment. Finally, basedon the aligned visual and textual representations, AlignVD givesa reasonable answer to the question via the cross-modal decoderExtensive experiments on two large-scale visual dialog datasetshave demonstrated the effectiveness of vision-language alignmentand our proposed AlignVD achieves new state-of-the-art results. Inaddition, our single model has won first place on the visual dialogchallenge leaderboard with a NDCG metric of 78.70, surpassing theprevious best ensemble model by about 1 point.
内容类型	会议论文
源URL	[http://ir.ia.ac.cn/handle/173211/51892]
专题	数字内容技术与服务研究中心_听觉模型与认知计算
通讯作者	Xiuyi Chen
作者单位	1.School of Future Technology, University of CAS 2.Institute of Automation, Chinese Academy of Sciences (CAS)
推荐引用方式 GB/T 7714	Feilong Chen,Duzhen Zhang,Xiuyi Chen,et al. Unsupervised and Pseudo-Supervised Vision-Language Alignment in Visual Dialog[C]. 见:. Lisboa, Portugal. October 10–14, 2022.

个性服务

查看访问统计

相关权益政策

暂无数据

收藏/分享

所有评论 (0)

[发表评论/异议/意见]

暂无评论

评论
权益异议
反馈意见

评注功能仅针对注册用户开放，请您登录

您对该条目有什么异议，请向管理员反馈。
内容：
Email：	*
单位:
验证码：	刷新

您在知识库使用过程中有什么好的想法或者建议可以反馈给我们。
标题：	*
内容：
Email：	*
验证码：	刷新

相关链接

CORC

联系我们