Associative Multichannel Autoencoder for MultimodalWord Representation
shaonan wang1,3; Jiajun Zhang1,3; Chengqing Zong1,2,3
2018-10
会议日期2018.10
会议地点Brussel
英文摘要

In this paper we address the problem of learning
multimodal word representations by integrating
textual, visual and auditory inputs.
Inspired by the re-constructive and associative
nature of human memory, we propose
a novel associative multichannel autoencoder
(AMA). Our model first learns the associations
between textual and perceptual modalities,
so as to predict the missing perceptual information
of concepts. Then the textual and
predicted perceptual representations are fused
through reconstructing their original and associated
embeddings. Using a gating mechanism
our model assigns different weights to each
modality according to the different concepts.
Results on six benchmark concept similarity
tests show that the proposed method significantly
outperforms strong unimodal baselines
and state-of-the-art multimodal models.

会议录出版者Conference on Empirical Methods in Natural Language Processing
语种英语
内容类型会议论文
源URL[http://ir.ia.ac.cn/handle/173211/40575]  
专题模式识别国家重点实验室_自然语言处理
通讯作者shaonan wang
作者单位1.National Laboratory of Pattern Recognition, CASIA, Beijing, China
2.CAS Center for Excellence in Brain Science and Intelligence Technology, Beijing, China
3.University of Chinese Academy of Sciences, Beijing, China
推荐引用方式
GB/T 7714
shaonan wang,Jiajun Zhang,Chengqing Zong. Associative Multichannel Autoencoder for MultimodalWord Representation[C]. 见:. Brussel. 2018.10.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace