Empirical ExploringWord-Character Relationship for Chinese Sentence Representation | |
shaonan wang1,2; jiajun zhang1,2; chengqing zong1,2,3 | |
刊名 | ACM Transactions on Asian and Low-Resource Language Information Processing
![]() |
2018 | |
卷号 | 3期号:17页码:1-15 |
关键词 | Sentence representation composition model |
文献子类 | full-length article |
英文摘要 | This article addresses the problem of learning compositional Chinese sentence representations, which represent the meaning of a sentence by composing the meanings of its constituent words. In contrast to English, a Chineseword is composed of characters,which contain rich semantic information. However, this information has not been fully exploited by existing methods. In this work, we introduce a novel, mixed character-word architecture to improve the Chinese sentence representations by utilizing rich semantic information of innerword characters.We propose two novel strategies to reach this purpose. The first one is to use a mask gate on characters, learning the relation among characters in a word. The second one is to use a max-pooling operation on words to adaptively find the optimal mixture of the atomic and compositional word representations. Finally, the proposed architecture is applied to various sentence compositionmodels, which achieves substantial performance gains over baseline models on sentence similarity task. To further verify the generalization ability of our model, we employ the learned sentence representations as features in sentence classification task, question classification task, and sentence entailment task. Results have shown that the proposed mixed character-word sentence representation models outperform both the character-based andword-basedmodels. |
语种 | 英语 |
内容类型 | 期刊论文 |
源URL | [http://ir.ia.ac.cn/handle/173211/40576] ![]() |
专题 | 模式识别国家重点实验室_自然语言处理 |
作者单位 | 1.National Laboratory of Pattern Recognition, CASIA 2.University of Chinese Academy of Sciences 3.Institute of Automation, CAS Center for Excellence in Brain Science and Intelligence Technology |
推荐引用方式 GB/T 7714 | shaonan wang,jiajun zhang,chengqing zong. Empirical ExploringWord-Character Relationship for Chinese Sentence Representation[J]. ACM Transactions on Asian and Low-Resource Language Information Processing,2018,3(17):1-15. |
APA | shaonan wang,jiajun zhang,&chengqing zong.(2018).Empirical ExploringWord-Character Relationship for Chinese Sentence Representation.ACM Transactions on Asian and Low-Resource Language Information Processing,3(17),1-15. |
MLA | shaonan wang,et al."Empirical ExploringWord-Character Relationship for Chinese Sentence Representation".ACM Transactions on Asian and Low-Resource Language Information Processing 3.17(2018):1-15. |
个性服务 |
查看访问统计 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论