A Unified Framework and Models for Integrating Translation Memory into Phrase-based Statistical Machine Translation
Yang Liu1; Kun Wang1; Chengqing Zong1; Keh-Yih Su2
刊名Computer Speech & Language (CSL)
2019
卷号54期号:1页码:176-206
关键词Phrase-based Machine Translation Translation Memory
文献子类技术改进
英文摘要

Since statistical machine translation (SMT) and translation memory (TM) complement each other in TM matched and unmatched regions, a unified framework for integrating TM into phrase-based SMT is proposed in this paper. Unlike previous two-stage pipeline approaches, which directly merge TM results into the input sentences and subsequently let the SMT only translates those unmatched regions, the proposed framework refers to the corresponding TM information associated with each phrase
at the SMT decoding. Under this unified framework, several integrated models are proposed to incorporate different types of information extracted from TM to guide the SMT decoding. We thus let SMT implicitly and indirectly utilize global context with a local dependency model. Furthermore, the SMT phrase table is dynamically enhanced with TM phrase pairs when the TM database and the SMT training set are different.
On a Chinese English TM database, our experiments show that the proposed Model-I significantly improves over both SMT
and TM when the SMT training set is also adopted as the TM database and when the fuzzy match score is over 0.4 (overall 3.5
BLEU points improvement and 2.6 TER points reduction). In addition, the proposed Model-II is significantly better than the TM
and the SMT systems when the SMT training set and the TM database are different. Furthermore, the proposed Model-III outperforms both the TM and the SMT systems even when the SMT training set and the TM database are from different domains. Additionally, the proposed Model-IV further achieves significant improvements with the help of Top-N TM sentence pairs. Lastly, all our models significantly outperform those state-of-the-art approaches under all test conditions.
 

语种英语
内容类型期刊论文
源URL[http://ir.ia.ac.cn/handle/173211/23994]  
专题自动化研究所_模式识别国家重点实验室_自然语言处理团队
通讯作者Yang Liu
作者单位1.National Laboratory of Pattern Recognition, Institute of Automation Chinese Academy of Sciences, University of Chinese Academy of Sciences
2.Institute of Information Science, Academia Sinica, Taipei, Taiwan
推荐引用方式
GB/T 7714
Yang Liu,Kun Wang,Chengqing Zong,et al. A Unified Framework and Models for Integrating Translation Memory into Phrase-based Statistical Machine Translation[J]. Computer Speech & Language (CSL),2019,54(1):176-206.
APA Yang Liu,Kun Wang,Chengqing Zong,&Keh-Yih Su.(2019).A Unified Framework and Models for Integrating Translation Memory into Phrase-based Statistical Machine Translation.Computer Speech & Language (CSL),54(1),176-206.
MLA Yang Liu,et al."A Unified Framework and Models for Integrating Translation Memory into Phrase-based Statistical Machine Translation".Computer Speech & Language (CSL) 54.1(2019):176-206.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace