CORC  > 北京大学  > 信息科学技术学院
Joint learning of Chinese words, terms and keywords
Cao, Ziqiang ; Li, Sujian ; Ji, Heng
2014
英文摘要Previous work often used a pipelined framework where Chinese word segmentation is followed by term extraction and keyword extraction. Such framework suffers from error propagation and is unable to leverage information in later modules for prior components. In this paper, we propose a four-level Dirichlet Process based model (DP-4) to jointly learn the word distributions from the corpus, domain and document levels simultaneously. Based on the DP-4 model, a sentence-wise Gibbs sampler is adopted to obtain proper segmentation results. Meanwhile, terms and keywords are acquired in the sampling process. Experimental results have shown the effectiveness of our method. ? 2014 Association for Computational Linguistics.; EI; 0
语种英语
内容类型其他
源URL[http://ir.pku.edu.cn/handle/20.500.11897/330024]  
专题信息科学技术学院
推荐引用方式
GB/T 7714
Cao, Ziqiang,Li, Sujian,Ji, Heng. Joint learning of Chinese words, terms and keywords. 2014-01-01.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace