CORC  > 清华大学
Text categorization algorithm based on feature order pair quantization
Ren Jisheng ; Wang Zuoying
2010-05-06 ; 2010-05-06
关键词Theoretical or Mathematical Experimental/ feature extraction pattern clustering singular value decomposition text analysis vector quantisation/ text categorization algorithm feature order pair quantization document representation scheme vector space centroid algorithm word sense pairs MicroF1 feature abstraction feature transformation singular value decomposition/ C6130D Document processing techniques C1250B Character recognition C1110 Algebra
中文摘要Text categorization algorithms should contain the various constraints presented in the language, but most neglect the order information of language feature in the text. This paper presents a document representation scheme based on feature pair quantization using clustering to identify feature order information in the text, which is then combined with the vector space centroid algorithm. Tests were done for representing documents based on word pairs and word sense pairs respectively in three different data sets. The results show that the current method outperforms traditional representations based on words or word sense. The average improvement of MicroF1 for word pairs is 3% ~ 4% and for word sense pair is 5% ~7%. Therefore, feature order information plays an important role for improving text categorization performance.
语种中文 ; 中文
出版者Tsinghua Univ. Press ; China
内容类型期刊论文
源URL[http://hdl.handle.net/123456789/11825]  
专题清华大学
推荐引用方式
GB/T 7714
Ren Jisheng,Wang Zuoying. Text categorization algorithm based on feature order pair quantization[J],2010, 2010.
APA Ren Jisheng,&Wang Zuoying.(2010).Text categorization algorithm based on feature order pair quantization..
MLA Ren Jisheng,et al."Text categorization algorithm based on feature order pair quantization".(2010).
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace