CORC  > 北京大学  > 信息科学技术学院
A fast and effective method for clustering large-scale Chinese question dataset
Zhang, Xiaodong ; Wang, Houfeng
刊名communications in computer and information science
2014
英文摘要Question clustering plays an important role in QA systems. Due to data sparseness and lexical gap in questions, there is no sufficient information to guarantee good clustering results. Besides, previous works pay little attention to the complexity of algorithms, resulting in infeasibility on large-scale datasets. In this paper, we propose a novel similarity measure, which employs word relatedness as additional information to help calculating similarity between questions. Based on the similarity measure and k-means algorithm, semantic k-means algorithm and its extended version are proposed. Experimental results show that the proposed methods have comparable performance with state-of-the-art methods and cost less time. ? Springer-Verlag Berlin Heidelberg 2014.; EI; 0; 345-356; 496
语种英语
内容类型期刊论文
源URL[http://ir.pku.edu.cn/handle/20.500.11897/327890]  
专题信息科学技术学院
推荐引用方式
GB/T 7714
Zhang, Xiaodong,Wang, Houfeng. A fast and effective method for clustering large-scale Chinese question dataset[J]. communications in computer and information science,2014.
APA Zhang, Xiaodong,&Wang, Houfeng.(2014).A fast and effective method for clustering large-scale Chinese question dataset.communications in computer and information science.
MLA Zhang, Xiaodong,et al."A fast and effective method for clustering large-scale Chinese question dataset".communications in computer and information science (2014).
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace