CORC  > 北京大学  > 信息科学技术学院
XDist: an effective XML keyword search system with re-ranking model based on keyword distribution
Gao Ning ; Deng ZhiHong ; Lu ShengLong
刊名science china information sciences
2014
关键词XML keywords search information retrieval ranking model keyword distribution evaluation TERM PROXIMITY
DOI10.1007/s11432-012-4781-6
英文摘要Keyword search enables web users to easily access XML data without understanding the complex data schemas. However, the native ambiguity of keyword search makes it arduous to select qualified relevant results matching keywords. To solve this problem, researchers have made much effort on establishing ranking models distinguishing relevant and irrelevant passages, such as the highly cited TF*IDF and BM25. However, these statistic based ranking methods mostly consider term frequency, inverse document frequency and length as ranking factors, ignoring the distribution and connection information between different keywords. Hence, these widely used ranking methods are powerless on recognizing irrelevant results when they are with high term frequency, indicating a performance limitation. In this paper, a new searching system XDist is accordingly proposed to attack the problems aforementioned. In XDist, we firstly use the semantic query model maximal lowest common ancestor (MAXLCA) to recognize the returned results of a given query, and then these candidate results are ranked by BM25. Especially, XDist re-ranks the top several results by a combined distribution measurement (CDM) which considers four measure criterions: term proximity, intersection of keyword classes, degree of integration among keywords and quantity variance of keywords. The weights of the four measures in CDM are trained by a listwise learning to optimize method. The experimental results on the evaluation platform of INEX show that the re-ranking method CDM can effectively improve the performance of the baseline BM25 by 22% under iP[0.01] and 18% under MAiP. Also the semantic model MAXLCA and the search engine XDist perform the best in their respective related fields.; http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=PARTNER_APP&SrcAuth=LinksAMR&KeyUT=WOS:000334860600001&DestLinkType=FullRecord&DestApp=ALL_WOS&UsrCustomerID=8e1609b174ce4e31116a60747a720701 ; Computer Science, Information Systems; SCI(E); 2; ARTICLE; zhdeng@cis.pku.edu.cn; 5; 57
语种英语
内容类型期刊论文
源URL[http://ir.pku.edu.cn/handle/20.500.11897/152121]  
专题信息科学技术学院
推荐引用方式
GB/T 7714
Gao Ning,Deng ZhiHong,Lu ShengLong. XDist: an effective XML keyword search system with re-ranking model based on keyword distribution[J]. science china information sciences,2014.
APA Gao Ning,Deng ZhiHong,&Lu ShengLong.(2014).XDist: an effective XML keyword search system with re-ranking model based on keyword distribution.science china information sciences.
MLA Gao Ning,et al."XDist: an effective XML keyword search system with re-ranking model based on keyword distribution".science china information sciences (2014).
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace