CORC  > 厦门大学  > 信息技术-会议论文
Large-scale similarity join with edit-distance constraints
Lin, Che ; Yu, Haiyang ; Weng, Wei ; He, Xianmang ; Lin C(林琛)
2014
关键词Algorithms
英文摘要Conference Name:19th International Conference on Database Systems for Advanced Applications, DASFAA 2014. Conference Address: Bali, Indonesia. Time:April 21, 2014 - April 24, 2014.; In the age of big data, the data quality problem is more severe than ever. As an essential step in data cleaning, similarity join has attracted lots of attentions from the database community. In this work, to address the similarity join problem with edit-distance constraints, we first improve the partition-based join algorithm for small scale data. Then we extend the algorithm based on MapReduce framework for large-scale data. Extensive experiments on both real and simulated datasets demonstrate the efficiency of our algorithms. ? 2014 Springer International Publishing Switzerland.
语种英语
出处http://dx.doi.org/10.1007/978-3-319-05813-9-22
出版者Springer Verlag
内容类型其他
源URL[http://dspace.xmu.edu.cn/handle/2288/86950]  
专题信息技术-会议论文
推荐引用方式
GB/T 7714
Lin, Che,Yu, Haiyang,Weng, Wei,et al. Large-scale similarity join with edit-distance constraints. 2014-01-01.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace