CORC  > 北京大学  > 信息科学技术学院
A new algorithm for fast mining frequent itemsets using N-lists
Deng ZhiHong ; Wang ZhongHui ; Jiang JiaJian
刊名science china information sciences
2012
关键词data mining frequent itemset mining data structure N-lists algorithm PATTERNS DATABASES UTILITIES SUPPORT
DOI10.1007/s11432-012-4638-z
英文摘要Mining frequent itemsets has emerged as a fundamental problem in data mining and plays an essential role in many important data mining tasks. In this paper, we propose a novel vertical data representation called N-list, which originates from an FP-tree-like coding prefix tree called PPC-tree that stores crucial information about frequent itemsets. Based on the N-list data structure, we develop an efficient mining algorithm, PrePost, for mining all frequent itemsets. Efficiency of PrePost is achieved by the following three reasons. First, N-list is compact since transactions with common prefixes share the same nodes of the PPC-tree. Second, the counting of itemsets' supports is transformed into the intersection of N-lists and the complexity of intersecting two N-lists can be reduced to O(m + n) by an efficient strategy, where m and n are the cardinalities of the two N-lists respectively. Third, PrePost can directly find frequent itemsets without generating candidate itemsets in some cases by making use of the single path property of N-list. We have experimentally evaluated PrePost against four state-of-the-art algorithms for mining frequent itemsets on a variety of real and synthetic datasets. The experimental results show that the PrePost algorithm is the fastest in most cases. Even though the algorithm consumes more memory when the datasets are sparse, it is still the fastest one.; http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=PARTNER_APP&SrcAuth=LinksAMR&KeyUT=WOS:000307720000007&DestLinkType=FullRecord&DestApp=ALL_WOS&UsrCustomerID=8e1609b174ce4e31116a60747a720701 ; Computer Science, Information Systems; SCI(E); 28; ARTICLE; 9; 2008-2030; 55
语种英语
内容类型期刊论文
源URL[http://ir.pku.edu.cn/handle/20.500.11897/152449]  
专题信息科学技术学院
推荐引用方式
GB/T 7714
Deng ZhiHong,Wang ZhongHui,Jiang JiaJian. A new algorithm for fast mining frequent itemsets using N-lists[J]. science china information sciences,2012.
APA Deng ZhiHong,Wang ZhongHui,&Jiang JiaJian.(2012).A new algorithm for fast mining frequent itemsets using N-lists.science china information sciences.
MLA Deng ZhiHong,et al."A new algorithm for fast mining frequent itemsets using N-lists".science china information sciences (2012).
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace