CORC  > 北京大学  > 信息科学技术学院
一种基于 Sketch 的 Top-k 紧密中心性快速搜索算法; A Fast Sketch-Based Approach of Top-k Closeness Centrality Search on Large Networks
邵蓥侠 ; 崔斌 ; 马林 ; 阴红志
刊名计算机学报
2016
关键词紧密中心性 图算法 近似算法 图分析 社交网络 closeness centrality graph algorithm approximation algorithm graph analysis social network
DOI10.11897/SP.J.1016.2016.01965
英文摘要在大数据的时代背景下,由于网络数据(network data)能有效简洁地描述社交网络、电子商务、医疗记录、在线教育等多种应用中各类复杂关系,越来越受到工业界和学术界的关注。在社交网络分析任务中,一个基本操作是从网络中发现重要程度前 k 大的节点。紧密中心性(closeness centrality)是一种常见的节点重要性刻画指标,它用节点在网络中心的程度来反映节点的重要性。用紧密中心性衡量节点重要性进行节点搜索的问题称为 top-k 紧密中心性搜索问题。然而,传统的精确算法由于其多项式级别的复杂度无法高效地扩展到大规模的网络数据上。近来,研究人员提出了近似算法,通过牺牲结果精度来获得性能提升。通过分析发现,目前存在的近似算法虽然性能得到了有效提升,但是结果精度牺牲过大。为了解决这个问题,该文设计了一种新颖的近似算法,叫做基于 Sketch的紧密中心性搜索算法。此近似算法应用了一个全新的计算方式,利用 Sketch 估计同一距离的邻居数目,然后得到近似的最短距离之和,最终得到各个节点的紧密中心性的估计值。此算法的时间复杂度为 O(m tD max ),其中 t 是常数,D max 是网络直径,m 是网络边数。根据实际社交网络的小世界现象的特性,此近似算法基本是个线性算法。最后,相比于目前存在的精确算法和近似算法,该文通过全面的实验验证了基于 Sketch 的紧密中心性搜索算法在时间性能和结果精度等两方面的优势。; The advanced development of various technologies on social network,e-commerce and online education has contributed to an increasing amount of large-scale network data.Among all sorts of network analysis tasks,one basic task is to search important nodes in a network. Closeness centrality is one of the popular metrics which measure the importance of a node in a network.Based on the closeness centrality,the basic task is called top-k closeness centrality search.However,the existing exact approaches cannot process large-scale networks because of their polynomial time complexity.Recently,some approximation algorithms are proposed,which achieve high performance by sacrificing the precision of results.But according to our study,we find that the loss of the precision of results is too much.To improve the precision of results while maintaining the high performance,in this paper,we propose a Sketch-based approximation algorithm for fast searching top-k closeness centrality in a large-scale network. The new algorithm is developed based on a new computation method which calculates the centrality by estimating the number of nodes within a certain distance by a data structure called FM-Sketch. The new algorithm has time complexity O(mtD max ),where t is a constant,D max is the diameter of a network and m is the number of edges in a network.With the small-world phenomenon assumption, the Sketch-based algorithm is a linear algorithm.Finally,we compare our Sketch-based algorithm with the state-of-the-art exact and approximation algorithms through extensive experiments. The results demonstrate the advantages of the new solution.; 国家自然科学基金; 国家“九七三”重点基础研究发展规划项目基金; 深圳政府研究项目(JCYJ20151014093505032)资助.; 中文核心期刊要目总览(PKU); 中国科技核心期刊(ISTIC); 中国科学引文数据库(CSCD); 10; 1965-1978; 39
语种中文
内容类型期刊论文
源URL[http://ir.pku.edu.cn/handle/20.500.11897/457650]  
专题信息科学技术学院
推荐引用方式
GB/T 7714
邵蓥侠,崔斌,马林,等. 一种基于 Sketch 的 Top-k 紧密中心性快速搜索算法, A Fast Sketch-Based Approach of Top-k Closeness Centrality Search on Large Networks[J]. 计算机学报,2016.
APA 邵蓥侠,崔斌,马林,&阴红志.(2016).一种基于 Sketch 的 Top-k 紧密中心性快速搜索算法.计算机学报.
MLA 邵蓥侠,et al."一种基于 Sketch 的 Top-k 紧密中心性快速搜索算法".计算机学报 (2016).
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace