面向大规模网络的快速重叠社团挖掘算法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Fast Overlapping Communities Detection Algorithms for Large-Scale Social Networks
  • 作者:李政廉 ; 吉立新 ; 黄瑞阳 ; 兰巨龙
  • 英文作者:LI Zheng-lian;JI Li-xin;HUANG Rui-yang;LAN Ju-long;National Digital Switching System Engineering & Technological R&D Center;
  • 关键词:重叠社团挖掘 ; 局部信息 ; 社团连接度 ; 邻居连接度
  • 英文关键词:overlapping communities detection;;Local information;;community connectivity;;neighborhood connectivit
  • 中文刊名:DZXU
  • 英文刊名:Acta Electronica Sinica
  • 机构:国家数字交换系统工程技术研究中心;
  • 出版日期:2019-02-15
  • 出版单位:电子学报
  • 年:2019
  • 期:v.47;No.432
  • 基金:国家自然科学基金(No.61521003)
  • 语种:中文;
  • 页:DZXU201902001
  • 页数:9
  • CN:02
  • ISSN:11-2087/TN
  • 分类号:3-11
摘要
重叠社团在社交网络大数据中普遍存在.针对现有重叠社团挖掘算法易将重叠区域错误地划分为独立的社团且计算复杂的问题,提出了一种基于局部信息度量的快速重叠社团挖掘算法(Local information based Fast Over-lapped Communities Detection,Li-FOCD).首先,为节点定义局部信息度量指标——社团连接度和邻居连接度,建模节点与社团的关系,缩小了计算范围;然后,每次并行地迭代执行缩减、扩展、去重等操作,并更新局部度量指标,通过松弛每次迭代的终止条件,发现近似最优社团集合而不是最优社团,最终算法复杂度为O(m+n).基于真实的大规模社交网络数据的试验分析表明:与当前流行的重叠社团挖掘算法相比,Li-FOCD在不损失检测质量的前提下,大幅提升了计算效率.
        Overlap between community pairs is commonplace in large-scale social networks. The most existing overlapped community detection algorithms may falsely identify overlaps as communities because the overlap area is denser than others, and those algorithms are computationally demanding and cannot scale well with the size of networks. In this paper,we propose a fast overlapping community discovery algorithm based on some locally computed information—Local information based Fast Overlapped Communities Detection( Li-FOCD). Firstly,we introduce two local information metrics for each network node—community connectivity score and neighborhood connectivity score, to model the relationship between nodes and communities; secondly,based on local metrics,we can concurrently execute the iterations of reduction,expansion,and duplication removal to find the approximately optimal communities instead of the optimal community,and achieve a low complexity O( m + n). Experimental analysis based on real large-scale social networking datasets shows that our algorithm outperforms some popular overlapped community finding algorithms in terms of computational time while not compromising with quality.
引文
[1]DASGUPTA K,SINGH R,VISWANATHAN B,et al. So-cial ties and their relevance to churn in mobile telecom net-w orks[A]. Proceedings of 11th International Conferenceon Extending Database Technology:Advances in DatabaseTechnology[C]. New York:ACM,2008. 668-677.
    [2]CAI Q,MA L,GONG M,et al. A survey on network com-munity detection based on evolutionary computation[J].International Journal of Bio-Inspired Computation,2016,8(2):84-98.
    [3]SCHAEFFER S E. Graph clustering[J]. Computer ScienceReview,2007,1(1):27-64.
    [4]邓小龙,翟佳羽,尹栾玉.基于矢量影响力聚类系数的高效有向网络社团划分算法[J].电子与信息学报,2017,39(9):2071-2080.Deng Xiaolong,Zhai Jiayu,Yin Luanyu. Vector influenceclustering coefficient based efficient directed communitydetection algorithm[J]. Journal of Electronics&Informa-tion Technology,2017,39(9):2071-2080.(in Chinese)
    [5] BLONDEL V D,GUILLAUME J L,LAMBIOTTE R,etal. Fast unfolding of communities in large netw orks[J].Journal of Statistical M echanics:Theory and Experiment,2008,2008(10):10008.
    [6] LANCICHINETTI A,FORTUNATO S and KERTESZ J.Detecting the overlapping and hierarchical communitystructure in complex netw orks[J]. New Journal of Phys-ics,2009,11(3):033015.
    [7]LEE C,REID F,MCDAID A,et al. Detecting highly over-lapping community structure by greedy clique expansion[J]. ArXiv,2010,1002:1827.
    [8] MISHRA N,SCHREIBER R,STANTON I,et al. Findingstrongly knit clusters in social netw orks[J]. Internet M ath-ematics,2008,5(1-2):155-174.
    [9]YANG J,LESKOVEC J. Structure and overlaps of commu-nities in netw orks[J]. ArXiv,2012,1205:6228.
    [10]乔少杰,韩楠,张凯峰,等.复杂网络大数据中重叠社区检测算法[J].软件学报,2017,28(3):631-647.QIAO Shaojie,HAN Nan,ZHANG Kaifeng,et al. Algo-rithm for detecting overlapping communities from com-plex netw ork big data[J]. Journal of Softw are,2017,28(3):631-647.(in Chinese)
    [11] CHAKRABORTY T,KUMAR S,GANGULY N,et al.GenPerm:a unified method for detecting non-overlappingand overlapping communities[J]. IEEE Transactions onKnow ledge and Data Engineering,2016,28(8):2101-2114.
    [12]HUANG Faliang,LI Xuelong,et al. Overlapping commu-nity detection for multimedia social netw orks[J]. IEEETransactions on M ultimedia,2017,19(8):1881-1893.
    [13]AHN Y Y,BAGROW J P,LEHMANN S. Link communi-ties reveal multiscale complexity in netw orks[J]. Nature,2010,466(7307):761-764.
    [14] MCDAID A,HURLEY N. Detecting highly overlappingcommunities w ith model-based overlapping seed expan-sion[A]. Proceedings of International Conference on Ad-vances in Social Netw orks Analysis and M ining[C]. Pis-cataw ay,NJ:IEEE,2010. 112-119.
    [15] LIU Xiao,WEI Yiming,WANG Jian,et al. Communitydetection enhancement using no-negative matrix factoriza-tion w ith graph regularization[J]. International Journal ofM odern Physics B,2016,30(20):1650130.
    [16]Yang J,LESKOVEC J. Overlapping community detectionat scale:a nonnegative matrix factorization approach[A].Proceedings of the Sixth ACM International Conferenceon Web Search and Data M ining[C]. New York:ACM,2013. 587-596.
    [17] NARAYANAM R,NARAHARI Y. A game theory in-spired,decentralized,local information based algorithmfor community detection in social graphs[A]. Proceed-ings of 21st International Conference on Pattern Recogni-tion[C]. Piscataw ay,NJ:IEEE,2012,1072-1075.
    [18] GREGORY S. Finding overlapping communities in net-w orks by label propagation[J]. New Journal of Physics,2010,12(10):103018.
    [19]刘功申,孟魁,郭弘毅,苏波,李建华.基于贡献函数的重叠社区划分算法[J].电子与信息学报,2017,39(8):1964-1971.LIU Gongshen,M ENG Kui,GUO Hongyi,SU Bo,LIJianhua. Overlapping-communities recognition algorithmbased on contribution function[J]. Journal of Electronics&Information Technology,2017,39(8):1964-1971.(inChinese)
    [20]CHEN W,LIU Z,SUN X,et al. A game-theoretic frame-w ork to identify overlapping communities in social net-w orks[J]. Data M ining and Know ledge Discovery,2010,21(2):224-240.
    [21]ZHANG Lei,PAN Hebin,et al. A mixed representation-based multi-objective evolutionary algorithm for overlap-ping community detection[J]. IEEE Transactions on Cy-bernetics,2017,47(9):2703-2716.
    [22] WEN X,CHEN W N,LIN Y,et al. A maximal cliquebased multi-objective evolutionary algorithm for overlap-ping community detection[J]. IEEE Transactions on Evo-lutionary Computation,2017,21(3):363-377.
    [23]BAUMES J,GOLDBERG M,MAGDOM-ISMAIL M. Ef-ficient identification of overlapping communities[A].Proceedings of International Conference on Intelligenceand Security Informatics[C]. Heidelberg,Berlin:Spring-er,2005. 27-36.
    [24]XIE J and SZYMANSKI B K. Community detection usinga neighborhood strength driven label propagation algo-rithm[A]. Proceedings of Netw ork Science Workshop[C]. Piscataway,NJ:IEEE,2011. 188-195.
    [25]XIE J,SZYMANSKI B K. Towards linear time overlap-ping community detection in social netw orks[A]. Pro-ceedings of Pacific-Asia Conference on Know ledge Dis-covery and Data M ining[C]. Heidelberg,Berlin:Spring-er,2012. 25-36.
    [26]刘世超,朱福喜,甘琳.基于标签传播概率的重叠社区发现算法[J].计算机学报,2016,29(4):717-729.LIU Shichao,ZHU Fuxi,GAN Lin. A label-propagation-probability-based algorithm for overlapping communitydetection[J]. Chinese Journal of Computers,2016,29(4):717-729.(in Chinese)
    [27]RAGHAVAN U N,ALBERT R and KUMARA S. Nearlinear time algorithm to detect community structures inlarge-scale netw orks[J]. Physical review E,2007,76(3):036106.
    [28]马慧芳,谢蒙,何廷年,等.基于核心标签的可重叠微博网络社区划分方法[J].电子学报,2017,45(4):769-776.M A Xiefang,XIE M eng,HE Tingnian,et al. An overlap-ping microblog community detection algorithm via coretags[J]. Acta Electronica Sinica,2017,45(4):769-776.(in Chinese)
    [29] COSCIA M,ROSSETTI G,GIANNOTTI F,et al. De-mon:a local-first discovery method for overlapping com-munities[A]. Proceedings of the 18th ACM SIGKDD In-ternational Conference on Know ledge Discovery and DataM ining[C]. New York:ACM,2012. 615-623.
    [30] YANG B,LIU J,LIU D. An autonomy-oriented compu-ting approach to community mining in distributed and dy-namic netw orks[J]. Autonomous Agents and M ulti-AgentSystems,2010,20(2):123-157.
    [31]BU Z,WU Z,CAO J,et al. Local community mining ondistributed and dynamic netw orks from a multiagent per-spective[J]. IEEE Transactions on cybernetics,2016,46(4):986-999.
    [32]BU Z,CAO J,LI H J,et al. GLEAM:a graph clusteringframew ork based on potential game optimization for large-scale social netw orks[J]. Know ledge and InformationSystems,2017:1-30.
    [33]国琳,左万利,彭涛.基于隶属度的社会化网络重叠社区发现及动态集群演化分析[J].电子学报,2016,44(3):587-594.GUO L,ZUO W L,PENG T. Overlapping community de-tection and dynamic group evolution analysis based on thedegree of membership in social netw ork[J]. Acta Elec-tronica Sinica,2016,44(3):587-594.(in Chinese)
    [34]YANG J,LESKOVEC J. Defining and evaluating networkcommunities based on ground-truth[J]. Know ledge andInformation Systems,2015,42(1):181-213.
    [35] LANCICHINETTI A,RADICCHI F,RAMASCO J J,etal. Finding statistically significant communities in net-w orks[J]. PloS One,2011,6(4):e18961.