摘要
【目的/意义】为了实现突发事件网络舆情热点话题的及时发现与捕捉,实现多角度、全方位、高精度的网络舆情突发事件监测,精准构建特定时间区间内网络舆情突发事件的知识图谱监测模型对于舆情内容的监测和突发话题的发现具有重要影响。【方法/过程】本文基于知识图谱理论,提出了一种新的网络舆情监测方法,以突发事件网络舆情的时间特征为切入点,通过突发词项识别、构建突发话题图以及语义补充与完善三个步骤,在保留突发事件特征的基础上有效过滤无关网络内容,构建包含语义关系的突发话题图,实现全方面、高精度、少噪音的突发事件网络舆情热点话题监测。最后,本文以全标注微博数据集与在线微博数据流为基础展开实验研究。【结果/结论】实验结果表明:基于知识图谱的网络舆情监测方法有效提升了突发事件网络舆情监测的准确性与全面性,相较于传统的网络舆情监测算法,其突发事件监测准确率与召回率提升幅度大于6%,F1得分提升幅度大于12%,即通过筛选突发词项、构建突发话题图、语义补充与完善三个步骤,基于知识图谱的网络舆情监测方法在理论层面上有效提升了突发事件网络舆情监测的准确性与全面性,对于及时发现网络舆情话题、精确捕捉网络舆情发展趋势、针对性防治网络舆情危机等具有重要的指导意义。
【Purpose/significance】To detect emerging topics in online public opinion as well as monitor user-generated contents on social media platforms, the online public opinion monitoring and early warning function module realizes the detection of social emergencies based on the knowledge map theory.【Method/process】This paper propose a novel approachbased on knowledge graph. The general idea is combining diverse features of terms and their multi-correlations in a topicgraph and the method relies on a 3-steps process: Emerging Terms Identification, Topic Graph Construction and SemanticEnrichment. Furthermore, a novel mechanism of term selection and semantic enrichment for graph-based topic detection isproposed to facilitate eliminating noises and extracting more comprehensive information from data streams. Extensive exper-iments on the dataset verify the effectiveness of our algorithm over several benchmarks.【Result/conclusion】Both topic preci-sion and recall have been increased by at least 6% with our approaches, reflecting the accuracy of topic detection in real-life events' emerging time period.F1 score for term extraction has been increased by at least12%. It means the quality ofterms extracted for describing an emerging topic has been ameliorated.
引文
1 Aiello L M,Petkos G,Martin C,et al. Sensing Trending Topics in Twitter[J]. IEEE Transactions on Multimedia,2013,15(6):1268-1282.
2 张寿华,刘振鹏.网络舆情热点话题聚类方法研究[J].小型微型计算机系统,2013, 34(3):471-474.
3 周鹏,蔡淑琴,石双元,等.基于关键词抽取的微博舆情事件内容聚合[J].情报杂志,2014,(1):91-96.
4 Sayyadi,Hassan,Raschid,et al. A Graph Analytical Approach for Topic Detection[J]. Acm Transactions on Internet Technology,2013,13(2):1-23.
5 陈忆金,曹树金,陈少驰,等.网络舆情信息监测研究进展[J].图书情报知识, 2011(6):41-49.
6 李纲,李阳.情报视角下的突发事件监测与识别研究[J].图书情报工作,2014, 58(24):66-72.
7 唐涛.基于搜索引擎日志分析的网络舆情监测方法研究[J].情报杂志,2012,31(8):27-30.
8 Yan S,Tang S,Pei S,et al. The spreading of opposite opinions on online social networks with authoritative nodes[J]. Physica A Statistical Mechanics&Its Applications,2013,392(17):3846-3855.
9 马雯雯,魏文晗,邓一贵.基于隐含语义分析的微博话题发现方法[J].计算机工程与应用,2014,50(1):96-100.
10 Zhang C,Wang H,Cao L,et al. A hybrid term-term relations analysis approach for topic detection[J]. Knowledge-Based Systems,2016,93:109-120.
11 李磊,刘继,张竑魁.基于共现分析的网络舆情话题发现及态势演化研究[J].情报科学,2016,34(1):44-47.
12 Cataldi M,Schifanella C,Sapino M L,et al. CoSeNa:a context-based search and navigation system[C]//International Conference on Management of Emergent Digital Ecosystems. ACM,2009:33.
13 谈国新,方一.突发公共事件网络舆情监测指标体系研究[J].华中师范大学学报(人文社会科学版),2010,49(3):66-70.
14 Xu X,Yuruk N,Feng Z,et al. SCAN:a structural clustering algorithm for networks[C]//ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2007:824-833.
15 Blei D M,Ng A Y,Jordan M I. Latent dirichlet allocation[J].Journal of Machine Learning Research,2003,(3):993-1022.