移动投诉信息中热点问题的自动发现与分析
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
当今社会,随着通信和网络技术的飞速发展,移动通信服务作为一种便捷的交流和沟通方式,在人们的日常生活和工作中发挥着举足轻重的作用。近年来,运营商提供的服务种类和客户群体数量不断增加,导致了客户投诉数量也呈现出急剧上升的趋势。于是各大企业在经营中开始更多地关注客户满意度,并且把客户满意度的不断提升作为重要的战略目标。
     目前,国内外研究学者对热点问题的发现已有一些成果,但它们处理的对象主要是从互联网上抓取的海量文本信息,而对于通信服务中的客户投诉这个专业领域尚无成熟的可借鉴的技术。因此,如何通过对现有的投诉信息分析来发现内部隐藏的热点问题,及时地采取措施,对于提高服务质量、提升品牌价值显得极其重要和迫切。
     正是在此背景下,本文提出了对移动投诉信息进行数据挖掘的课题,目的是建立一种投诉信息热点问题自动发现与分析的系统模型,以解决目前存在的投诉业务量急剧增长与分析处理效率低下之间的矛盾。本文的主要内容概括如下:
     首先,介绍了热点问题发现的研究背景和发展现状,分析了当前的热点发现系统存在的问题,阐述了本系统的设计原理及工作流程。
     其次,通过对投诉信息文本及热点问题特点的分析,明确了系统需求,设计了系统基础架构,提出了一种优化的K-means算法,并在结合数据仓库中联机分析处理(OLAP)和关联规则的基础上,实现了对不同主题类别的热点问题的挖掘和分析。
     最后,将该系统模型应用于移动投诉管理项目中,并对系统运行情况进行了展示。
Nowadays, with the rapid development of communications and network technology, mobile communication service has played an indispensable role in people's ordinary work and life as a kind of convenient communication method. In recent years, because of the increasing number of customer groups and types of services provided by companies, the number of customer complaints is increasing quickly. So the major companies begin to pay more attention to customer satisfaction in the management of enterprise and regard the continuous improvement of customer satisfaction as an important strategic target.
     At present, there have been some achievements made by the researchers at home and abroad on the discovery technology of the hot issues, but the objects which they mainly deal with usually come from the internet mass text information, and there aren't any mature technologies to use directly in the area of customers complaints from the communications services. So how to discover the hidden hot issues inside by complaint information analysis and how to take measures in time is extremely important and urgent for the improvement of the service quality and enhancing of the brand value.
     It is in this context, we proposed a research topic of data mining technique for mobile complaint information in order to establish a system model for the automatic discovery and analysis on hot issue inside to solve the existent contradictions between the sharp increase of complaints and the inefficient analysis and management. The main contents in this paper are summarized as follows:
     At first, we introduced the background and recent development of the hot issues discovery, analyzed the main problems in current hot issues discovery system and explained the principle of the system design and workflow.
     Secondly, through the analysis of the complaint information text and the features of hot issues, we confirmed the system requirements and designed the system basic architecture. We gave a kind of optimized K-means algorithm and achieved the goals of hot issues mining and analysis in different topic with a combination of online analytical processing (OLAP) and association rules in data warehouse technology.
     Finally, we used this system model in mobile complaints management projects and showed the system running conditions.
引文
[1]齐海凤网络舆情热点发现与事件跟踪技术研究[学位论文],哈尔滨工程大学,2008
    [2]郭建永聚类分析在文本挖掘中的应用与研究[学位论文],江苏,江南大学,2008
    [3]张启宇中文分词算法研究综述情报探索11(133)2008 p53-p56
    [4]王军基于文本分类的WEB信息检索技术的研究[学位论文],大连交通大学,2007
    [5]庐健面向文本的主题挖掘技术与实现[学位论文],济南大学,2010
    [6]李若鹏,李翔,林祥等基于DK算法的互联网热点主动发现研究与实现计算机发展与技术18(9)2008 p1-p4
    [7]姚家奕数据仓库与数据挖掘技术原理及应用电子工业出版社2009 p20,p102
    [8]宋迎花群体性突发事件的研究与预警实现[学位论文],北京邮电大学,2012
    [9]沈斌关联规则相关技术研究[学位论文],浙江大学,2007
    [10]PaodingAnalyzer百度百科http://baike.baidu.com/view/7324777.htm
    [11]李若鹏互联网舆情信息管控关键技术研究与实现[学位论文],上海交通大学,2008
    [12]胡燕,吴虎子,钟珞中文文本分类中基于词性的特征提取方法研究武汉理工大学学报29(4)2007 p132-p1 35
    [13]顾洪博基于k_means算法的k值优化的研究与应用海南大学学报自然科学版27(4)2009 p386-p389
    [14]储岳中一类基于贝叶斯信息准则的k均值聚类算法安徽工业大学学报(自然科学版)27(4)2010 p409-p412
    [15]孙卫琴精通Struts:基于MVC的Java Web设计与开发电子工业出版社2004 p9-p12
    [16]李小武网络舆情热点话题自动发现技术的研究与实现[学位论文],昆明理工大学,2012
    [17]Charles L.Wayne. Multilingual ToPic Detection and Tracking:Sueeessful Research Enabled by Corpora and Evaluation. In Proeeedings of the 2nd International Conference on Language Resources & Evaluation(LREC 2000),2000, p1487-p1494
    [18]刘学A市移动公司客户投诉管理机制模型分析[学位论文],北京邮电大学,2007
    [19]Agrawal R, Srikant R. Fast algorithms for mining association rules in large database[C]. In:Proc of the 1994 International Conference on VLDB. San Francisco:Morgan Kaufmann Publishers,1994, p487-p499.
    [20]Jeffrey G.Blodgett,DonnaJ.Hill,Stephen S.Tax. The Effeets of Distributive,Procedural,and Interaetional Justice on Post-complaint Behavior.Joumal of Retailing,1997,73(2),p185-p210
    [21]刘金岭海量中文短信文本最佳聚类数研究计算机工程36(8)2010p66-p68
    [22]李晓红中文文本分类中的特征词抽取方法计算机工程与设计30(17)2009 p4127-p4129
    [23]白冰,李德华,熊才权研讨支持系统中基于主题聚类的热点提取计算机与数字工程38(11)2010 p81-p85
    [24]郭建永,蔡勇,甄艳霞基于文本聚类技术的主题发现计算机工程与设计29(6)2008 p1426-p1432
    [25]陈阳,凌俊民,蒙圣光投诉数据智能挖掘分类管理系统数字技术与应用22(6)2011 p146-p149
    [26]黄敏热点发现及文本倾向性分析技术研究安徽工业大学学报(自然科学版)28(4)201 1 p400-p403
    [27]Gardial,Scott,Woodruff,Schumann and Burns. Comparing Consumers' Recall of Prepurchase and Postpurchase Product Evaluation Experiences.Journal of Consumer Research,1994,3(20) p548—p562
    [28]董超基于主题信息服务的垂直搜索引擎的设计与实现[学位论文],北京邮电大学,2010
    [29]李莹聚类结果评价方法与聚类知识提取技术的研究[学位论文],南京航空航天大学,2008
    [30]Papka R. On-line New Event Detection,Clustering and Tracking Amherst Dartment of Computer Seienee,University of Massachusetts, Amherst,1999