针对浮动车异常数据的增强型在线异常点检测算法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:An Enhanced Online Algorithm for Detecting Abnormal Probe Vehicle Data
  • 作者:曹鹏 ; 马婕 ; 施展华
  • 英文作者:CAO Peng;MA Jie;SHI Zhan-hua;School of Transportation and Logistics, Southwest Jiaotong University;School of Transportation Engineering, Tongji University;
  • 关键词:智能交通系统 ; 离群点检测 ; 聚类 ; 浮动车数据
  • 英文关键词:intelligent transportation;;outlier detection;;clustering;;probe vehicle data
  • 中文刊名:JTGC
  • 英文刊名:Journal of Transportation Engineering and Information
  • 机构:西南交通大学交通运输与物流学院;同济大学交通运输工程学院;
  • 出版日期:2019-06-15
  • 出版单位:交通运输工程与信息学报
  • 年:2019
  • 期:v.17;No.64
  • 基金:中央高校基本科研业务费专项资金资助
  • 语种:中文;
  • 页:JTGC201902004
  • 页数:13
  • CN:02
  • ISSN:51-1652/U
  • 分类号:27-39
摘要
在智能交通系统(ITS)中,为了从浮动车数据中获取准确动态的交通信息,需要实时识别并剔除异常数据。为了检测异常浮动车数据,本文提出一种独特的增强型在线非监督离群点检测算法(EOSA)。该算法由SS算法和K-means聚类算法组成,其中,SS算法可采用基于离散变量和连续变量的概率模型来计算异常分值,将K-means聚类算法与SS算法相结合可以有效提高检测精度。本文采用了深圳市浮动车数据来验证EOSA算法,试验结果表明,该算法能够成功检测到异常的数据,其中包含车辆停放和停车等待时的异常GPS记录。此外,通过试验比较,本文提出的EOSA算法比现有六种常用算法都具有更高的异常数据检测精度。EOSA算法有望应用于基于浮动车数据的智能交通系统。
        To acquire accurate and dynamic traffic information from probe vehicle data, it is necessary to identify and eliminate abnormal data in real time for Intelligent Transportation Systems(ITS). In this paper,an Enhanced Online SmartSifter Algorithm(EOSA) comprising a unique online unsupervised outlier detection engine is proposed for identifying abnormal probe vehicle data. EOSA consists of two components,namely the SmartSifter(SS) algorithm and the K-means clustering algorithm. The SS algorithm employs a probabilistic model for categorical and continuous variables to calculate an abnormal score. The K-means clustering algorithm incorporates the SS algorithm for improving the detection accuracy. EOSA was validated by employing probe vehicle data of the city of Shenzhen, China. Validated results indicate that EOSA can detect abnormal data, including GPS records, when vehicles are parking in dwellings.Furthermore, an experiment for comparing EOSA with six conventional algorithms was also conducted.Obtained results show that EOSA has the best performance among all the algorithms. Hence, EOSA can be potentially employed for probe vehicle data-based ITS.
引文
[1]ZHONG M,LINGRAS P,SHARMA S.Estimation of missing traffic counts using factor,genetic,neural,and regression techniques[J].Transportation Research Part C Emerging Technologies,2004,12(2):139-166.
    [2]TAKAHASHI S,KURIMURA Y,TAKEYAMA K,et al.Empirical approaches to outlier detection in intelligent transportation systems data[J].Transportation Research Record Journal of the Transportation Research Board,2003,1840(2):74-81.
    [3]HODGE VICTORIA J,AUSTIN J.A survey of outlier detection methodologies[J].Artificial Intelligence Review,2004,22(2):85-126.
    [4]GARCíA-TEODORO P,DíAZ-VERDEJO J,MACIá-FERNáNDEZ G,et al.Anomaly-based network intrusion detection:techniques,systems and challenges[J].Computers&Security,2009,28(1-2):18-28.
    [5]LAURIKKALA J,JUHOLA M,KENTALA E.Informal identification of outliers in medical data[J].Fifth International Workshop on Intelligent Data Analysis in Medicine and Pharmacology,2000:20-24.
    [6]DERRIG,RICHARD A.Insurance fraud[J].Journal of Risk&Insurance,2002,69(3):271-287.
    [7]WESTERWEEL J,SCARANO F.Universal outlier detection for PIV data[J].Experiments in Fluids,2005,39(6):1096-1100.
    [8]CHEN S,WANG W,VAN Z H.A comparison of outlier detection algorithms for ITS data[J].Expert Systems with Applications,2010,37(2):1169-1178.
    [9]SARVI M,CHUNG E,MURAKAMI Y,et al.Amethodology for data cleansing and trip end identification of probe vehicles[J].Proceedings of the JSCE Conference of Infrastructure Planning,2002,26:1-4.
    [10]JACOBSON L,NIHAN N,BENDER J.Detecting erroneous loop detector data in a freeway traffic management system[J].Transportation Research Record.1990,1287:151-166.
    [11]LAM P,WANG L,NGAN H Y T,et al.Outlier detection in large-scale traffic data by na?ve bayes method and gaussian mixture model method[J].Electronic Imaging,2017,2017(9):73-78.
    [12]ZHANG Z,YANG D,ZHANG T,et al.A study on the method for cleaning and repairing the probe vehicle data[J].IEEE Transactions on Intelligent Transportation Systems,2013,14(1):419-427.
    [13]YAMANISHI K,TAKEUCHI J I,WILLIAMS G,et al.On-line unsupervised outlier detection using finite mixtures with discounting learning algorithms[J].Data Mining&Knowledge Discovery,2004,8(3):275-300.
    [14]YAMANISHI K,TAKEUCHI J I,WILLIAMS G,et al.Data mining for security[J].NEC Journal of Advanced Technology,2005,2(1):63-69.
    [15]BORMAN S.The expectation maximization algorithm a short tutorial[EB].https://www.cs.utah.edu/~piyush/teaching/EM_algorithm.pdf,2006-6-28.
    [16]NEAL R M,HINTON G E.A view of the EMalgorithm that justifies incremental,sparse,and other variants[J].Learning in Graphical Models,Springer Netherlands,1998:355-368.
    [17]LEDUC G.Road traffic data:collection methods and applications[J].Working Papers on Energy.Transport and Climate Change,2008,55(1):1-55.
    [18]CAO P,MIWA T,YAMAMOTO T,et al.Bi-level generalized least-square estimation of dynamic origin-destination matrix for urban network using probe vehicle data[J].Transportation Research Record:Journal of the Transportation Research Board,2013,2333(2333):66-73.
    [19]唐智慧,王志鹏,党珊,等.手机打车软件操作驾驶分心检测模型研究[J].2018,16(59):9-14
    [20]彭其渊,陆柳洋,占曙光.基于突发故障的高速列车运行调整研究[J].交通运输工程与信息,2018,16(1):1-8.