基于单类图分类的化合物分类方法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Compound classification based on one-class graph classification
  • 作者:王晓东 ; 张阳 ; 王美丽
  • 英文作者:WANG Xiao-dong;ZHANG Yang;WANG Mei-li;College of Information Engineering,Northwest A&F University;
  • 关键词:图分类 ; 频繁子图挖掘 ; 特征选择 ; 单类分类 ; 自适应增强
  • 英文关键词:graph classification;;frequent subgraph mining;;feature selection;;one-class classification;;Adaboost
  • 中文刊名:SJSJ
  • 英文刊名:Computer Engineering and Design
  • 机构:西北农林科技大学信息工程学院;
  • 出版日期:2019-03-16
  • 出版单位:计算机工程与设计
  • 年:2019
  • 期:v.40;No.387
  • 基金:国家自然科学基金项目(61402374);; 陕西省科技统筹创新工程计划基金项目(2014KTDZ03-02-2)
  • 语种:中文;
  • 页:SJSJ201903026
  • 页数:7
  • CN:03
  • ISSN:11-1775/TP
  • 分类号:155-161
摘要
为减少新药设计过程中化合物筛选与结构优化的成本,提高筛选与优化的精确性,提出一种用于化合物分类的单类图分类方法。对化合物的严格拓扑结构提出优化频繁子图的方法选取特征子图,自适应地挖掘每阶段支持度下信息量表示更全的闭频繁子图,将提取的频繁子图作为特征样本输入,通过Adaboost集成一类支持向量机分类算法训练分类模型。实验结果表明,该方法可以明确提取相关性较高的频繁子结构,显著降低频繁子图挖掘后的特征空间,有效提高分类的准确性和泛化性。
        To reduce the cost of screening compounds and structure optimization in the design process of new drugs,and improve the accuracy of screening and optimization,one-class graph classification method for compound classification was proposed.Considering the strict topology structure of the compound,the method of optimization of frequent subgraph was used to select the feature subgraph,and the closed frequent subgraph which was more complete at each stage of support was adaptively mined,and the extracted frequent subgraphs were inputted as the feature samples.The classification algorithm of Adaboost was used to train one-class support vector machine classification model.Experimental results show that the proposed method can extract frequent substructures with high correlation and significantly reduce the feature space after frequent subgraph mining,and effectively improve the accuracy and generalization of classification.
引文
[1]Paricharak S,Ijzerman A P,Jenkins J L,et al.Data-driven derivation of an“informer compound set”for improved selection of active compounds in high-throughput screening[J].Journal of Chemical Information&Modeling,2016,56(9):1622-1630.
    [2]Akoglu L,Tong H,Koutra D.Graph based anomaly detection and description:A survey[J].Data Mining&Knowledge Discovery,2015,29(3):626-688.
    [3]Xu L,Xie J,Wang X,et al.A mixed Weisfeiler-Lehman graph kernel[C]//International Workshop on Graph-Based Representations in Pattern Recognition.2015:242-251.
    [4]Park Y,Reeves D.Deriving common malware behavior through graph clustering[J].Computers&Security,2013,39(4):419-430.
    [5]Pan S,Wu J,Zhu X,et al.Task sensitive feature exploration and learning for multitask graph classification[J].IEEETransactions on Cybernetics,2017,47(3):744-758.
    [6]TU Liyang,DU Junqiang,JIE Biao,et al.Minimal hepatic encephalopathy classification based on discriminative subgraph reconstruction[J].PR&AI,2016,29(9):832-839(in Chinese).[屠黎阳,杜俊强,接标,等.基于判别性子图重构的轻微肝性脑病分类[J].模式识别与人工智能,2016,29(9):832-839.]
    [7]Cao D,Yang Y,Zhao J,et al.Computer-aided prediction of toxicity with substructure pattern and random forest[J].Journal of Chemometrics,2012,26(1-2):7-15.
    [8]Jiang C,Coenen F,Zito M.A survey of frequent subgraph mining algorithms[J].Knowledge Engineering Review,2013,28(1):75-105.
    [9]Lakshmi K,Meyyappan T.Efficient algorithm for mining frequent subgraphs(static and dynamic)based on gSpan[J].International Journal of Computer Applications,2013,63(19):9-12.
    [10]FEI Fei,WANG Lipeng,JIE Biao,et al.Discriminative subgraph mining with application in MCI classification[J].Journal of Nanjing University(Natural Sciences),2015,51(2):328-334(in Chinese).[费飞,王立鹏,接标,等.判别性子图挖掘方法及其在MCI分类中的应用[J].南京大学学报(自然科学),2015,51(2):328-334.]
    [11]Liu J,Song J,Miao Q,et al.An ensemble cost-sensitive one-class learning framework for malware detection[J].International Journal of Pattern Recognition and Artificial Intelligence,2015,29(5):523-527.
    [12]Cyganek B.One-Classsupport vector ensembles for image segmentation and classification[J].Journal of Mathematical Imaging&Vision,2012,42(2-3):103-117.
    [13]CAO Ying,MIAO Qiguang,LIU Jiachen,et al.Advance and prospects of AdaBoost algorithm[J].Acta Automatica Sinica,2013,39(6):745-758(in Chinese).[曹莹,苗启广,刘家辰,等.AdaBoost算法研究进展与展望[J].自动化学报,2013,39(6):745-758.]
    [14]Thammasiri D,Meesad P.AdaBoost ensemble data classication based on diversity of classifiers[J].Advanced Materials Research,2012,403-408:3682-3687.
    [15]Owusu E,Zhan Y,Mao Q R.A neural-AdaBoost based facial expression recognition system[J].Expert Systems with Applications,2014,41(7):3383-3390.
    [16]WANG Xiaodong.The prediction method of compound properties based on graph mining[D].Yangling:Northwest A&F University,2018(in Chinese).[王晓东.基于图数据挖掘的化合物性质预测方法研究[D].杨凌:西北农林科技大学,2018.]