基于多分类器组合的蛋白质结构预测研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着类基因组计划的顺利进展,越来越多的蛋白质序列被测定出来,利用理论计算方法来研究蛋白质的结构和功能从而指导实验是一项非常有意义的工作。本文从蛋白质的一级序列出发使用多分类器组合的一些算法对蛋白质结构进行分类研究,论文主要工作如下:
     1、首先对蛋白质相关知识作了简要介绍,在研究蛋白质结构预测问题的过程中提出了使用多分类器组合算法对蛋白质结构进行分类研究。并在研究支持向量机等分类器和多分类器组合的基础上,对其类型、算法等进行了分析。
     2、对蛋白子折叠子预测现状进行了研究,提出以支持向量机为基础的多分类器级联算法来解决折叠子分类问题,实验结果比直接分类提高了近四个百分点,证明了这种思路的有效性。
     3、分析了多分类器融合算法的理论框架,并采用决策模板算法对蛋白质结构类的预测问题进行了研究。在此基础上对这种算法进行了三种改进,并设计了不同的实验来验证算法。实验结果均有不同程度的提高,这说明了算法改进的有效性,也表明将融合算法用于蛋白质结构类分类研究是一种比较可行的思路。
     4、分析了多分类器选择算法的理论框架,对基于局部类精度的动态选择算法和聚类选择算法进行了说明,并将其运用到蛋白质同源寡聚体分类问题中。实验结果和使用支持向量机进行了比较,表明基于选择算法来预测蛋白质同源寡聚体,其精度和可靠性都优于后者。
The success of human genome project makes the number of protein sequences entering into data bank rapidly increasing. Theoretical method computing for predict -ing the structure and function of protein and guiding the experiments is very significative work. In this thesis, we use several methods of multiple classifiers combination to classify protein structures based on the protein primary sequences. The main work is summarized as follow:
    1. The knowledge about protein is introduced firstly. During researching about the predicting the structure of protein, the methods of multiple classifiers combination are applied to classify the structure of protein. And we summarized the methods about multiple classifiers combination and the classifiers such as support vector machine based on the researches of lots of scholars.
    2. Researching about multi-class protein fold recognition, we use the cascade algorithms based on support vector machine to classify the folds. The total accuracy is nearly 4 percentile higher than direct-classifying. This result suggests the thought is feasible.
    3. We investigate the theoretical framework of multiple classifiers fusion, and apply the decision template algorithms to classify the protein secondary structural classes. Then three kinds of improving algorithms are proposed. We use the different experiments to validate them. The results of the experiments are better. It shows that the algorithms and the thought of using multiple classifiers fusion to solve classifying the protein structural classes are effective.
    4. Researching the theoretical framework for classifier selection, we explain the algorithms of dynamic classifier selection using local class accuracy estimates and that of clustering-and-selection and apply them to classify the homo-oligomeric of the protein. We compare with the prediction using support vector machine. The result of the experiments suggests that the precision and the reliability using the selection algorithms are better than those of using support vector machine.
    This thesis is endowed by the postgraduate carving out seed foundation of Northwestern Polytechnical University, No.Z20030048.
引文
[1] Shapiro, L. and Lima, C.D. The Argonne Structural Genomics Workshop: Lamaze class for the birth of a new science. Structure. 1998, 6(3):265-267.
    [2] Clore, G.M. and Gronenborn, A.M. Two-, three-, and four-dimensional NMR methods for obtaining larger and more precise three-dimensional structures of proteins in solution. Annu. Rev. Biophys. Biochem., 1991, 20:29-63.
    [3] Anfinsen, C.B., Haber, E., Sela, M. and White, F.H. The kinetics of the formation of native ribonuclease during oxidation of the reduced polypeptide chain. Proc. Natl. Acad Sci. U.S.A. 1961, 47: 1309-1314.
    [4] Dubchak, I., Muchnik, I., Holbrook, S.R., Kim, S-H. Prediction of protein folding class using global description of amino acid sequence. Proc. Natl. Acad. Sci. U.S.A., 1995, 92: 8700-8704.
    [5] Ding, C.H.Q. and Dubchak, I., Multi-class protein fold recognition using support vector machines and neural networks. Bioinformatics, 2001, 17(4): 349-358.
    [6] Klein P. Prediction of protein structure class by discriminant analysis biochem[J].Biophys Acta, 1986,876:205-275.
    [7] Chou KC,Maggiora GM. Domain structure prediction [J].Protein Engineering,1998, 11 (7):523-538
    [8] Chou KC. A key driving force in determination of protein structural classes[J]. Biochemical and Biophysical Research Communication, 1999,264:216-224
    [9] Lei Xu, Adam Krzyzak, Member, Methods of Combinng Multiple Classifiers and Their Applications to Handwriting Recongnition,IEEE, 1992,22(3):418-435
    [10] Ludmila I.Kuncheva, Switching Between Selection and Fusion in Combining Classifiers:An Experiment, IEEE,2002,32(2), 146-156
    [11] Kevin Woods, et.al, Combination of multiple classifiers using local accura -cy estimates,IEEE, 1997,19(4),405-410.
    [12] J. Kittler, M. Hatef, R.P.W. Duin, and J. Matas, On Combining Classifiers, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 3, pp. 226-239, 1998.
    [13] Anil K.Jain, Robert P.W.Duin,and Jianchang Map, Statistical Pattern Recognition:A Review, IEEE TRANSACTION ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL.22,NO. 1,4-37,2000,1.
    [14] L.I.Kuncheva, J.C.Bezdek,and R.P.W.Duin, Decision templates for multiple classifier fusion: An experimental comparison, Pattern Recognition,vol.34,No.2,299-314,2001.
    [15] Giorgio Giacinto and Fabio Roll, A Theoretical Framework for Dynamic Classifier Selection, International Conference on Pattern Recognition (ICPR'00)-Volume 2 ,September 03-08, 2000, Barcelona, Spain.
    [16] Giorgio Giacinto and Fabio Roll, Methods for Dynamic Classifier Selection, Proc.of the 10th Inernational Conference on Image Analysis and Processing ICIAP 99,Venice,Italy,Sept27-29,659-664,1999.
    [17] Giorgio Giacinto and Fabio Roll, Dynamic Classifier Selectin based on Multiple Classifier Behaviour, Pattern Recognition,34,179-181,2001.
    
    
    [18] Giorgio Giacinto and Fabio Roll, An approach to the automatic design of multiple classifier system, Pattern Recognition Letters 22,25-33,2001.
    [19] Huang.Y.S,Suen.C.Y, A method of Combining multiple experts for the recognition of unconstrained handwritten numerats,IEEE Trans on PA&MI, 1995 17(1):90-93
    [20] Kittler J, Improving Recognition Rates by Classifier Combination:A Theortical Framework, In: Downtown AC, Imedovos(eds),Progress in Handwriting Recognition. World Scientific, 1997,231-248.
    [21] Hull.J.H,Commike.A,Ho.T.K,Multiple algorithms for handwrittrn character recognition, Proc Int workshop frontiers in handwriting recognition, Montreal,PQ,Canada, 1990:117-129.
    [22] Ali.K.M,Pazzani.M.J, On the link between error correlation and error reduction in decision tree ensembles, Technical report 95-38,ICS-UCI, 1995
    [23] Jun Cao et al, Recognition of Handwritten Numeral with Multiple Feature and Multistage Classifier, Pattern recognition, 1995 28(2): 153-160
    [24] Rahman A.E.R,Fairhurst M.C, Serial combination of multiple experts: a unified valuation, Pattern Analysis & Application, 1999.2:292-311
    [25] Louisa Lam and Ching Y Such, Optimal combinations of pattern classifiers,Pattern Recognition Letters, 16:945-954,1995
    [26] Louisa Lain and Ching Y Such, Application of Majority Voting to Pattern Recognition: An Analysis of Its Behavior and Performance, IEEE Transactions on Systems, Man,and Cybernetics-Part A: Systems and Humans,27(5):553-567,Sep. 1997
    [27] Tin Xan Ho, Jonathan J Hull and Sargur N Srihari, Decision Combination in Multiple Classifiers Systems, IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(1):66-75, Jan,1994
    [28] Galina Rogova, Combining the Results of Several Neural Network Classifiers, Neural Network, 7(5):777-781,1994
    [29] Tahani H, Keller JM,Information Fusion in Computer Vision Using the Fuzzy Inegral, IEEE Trans on SMC, 1990, 20(3):733-741
    [30] Yager R R, Element Selection from a Fuzzy Subset Using the Fuzzy Integral, IEEE Trans on SMC, 1993,23(2):467-477
    [31] Klir.G.J,Wang.Z.Y.et al, Constructing fuzzy measures in expert system, Fuzzy sets and System, 1997 9(2):251-264
    [32] L.I.Kuncheva, Clustering-and-selection model for classifier combination, In Proc. Knowledge-Based Intelligent Engineering Systems and Allied Technologies, Brighton,U.K., 2000,185-188
    [33] Lindersteon-Lang, K.U. and Shellman, J.A. Protein structure and enzyme activity. In: Enzymes, Boyer, RD., ed. New York: Academic Press. 1959, pp 443-510.
    [34] Ibarra, B. Bernal, EG., Melendez, M. and Leyton, G.R. Physico-chemical findings on the products of protein decomposition; chromatographic analysis of various types of peptones. Rev. Med. Cordoba., 1958, 46:351-358.
    [35] Buehner, M. Ford, G.C. Moras, D., Olsen, K.W. and Rossman, M.G, D-glyceraldehyde-3-phosphate dehydrogenase: three-dimensional structure and evolutionary significance. Proc. Natl. Acad. Sci. U. S. A., 1973, 70(11):3052-3054.
    
    
    [36] Finkelstein, A.V. and Ptitsyn, O.B. Why do globular proteins fit the limited set of folding patterns. Prog. Biophys. Mol. Biol., 1987, 50: 171-190.
    [37] Chothia, C. and Finkelstein. A.V. The classification and origins of protein folding patterns. Ann. Rev. Biochem., 1990, 59: 1007-1039.
    [38] Murzin, A.G., Brenner, S.E., Hubbard, T. and Chothia, C. SCOP: A structural classification of protein database for the investigation of sequence and structures. J. Mol. Biol., 1995, 247: 536-540.
    [39] Hubbard, T.J.P. Murzin, A.G., Brenner, S.E. and Chothia, C. SCOP: A structural classification of proteins database. Nucleic Acids Res., 1997, 25(1): 236-239.
    [40] Chothia, C. One thousand families for the molecular biologist. Nature, 1992, 357: 543-544.
    [41] Blundell, T.L. and Johnson, M.S. Catching a common fold. Protein Sci., 1993, 2(6):877-883.
    [42] Wang, Z.X. A re-estimation for the total numbers of protein folds and superfamilies. Protein Eng., 1998, 11(8): 621-626.
    [43] Alexandrov, N.N. and Go, N. Biological meaning, statistical significance, and classification of local spatial similarities in nonhomologous proteins. Protein Sci., 1994, 3(6):866-875.
    [44] Harrison, A., Pearl, F., Mott, R., Thornton, J. and Orengo, C. Quantifying the similarities within fold space. JMol Biol., 2002, 323(5):909-26.
    [45] Orengo, C.A., Michie, A.D., Jones, S., Jones, D.T., Swindells, M.B. and Thornton, J.M. CATH- A Hierarchic Classification of Protein Domain Structures. Structure, 1997, 5(8): 1093-1108.
    [46] Pearl, F.M.G., et.al., Assigning genomic sequences to CATH. Nucleic Acids Res., 2000, 28: 277-282.
    [47] Marchler-Bauer, A. and Bryant, S.H. A measure of success in fold recognition. Trends Biochem. Sci., 1997, 22: 236-240.
    [48] Dubchak, I., Muchnik, I., Mayor, C., Dralynk, I. and Kim, S.H. Recognition of a protein fold in the context of the SCOP classification. Proteins: Struct. Funct. Genet., 1999, 35: 401-407.
    [49] Levitt, M. and Chothia, C. Structural patterns in globular proteins. Nature, 1981, 261: 552-558.
    [50] Bowie, J.U., Luthy, R. and Eisenberg, D. A method to identity protein sequences that fold into a known three-dimensional structure. Science, 1991, 253: 164-170.
    [51] Luthy, R., Bowie, J.U. and Eisenberg, D. Assessment of protein models with three-dimensional profiles. Nature, 1992, 356: 83-85.
    [52] Pascarella, S. and Argos, E A databank merging related protein structures and sequences. Protein Eng., 1992, 5:121-137.
    [53] Reczko, M. and Bohr, H., The DEF database of sequence based protein fold class predictions. Nucleic Acids Res., 1994, 22(17):3616-3619.
    [54] Hobohm, U., Scharf, M., Schneider, R. and Sander, C. Selection of a representative set of structures from the Brookhaven Protein Bank. Protein Sci., 1992, 1:409-417.
    [55] Hobohm, U. and Sander, C. Englarged representative set of proteins. Protein Sci., 1994, 3:522-524.
    
    
    [56] Lo Conte, L., Ailey, B., Hubbard, T.J.P., Brenner, S.E., Murzin, A.G. and Chothia, C. SCOP: a structural classification of proteins database. Nucleic Acids Res., 28:257-259.
    [57] Vapnik, V. (Ed.) The Nature of Statistical Learning Theory. Springer, New York, 1995.
    [58] Vapnik, V. (Ed.) Statistical Learning Theory. Wiley, New York, 1998.
    [59] Brown, M., Grundy, W., Lin, D., Cristianini, N., Sugnet, C.W., Furey, T.S., Ares Yr. M., and Hausser, D. Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc. Natl. Acad. Sci. U.S.A., 2000, 97:262-267
    [60] Zien, A., Ratsch, G., Mika, S., Schlkopf, B., Lengauer, T. and Muller K. R. Engineering support vector machine kernels that recognize translation initiation sites. Bioinformatics, 2000, 16:799-807.
    [61] Jaakkola, T., Diekhans, M. and Haussler, D. Using the Fisher kernel method to detect remote protein homologies. Proceedings of the 7th International Conference on Intelligent systems for Molecular Biology. AAAI Press, Menlo Park, CA. 1999, 149-158.
    [62] Cai, Y.D., Liu, X.J., Xu, X.B. and Chou, K.C. Support vector machines for prediction membrane protein types by incorporating quasi-sequence-order effect. Internet Electron. J. Mol. Des., 2002, 1: 219-226. http://www.biochempress.com.
    [63] Hua, S.J. and Sun, Z.R. Support vector machine approach for protein subcellular localization prediction. Bioinformatics, 2001, 17(8): 721-728.
    [64] Cai, Y.D., Liu, X.J. and Chou, K.C., Support vector machines for prediction of protein subcellular location. Mol. Cell Biol. Res. Commun., 2000, 4: 230-233.
    [65] Garian, R. Prediction of Quaternary Structure from Primary Structure, Bioinformatics, 2001, 17: 551-556.
    [66] Zhang Shao-Wu, Pan Quan, Zhang Hong-Cai, Wu Yong-Hong, Shi Jian-Yu, Support Vector Machine for Predicting Protein Homo-oligomers by Incorporating Pseudo-amino acid Composition. Internet Electron. J. Mol. Des., 2003, 2: 392-402, http://www.biochempress.com.
    [67] Shao-Wu Zhang, Quan Pan, Hong-Cai Zhang,Yun-Long Zhang and Hai-Yu Wang, Classification of Protein Quaternary Structure with Support Vector Mmachine, Bioinformatics, VOL.19, NO.18, 2390-2396, 2003:
    [68] Joachims, T. Making large-scale SVM learning practical; in: Advances in Kernel Methods-Support Vector learning. Scholkopf, B., Burges, C. and Smola, A., Eds. MIT Press, Cambridge, MA, 1999, pp.42-56.
    [69] Burges, J.C. A Tutorial on Support Vector Machines for Pattern Recognition. Bell Laboratories, Lucent Technologies. 1997.
    [70] Courant, R. and Hilbert, D. Methods of Mathematical Physics. New York: Wildy-Interscience, 1953.
    [71] Corinna Cortes and Vapnik, V. Support-Vector Network. Machine Learning, 1995, 20:273—297.
    [72] T.Joachims, Text categorization with support vector machines, In European Conference on Machine Learning (ECML), 1998
    [73] T.Joachims, Estimating the Generalization performance of an SVM Efficiently. International Conference on Machine Learning (ICML), 2000
    [74] T.Joachims, Transductive Inference for Text Classification using Support Vector Machines. International Conference on Machine Learning (ICML), 1999.
    
    
    [75] K. Morik, P. Brockhausen, and T. Joachims, Combining statistical learning with a knowledge-based approach-A case study in intensive care monitoring. International Conference on Machine Learning (ICML), 1999.
    [76] T.Joachims, Learning to Classify Text Using Support Vector Machines: Methods, Theory, and Algorithms. Dissertation, Kluwer,2002.
    [77] Nishikawa, K., and Ooi, T. Correlation of the amino acid composition of a protein to its structural and biological characters. J. Biochem., 1982, 91: 1821-1824.
    [78] Nishikawa, K., Kubota, Y. and Ooi, T. Classification of the protein into groups based on amino acid composition and other characters. Ⅰ. Angular distribution. J. Biochem., 1983, 94:981-995.
    [79] Nishikawa, K., Kubota, Y. and Ooi, T. Classification of the protein into groups based on amino acid composition and other characters. Ⅱ. Grouping into four types. J. Biochem., 1983, 94:997-1007.
    [80] Nishikawa, K. and Ooi, T. Radial locations of amino acid residues in a globular protein: correlation with the sequence. J Biochem (Tokyo). 1986, 100(4): 1043-1047.
    [81] Cendnao, J., Aloy, E, Peez-Pons, J.A. and Querol, E. Relation between amino acid composition and cellular location of proteins. J. Mol. Biol., 1997, 266: 594-600.
    [82] Chou, K.C. and Elord, D.W. Using discriminant function for prediction of subcellular location of prokaryotic proteins. Biochem. Biophys. Res. Commun., 1998, 252: 63-68.
    [83] J.H.Friedman, Regularized Discriminant Analysis, J.Am.Statistical Assoc., Vol.84:165-175,1989
    [84] Duda, R.O. and Hart, P.E. Pattern Classification and Scene Analysis. John Wiley & Sons, New york, 1973
    [85] Chan.L.W, Fallside.F, An adaptive training algorithm for back-progagation networks, Computer Speech and Language, 1987.2
    [86] Tao Jiang, Ying Xu and Michael Q. Zhang ed., Current Topics in Computational Molecular Biology, Tsinghua University Press, The MIT press, 2002.
    [87] Lim, V.I. Algorithms for prediction of α-helices and β-structural regions in globular proteins. J. Mol. Biol., 1974, 88: 873-894.
    [88] Chou, EY. and Fasman, G.D. Prediction of protein conformation. Biochemistry, 1974, 13(2): 222-245.
    [89] Ptitsyn, O.B, and Finkelstein, A.V. Theory of protein secondary structure and algorithm of its prediction. Biopolymers. 1983, 22:15-22.
    [90] Gariner, J., Osguthorpe, D. and Robson, B. Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. J. Mol. Biol., 1978, 120: 97-120.
    [91] Gibrat, J.F. Gamier, J. and Robson, B. Further developments of protein secondary structure prediction using information theory. New parameters and consideration of residue pairs. J. Mol. Biol., 1987, 198: 425-443.
    [92] Cohen, F.E., Abarbanel, R.M., Kuntz, I.D. and Fletterick, R.J. Turn prediction in proteins using a pattern-matching approach. Biochemistry, 1986, 25(1): 266-275.
    [93] Solovyer, V.V. and Salamov, A.A. Predicting alpha-helix and beta-strand segments of globular proteins. Comput. Appl. Biosci., 1994, 10(6): 661-669.
    
    
    [94] Holley, H.L. and Karplus, M. Protein secondary structure prediction with a neural network. Proc. Natl. Acad. Sci. U.S.A., 1989, 86: 152-156.
    [95] Zvelebil, M.J., Barton, G.J., Taylor, W.R. and Sternberg, M.J. Prediction of protein secondary structure and active sites using the alignment of homologous sequences. J. Mol. Biol., 1987, 195(4): 957-961.
    [96] Salzberg, S. and Cost, S. Predicting protein secondary structure with nearest-neighbor algorithm. J. Mol. Biol., 1992, 227: 371-374.
    [97] Hua S. J. and Sun, Z. R. A Novel Method of Protein Secondary Structure Prediction with High Segment Overlap Measure: Support Vector Machine Approach, J. Mol. Biol., 2001, 308:397-407.
    [98] Kabsch, W. and Sander, C. Dictionary of protein secondary structures: Pattern recognition of hydrogen-bend and geometrical features. Biopolymers, 1983.22: 2577-2637.
    [99] Kawashima S. Ogata H. Kanehisa M. Aaindex: Amino Acid Index Database. Nucleic Acids Res. 1999, 27(1): 368-369.
    [100] Robert Garian. Prediction of quaternary structure from primary structure [J]. Bioinformatics, 2001, 17(6): 551-556.
    [101] J.R.Quinlan,C4.5:Programs for Machine Learning. San Mateo,Calif:Morgan Kaufmann, 1993.
    [102] 陈润生,生物信息学,生物物理学报,1999,1:5-11.
    [103] 来鲁华,蛋白质的结构预测与分子设计.北京:北京大学出版社,1993.
    [104] 秦红珊,杨新歧,曹文斗,从非同源蛋白质的一级序列预测其结构类,生物物理学报,2002,18(2),213-222
    [105] 周嫔,马少平,苏中,多分类器合成方法综述,中文信息处理国际会议论文集,1998,85-92
    [106] 马文驷,多分类器融合模式识别方法研究,西安电子科技大学硕士学位论文,2002.1.1
    [107] 潘翔,基于多分类器集成的模式识别研究,浙江工业大学硕士学位论文,2001.12.1
    [108] 戴汝为,郝红卫,综合集成的构思在模式识别中的应用,自动化学报,Vol.23,No.3,1997.5
    [109] 朱小燕,史一凡,马少平,手写体识别研究,模式识别与工智能,Vol.13,No.2,174-180,2000.6
    [110] 韩宏,杨静宇,多分类器组合及其应用,计算机科学,Vol.27,No.1,2000.
    [111] 吕岳,施鹏飞,赵宇明,多分类组合的投票表决规则,Vol.34,No.5,2000.5
    [112] 孙怀江,胡钟山,杨静宇等,基于证据理论的多分类器融合方法研究,计算机学报,2001,24(3):231-235
    [113] 阎隆飞,孙之荣主编.蛋白质分子结构.北京:清华大学出版社,1999.
    [114] 陶慰孙,李惟,姜涌明.蛋白质分子基础.北京:高等教育出版社,1995.
    [115] 黄积涛,蛋白质结构、运动与功能,天津大学博士论文,2003.
    [116] 张绍武,潘泉,张洪才,张云龙,王海瑜,基于支持向量机和贝叶斯方法的蛋白质四级结构分类研究.生物物理学报,2003,19(2):171-175.
    [117] 张学工,关于统计学习理论与支持向量机,自动化学报,2000,26(1):32-41
    [118] 吕岳,施鹏飞,赵宇明,改进的贝叶斯多分类器组合规则,数据采集与处理,Vol.15,No.2,Jun,2000.
    
    
    [119] 赵谊虹,程国华,史习智,多分类器融合中一种新的加权算法,Vol.36,No.6,Jun,2000.
    [120] 胡钟山,娄震等,基于多分类器组合的手写数字识别,计算机学报,1999.22(4):369-374
    [121] 荆晓远,杨静宇,基于相关性和有效互补性分析的多分类器组合方法,自动化学报,Vol.26,No.6,741-747,Nov,2000
    [122] 李昌华,杨兵,谢维信,基于多分类器组合的联机图形识别方法研究,控制与决策,Vol.18,No.192-95,Jan.2003
    [123] 王正群,叶晖,孙兴华,杨静宇,模糊多分类器组合,小型微型计算机系统,Vol.24,No.186-89,Jan.2003
    [124] 刘汝杰,袁保宗,唐晓芳,一种新的基于聚类的多分类器融合算法,计算机研究与发展,Vol.38,No.19 1236-1241 Oct.2001
    [125] 景志宏,赵谊虹,程国华,刘振霞,基于多神经网络分类器的目标识别仿真试验研究,系统仿真学报,441-443 Vol.15,No.3
    [126] 潘翔,姚明海,陈国华,多分类器的一种动态联合方法,计算机工程与应用,2002.12 92-93.
    [127] 冯志萍,从蛋白质的一级结构预测蛋白质的亚细胞位置和结构类.天津学博士论文,2001.
    [128] 曾华军,张银奎等译,机器学习,(Tom M.Mitchell.Machine Learning.)北京:机械工业出版社,2003
    [129] 边肇祺,张学工等,模式识别(第二版),北京:清华大学出版社,2000
    [130] 王镜岩,文重,陆德培,文镜和刘志华译.生物化学.北京:科学出版社,2000.Hames, B.D. Hooper, N.M. and Houghton, J.D. Instant notes in biochemistry. United Kingdom, BIOS Scientific Publishers Limited, 1997.
    [131] 赵波,蛋白质结构预测问题的支撑向量机方法,西安交通大学硕士学位论文,2002