结构对称蛋白质性质研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
蛋白质序列和结构的关系即第二遗传密码问题,长期以来是蛋白质科学研究的热点和难点问题。近年来,许多研究者通过研究一些特殊蛋白质,揭示出了一些特殊的序列同结构的关联性质,加快了蛋白质序列和结构关系研究的步伐。本论文主要针对结构对称蛋白质的序列特性,序列同结构的关联特性以及对称结构的形成机制展开了系统研究,此项研究工作有助于深入了解蛋白质序列和结构的关系。本论文从四个方面展开了研究工作:
     1)选取了α、β和αβ类中的一些对称蛋白质结构域,提出了非线性重现图方法研究它们的序列,结果发现它们隐含着序列对称性,且与结构对称性一致;从PROPEAT数据库选取了一些对称蛋白质折叠子,提出了相似矩阵和关联矩阵方法研究它们的序列,结果发现它们隐含着序列对称性,且与结构对称性一致。这两项工作都表明蛋白质的对称结构与隐含序列对称性有强关联。
     2)Plant Cytotoxin B家族蛋白质有两个对称的Beta-trefoil结构域。我们用改进重现图方法研究它们的序列对称性,发现对称蛋白质结构域有不同的序列对称度。通过计算残基接触密度,发现它们不同的序列进化速率可能导致了不同的序列对称度。此外,Trefoil单元的多序列比对,发现了四个三重复模块,它们有较大的残基相互作用数目和较小的B-Factors,我们推测它们是关键结构氨基酸。而且,这些模块在Beta-trefoil结构中对称分布,模块和模块中残基相互作用呈现出三对称性,且与Trefoil单元的三对称性吻合。我们由此推测,这些对称的关键结构氨基酸在对称结构的形成中起着主导作用。
     3)选取了Four-blade beta-propeller蛋白质,应用关联矩阵方法研究它们的结构对称性、序列对称性和内部残基相互作用对称性,以及这些对称性之间的关联性。结果显示,序列对称性和内部残基相互作用对称性都与结构对称性有强关联性。考虑到序列对称性较弱,内部残基相互作用对称性较强以及前者与对称结构的关联指数小于后者与对称结构的关联指数,内部残基相互作用对称性与结构对称性的关联应该强于序列对称性与结构对称性的关联。为此,我们认为内部残基相互作用对称性包含对称和非对称残基两部分的贡献,因为两者都有可能享有对称的相互作用,这也说明了非对称残基以对称的相互作用参与到对称结构的编码之中。
     4)选取了Beta-trefoil蛋白质,应用重现图方法研究了它们的序列对称性。结果显示,相同的结构对称度对应于不同的序列对称度。我们提出两种假设:一种是虽然序列对称度不同,但是内部残基相互作用对称度相同;另一种是蛋白质外部相互作用参与了对称结构编码,降低了序列对称度。针对后者,我们做了深入研究,发现序列对称度与外部相互作用存在负关联。而且,残基相互作用类型分布显示外部相互作用主要以极化和半带电接触方式参与到结构编码中,但是具体编码方式目前仍不清楚,它们可能影响了蛋白质的折叠过程。
The relationship between protein sequence and structure, namely, the second genetic code, has long been hot and hard topic in life science. Recently, some researchers shifted focus towards some special proteins and discovered some special correlations between sequence and structure that result in rapid progress in understanding the second genetic code. In this thesis, systematic studies are performed on proteins with symmetric structures, including the characteristics of sequences, the correlation between sequence and structure and the formation mechanism of symmetric structures. These works will help us better understand the relationship between protein sequence and structure. This thesis includes works of four aspects as follow:
     1) We selected proteins with symmetric domains fromα,βandαβclasses and applied nonlinear recurrent plot method to analyze their sequences. The results show that they have hidden sequence symmetries that are consistent with the symmetric domains. We also selected structurally symmetric folds from PROPEAT databank and applied similarity matrix and correlation matrix to analyze their sequences. The results show that they have hidden sequence symmetries corresponding to symmetric folds. These results both show that the symmetric structures are closely correlated with the hidden sequence symmetries.
     2) Proteins in Plant Cytotoxin B family have two symmetric domains sharing beta-trefoil fold. We applied modified recurrent plot to analyze their sequences and detected the different degrees of sequence symmetries. With the calculation of contact density, we found the symmetric domains have different evolutionary rates that may contribute to the different degrees of sequence symmetries. On the other hand, we found four three-repetitive motifs with the aid of multiple sequence alignments on trefoil units. These motifs have large residue interaction numbers and small values of B-Factors. They seem to be key structural residues. Moreover, these motifs are located symmetrically at the structures. These motifs and their residue interactions respectively show three-fold symmetries, which are consistent with the three-fold symmetries of trefoil units. Taken together, these evidences suggest that the symmetric key structural residues should dominate the symmetric structures.
     3) We selected proteins with Four-blade Beta-propeller fold and applied correlation matrix to study the symmetries of structure, sequence and inter-residue interaction. The correlations among them show that the symmetries of sequences and inter-residue interactions are both closely correlated with the symmetries of structures. However, three lines of evidences suggest that the correlations of symmetries between inter-residue interactions and structures exceed those between sequences and structures. So,it seems that the symmetric and asymmetric residues both contribute to the symmetry of the inter-residue interactions because both can share symmetric inter-residues interactions. Furthermore, it suggests that the asymmetric residues contribute to the symmetric structures by means of symmetric inter-residue interactions.
     4) We selected proteins with Beta-trefoil fold and applied modified recurrent plot to study the symmetries of sequence. The results show that the similar degrees of structural symmetries correspond to the different degrees of sequence symmetries. Two hypotheses were proposed: First, the different degrees of sequence symmetries, but the similar degrees of inter-residue interaction symmetries; Second, the external interactions contribute to the symmetric structures and decrease the degrees of sequence symmetries. As for the second hypothesis, our further studies show that the degrees of sequence symmetries are negatively correlated with the external interactions. And the distributions of the different types of residue interactions show that they contribute to the symmetric structures by mean of polar and half-charged contacts. However, it is not very clear how they encode the symmetric structures in detail. They may influence protein folding processes.
引文
[1] Consortium, I. H. G. S. Finishing the euchromatic sequence of the human genome. Nature, 2004, 431: 931-945
    [2] Hayden, E. C. International genome project launched. Nature, 2008, 451: 378-379
    [3] Janssen, P. J., et al. COmplete GENome Tracking (COGENT): a flexible data environment for computational genomics. Bioinformatics, 2003, 19: 1451-1452
    [4]阎隆飞,孙之荣.蛋白质分子结构.清华大学出版社, 1999
    [5] Chambers, I., et al. The structure of the mouse glutathione peroxidase gene: the selenocysteine in the active site is encoded by the 'termination' codon TGA. EMBO J., 1986, 5: 1221-1227
    [6] Zinoni, F., et al. Nucleotide sequence and expression of the selenocysteine-containing polypeptide of formate dehydrogenase (formate-hydrogen-lyase-linked) from Escherichia coli. Proc. Natl. Acad. Sci. U. S. A., 1986, 83: 4650-4654
    [7] Srinivasan, G., et al. Pyrrolysine encoded by UAG in archaea: charging of a UAG-decoding specialized tRNA. Science, 2002, 296: 1459-1462
    [8] Hao, B., et al. A new UAG-encoded residue in the structure of methanogen methyltransferase. Science, 2002, 296: 1462-1466
    [9] Kawashima, S. and Kanehisa, M. AAindex: amino acid index database. Nucleic Acids Res., 2000, 28: 374
    [10] Sanger, F. A disulphide interchange reaction. Nature, 1953, 171: 1025-1026
    [11] Sanger, F. and Thompson, E. O. The amino-acid sequence in the glycyl chain of insulin. II. The investigation of peptides from enzymic hydrolysates. Biochem. J., 1953, 53: 366-374
    [12] Sanger, F. and Thompson, E. O. The amino-acid sequence in the glycyl chain ofinsulin. I. The identification of lower peptides from partial hydrolysates. Biochem. J., 1953, 53: 353-366
    [13]梁宋平.世纪之交的蛋白质序列测定技术.生命科学, 1999, 11: 31-35
    [14] Zhou, H. Y., et al. DDOMAIN: Dividing structures into domains using a normalized domain-domain interaction profile. Protein Sci., 2007, 16: 947-955
    [15] Berman, H. M., et al. The Protein Data Bank. Nucleic Acids Res., 2000, 28: 235-242
    [16] Finkelstein, A. V. and Ptitsyn, O. B. Protein Physics: A Course of Lectures. Academic Press, 2002
    [17] Murzin, A. G., et al. SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol., 1995, 247: 536-540
    [18] Orengo, C. A., et al. CATH- A hierarchic classification of protein domain structures. Structure, 1997, 5: 1093-1108
    [19] Hadley, C. and Jones, D. T. A systematic comparison of protein structure classifications: SCOP, CATH and FSSP. Structure, 1999, 7: 1099-1112
    [20] Chou, K. C. A novel approach to predicting protein structural classes in a (20-1)-D amino acid composition space. Proteins, 1995, 21: 319-344
    [21] Anfinsen, C. B. Principles that govern the folding of protein chains. Science, 1973, 181: 223-230
    [22]阿伯特.莱特著,张维钦译.蛋白质的结构与功能.高等教育出版社, 1982
    [23] Anfinsen, C. B. and Scheraga, H. A. Experimental and theoretical aspects of protein folding. Adv. Protein Chem., 1975, 29: 205-300
    [24] White, S. T. and Jacobs, R. E. Statistical distribution of hydrophobic residues along the length of protein chains. Implications for protein folding and evolution. Biophys. J., 1990, 57: 911-921
    [25] Pande, V. S., et al. Nonrandomness in protein sequences - evidence for a physically driven stage of evolution. Proc. Natl. Acad. Sci. U. S. A., 1994, 91: 12972-12975
    [26] Rackovsky, S.“Hidden”sequence periodicities and protein architecture. Proc. Natl. Acad. Sci. U. S. A., 1998, 95: 8580-8584
    [27] Korotkov, E. V., et al. Information decomposition method to analyze symbolical sequences. Phys Lett A, 2003, 312: 198-210
    [28] Pauling, L. and Corey, R. B. Configuration of polypeptide chains. Nature, 1951, 168: 550-551
    [29] Pauling, L. and Corey, R. B. The polypeptide-chain configuration in hemoglobin and other globular proteins. Proc. Natl. Acad. Sci. U. S. A., 1951, 37: 282-285
    [30] Pauling, L. and Corey, R. B. The structure of fibrous proteins of the collagen-gelatin group. Proc. Natl. Acad. Sci. U. S. A., 1951, 37: 272-281
    [31] Pauling, L. and Corey, R. B. The structure of hair, muscle, and related proteins. Proc. Natl. Acad. Sci. U. S. A., 1951, 37: 261-271
    [32] Pauling, L. and Corey, R. B. The structure of feather rachis keratin. Proc. Natl. Acad. Sci. U. S. A., 1951, 37: 256-261
    [33] Pauling, L. and Corey, R. B. The pleated sheet, a new layer configuration of polypeptide chains. Proc. Natl. Acad. Sci. U. S. A., 1951, 37: 251-256
    [34] Pauling, L. and Corey, R. B. The structure of synthetic polypeptides. Proc. Natl. Acad. Sci. U. S. A., 1951, 37: 241-250
    [35] Pauling, L. and Corey, R. B. Atomic coordinates and structure factors for two helical configurations of polypeptide chains. Proc. Natl. Acad. Sci. U. S. A., 1951, 37: 235-240
    [36] Pauling, L., et al. The structure of proteins; two hydrogen-bonded helical configurations of the polypeptide chain. Proc. Natl. Acad. Sci. U. S. A., 1951, 37: 205-211
    [37] Chou, P. Y. and Fasman, G. D. Conformational parameters for amino acids in helical, beta-sheet, and random coil regions calculated from proteins. Biochemistry, 1974, 13: 211-222
    [38] Chou, P. Y. and Fasman, G. D. Prediction of protein conformation. Biochemistry, 1974, 13: 222-245
    [39] Chou, P. Y. and Fasman, G. D. Empirical predictions of protein conformation. Annu. Rev. Biochem., 1978, 47: 251-276
    [40] Chou, P. Y. and Fasman, G. D. Prediction of the secondary structure of proteins from their amino acid sequence. Adv. Enzymol. Relat. Areas. Mol. Biol., 1978, 47: 45-148
    [41] Garnier, J., et al. Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. J. Mol. Biol., 1978, 120: 97-120
    [42] Gibrat, J., et al. Further developments of protein secondary structure prediction using information theory. J. Mol. Biol., 1987, 198: 425-428
    [43] Lim, V. I. Structural principles of the globular organization of protein chains: a stereochemical theory of globular protein secondary structure. J. Mol. Biol., 1974, 88: 857-872
    [44] Lim, V. I. Algorithm for prediction of -helical and beta-structural regions in globular proteins. J. Mol. Biol., 1974, 88: 873-894
    [45] Sun, Z. R., et al. Prediction of protein supersecondary structures based on artificial neural network method. Protein Eng. Des. Sel., 1997, 10: 763-769
    [46] Rost, B. Review: Protein secondary structure prediction continues to rise. J. Struct. Biol., 2001, 134: 204-218
    [47] Petersen, T. N., et al. Prediction of protein secondary structure at 80% accuracy. Proteins, 2000, 41: 17-20
    [48] Bryant, S. H. and Lawrence, C. E. An empircal energy function for threading protein sequence through the folding motif. Proteins, 1993, 16: 92-112
    [49] Fetrow, J. S. and Bryant, S. H. New programs for protein tertiary structure prediction. Biotechnology, 1993, 11: 479-484
    [50] Koehl, P. and Delarue, M. A self-consistent mean field approach to simultaneous gap closure and side-chain positioning in homology modeling. Nature Struc. Biol., 1995, 8: 163-170
    [51] Hwang, J. K. and Liao, W. F. Side-chain prediction by neural networks and simulated annealing optimization. Protein Eng. Des. Sel., 1995, 8: 363-370
    [52] Karplus, M. and Petsko, G. A. Molecular dynamics simulations in biology. Nature, 1990, 347: 631-639
    [53] Karplus, M. and Sali, A. Theoretical studies of protein folding and unfolding. Curr. Opin. Struct. Biol., 1995, 5: 58-73
    [54] Karplus, M. The Levinthal paradox: yesterday and today. Fold Des., 1997, 2: S69-75
    [55] Karplus, M. and McCammon, J. A. Molecular dynamics simulations of biomolecules. Nat. Struct. Biol., 2002, 9: 646-652
    [56] Li, M. F., et al. Nonlinear analysis of sequence symmetry of beta-trefoil family proteins. Chaos Solitons Fractals, 2005, 25: 491-497
    [57] Xu, R. Z. and Xiao, Y. A common sequence-associated physicochemical feature for proteins of beta-trefoil family. Comput Biol Chem, 2005, 29: 79-82
    [58] Brych, S. R., et al. Accommodation of a highly symmetric core within asymmetric protein superfold. Protein Sci., 2003, 12: 2704-2718
    [59] Marcotte, E. M., et al. A census of protein repeats. J. Mol. Biol., 1998, 293: 151-160
    [60] Hausrath, A. C. and Goriely, A. G. Repeat protein architectures predicted by a continuum representation of fold space. Protein Sci., 2006, 15: 753-760
    [61] Taylor, W. R., et al. A fourier analysis of symmetry in protein structure. Protein Eng. Des. Sel., 2002, 15: 79-89
    [62] Shih, E. S. C. and HWang, M. J. Protein structure comparison by probability-based matching of secondary structure elements. Bioinformatics, 2003, 19: 735-741
    [63] Shih, E. S. C. and Hwang, M. J. Alternative alignments from comparison of protein structures. Proteins, 2004, 56: 519-527
    [64] Nishikawa, K. and Ooi, T. Tertiary structure of protein. II. Freedom of dihedral angles and energy calculations. J. Physical. Soc. Japan., 1972, 32: 1338-1347
    [65] Levitt, M. A simplified representation of protein conformations for rapid simulation of protein folding. J. Mol. Biol., 1976, 104: 59-107
    [66] Needleman, S. B. and Wunsch, C. D. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol., 1970, 48: 443-453
    [67] Higgins, D., et al. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res., 1994, 22: 4673-4680
    [68] Huang, X. Q. On global sequence alignment. Comput. Appl. Biosci., 1994, 10: 227-235
    [69] Smith, T. F. and Waterman, M. S. Identification of common molecular subsequences. J. Mol. Biol., 1981, 147: 195-197
    [70] Altschul, S. F., et al. Basic local alignment search tool. J. Mol. Biol., 1990, 215: 403-410
    [71] Kumar, S., et al. MEGA3: Integrated software for molecular evolutionary genetics analysis and sequence alignment. Brief. Bioinformatics 2004
    [72] Nicholas, K. B., et al. GeneDoc: Analysis and Visualization of Genetic Variation. EMBNEW. NEWS, 1997, 4: 14
    [73] Gibbs, A. J. and Mcintyre, G. A. The diagram, a method for comparing sequences its use with amino scid and nucleotide sequences. Eur. J. Biochem., 1970, 16: 1-11
    [74] Junier, T. and Pagni, M. Dotlet: diagonal plots in a Web browser. Bioinformatics, 2000, 16: 178-179
    [75] Deleage, G., et al. ANTHEPROT: An integrated protein sequence analysis softwarewith client/server capabili. Comput. Biol. Med., 2001, 31: 259-267
    [76] Eckmann, J. P., et al. Recurrence plots of dynamical systems. Europhys Lett, 1987, 5: 973-977
    [77] Zbilut, J. P., et al. Review of nonlinear analysis of proteins through recurrence quantification. Cell Biochem. Biophys., 2002, 36: 67-87
    [78] Giuliani, A., et al. Nonlinear signal analysis methods in the elucidation of protein sequence-structure relationships. Chem. Rev., 2002, 102: 1471-1491
    [79] Heger, A. and Holm, L. Rapid automatic detection and alignment of repeats in protein sequences. Proteins, 2000, 41: 224-237
    [80] Heringa, J. Detection of internal repeats: how common are they? Curr. Opin. Struct. Biol., 1998, 8: 338-345
    [81] Heringa, J. and Argos, P. A method to recognize distant repeats in protein sequences. Proteins, 1993, 17: 391-341
    [82] Szklarczyk, R. and Heringa, J. Tracking repeats using significance and transitivity. Bioinformatics, 2004, 20: 311-317
    [83] George, R. A. and Heringa, J. The REPRO server: finding protein internal sequence repeats through the web. Trends Biochem. Sci., 2000, 25: 515-517
    [84] Laskin, A. A., et al. Latent periodicity of serine-threonine and tyrosine protein kinases and other protein families. Comput Biol Chem, 2005, 29: 229-243
    [85] Gruber, M., et al. REPPER- repeats and their periodicities in fibrous proteins. Nucleic Acids Res., 2005, 33: W239-W243
    [86] Soding, J., et al. HHrep: de novo protein repeat detection and the origin of TIM barrels. Nucleic Acids Res., 2006, 34: W137-W142
    [87] Huang, Y. Z., et al. Nonlinear analysis of sequence repeats of multi-domain proteins. Chaos Solitons Fractals, 2007, 34: 782-786
    [88] Mount, D. W. Bioinformatics: Sequence and Genome Analysis. Cold Spring Harbor Laboratory Press, 2002
    [89] Sweet, R. M. and Eisenberg, D. Correlation of sequence hydrophobicities measures similarities in three-dimensional protein structures. J. Mol. Biol., 1983, 171: 479-488
    [90] Orengo, C. A., et al. Protein superfamilies and domains superfolds. Nature, 1994, 372: 631-634
    [91] Lang, D., et al. Structural evidence for evolution of the alpha/beta barrel scaffold by gene duplication and fusion. Science, 2003, 289: 1546-1550
    [92] H?cker, B., et al. Dissection of a (βα)8-barrel enzyme into two folded halves. Nature Struc. Biol., 2001, 8: 32-36
    [93] Nagano, N., et al. One fold with many functions: the evolutionary relationships between TIM barrel families based on their sequences, structures and functions. J. Mol. Biol., 2002, 321: 741-765
    [94] Copley, R. R. and Bork, P. Homology among (beat/alpha)8 barrels: implications for the evolution of metabolic pathways. J. Mol. Biol., 2000, 303: 627-641
    [95] Murzin, A. G., et al. Beta-trefoil fold patterns of structure and sequence in the Kunitz inhibitors interleukins-1beta and 1alpha and Fibroblast growth factors. J. Mol. Biol., 1992, 223: 531-543
    [96] McLachlan, A. D. Three-fold structural pattern in the soybean typsin inhibitor (Kunitz). J. Mol. Biol., 1979, 133: 557-563
    [97] McLachlan, A. D. Evidence for gene duplication in collagen. J. Mol. Biol., 1976, 107: 159-174
    [98] Huang, Y. Z. and Xiao, Y. Detection of gene duplication signals of Ig folds from their amino acid sequences. Proteins, 2007, 68: 267-272
    [99] Wang, X. C., et al. Structural-symmetry-related sequence patterns of the proteins of beta-propeller family. J. Mol. Graph. Model., 2008, 26: 829-833
    [100] Li, T. P., et al. Reduction of protein sequence complexity by residue grouping. Protein Eng. Des. Sel., 2003, 16: 323-330
    [101] Liu, X., et al. Simplified amino acid alphabets based on deviation of conditional probability from random background. Phys. Rev. E., 2002, 66: 021906
    [102] Henikoff, S. and Henikoff, J. G. Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. U. S. A., 1992, 89: 10915-10919
    [103] Dayhoff, M. O., et al. A model of evolutionary change in proteins. Atlas of Protein Sequence and Structure, 1978, 5: 352
    [104] Pearson, W. R. Empirical statistical estimate for sequence for sequence similarity searches. J. Mol. Biol., 1998, 267: 71-84
    [105] Fitch, W. M. Distinguishing homologous from analogous proteins. Syst. Zool., 1970, 19: 99-113
    [106] Monfort, W., et al. The three-dimensional structure of ricin at 2. 8 ? J. Biol. Chem., 1987, 262: 5398-5403
    [107] Rutenber, E. and Robertus, J. D. Structure of ricin B-chain at 2. 5 ? resolution. Proteins, 1991, 10: 260-269
    [108] Rutenber, E., et al. Structure and evolution of ricin B chain. Nature, 1987, 326: 624-626
    [109] Chavez, L. L., et al. Multiple routes lead to the native state in the energy landscape of the beta-trefoil family. Proc. Natl. Acad. Sci. U. S. A., 2006, 103: 10254-10258
    [110] Brych, S. R., et al. Structure and stability effects of mutations designed to increase the primary sequence symmetry within the core region of a beta-trefoil. Protein Sci., 2001, 10: 2587-2599
    [111] Hazes, B. The (QxW)3 domain: a flexible lectin scaffold. Protein Sci., 1996, 5: 1490-1501
    [112] Gromiha, M. M. and Selvaraj, S. Inter-residue interactions in protein recognition, folding and stability. Prog. Biophys. Mol. Biol., 2004, 86: 235-277
    [113] Wallace, A. C., et al. LIGPLOT: A program to generate schematic diagrams of protein-ligand interactions. Protein Eng. Des. Sel., 1995, 8: 127-134
    [114] Bloom, J. D., et al. Structural determinants of the rate of protein evolution in yeast. Mol. Biol. Evol., 2006, 23: 1751-1761
    [115] Brinda, K. V. and Vishveshwara, S. A Network Representation of Protein Structures: Implications for Protein Stability. Biophys. J., 2005, 89: 4159
    [116] Chen, C. J., et al. Identification of key residues in proteins by using their physical characters. Phys Rev E, 2006, 73: 041926
    [117] Chen, C. J., et al. All-atom contact potential approach to protein thermostablity analysis. Biopolymers, 2007, 85: 28-37
    [118] Still, V. C., et al. Semianalytical treatment of solvation for molecular mechanics and dynamics. J. Am. Chem. Soc., 1990, 112: 6127-6129
    [119] Qiu, D., et al. The GB/SA continuum model for solvation. A fast analytical method for the calculation of approximate Born radii J Phys Chem A, 1997, 101: 3005-3014
    [120] Ren, P. and Ponder, J. W. Polarizable atomic multipole water model for molecular mechanics simulation. J Phys Chem B 2003, 107: 5933-5947
    [121] MacKerell, A. D., et al. All-atom empirical potential for molecular modeling and dynamics studies of proteins. J Phys Chem B, 1998, 102: 3586-3617
    [122] Hirsh, A. E. and Fraser, H. B. Protein dispensability and rate of evolution. Nature, 2001, 411: 1046-1049
    [123] Yang, J., et al. Rate of protein evolution versus fitness effect of gene deletion. Mol. Biol. Evol., 2003, 20: 772-774
    [124] Fraser, H. B., et al. Coevolution of gene expression among interacting proteins. Proc. Natl. Acad. Sci. U. S. A., 2004, 101: 9033-9038
    [125] Wall, D. P., et al. Functional genomic analysis of the rates of protein evolution. Proc. Natl. Acad. Sci. U. S. A., 2005, 102: 5483-5488
    [126] Deeds, J. and Shakhnovich, E. I. A structure-centric view of protein evolution, design and adaptation. Adv. Enzymol. Relat. Areas Mol. Biol., 2007, 75: 133-191
    [127] Tiana, G., et al. Imprint of evolution on protein structure. Proc. Natl. Acad. Sci. U. S. A., 2004, 101: 2846-2851
    [128] Kussell, E. The designability hypothesis and protein evolution. Protein Pept. Lett., 2005, 12: 111-116
    [129] Li, H., et al. Emergence of prefered structures in a simple model of protein folding. Science, 1996, 273: 666-669
    [130] England, J. L. and Shakhnovich, E. I. Structural determinant of protein designability. Phys. Rev. Lett., 2003, 90: 218101
    [131] England, J. L., et al. Natural selection of more designable folds: A mechanism for thermophilic adaptation. Proc. Natl. Acad. Sci. U. S. A., 2003, 100: 8727-8731
    [132] Drummond, D. A., et al. A single determinant dominates the rate of yeast protein evolution. Mol. Biol. Evol., 2006, 23: 327-337
    [133] Shakhnovich, E. I. Protein design: A perspective from simple tractable models. Fold Des., 1998, 3: 45-58
    [134] Wolynes, P. G. Symmetry and the energy landscapes of biomolecules. Proc. Natl. Acad. Sci. U. S. A., 1996, 93: 14249-14255
    [135] Shakhnovich, B. E., et al. Protein structure and evolutionary history determine sequence space topology. Genome Res., 2005, 15: 385-392
    [136] Shakhnovich, B. E. Relative contributions of structural designability and functional diversity in molecular evolution of duplicates. Bioinformatics, 2006, 22: 440-445
    [137] Riddle, D. S., et al. Functional rapidly folding proteins from simplified amino acid sequences. Nature Struc. Biol., 1997, 4: 805-809
    [138] Vriend, G. WHAT IF: A molecular modeling and drug design program. J Mol Graph 1990, 8: 52-56
    [139] Graves, B. J., et al. Structure of Interleukin 1alhpa at 2. 7? Resolution. Biochemistry, 1990, 29: 2679-2684
    [140] Kabsch, W. and Sander, C. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymer, 1983, 22: 2577-2637
    [141] Zhang, Y. and Skolnick, J. TM-align: A protein structure alignment algorithm based on TM-score. Nucleic Acids Res., 2005, 33: 2302-2309
    [142] Gille, C. and Fr?mmel, C. STRAP: editor for STRuctural Alignments of Proteins. Bioinformatics, 2001, 17: 377-378
    [143] Shindyalov, I. N. and Bourne, P. E. Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Eng. Des. Sel., 1998, 11: 739-747
    [144] Sweeet, R. M. and Eisenberg, D. Correlation of sequence hydrophobicities measures similarities in three-dimensional protein structures. J. Mol. Biol., 1983, 171: 479-488
    [145] Zwanzig, R., et al. Levinthal's paradox. Proc. Natl. Acad. Sci. U. S. A., 1992, 89: 20-22
    [146] Durbin, R., et al. Biological sequence analysis: Probabilistic models of protein and nucleic acids. England: University of Cambridge, 1998
    [147] Baker, D. A surprising simplicity to protein folding. Nature, 2000, 405: 39-42
    [148] Baker, D. and Sali, A. Protein structure prediction and structural genomics. Science, 2001, 294: 93-96
    [149] duPonta, N. C., et al. Validation and comparison of luminex multiplex cytokine analysis kits with ELISA: Determinations of a panel of nine cytokines in clinical sample culture supernatants J. Reprod. Immunol., 2005, 66: 175-191
    [150] Viktor, H., et al. Comparison of multiple amber force fields and development of improved protein backbone parameters. Proteins, 2006, 65: 712-725
    [151] Branden, C. and Tooze, J. Introduction to protein structure. New York: Garland Publishing, 1999
    [152] Gomis-Rüth, F. X., et al. The helping hand of collagenase-3 (MMP-13): 2. 7 A crystal structure of its C-terminal haemopexin-like domain. J. Mol. Biol., 1996, 264: 556-566
    [153] Ji, X. F., et al. Hidden symmetries in the primary sequences of beta-barrel family. Comput Biol Chem, 2007, 31: 61-63
    [154] Rost, B. Twilight zone of protein sequence alignments. Protein Eng. Des. Sel., 1999, 12: 85-94
    [155] Fariselli, P. and Casadio, R. Prediction of the number of residue contacts in proteins. Proceedings of the 2000 conference on intelligent systems for molecular biology (ISMBOO), La Jolla, CA, 2000: 146-151
    [156] McCallum, R. Striped sheets and protein contact prediction. Bioinformatics, 2004, 20: 224-231
    [157] Vullo, A., et al. A two-stage approach for improved prediction of residue contact maps. BMC Bioinformatics, 2006, 7: 180
    [158] Varghese, J. N., et al. Structure of the influenza virus glycoprotein antigen neuraminidase at 2. 9? resolution. Nature, 1983, 30: 335-340
    [159] Murzin, A. G. Structural principles for the propeller assembly of beta-sheets: The preference for seven-fold symmetry. Proteins, 1992, 14: 191-201
    [160] Lalonde, J. M., et al. The up-and-down beta-barrel proteins. FASEB J., 1994, 8: 1240-1247
    [161] Li, J. and Brick, P. Structure of full-length porcine synovial collagenase reveals a C-terminal domain containing a calcium-linked, four-blades beta-propeller. Structure, 1995, 3: 541-549
    [162] Raetz, C. R. H. and Roderick, S. L. A left-handed parallel beta helix in the structure of UDP-N-Acetylglucosamine acyltransferase. Science, 1995, 270: 997-1000
    [163] Pujadas, G. and Palau, J. Tim-barrel fold: structural functional and evolutionarycharacteristics in natural and designed molecules. Biologia Bratislava, 1999, 54: 231-254
    [164] Tang, J., et al. Structural evidence for gene duplication in the evolution of the acid proteases. Nature, 1978, 271: 618-621
    [165] Holm, I., et al. Evolution of aspartyl proteases by gene duplication: The mouse renin gene is organized in two homologous clusters of four exons. EMBO J., 1984, 3: 557-562
    [166] Ponting, C. P. and Russell, R. B. Identification of distant homologues of fibroblast growth factors suggests a common ancestor for all beta-trefoil proteins. J. Mol. Biol., 2000, 302: 1041-1047
    [167] Mukhopadhyay, D. The molecular evolutionary history of a winged bean alpha-chymotrypsin inhibitor and modeling of its mutations through structural analysis. J. Mol. Evol., 2000, 50: 214-223
    [168] Laskowski, R. A., et al. PDBsum more: new summaries and analyses of the known 3D structures of proteins and nucleic acids. Nucleic Acids Res., 2005, 33: D266-D268
    [169] Mlsna, D., et al. Structure of recombinant ricin A chain at 2. 3 ? Protein Sci., 1993, 2: 429-435
    [170] Eck, J., et al. Characterization of recombinant and plant-derived mistletoe lectin and their B-chains. Eur. J. Biochem., 1999, 265: 788-797
    [171] Jiang, F. and Kim, S. H. Soft docking: matching of molecular surface cubes. J. Mol. Biol., 1991, 219: 79-102
    [172] Li, M., et al. Effects of external interactions on protein sequence-structure relations of beta-trefoil fold. Proteins, 2008: DOI: PROT. 22010
    [173] Salwinski, L., et al. The database of interacting proteins: 2004 update. Nucleic Acids Res., 2004, 32: D449-D451
    [174] Kerrien, S., et al. IntAct-Open source resource for molecular interaction data.Nucleic Acids Res., 2006, 35: D561-D565
    [175] Consortium, T. G. O. Gene Ontology: tool for the unification of biology. Nat. Genet., 2000, 25: 25-29
    [176] Fraser, H. B., et al. A simple dependence between evolution rate and the number of protein-protein interactions. BMC Evol. Biol., 2003, 3: 11
    [177] Sziagyi, A., et al. Prediction of physical protein-protein interactions. Phys Biol, 2005, 2: 1-16
    [178] Salwinski, L. and Eisenberg, D. Computational methods of analysis of protein-protein interactions. Curr. Opin. Struct. Biol., 2003, 13: 377-382
    [179] Hermjakob, H., et al. The HUPO PSI's molecular interaction format--a community standard for the representation of protein interaction data. Nat. Biotechnol., 2004, 22: 177-183
    [180] Herbeck, J. T. and Wall, D. P. Converging on general model of protein evolution. Trends Biotechnol., 2005, 23: 485-487