首页 | 本学科首页   官方微博 | 高级检索  
     


The characterization of amino acid sequences in proteins by statistical methods
Authors:J M Zimmerman  N Eliezer  R Simha
Affiliation:1. Panskura Banamali College, Panskura RS, WB 721152, India;2. Departament de Química, Universitat de les Illes Balears, Crta. de Valldemossa km 7.5, 07122 Palma de Mallorca (Baleares), Spain;3. Department of Chemistry, CICECO-Aveiro Institute of Materials, University of Aveiro, 3810-193 Aveiro, Portugal;1. Institute for Advanced Studies in Basic Sciences (IASBS), Gava Zang, Zanjan 45195-159, Iran;2. Department of Chemistry, University of Sistan and Baluchestan, Zahedan, 98135-674, Iran;3. Trace Analysis Research Centre, Department of Chemistry, Dalhousie University, PO Box 15000, Halifax, NS B3H 4R2, Canada
Abstract:Three different but related comprehensive statistical analyses of amino acid sequences in proteins are described. The goal in each case is to search for evidence of significant sequence structure in individual proteins relative to a purely random arrangement of the amino acid residues and to attempt to relate any significant structure uncovered to the secondary and/or tertiary configuration of the protein.In the first of these analyses, which is reviewed briefly in an appendix, amino acids are divided into subgroups according to a variety of side chain physical properties (e.g. polarity, hydrophobicity). Deviations from randomness are expressed in terms of correlation indices ϱij(c) which are composition normalized doublet frequencies. Here i and j denote membership in a particular group for the physical property chosen and c denotes the “lag”, that is the number of residues along the chain separating the doublet.The other more refined analyses are described in some detail. For both of these each amino acid in a given protein is replaced by its appropriate value on a continuous physical property scale. Six such scales are employed: bulkiness, polarity, RF, pI, pK1 and hydrophobicity. The resulting amino acid index sequences are treated as discrete series and are analyzed first by means of serial correlation methods and subsequently by employing spectral analysis techniques. Periodicities exhibited in these series are evaluated statistically and speculations are made concerning the connection between such structure and protein configuration.Although more than forty individual proteins whose primary sequences are known have been analyzed by these methods, results for the cytochrome c series, the hemoglobins and lysozyme are emphasized in the present paper. In the case of the cytochrome c family of proteins several relationships between primary sequence structure and “evolutionary order” are discussed. In addition, the results of several homogeneity studies are described in which the sequence structure of various portions of a given protein chain are compared.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号