首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A method for analyzing differences in the folding mechanisms of proteins in the same family is presented. Using only information from the amino acid sequences, contact maps derived from the interresidue average distances are employed. These maps, referred to as average distance maps (ADM), are applied to the folding of c-type lysozymes. The results reveal that the ADMs of these lysozymes reflect the differences in the detailed folding mechanisms. Further possible applications of the present method are also discussed.  相似文献   

2.
It has been shown that probable portions which form contacts in a protein can be predicted by means of an average distance map (ADM) as well as regular structures (-helices and -turns) defined as short-range compact regions (Kikuchiet al., 1988a,c). In this paper, we analyze the occurrence of those portions and short-range compact regions on ADMs for various proteins regarding their folding types. We have found out that each folding type of proteins shows characteristic distribution of such parts on ADMS. We also discuss the possibility of the prediction of folding types of proteins by ADMs.  相似文献   

3.
To understand the folding mechanism of a protein is one of the goals in bioinformatics study. Nowadays, it is enigmatic and difficult to extract folding information from amino acid sequence using standard bioinformatics techniques or even experimental protocols which can be time consuming. To overcome these problems, we aim to extract the initial folding unit for titin protein (Ig and fnIII domains) by means of inter-residue average distance statistics, Average Distance Map (ADM) and contact frequency analysis (F-value). TI I27 and TNfn3 domains are used to represent the Ig-domain and fnIII-domain, respectively. Beta-strands 2, 3, 5, and 6 are significant for the initial folding processes of TI I27. The central strands of TNfn3 were predicted as a primary folding segment. Known 3D structure and unknown 3D structure domains were investigated by structure or non-structure based multiple sequence alignment, respectively, to learn the conserved hydrophobic residues and predicted compact region relevant to evolution. Our results show good correspondence to experimental data, phi-value and protection factor from H-D exchange experiments. The significance of conserved hydrophobic residues near F-value peaks for structural stability using hydrophobic packing is confirmed. Our prediction methods once again could extract a folding mechanism only knowing the amino acid sequence.  相似文献   

4.
Predicting protein folding rate from amino acid sequence is an important challenge in computational and molecular biology. Over the past few years, many methods have been developed to reflect the correlation between the folding rates and protein structures and sequences. In this paper, we present an effective method, a combined neural network--genetic algorithm approach, to predict protein folding rates only from amino acid sequences, without any explicit structural information. The originality of this paper is that, for the first time, it tackles the effect of sequence order. The proposed method provides a good correlation between the predicted and experimental folding rates. The correlation coefficient is 0.80 and the standard error is 2.65 for 93 proteins, the largest such databases of proteins yet studied, when evaluated with leave-one-out jackknife test. The comparative results demonstrate that this correlation is better than most of other methods, and suggest the important contribution of sequence order information to the determination of protein folding rates.  相似文献   

5.
Describing the whole story of protein folding is currently the main enigmatic problem in molecular bioinformatics study. Protein folding mechanisms have been intensively investigated with experimental as well as simulation techniques. Since a protein folds into its specific 3D structure from a unique amino acid sequence, it is interesting to extract as much information as possible from the amino acid sequence of a protein. Analyses based on inter-residue average distance statistics and a coarse-grained Gō-model simulation were conducted on Ig and FN3 domains of a titin protein to decode the folding mechanisms from their sequence data and native structure data, respectively. The central region of all domains was predicted to be an initial folding unit, that is, stable in an early state of folding. This common feature coincides well with the experimental results and underscores the significance of the β-sandwich proteins' common structure, namely, the key strands for folding and the Greek-key motif, which is located in the central region. We confirmed that our sequence-based techniques were able to predict the initial folding event just next to the denatured state and that a 3D-based Gō-model simulation can be used to investigate the whole process of protein folding.  相似文献   

6.
Ichimaru T  Kikuchi T 《Proteins》2003,51(4):515-530
It is a general notion that proteins with very similar three-dimensional structures would show very similar folding kinetics. However, recent studies reveal that the folding kinetic properties of some proteins contradict this thought (i.e., the members in a same protein family fold through different pathways). For example, it has been reported that some beta-proteins in the intracellular lipid-binding protein family fold through quite different pathways (Burns et al., Proteins 1998;33:107-118). Similar differences in folding kinetics are also observed in the members of the globin family (Nishimura et al., Nat Struct Biol 2000;7:679-686). In our study, we examine the possibility of predicting qualitative differences in folding kinetics of the intracellular lipid-binding proteins and two globin proteins (i.e., myoglobin and leghemoglobin). The problem is tackled by means of a contact map based on the average distance statistics between residues, the Average Distance Map (ADM), as constructed from sequence. The ADMs for the three proteins show overall similarity, but some local differences among maps are also observed. Our results demonstrate that some properties of the protein folding kinetics are consistent with local differences in the ADMs. We also discuss the general possibility of predicting folding kinetics from sequence information.  相似文献   

7.
Huang JT  Tian J 《Proteins》2006,63(3):551-554
The significant correlation between protein folding rates and the sequence-predicted secondary structure suggests that folding rates are largely determined by the amino acid sequence. Here, we present a method for predicting the folding rates of proteins from sequences using the intrinsic properties of amino acids, which does not require any information on secondary structure prediction and structural topology. The contribution of residue to the folding rate is expressed by the residue's Omega value. For a given residue, its Omega depends on the amino acid properties (amino acid rigidity and dislike of amino acid for secondary structures). Our investigation achieves 82% correlation with folding rates determined experimentally for simple, two-state proteins studied until the present, suggesting that the amino acid sequence of a protein is an important determinant of the protein-folding rate and mechanism.  相似文献   

8.
9.
It has been shown for 20 proteins that amino acid residues included into the protein folding nucleus, determined experimentally, are often involved in the theoretically determined amyloidogenic fragments. For 18 proteins, Φ-values indicative of the extent of residue involvement into the folding nucleus are on average higher for amino acid residues within amyloidogenic regions. Amyloidogenic fragments were predicted for 20 proteins by two methods chosen from four on the basis of comparison of prediction of amyloidogenic regions known from experimental data. Since theoretical folding nuclei are detected by the protein three-dimensional structure and amyloidogenic regions by the protein chain primary structure, the detected regularity makes possible predictions of folding nucleation sites on the basis of amino acid sequence.  相似文献   

10.
Mishra P  Pandey PN 《Bioinformation》2011,6(10):372-374
The number of amino acid sequences is increasing very rapidly in the protein databases like Swiss-Prot, Uniprot, PIR and others, but the structure of only some amino acid sequences are found in the Protein Data Bank. Thus, an important problem in genomics is automatically clustering homologous protein sequences when only sequence information is available. Here, we use graph theoretic techniques for clustering amino acid sequences. A similarity graph is defined and clusters in that graph correspond to connected subgraphs. Cluster analysis seeks grouping of amino acid sequences into subsets based on distance or similarity score between pairs of sequences. Our goal is to find disjoint subsets, called clusters, such that two criteria are satisfied: homogeneity: sequences in the same cluster are highly similar to each other; and separation: sequences in different clusters have low similarity to each other. We tested our method on several subsets of SCOP (Structural Classification of proteins) database, a gold standard for protein structure classification. The results show that for a given set of proteins the number of clusters we obtained is close to the superfamilies in that set; there are fewer singeltons; and the method correctly groups most remote homologs.  相似文献   

11.
Internal homologies in an amino acid sequence of a protein and in amino acid sequences of two different proteins are examined, using correlation coefficients calculated from the sequences when residues are replaced by various quantitative properties of the amino acids such as hydrophobicity. To improve the signal-noise ratio the average correlation coefficient is used to detect homology because the correlation depends on the property considered. In this way, any sequence repetition in a protein and the extent of the similarity and difference among proteins can be estimated quantitatively. The procedure was applied first to the sequences of proteins which have been assumed on other grounds to contain some internal sequence repetitions, α-tropomyosin from rabbit skeletal muscle, calmodulin from bovine brain, troponin C from skeletal and cardiac muscle, and then to the sequences of calcium binding proteins, calmodulin, troponin C, and L2 light chain of myosin. The results show that α-tropomyosin has a markedly periodic sequence at intervals of multiples of seven residues throughout the whole sequence, and calmodulin and skeletal troponin C contain two homologous sequences, the homology of troponin C being weaker than that of calmodulin. Candidates for the calcium binding regions of both troponin C, calmodulin, and L2 light chain are the homologous parts having a high average correlation coefficient (about 0·5) with respect to the sequences of the CD and EF hand regions of carp parvalbumin. The procedure may be a useful method for searching for homologous segments in amino acid sequences.  相似文献   

12.
Transmembrane helix (TMH) topology prediction is becoming a focal problem in bioinformatics because the structure of TM proteins is difficult to determine using experimental methods. Therefore, methods that can computationally predict the topology of helical membrane proteins are highly desirable. In this paper we introduce TMHindex, a method for detecting TMH segments using only the amino acid sequence information. Each amino acid in a protein sequence is represented by a Compositional Index, which is deduced from a combination of the difference in amino acid occurrences in TMH and non-TMH segments in training protein sequences and the amino acid composition information. Furthermore, a genetic algorithm was employed to find the optimal threshold value for the separation of TMH segments from non-TMH segments. The method successfully predicted 376 out of the 378 TMH segments in a dataset consisting of 70 test protein sequences. The sensitivity and specificity for classifying each amino acid in every protein sequence in the dataset was 0.901 and 0.865, respectively. To assess the generality of TMHindex, we also tested the approach on another standard 73-protein 3D helix dataset. TMHindex correctly predicted 91.8% of proteins based on TM segments. The level of the accuracy achieved using TMHindex in comparison to other recent approaches for predicting the topology of TM proteins is a strong argument in favor of our proposed method. Availability: The datasets, software together with supplementary materials are available at: http://faculty.uaeu.ac.ae/nzaki/TMHindex.htm.  相似文献   

13.
14.
15.
16.
17.
Hydrogen exchange experiments provide detailed information about the local stability and the solvent accessibility of different regions of the structures of folded proteins, protein complexes, and amyloid fibrils. We introduce an approach to predict protection factors from hydrogen exchange in proteins based on the knowledge of their amino acid sequences without the inclusion of any additional structural information. These results suggest that the propensity of different regions of the structures of globular proteins to undergo local unfolding events can be predicted from their amino acid sequences with an accuracy of 80% or better.  相似文献   

18.
Patterns of hydrophobic and hydrophilic residues play a major role in protein folding and function. Long, predominantly hydrophobic strings of 20-22 amino acids each are associated with transmembrane helices and have been used to identify such sequences. Much less attention has been paid to hydrophobic sequences within globular proteins. In prior work on computer simulations of the competition between on-pathway folding and off-pathway aggregate formation, we found that long sequences of consecutive hydrophobic residues promoted aggregation within the model, even controlling for overall hydrophobic content. We report here on an analysis of the frequencies of different lengths of contiguous blocks of hydrophobic residues in a database of amino acid sequences of proteins of known structure. Sequences of three or more consecutive hydrophobic residues are found to be significantly less common in actual globular proteins than would be predicted if residues were selected independently. The result may reflect selection against long blocks of hydrophobic residues within globular proteins relative to what would be expected if residue hydrophobicities were independent of those of nearby residues in the sequence.  相似文献   

19.
Huang JT  Xing DJ  Huang W 《Amino acids》2012,43(2):567-572
The successful prediction of protein-folding rates based on the sequence-predicted secondary structure suggests that the folding rates might be predicted from sequence alone. To pursue this question, we directly predict the folding rates from amino acid sequences, which do not require any information on secondary or tertiary structure. Our work achieves 88% correlation with folding rates determined experimentally for proteins of all folding types and peptide, suggesting that almost all of the information needed to specify a protein's folding kinetics and mechanism is comprised within its amino acid sequence. The influence of residue on folding rate is related to amino acid properties. Hydrophobic character of amino acids may be an important determinant of folding kinetics, whereas other properties, size, flexibility, polarity and isoelectric point, of amino acids have contributed little to the folding rate constant.  相似文献   

20.
Using a maximum-likelihood formalism, we have developed a method with which to reconstruct the sequences of ancestral proteins. Our approach allows the calculation of not only the most probable ancestral sequence but also of the probability of any amino acid at any given node in the evolutionary tree. Because we consider evolution on the amino acid level, we are better able to include effects of evolutionary pressure and take advantage of structural information about the protein through the use of mutation matrices that depend on secondary structure and surface accessibility. The computational complexity of this method scales linearly with the number of homologous proteins used to reconstruct the ancestral sequence.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号