首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The maintenance of protein function and structure constrains the evolution of amino acid sequences. This fact can be exploited to interpret correlated mutations observed in a sequence family as an indication of probable physical contact in three dimensions. Here we present a simple and general method to analyze correlations in mutational behavior between different positions in a multiple sequence alignment. We then use these correlations to predict contact maps for each of 11 protein families and compare the result with the contacts determined by crystallography. For the most strongly correlated residue pairs predicted to be in contact, the prediction accuracy ranges from 37 to 68% and the improvement ratio relative to a random prediction from 1.4 to 5.1. Predicted contact maps can be used as input for the calculation of protein tertiary structure, either from sequence information alone or in combination with experimental information. © 1994 John Wiley & Sons, Inc.  相似文献   

2.
Prediction of contact maps with neural networks and correlated mutations.   总被引:1,自引:0,他引:1  
Contact maps of proteins are predicted with neural network-based methods, using as input codings of increasing complexity including evolutionary information, sequence conservation, correlated mutations and predicted secondary structures. Neural networks are trained on a data set comprising the contact maps of 173 non-homologous proteins as computed from their well resolved three-dimensional structures. Proteins are selected from the Protein Data Bank database provided that they align with at least 15 similar sequences in the corresponding families. The predictors are trained to learn the association rules between the covalent structure of each protein and its contact map with a standard back propagation algorithm and tested on the same protein set with a cross-validation procedure. Our results indicate that the method can assign protein contacts with an average accuracy of 0.21 and with an improvement over a random predictor of a factor >6, which is higher than that previously obtained with methods only based either on neural networks or on correlated mutations. Furthermore, filtering the network outputs with a procedure based on the residue coordination numbers, the accuracy of predictions increases up to 0.25 for all the proteins, with an 8-fold deviation from a random predictor. These scores are the highest reported so far for predicting protein contact maps.  相似文献   

3.
We investigate the conservation of amino acid residue sequences in 21 DNA-binding protein families and study the effects that mutations have on DNA-sequence recognition. The observations are best understood by assigning each protein family to one of three classes: (i) non-specific, where binding is independent of DNA sequence; (ii) highly specific, where binding is specific and all members of the family target the same DNA sequence; and (iii) multi-specific, where binding is also specific, but individual family members target different DNA sequences. Overall, protein residues in contact with the DNA are better conserved than the rest of the protein surface, but there is a complex underlying trend of conservation for individual residue positions. Amino acid residues that interact with the DNA backbone are well conserved across all protein families and provide a core of stabilising contacts for homologous protein-DNA complexes. In contrast, amino acid residues that interact with DNA bases have variable levels of conservation depending on the family classification. In non-specific families, base-contacting residues are well conserved and interactions are always found in the minor groove where there is little discrimination between base types. In highly specific families, base-contacting residues are highly conserved and allow member proteins to recognise the same target sequence. In multi-specific families, base-contacting residues undergo frequent mutations and enable different proteins to recognise distinct target sequences. Finally, we report that interactions with bases in the target sequence often follow (though not always) a universal code of amino acid-base recognition and the effects of amino acid mutations can be most easily understood for these interactions.  相似文献   

4.
Gupta N  Mangal N  Biswas S 《Proteins》2005,59(2):196-204
Prediction of fold from amino acid sequence of a protein has been an active area of research in the past few years, but the limited accuracy of existing techniques emphasizes the need to develop newer approaches to tackle this task. In this study, we use contact map prediction as an intermediate step in fold prediction from sequence. Contact map is a reduced graph-theoretic representation of proteins that models the local and global inter-residue contacts in the structure. We start with a population of random contact maps for the protein sequence and "evolve" the population to a "high-feasibility" configuration using a genetic algorithm. A neural network is employed to assess the feasibility of contact maps based on their 4 physically relevant properties. We also introduce 5 parameters, based on algebraic graph theory and physical considerations, that can be used to judge the structural similarity between proteins through contact maps. To predict the fold of a given amino acid sequence, we predict a contact map that will sufficiently approximate the structure of the corresponding protein. Then we assess the similarity of this contact map with the representative contact map of each fold; the fold that corresponds to the closest match is our predicted fold for the input sequence. We have found that our feasibility measure is able to differentiate between feasible and infeasible contact maps. Further, this novel approach is able to predict the folds from sequences significantly better than a random predictor.  相似文献   

5.
6.
MOTIVATION: Some amino acids clearly show preferences over others in protein-protein interfaces. These preferences, or so-called interface propensities can be used for a priori interface prediction. We investigated whether the prediction accuracy could be improved by considering not single but pairs of residues in an interface. Here we present the first systematic analysis of intramolecular surface contacts in interface prediction. RESULTS: We show that preferences do exist for contacts within and around an interface region within one molecule: specific pairs of amino acids are more often occurring than others. Using intramolecular contact propensities in a blind test, higher average scores were assigned to interface residues than to non-interface residues. This effect persisted as small but significant when the contact propensities were corrected to eliminate the influence of single amino acid interface propensity. This indicates that intramolecular contact propensities may replace interface propensities in protein-protein interface prediction. AVAILABILITY: The source code is available on request from the authors.  相似文献   

7.
One of the main barriers to accurate computational protein structure prediction is searching the vast space of protein conformations. Distance restraints or inter‐residue contacts have been used to reduce this search space, easing the discovery of the correct folded state. It has been suggested that about 1 contact for every 12 residues may be sufficient to predict structure at fold level accuracy. Here, we use coarse‐grained structure‐based models in conjunction with molecular dynamics simulations to examine this empirical prediction. We generate sparse contact maps for 15 proteins of varying sequence lengths and topologies and find that given perfect secondary‐structural information, a small fraction of the native contact map (5%‐10%) suffices to fold proteins to their correct native states. We also find that different sparse maps are not equivalent and we make several observations about the type of maps that are successful at such structure prediction. Long range contacts are found to encode more information than shorter range ones, especially for α and αβ‐proteins. However, this distinction reduces for β‐proteins. Choosing contacts that are a consensus from successful maps gives predictive sparse maps as does choosing contacts that are well spread out over the protein structure. Additionally, the folding of proteins can also be used to choose predictive sparse maps. Overall, we conclude that structure‐based models can be used to understand the efficacy of structure‐prediction restraints and could, in future, be tuned to include specific force‐field interactions, secondary structure errors and noise in the sparse maps.  相似文献   

8.
9.
Co-evolving residues in membrane proteins   总被引:2,自引:0,他引:2  
MOTIVATION: The analysis of co-evolving residues has been exhaustively evaluated for the prediction of intramolecular amino acid contacts in soluble proteins. Although a variety of different methods for the detection of these co-evolving residues have been developed, the fraction of correctly predicted contacts remained insufficient for their reliable application in the construction of structural models. Membrane proteins, which constitute between one-fourth and one-third of all proteins in an organism, were only considered in few individual case studies. RESULTS: We present the first general study of correlated mutations in alpha-helical membrane proteins. Using seven different prediction algorithms, we extracted co-evolving residues for 14 membrane proteins having a solved 3D structure. On average, distances between correlated pairs of residues lying on different transmembrane segments were found to be significantly smaller compared to a random prediction. Covariation of residues was frequently found in direct sequence neighborhood to helix-helix contacts. Based on the results obtained from individual prediction methods, we constructed a consensus prediction for every protein in the dataset that combines obtained correlations from different prediction algorithms and simultaneously removes likely false positives. Using this consensus prediction, 53% of all predicted residue pairs were found within one helix turn of an observed helix-helix contact. Based on the combination of co-evolving residues detected with the four best prediction algorithms, interacting helices could be predicted with a specificity of 83% and sensitivity of 42%. AVAILABILITY: http://webclu.bio.wzw.tum.de/helixcorr/  相似文献   

10.
iSPOT (http://cbm.bio.uniroma2.it/ispot) is a web tool developed to infer the recognition specificity of protein module families; it is based on the SPOT procedure that utilizes information from position-specific contacts, derived from the available domain/ligand complexes of known structure, and experimental interaction data to build a database of residue-residue contact frequencies. iSPOT is available to infer the interaction specificity of PDZ, SH3 and WW domains. For each family of protein domains, iSPOT evaluates the probability of interaction between a query domain of the specified families and an input protein/peptide sequence and makes it possible to search for potential binding partners of a given domain within the SWISS-PROT database. The experimentally derived interaction data utilized to build the PDZ, SH3 and WW databases of residue-residue contact frequencies are also accessible. Here we describe the application to the WW family of protein modules.  相似文献   

11.
We present a new structurally derived pair-to-pair substitution matrix (P2PMAT). This matrix is constructed from a very large amount of integrated high quality multiple sequence alignments (Blocks) and protein structures. It evaluates the likelihoods of all 160,000 pair-to-pair substitutions. P2PMAT matrix implicitly accounts for evolutionary conservation, correlated mutations, and residue-residue contact potentials. The usefulness of the matrix for structural predictions is shown in this article. Predicting protein residue-residue contacts from sequence information alone, by our method (P2PConPred) is particularly accurate in the protein cores, where it performs better than other basic contact prediction methods (increasing accuracy by 25-60%). The method mean accuracy for protein cores is 24% for 59 diverse families and 34% for a subset of proteins shorter than 100 residues. This is above the level that was recently shown to be sufficient to significantly improve ab initio protein structure prediction. We also demonstrate the ability of our approach to identify native structures within large sets of (300-2000) protein decoys. On the basis of evolutionary information alone our method ranks the native structure in the top 0.3% of the decoys in 4/10 of the sets, and in 8/10 of sets the native structure is ranked in the top 10% of the decoys. The method can, thus, be used to assist filtering wrong models, complementing traditional scoring functions.  相似文献   

12.
Given sufficient large protein families, and using a global statistical inference approach, it is possible to obtain sufficient accuracy in protein residue contact predictions to predict the structure of many proteins. However, these approaches do not consider the fact that the contacts in a protein are neither randomly, nor independently distributed, but actually follow precise rules governed by the structure of the protein and thus are interdependent. Here, we present PconsC2, a novel method that uses a deep learning approach to identify protein-like contact patterns to improve contact predictions. A substantial enhancement can be seen for all contacts independently on the number of aligned sequences, residue separation or secondary structure type, but is largest for β-sheet containing proteins. In addition to being superior to earlier methods based on statistical inferences, in comparison to state of the art methods using machine learning, PconsC2 is superior for families with more than 100 effective sequence homologs. The improved contact prediction enables improved structure prediction.  相似文献   

13.
Abstract

We present a new algorithm for characterization of protein spatial structure basing on the molecular hydrophobicity potential approach. The method is illustrated by the analysis of three-dimensional structure of barnase and barnase-barstar complex. Current approach enables identification of amino acid residues situated in unfavorable environment (these residues may be “active” for binding), and to map quantitatively hydrophobic, hydrophilic and unfavorable hydrophobic-hydrophilic intra-and inter-molecular contacts involving backbone and side-chain segments of amino acid residues. Calculation of individual contributions of amino acid residues to such contacts permits identification of structurally-important residues. The contact plots obtained with molecular hydrophobicity potential calculations, provide easy rules to choose sites for mutations, which can increase a strength of intra- or inter-molecular hydrophobic interactions. The unfavorable hydrophobic-hydrophilic contact can be mutated to favorable hydrophobic, and already existing weak hydrophobic contact can be strengthen by increasing hydrophobicity of residues in contact. Basing on the analysis of the contact plots, we suggest several mutations of barnase which are supposed to increase intramolecular hydrophobic interactions, and thus might lead to increased stability of the protein. Part of these mutations was studied previously experimentally, and indeed stabilized barnase. The other of predicted mutations were not studied experimentally yet. Several new mutations of barnase and barstar are also proposed to enhance the hydrophobic interactions on their binding interface.  相似文献   

14.
The globin family of proteins has a characteristic structural pattern of helix interactions that nonetheless exhibits some variation. A simplified model for globin structural evolution was developed in which protein shape evolved by random change of contacts between helices. A conserved globin domain of 15 bacterial proteins representing four structural families was studied. Using a parsimony approach ancestral structural states could be reconstructed. The distribution of number of contact changes per site for a fixed topology tree fit a gamma distribution. Homoplasy was high, with multiple changes per site and no support for an invariant class of residue-residue contacts. Contacts changed more slowly than sequence. A phylogenetic reconstruction using a distance measure based on the proportion of shared contacts was generally consistent with a sequence-based phylogeny but not highly resolved. Contact pattern convergence between members of different globin family proteins could not be detected. Simulation studies indicated the convergence test was sensitive enough to have detected convergence involving only 10% of the contacts, suggesting a limit on the extent of selection for a specific contact pattern. Contact site methods may provide additional approaches to study the relationship between protein structure and sequence evolution. [Reviewing Editior: Dr. Lauren Ancel Meyers]  相似文献   

15.
Fuchs A  Kirschner A  Frishman D 《Proteins》2009,74(4):857-871
Despite rapidly increasing numbers of available 3D structures, membrane proteins still account for less than 1% of all structures in the Protein Data Bank. Recent high-resolution structures indicate a clearly broader structural diversity of membrane proteins than initially anticipated, motivating the development of reliable structure prediction methods specifically tailored for this class of molecules. One important prediction target capturing all major aspects of a protein's 3D structure is its contact map. Our analysis shows that computational methods trained to predict residue contacts in globular proteins perform poorly when applied to membrane proteins. We have recently published a method to identify interacting alpha-helices in membrane proteins based on the analysis of coevolving residues in predicted transmembrane regions. Here, we present a substantially improved algorithm for the same problem, which uses a newly developed neural network approach to predict helix-helix contacts. In addition to the input features commonly used for contact prediction of soluble proteins, such as windowed residue profiles and residue distance in the sequence, our network also incorporates features that apply to membrane proteins only, such as residue position within the transmembrane segment and its orientation toward the lipophilic environment. The obtained neural network can predict contacts between residues in transmembrane segments with nearly 26% accuracy. It is therefore the first published contact predictor developed specifically for membrane proteins performing with equal accuracy to state-of-the-art contact predictors available for soluble proteins. The predicted helix-helix contacts were employed in a second step to identify interacting helices. For our dataset consisting of 62 membrane proteins of solved structure, we gained an accuracy of 78.1%. Because the reliable prediction of helix interaction patterns is an important step in the classification and prediction of membrane protein folds, our method will be a helpful tool in compiling a structural census of membrane proteins.  相似文献   

16.
Interleukin-2 tyrosine kinase (Itk), is a T-cell specific tyrosine kinase of the Tec family. We have examined a novel intermolecular interaction between the SH3 and SH2 domains of Itk. In addition to the interaction between the isolated domains, we have found that the dual SH3/SH2 domain-containing fragment of Itk self-associates in a specific manner in solution. Tec family members contain the SH3, SH2 and catalytic domains common to many kinase families but are distinguished by a unique amino-terminal sequence, which contains a proline-rich stretch. Previous work has identified an intramolecular regulatory association between the proline-rich region and the adjacent SH3 domain of Itk. The intermolecular interaction between the SH3 and SH2 domains of Itk that we describe provides a possible mechanism for displacement of this intramolecular regulatory sequence, a step that may be required for full Tec kinase activation. Additionally, localization of the interacting surfaces on both the SH3 and SH2 domains by chemical shift mapping has provided information about the molecular details of this recognition event. The interaction involves the conserved aromatic binding pocket of the SH3 domain and a newly defined binding surface on the SH2 domain. The interacting residues on the SH2 domain do not conform to the consensus motif for an SH3 proline-rich ligand. Interestingly, we note a striking correlation between the SH2 residues that mediate this interaction and those residues that, when mutated in the Tec family member Btk, cause the hereditary immune disorder, X-linked agamaglobulinemia.  相似文献   

17.
18.
The simplest approximation of interaction potential between amino acid residues in proteins is the contact potential, which defines the effective free energy of a protein conformation by a set of amino acid contacts formed in this conformation. Finding a contact potential capable of predicting free energies of protein states across a variety of protein families will aid protein folding and engineering in silico on a computationally tractable time-scale. We test the ability of contact potentials to accurately and transferably (across various protein families) predict stability changes of proteins upon mutations. We develop a new methodology to determine the contact potentials in proteins from experimental measurements of changes in protein's thermodynamic stabilities (DeltaDeltaG) upon mutations. We apply our methodology to derive sets of contact interaction parameters for a hierarchy of interaction models including solvation and multi-body contact parameters. We test how well our models reproduce experimental measurements by statistical tests. We evaluate the maximum accuracy of predictions obtained by using contact potentials and the correlation between parameters derived from different data-sets of experimental (DeltaDeltaG) values. We argue that it is impossible to reach experimental accuracy and derive fully transferable contact parameters using the contact models of potentials. However, contact parameters may yield reliable predictions of DeltaDeltaG for datasets of mutations confined to the same amino acid positions in the sequence of a single protein.  相似文献   

19.
Meissner M  Koch O  Klebe G  Schneider G 《Proteins》2009,74(2):344-352
We present machine learning approaches for turn prediction from the amino acid sequence. Different turn classes and types were considered based on a novel turn classification scheme. We trained an unsupervised (self-organizing map) and two kernel-based classifiers, namely the support vector machine and a probabilistic neural network. Turn versus non-turn classification was carried out for turn families containing intramolecular hydrogen bonds and three to six residues. Support vector machine classifiers yielded a Matthews correlation coefficient (mcc) of approximately 0.6 and a prediction accuracy of 80%. Probabilistic neural networks were developed for beta-turn type prediction. The method was able to distinguish between five types of beta-turns yielding mcc > 0.5 and at least 80% overall accuracy. We conclude that the proposed new turn classification is distinct and well-defined, and machine learning classifiers are suited for sequence-based turn prediction. Their potential for sequence-based prediction of turn structures is discussed.  相似文献   

20.
The influence of long-range residue interactions on defining secondary structure in a protein has long been discussed and is often cited as the current limitation to accurate secondary structure prediction. There are several experimental examples where a local sequence alone is not sufficient to determine its secondary structure, but a comprehensive survey on a large data set has not yet been done. Interestingly, some earlier studies denied the negative effect of long-range interactions on secondary structure prediction accuracy. Here, we have introduced the residue contact order (RCO), which directly indicates the separation of contacting residues in terms of the position in the sequence, and examined the relationship between the RCO and the prediction accuracy. A large data set of 2777 nonhomologous proteins was used in our analysis. Unlike previous studies, we do find that prediction accuracy drops as residues have contacts with more distant residues. Moreover, this negative correlation between the RCO and the prediction accuracy was found not only for beta-strands, but also for alpha-helices. The prediction accuracy of beta-strands is lower if residues have a high RCO or a low RCO, which corresponds to the situation that a beta-sheet is formed by beta-strands from different chains in a protein complex. The reason why the current study draws the opposite conclusion from the previous studies is examined. The implication for protein folding is also discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号