首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 30 毫秒
1.
Recognition of protein fold from amino acid sequence is a challenging task. The structure and stability of proteins from different fold are mainly dictated by inter-residue interactions. In our earlier work, we have successfully used the medium- and long-range contacts for predicting the protein folding rates, discriminating globular and membrane proteins and for distinguishing protein structural classes. In this work, we analyze the role of inter-residue interactions in commonly occurring folds of globular proteins in order to understand their folding mechanisms. In the medium-range contacts, the globin fold and four-helical bundle proteins have more contacts than that of DNA-RNA fold although they all belong to all-alpha class. In long-range contacts, only the ribonuclease fold prefers 4-10 range and the other folding types prefer the range 21-30 in alpha/beta class proteins. Further, the preferred residues and residue pairs influenced by these different folds are discussed. The information about the preference of medium- and long-range contacts exhibited by the 20 amino acid residues can be effectively used to predict the folding type of each protein.  相似文献   

2.
CORA is a suite of programs for multiply aligning and analyzing protein structural families to identify the consensus positions and capture their most conserved structural characteristics (e.g., residue accessibility, torsional angles, and global geometry as described by inter-residue vectors/contacts). Knowledge of these structurally conserved positions, which are mostly in the core of the fold and of their properties, significantly improves the identification and classification of newly-determined relatives. Information is encoded in a consensus three-dimensional (3D) template and relatives found by a sensitive alignment method, which employs a new scoring scheme based on conserved residue contacts. By encapsulating these critical "core" features, templates perform more reliably in recognizing distant structural relatives than searches with representative structures. Parameters for 3D-template generation and alignment were optimized for each structural class (mainly-alpha, mainly-beta, alpha-beta), using representative superfold families. For all families selected, the templates gave significant improvements in sensitivity and selectivity in recognizing distant structural relatives. Furthermore, since templates contain less than 70% of fold positions and compare fewer positions when aligning structures, scans are at least an order of magnitude faster than scans using selected structures. CORA was subsequently tested on eight other broad structural families from the CATH database. Diagnostics plots are generated automatically and provide qualitative assistance for classifying newly determined relatives. They are demonstrated here by application to the large globin-like fold family. CORA templates for both homologous superfamilies and fold families will be stored in CATH and used to improve the classification and analysis of newly determined structures.  相似文献   

3.
Protein contacts, inter-residue interactions and side-chain modelling   总被引:1,自引:0,他引:1  
Faure G  Bornot A  de Brevern AG 《Biochimie》2008,90(4):626-639
Three-dimensional structures of proteins are the support of their biological functions. Their folds are stabilized by contacts between residues. Inner protein contacts are generally described through direct atomic contacts, i.e. interactions between side-chain atoms, while contact prediction methods mainly used inter-Calpha distances. In this paper, we have analyzed the protein contacts on a recent high quality non-redundant databank using different criteria. First, we have studied the average number of contacts depending on the distance threshold to define a contact. Preferential contacts between types of amino acids have been highlighted. Detailed analyses have been done concerning the proximity of contacts in the sequence, the size of the proteins and fold classes. The strongest differences have been extracted, highlighting important residues. Then, we studied the influence of five different side-chain conformation prediction methods (SCWRL, IRECS, SCAP, SCATD and SCCOMP) on the distribution of contacts. The prediction rates of these different methods are quite similar. However, using a distance criterion between side chains, the results are quite different, e.g. SCAP predicts 50% more contacts than observed, unlike other methods that predict fewer contacts than observed. Contacts deduced are quite distinct from one method to another with at most 75% contacts in common. Moreover, distributions of amino acid preferential contacts present unexpected behaviours distinct from previously observed in the X-ray structures, especially at the surface of proteins. For instance, the interactions involving Tryptophan greatly decrease.  相似文献   

4.
E Ferrada  A Wagner 《Biophysical journal》2012,102(8):1916-1925
The relationship between the genotype (sequence) and the phenotype (structure) of macromolecules affects their ability to evolve new structures and functions. We here compare the genotype space organization of proteins and RNA molecules to identify differences that may affect this ability. To this end, we computationally study the genotype-phenotype relationship for short RNA and lattice proteins of a reduced monomer alphabet size, to make exhaustive analysis and direct comparison of their genotype spaces feasible. We find that many fewer protein molecules than RNA molecules fold, but they fold into many more structures than RNA. In consequence, protein phenotypes have smaller genotype networks whose member genotypes tend to be more similar than for RNA phenotypes. Neighborhoods in sequence space of a given radius around an RNA molecule contain more novel structures than for protein molecules. We compare this property to evidence from natural RNA and protein molecules, and conclude that RNA genotype space may be more conducive to the evolution of new structure phenotypes.  相似文献   

5.
The analysis of inter-residue interactions in protein structures provides considerable insight to understand their folding and stability. We have previously analyzed the role of medium- and long-range interactions in the folding of globular proteins. In this work, we study the distinct role of such interactions in the three-dimensional structures of membrane proteins. We observed a higher number of long-range contacts in the termini of transmembrane helical (TMH) segments, implying their role in the stabilization of helix-helix interactions. The transmembrane strand (TMS) proteins are having appreciably higher long-range contacts than that in all-beta class of globular proteins, indicating closer packing of the strands in TMS proteins. The residues in membrane spanning segments of TMH proteins have 1.3 times higher medium-range contacts than long-range contacts whereas that of TMS proteins have 14 times higher long-range contacts than medium-range contacts. Residue-wise analysis indicates that in TMH proteins, the residues Cys, Glu, Gly, Pro, Gln, Ser and Tyr have higher long-range contacts than medium-range contacts in contrast with all-alpha class of globular proteins. The charged residue pairs have higher medium-range contacts in all-alpha proteins, whereas hydrophobic residue pairs are dominant in TMH proteins. The information on the preference of residue pairs to form medium-range contacts has been successfully used to discriminate the TMH proteins from all-alpha proteins. The statistical significance of the results obtained from the present study has been verified using randomized structures of TMH and TMS protein templates.  相似文献   

6.
Kamat AP  Lesk AM 《Proteins》2007,66(4):869-876
Comparing and classifying protein folding patterns allows organizing the known structures and enumerating possible protein structural patterns including those not yet observed. We capture the essence of protein folding patterns in a concise tableau representation based on the order and contact patterns of secondary structures: helices and strands of sheet. The tableaux are intelligible to both humans and computers. They provide a database, derived from the Protein Data Bank, mineable in studies of protein architecture. Using this database, we have: (i) determined statistical properties of secondary structure contacts in an unbiased set of protein domains from ASTRAL, (ii) observed that in 98% of cases, the tableau is a faithful representation of the folding pattern as classified in SCOP, (iii) demonstrated that to a large extent the local structure of proteins indicates their complete folding topology, and (iv) studied the use of the representation for fold identification.  相似文献   

7.
Protein sequences have evolved to fold into functional structures, resulting in families of diverse protein sequences that all share the same overall fold. One can harness protein family sequence data to infer likely contacts between pairs of residues. In the current study, we combine this kind of inference from coevolutionary information with a coarse‐grained protein force field ordinarily used with single sequence input, the Associative memory, Water mediated, Structure and Energy Model (AWSEM), to achieve improved structure prediction. The resulting Associative memory, Water mediated, Structure and Energy Model with Evolutionary Restraints (AWSEM‐ER) yields a significant improvement in the quality of protein structure prediction over the single sequence prediction from AWSEM when a sufficiently large number of homologous sequences are available. Free energy landscape analysis shows that the addition of the evolutionary term shifts the free energy minimum to more native‐like structures, which explains the improvement in the quality of structures when performing predictions using simulated annealing. Simulations using AWSEM without coevolutionary information have proved useful in elucidating not only protein folding behavior, but also mechanisms of protein function. The success of AWSEM‐ER in de novo structure prediction suggests that the enhanced model opens the door to functional studies of proteins even when no experimentally solved structures are available.  相似文献   

8.
During evolution, the effective interactions between residues in a protein can be adjusted through mutations to allow the protein to fold to its native structure on an adequate time scale. We seek to address the question: Are there some structures that can be better optimized than others? Using exhaustive enumeration of the compact conformations of short proteins confined to simple lattices, we find that the best structures are those that contain contacts rare in random structures, indicating the importance of nonlocal contacts for assisting the folding process. Certain structural motifs such as long β-hairpins, Greek-key motifs, and jelly rolls, commonly found in proteins of known structure, have a high degree of optimizability. Contrary to what might be expected, positive correlations between the various interactions reduce optimizability. The optimization procedure produces a correlated energy landscape, which might assist folding. © 1995 John Wiley & Sons, Inc.  相似文献   

9.
Intron boundaries were extracted from genomic data and mapped onto single-domain human and murine protein structures taken from the Protein Data Bank. A first analysis of this set of proteins shows that intron boundaries prefer to be in non-regular secondary structure elements, while avoiding alpha-helices and beta-strands. This fact alone suggests an evolutionary model in which introns are constrained by protein structure, particularly by tertiary structure contacts. In addition, in silico recombination experiments of a subset of these proteins together with their homologues, including those in different species, show that introns have a tendency to occur away from artificial crossover hot spots. Altogether, these findings support a model in which genes can preferentially harbour introns in less constrained regions of the protein fold they code for. In the light of these findings, we discuss some implications for protein modelling and design.  相似文献   

10.
11.
Increasing efforts are being invested in the construction of nanostructures with desired shapes and physical and chemical properties. Our strategy involves nanostructure design using naturally occurring protein building blocks. Inspection of the protein structural database (PDB) reveals the richness of the conformations, shapes, and chemistries of proteins and their building blocks. To increase the population of the native fold in the selected building block, we mutate natural residues by engineered, constrained residues that restrict the conformational freedom at the targeted site and have favorable interactions, geometry, and size. Here, as a model system, we construct nanotubes using building blocks from left-handed beta-helices which are commonly occurring repeat protein architectures. We pick two-turn beta-helical segments, duplicate and stack them, and using all-atom molecular dynamics simulations (MD) with explicit solvent probe the structural stability of these nanotubular structures as indicated by their capacity to retain the initial organization and their conformational dynamics. Comparison of the results for the wild-type and mutated sequences shows that the introduction of the conformationally restricted 1-aminocyclopropanecarboxylic acid (Ac3c) residue in loop regions greatly enhances the stability of beta-helix nanotubes. The Ac3c geometrical confinement effect is sequence-specific and position-specific. The achievement of high stability of nanotubular structures originates not only from the reduction of mobility at the mutation site induced by Ac3c but also from stabilizing association forces between building blocks such as hydrogen bonds and hydrophobic contacts. For the selected synthetic residue, similar size, hydrophobicity, and backbone conformational tendencies are desirable as in the Ac3c.  相似文献   

12.
The question of how best to compare and classify the (three‐dimensional) structures of proteins is one of the most important unsolved problems in computational biology. To help tackle this problem, we have developed a novel shape‐density superposition algorithm called 3D‐Blast which represents and superposes the shapes of protein backbone folds using the spherical polar Fourier correlation technique originally developed by us for protein docking. The utility of this approach is compared with several well‐known protein structure alignment algorithms using receiver‐operator‐characteristic plots of queries against the “gold standard” CATH database. Despite being completely independent of protein sequences and using no information about the internal geometry of proteins, our results from searching the CATH database show that 3D‐Blast is highly competitive compared to current state‐of‐the‐art protein structure alignment algorithms. A novel and potentially very useful feature of our approach is that it allows an average or “consensus” fold to be calculated easily for a given group of protein structures. We find that using consensus shapes to represent entire fold families also gives very good database query performance. We propose that using the notion of consensus fold shapes could provide a powerful new way to index existing protein structure databases, and that it offers an objective way to cluster and classify all of the currently known folds in the protein universe. Proteins 2012. © 2011 Wiley Periodicals, Inc.  相似文献   

13.
In this work, we have analyzed the relative importance of secondary versus tertiary interactions in stabilizing and guiding protein folding. For this purpose, we have designed four different mutants to replace the alpha-helix of the GB1 domain by a sequence with strong beta-hairpin propensity in isolation. In particular, we have chosen the sequence of the second beta-hairpin of the GB1 domain, which populates the native conformation in aqueous solution to a significant extent. The resulting protein has roughly 30 % of its sequence duplicated and maintains the 3D-structure of the wild-type protein, but with lower stability (up to -5 kcal/mol). The loss of intrinsic helix stability accounts for about 80 % of the decrease in free energy, illustrating the importance of local interactions in protein stability. Interestingly enough, all the mutant proteins, included the one with the duplicated beta-hairpin sequence, fold with similar rates as the GB1 domain. Essentially, it is the nature of the rate-limiting step in the folding reaction that determines whether a particular interaction will speed up, or not, the folding rates. While local contacts are important in determining protein stability, residues involved in tertiary contacts in combination with the topology of the native fold, seem to be responsible for the specificity of protein structures. Proteins with non-native secondary structure tendencies can adopt stable folds and be as efficient in folding as those proteins with native-like propensities.  相似文献   

14.
A quantitative structure-property relationship (QSPR) was used to design model protein sequences that fold repeatedly and relatively rapidly to stable target structures. The specific model was a 125-residue heteropolymer chain subject to Monte Carlo dynamics on a simple cubic lattice. The QSPR was derived from an analysis of a database of 200 sequences by a statistical method that uses a genetic algorithm to select the sequence attributes that are most important for folding and a neural network to determine the corresponding functional dependence of folding ability on the chosen attributes. The QSPR depends on the number of anti-parallel sheet contacts, the energy gap between the native state and quasi-continuous part of the spectrum and the total energy of the contacts between surface residues. Two Monte Carlo procedures were used in series to optimize both the target structures and the sequences. We generated 20 fully optimized sequences and 60 partially optimized control sequences and tested each for its ability to fold in dynamic MC simulations. Although sequences in which either the number of anti-parallel sheet contacts or the energy of the surface residues is non-optimal are capable of folding almost as well as fully optimized ones, sequences in which only the energy gap is optimized fold markedly more slowly. Implications of the results for the design of proteins are discussed.  相似文献   

15.
One of the main barriers to accurate computational protein structure prediction is searching the vast space of protein conformations. Distance restraints or inter‐residue contacts have been used to reduce this search space, easing the discovery of the correct folded state. It has been suggested that about 1 contact for every 12 residues may be sufficient to predict structure at fold level accuracy. Here, we use coarse‐grained structure‐based models in conjunction with molecular dynamics simulations to examine this empirical prediction. We generate sparse contact maps for 15 proteins of varying sequence lengths and topologies and find that given perfect secondary‐structural information, a small fraction of the native contact map (5%‐10%) suffices to fold proteins to their correct native states. We also find that different sparse maps are not equivalent and we make several observations about the type of maps that are successful at such structure prediction. Long range contacts are found to encode more information than shorter range ones, especially for α and αβ‐proteins. However, this distinction reduces for β‐proteins. Choosing contacts that are a consensus from successful maps gives predictive sparse maps as does choosing contacts that are well spread out over the protein structure. Additionally, the folding of proteins can also be used to choose predictive sparse maps. Overall, we conclude that structure‐based models can be used to understand the efficacy of structure‐prediction restraints and could, in future, be tuned to include specific force‐field interactions, secondary structure errors and noise in the sparse maps.  相似文献   

16.
Although most proteins conform to the classical one‐structure/one‐function paradigm, an increasing number of proteins with dual structures and functions have been discovered. In response to cellular stimuli, such proteins undergo structural changes sufficiently dramatic to remodel even their secondary structures and domain organization. This “fold‐switching” capability fosters protein multi‐functionality, enabling cells to establish tight control over various biochemical processes. Accurate predictions of fold‐switching proteins could both suggest underlying mechanisms for uncharacterized biological processes and reveal potential drug targets. Recently, we developed a prediction method for fold‐switching proteins using structure‐based thermodynamic calculations and discrepancies between predicted and experimentally determined protein secondary structure (Porter and Looger, Proc Natl Acad Sci U S A 2018; 115:5968–5973). Here we seek to leverage the negative information found in these secondary structure prediction discrepancies. To do this, we quantified secondary structure prediction accuracies of 192 known fold‐switching regions (FSRs) within solved protein structures found in the Protein Data Bank (PDB). We find that the secondary structure prediction accuracies for these FSRs vary widely. Inaccurate secondary structure predictions are strongly associated with fold‐switching proteins compared to equally long segments of non‐fold‐switching proteins selected at random. These inaccurate predictions are enriched in helix‐to‐strand and strand‐to‐coil discrepancies. Finally, we find that most proteins with inaccurate secondary structure predictions are underrepresented in the PDB compared with their alternatively folded cognates, suggesting that unequal representation of fold‐switching conformers within the PDB could be an important cause of inaccurate secondary structure predictions. These results demonstrate that inconsistent secondary structure predictions can serve as a useful preliminary marker of fold switching.  相似文献   

17.
We show that loops of close contacts involving hydrophobic residues are important in protein folding. Contrary to Berezovsky Berezovsky and Trifonov (J Biomol Struct Dyn 20, 5-6, 2002) the loops important in protein folding usually are much larger in size than 23-31 residues, being instead comparable to the size of the protein for single domain proteins. Additionally what is important are not single loop contacts, but a highly interconnected network of such loop contacts, which provides extra stability to a protein fold and which leads to their conservation in evolution.  相似文献   

18.
We have developed a new combined approach for ab initio protein structure prediction. The protein conformation is described as a lattice chain connecting C(alpha) atoms, with attached C(beta) atoms and side-chain centers of mass. The model force field includes various short-range and long-range knowledge-based potentials derived from a statistical analysis of the regularities of protein structures. The combination of these energy terms is optimized through the maximization of correlation for 30 x 60,000 decoys between the root mean square deviation (RMSD) to native and energies, as well as the energy gap between native and the decoy ensemble. To accelerate the conformational search, a newly developed parallel hyperbolic sampling algorithm with a composite movement set is used in the Monte Carlo simulation processes. We exploit this strategy to successfully fold 41/100 small proteins (36 approximately 120 residues) with predicted structures having a RMSD from native below 6.5 A in the top five cluster centroids. To fold larger-size proteins as well as to improve the folding yield of small proteins, we incorporate into the basic force field side-chain contact predictions from our threading program PROSPECTOR where homologous proteins were excluded from the data base. With these threading-based restraints, the program can fold 83/125 test proteins (36 approximately 174 residues) with structures having a RMSD to native below 6.5 A in the top five cluster centroids. This shows the significant improvement of folding by using predicted tertiary restraints, especially when the accuracy of side-chain contact prediction is >20%. For native fold selection, we introduce quantities dependent on the cluster density and the combination of energy and free energy, which show a higher discriminative power to select the native structure than the previously used cluster energy or cluster size, and which can be used in native structure identification in blind simulations. These procedures are readily automated and are being implemented on a genomic scale.  相似文献   

19.
The first application of a novel technique for the identification of common folding motifs in proteins is presented. Using techniques derived from graph theory, developed in order to compare secondary structure motifs in proteins, we have established that there is a striking resemblance in the tertiary fold of the Salmonella typhimurium Che Y chemotaxis protein and that of the GDP-binding domain of Escherichia coli elongation factor Tu (EF Tu). These two protein structures are representatives of two major macromolecular classes: CheY is a signal-transduction protein with sequence homologies to a wide range of bacterial proteins involved in regulation of chemotaxis, membrane synthesis and sporulation; whilst EF Tu is one of a family of guanosine-nucleotide-binding proteins which include the ras oncogene proteins and signal-transducing G proteins. The similarity we have found extends far beyond the previously recognized resemblances of each protein's fold to that of a generic nucleotide-binding domain. The lack of significant sequence homology between the two classes of proteins may mean that the common fold of the two proteins constitutes a particularly stable folding motif. However, an alternative possibility is that the strong three-dimensional structural resemblance may be indicative of a remote shared common ancestry between the bacterial signal-transduction proteins and the GDP-binding proteins.  相似文献   

20.
Inter-residue interactions in protein folding and stability   总被引:6,自引:0,他引:6  
During the process of protein folding, the amino acid residues along the polypeptide chain interact with each other in a cooperative manner to form the stable native structure. The knowledge about inter-residue interactions in protein structures is very helpful to understand the mechanism of protein folding and stability. In this review, we introduce the classification of inter-residue interactions into short, medium and long range based on a simple geometric approach. The features of these interactions in different structural classes of globular and membrane proteins, and in various folds have been delineated. The development of contact potentials and the application of inter-residue contacts for predicting the structural class and secondary structures of globular proteins, solvent accessibility, fold recognition and ab initio tertiary structure prediction have been evaluated. Further, the relationship between inter-residue contacts and protein-folding rates has been highlighted. Moreover, the importance of inter-residue interactions in protein-folding kinetics and for understanding the stability of proteins has been discussed. In essence, the information gained from the studies on inter-residue interactions provides valuable insights for understanding protein folding and de novo protein design.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号