首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Interresidue pair contacts were analyzed in detail for four pairs of protein structures solved using X-ray analysis (X-ray) and nuclear magnetic resonance (NMR). In the four NMR structures, at distances of ≤4.0 Å, the total number of pair contacts was 4–9% lower and, in general, the pair contacts were 0.02–0.16 Å shorter compared to the X-ray structures. Each of the four structural pairs contained 83–94% common pair contacts (CPCs), which were formed by identical residues in both structures; the other 6–17% were longer intrinsic pair contacts (IPCs) formed by different residues in NMR and X-ray structures, while the latter contained more IPC. Every NMR structure contained three types of CPC that were shorter, longer, or equal to the identical contact pairs in the X-ray structure of this protein. Methodologically different short CPCs prevailed at a known distance dependence of the interresidue contact density in 60–61 pairs of NMR/X-ray structures. Among the analyzed four structural pairs, contact shortening appeared upon the energy minimization of the crambin NMR structure and upon solving the ubiquitin, hen lysozyme, and monomeric hemoglobin NMR structures using X-PLOR software with decreased van der Waals atomic radii. The degree of contact shortening in the NMR structures diminished with an increase in the NMR data used to solve these structures. Among the 60 pairs of NMR/X-ray structures, the major difference between α-helical and β-structural proteins in the dependences on interresidue distances of average contact density appeared due to strong α/β differences in the backbone local geometry.  相似文献   

2.
Understanding and predicting protein structures depend on the complexity and the accuracy of the models used to represent them. We have recently set up a Hidden Markov Model to optimally compress protein three-dimensional conformations into a one-dimensional series of letters of a structural alphabet. Such a model learns simultaneously the shape of representative structural letters describing the local conformation and the logic of their connections, i.e. the transition matrix between the letters. Here, we move one step further and report some evidence that such a model of protein local architecture also captures some accurate amino acid features. All the letters have specific and distinct amino acid distributions. Moreover, we show that words of amino acids can have significant propensities for some letters. Perspectives point towards the prediction of the series of letters describing the structure of a protein from its amino acid sequence.  相似文献   

3.
Measurements of protein sequence-structure correlations   总被引:1,自引:0,他引:1  
Crooks GE  Wolfe J  Brenner SE 《Proteins》2004,57(4):804-810
Correlations between protein structures and amino acid sequences are widely used for protein structure prediction. For example, secondary structure predictors generally use correlations between a secondary structure sequence and corresponding primary structure sequence, whereas threading algorithms and similar tertiary structure predictors typically incorporate interresidue contact potentials. To investigate the relative importance of these sequence-structure interactions, we measured the mutual information among the primary structure, secondary structure and side-chain surface exposure, both for adjacent residues along the amino acid sequence and for tertiary structure contacts between residues distantly separated along the backbone. We found that local interactions along the amino acid chain are far more important than non-local contacts and that correlations between proximate amino acids are essentially uninformative. This suggests that knowledge-based contact potentials may be less important for structure predication than is generally believed.  相似文献   

4.
We calculated interchain contacts on the atomic level for nonredundant set of 4602 protein-protein interfaces using an unbiased Voronoi-Delaune tessellation method, and made 20x20 residue contact matrixes both for homodimers and heterocomplexes. The area of contacts and the distance distribution for these contacts were calculated on both the residue and the atomic levels. We analyzed residue area distribution and showed the existence of two types of interresidue contacts: stochastic and specific. We also derived formulas describing the distribution of contact area for stochastic and specific interactions in parametric form. Maximum pairing preference index was found for Cys-Cys contacts and for oppositely charged interactions. A significant difference in residue contacts was observed between homodimers and heterocomplexes. Interfaces in homodimers were enriched with contacts between residues of the same type due to the effects of structure symmetry.  相似文献   

5.
Protein structures are classically described in terms of secondary structures. Even if the regular secondary structures have relevant physical meaning, their recognition from atomic coordinates has some important limitations such as uncertainties in the assignment of boundaries of helical and β-strand regions. Further, on an average about 50% of all residues are assigned to an irregular state, i.e., the coil. Thus different research teams have focused on abstracting conformation of protein backbone in the localized short stretches. Using different geometric measures, local stretches in protein structures are clustered in a chosen number of states. A prototype representative of the local structures in each cluster is generally defined. These libraries of local structures prototypes are named as "structural alphabets". We have developed a structural alphabet, named Protein Blocks, not only to approximate the protein structure, but also to predict them from sequence. Since its development, we and other teams have explored numerous new research fields using this structural alphabet. We review here some of the most interesting applications.  相似文献   

6.
The detection of Outer Membrane Proteins (OMP) in whole genomes is an actual question, their sequence characteristics have thus been intensively studied. This class of protein displays a common beta-barrel architecture, formed by adjacent antiparallel strands. However, due to the lack of available structures, few structural studies have been made on this class of proteins. Here we propose a novel OMP local structure investigation, based on a structural alphabet approach, i.e., the decomposition of 3D structures using a library of four-residue protein fragments. The optimal decomposition of structures using hidden Markov model results in a specific structural alphabet of 20 fragments, six of them dedicated to the decomposition of beta-strands. This optimal alphabet, called SA20-OMP, is analyzed in details, in terms of local structures and transitions between fragments. It highlights a particular and strong organization of beta-strands as series of regular canonical structural fragments. The comparison with alphabets learned on globular structures indicates that the internal organization of OMP structures is more constrained than in globular structures. The analysis of OMP structures using SA20-OMP reveals some recurrent structural patterns. The preferred location of fragments in the distinct regions of the membrane is investigated. The study of pairwise specificity of fragments reveals that some contacts between structural fragments in beta-sheets are clearly favored whereas others are avoided. This contact specificity is stronger in OMP than in globular structures. Moreover, SA20-OMP also captured sequential information. This can be integrated in a scoring function for structural model ranking with very promising results.  相似文献   

7.
8.
The environment of amino acid residues in protein tertiary structures and three types of interfaces formed by protein-protein association--in complexes, homodimers, and crystal lattices of monomeric proteins--has been analyzed in terms of the propensity values of the 20 amino acid residues to be in contact with a given residue. On the basis of the similarity of the environment, twenty residues can be divided into nine classes, which may correspond to a set of reduced amino acid alphabet. There is no appreciable change in the environment in going from the tertiary structure to the interface, those participating in the crystal contacts showing the maximum deviation. Contacts between identical residues are very prominent in homodimers and crystal dimers and arise due to 2-fold related association of residues lining the axis of rotation. These two types of interfaces, representing specific and nonspecific associations, are characterized by the types of residues that partake in "self-contacts"--most notably Leu in the former and Glu in the latter. The relative preference of residues to be involved in "self-contacts" can be used to develop a scoring function to identify homodimeric proteins from crystal structures. Thirty-four percent of such residues are fully conserved among homologous proteins in the homodimer dataset, as opposed to only 20% in crystal dimers. Results point to Leu being the stickiest of all amino acid residues, hence its widespread use in motifs, such as leucine zippers.  相似文献   

9.
Protein design experiments have shown that the use of specific subsets of amino acids can produce foldable proteins. This prompts the question of whether there is a minimal amino acid alphabet which could be used to fold all proteins. In this work we make an analogy between sequence patterns which produce foldable sequences and those which make it possible to detect structural homologs by aligning sequences, and use it to suggest the possible size of such a reduced alphabet. We estimate that reduced alphabets containing 10-12 letters can be used to design foldable sequences for a large number of protein families. This estimate is based on the observation that there is little loss of the information necessary to pick out structural homologs in a clustered protein sequence database when a suitable reduction of the amino acid alphabet from 20 to 10 letters is made, but that this information is rapidly degraded when further reductions in the alphabet are made.  相似文献   

10.
Protein-protein interactions form the proteinaceous network, which plays a central role in numerous processes in the cell. This review highlights the main structures, properties of contact surfaces, and forces involved in protein-protein interactions. The properties of protein contact surfaces depend on their functions. The characteristics of contact surfaces of short-lived protein complexes share some similarities with the active sites of enzymes. The contact surfaces of permanent complexes resemble domain contacts or the protein core. It is reasonable to consider protein-protein complex formation as a continuation of protein folding. The contact surfaces of the protein complexes have unique structure and properties, so they represent prospective targets for a new generation of drugs. During the last decade, numerous investigations have been undertaken to find or design small molecules that block protein dimerization or protein(peptide)-receptor interaction, or on the other hand, induce protein dimerization.  相似文献   

11.
The structural annotation of proteins with no detectable homologs of known 3D structure identified using sequence‐search methods is a major challenge today. We propose an original method that computes the conditional probabilities for the amino‐acid sequence of a protein to fit to known protein 3D structures using a structural alphabet, known as “Protein Blocks” (PBs). PBs constitute a library of 16 local structural prototypes that approximate every part of protein backbone structures. It is used to encode 3D protein structures into 1D PB sequences and to capture sequence to structure relationships. Our method relies on amino acid occurrence matrices, one for each PB, to score global and local threading of query amino acid sequences to protein folds encoded into PB sequences. It does not use any information from residue contacts or sequence‐search methods or explicit incorporation of hydrophobic effect. The performance of the method was assessed with independent test datasets derived from SCOP 1.75A. With a Z‐score cutoff that achieved 95% specificity (i.e., less than 5% false positives), global and local threading showed sensitivity of 64.1% and 34.2%, respectively. We further tested its performance on 57 difficult CASP10 targets that had no known homologs in PDB: 38 compatible templates were identified by our approach and 66% of these hits yielded correctly predicted structures. This method scales‐up well and offers promising perspectives for structural annotations at genomic level. It has been implemented in the form of a web‐server that is freely available at http://www.bo‐protscience.fr/forsa .  相似文献   

12.
Armando D. Solis 《Proteins》2015,83(12):2198-2216
To reduce complexity, understand generalized rules of protein folding, and facilitate de novo protein design, the 20‐letter amino acid alphabet is commonly reduced to a smaller alphabet by clustering amino acids based on some measure of similarity. In this work, we seek the optimal alphabet that preserves as much of the structural information found in long‐range (contact) interactions among amino acids in natively‐folded proteins. We employ the Information Maximization Device, based on information theory, to partition the amino acids into well‐defined clusters. Numbering from 2 to 19 groups, these optimal clusters of amino acids, while generated automatically, embody well‐known properties of amino acids such as hydrophobicity/polarity, charge, size, and aromaticity, and are demonstrated to maintain the discriminative power of long‐range interactions with minimal loss of mutual information. Our measurements suggest that reduced alphabets (of less than 10) are able to capture virtually all of the information residing in native contacts and may be sufficient for fold recognition, as demonstrated by extensive threading tests. In an expansive survey of the literature, we observe that alphabets derived from various approaches—including those derived from physicochemical intuition, local structure considerations, and sequence alignments of remote homologs—fare consistently well in preserving contact interaction information, highlighting a convergence in the various factors thought to be relevant to the folding code. Moreover, we find that alphabets commonly used in experimental protein design are nearly optimal and are largely coherent with observations that have arisen in this work. Proteins 2015; 83:2198–2216. © 2015 Wiley Periodicals, Inc.  相似文献   

13.
What are the key building blocks that would have been needed to construct complex protein folds? This is an important issue for understanding protein folding mechanism and guiding de novo protein design. Twenty naturally occurring amino acids and eight secondary structures consist of a 28‐letter alphabet to determine folding kinetics and mechanism. Here we predict folding kinetic rates of proteins from many reduced alphabets. We find that a reduced alphabet of 10 letters achieves good correlation with folding rates, close to the one achieved by full 28‐letter alphabet. Many other reduced alphabets are not significantly correlated to folding rates. The finding suggests that not all amino acids and secondary structures are equally important for protein folding. The foldable sequence of a protein could be designed using at least 10 folding units, which can either promote or inhibit protein folding. Reducing alphabet cardinality without losing key folding kinetic information opens the door to potentially faster machine learning and data mining applications in protein structure prediction, sequence alignment and protein design. Proteins 2015; 83:631–639. © 2015 Wiley Periodicals, Inc.  相似文献   

14.
In many biological systems, proteins interact with other organic molecules to produce indispensable functions, in which molecular recognition phenomena are essential. Proteins have kept or gained their functions during molecular evolution. Their functions seem to be flexible, and a few amino acid substitutions sometimes cause drastic changes in function. In order to monitor and predict such drastic changes in the early stages in target populations, we need to identify patterns of structural changes during molecular evolution causing decreases or increases in the binding affinity of protein complexes. In previous work, we developed a likelihood-based index to quantify the degree to which a sequence fits a given structure. This index was named the sequence-structure fitness (SSF) and is calculated empirically based on amino acid preferences and pairwise interactions in the structural environment present in template structures. In the present work, we used the SSF to develop an index to measure the binding affinity of protein-protein complexes defined as the log likelihood ratio, contrasting the fitness of the sequences to the structure of the complex and that of the uncomplexed proteins. We applied the developed index to the complexes formed between influenza A hemagglutinin (HA) and four antibodies. The antibody-antigen binding region of HA is under strong selection pressure by the host immune system. Hence, examination of the long-term adaptation of HA to the four antibodies could reveal the strategy of the molecular evolution of HA. Two antibodies cover the HA receptor-binding region, while the other two bind away from the receptor-binding region. By focusing on branches with a significant decline in binding ability, we could detect key amino acid replacements and investigate the mechanism via conditional probabilities. The contrast between the adaptations to the two types of antibodies suggests that the virus adapts to the immune system at the cost of structural change.  相似文献   

15.
Martin O  Schomburg D 《Proteins》2008,70(4):1367-1378
Biological systems and processes rely on a complex network of molecular interactions. While the association of biological macromolecules is a fundamental biochemical phenomenon crucial for the understanding of complex living systems, protein-protein docking methods aim for the computational prediction of protein complexes from individual subunits. Docking algorithms generally produce large numbers of putative protein complexes with only few of these conformations resembling the native complex structure within an acceptable degree of structural similarity. A major challenge in the field of docking is to extract near-native structure(s) out of the large pool of solutions, the so called scoring or ranking problem. A series of structural, chemical, biological and physical properties are used in this work to classify docked protein-protein complexes. These properties include specialized energy functions, evolutionary relationship, class specific residue interface propensities, gap volume, buried surface area, empiric pair potentials on residue and atom level as well as measures for the tightness of fit. Efficient comprehensive scoring functions have been developed using probabilistic Support Vector Machines in combination with this array of properties on the largest currently available protein-protein docking benchmark. The established classifiers are shown to be specific for certain types of protein-protein complexes and are able to detect near-native complex conformations from large sets of decoys with high sensitivity. Using classification probabilities the ranking of near-native structures was drastically improved, leading to a significant enrichment of near-native complex conformations within the top ranks. It could be shown that the developed schemes outperform five other previously published scoring functions.  相似文献   

16.
Protein sequence world is considerably larger than structure world. In consequence, numerous non-related sequences may adopt similar 3D folds and different kinds of amino acids may thus be found in similar 3D structures. By grouping together the 20 amino acids into a smaller number of representative residues with similar features, sequence world simplification may be achieved. This clustering hence defines a reduced amino acid alphabet (reduced AAA). Numerous works have shown that protein 3D structures are composed of a limited number of building blocks, defining a structural alphabet. We previously identified such an alphabet composed of 16 representative structural motifs (5-residues length) called Protein Blocks (PBs). This alphabet permits to translate the structure (3D) in sequence of PBs (1D). Based on these two concepts, reduced AAA and PBs, we analyzed the distributions of the different kinds of amino acids and their equivalences in the structural context. Different reduced sets were considered. Recurrent amino acid associations were found in all the local structures while other were specific of some local structures (PBs) (e.g Cysteine, Histidine, Threonine and Serine for the alpha-helix Ncap). Some similar associations are found in other reduced AAAs, e.g Ile with Val, or hydrophobic aromatic residues Trp with Phe and Tyr. We put into evidence interesting alternative associations. This highlights the dependence on the information considered (sequence or structure). This approach, equivalent to a substitution matrix, could be useful for designing protein sequence with different features (for instance adaptation to environment) while preserving mainly the 3D fold.  相似文献   

17.
Protein-protein interactions play a central role in numerous processes in the cell and are one of the main fields of functional proteomics. This review highlights the methods of bioinformatics and functional proteomics of protein-protein interaction investigation. The structures and properties of contact surfaces, forces involved in protein-protein interactions, kinetic and thermodynamic parameters of these reactions were considered. The properties of protein contact surfaces depend on their functions. The contact surfaces of permanent complexes resemble domain contacts or the protein core and it is reasonable to consider such complex formation as a continuation of protein folding. Characteristics of contact surfaces of temporary protein complexes share some similarities with active sites of enzymes. The contact surfaces of the temporary protein complexes have unique structure and properties and they are more conservative in comparison with active site of enzymes. So they represent prospective targets for a new generation of drugs. During the last decade, numerous investigations were undertaken to find or design small molecules that block protein dimerization or protein(peptide)-receptor interaction, or, on the contrary, to induce protein dimerization.  相似文献   

18.
We propose a classification of amino acid residues based on the events of contact formation between particular residues and DNA nucleotides, i.e., using the most integral properties that characterize interactions organizing DNA-protein complexes. We apply the Voronoi-Delaunay tessellation to draw statistics of contacts and of contact areas for a set of 1937 DNA-protein complexes. Similarity of amino acid residues is defined upon comparison of corresponding rows and matrices of contacts and areas of contacts. Nine measures of distance have been used to estimate the closeness of rows. Residues have been grouped by three hierarchical and two nonhierarchical clustering methods. In a total tree built using nine metrics with three hierarchical methods, we show that clustering centers (pairs of amino acids) in the main groups are always constant while other relationships between objects vary. Major classes of up to six amino acids correspond to certain local structures of the polypeptide chain. These data can be taken into account when designing DNA-protein ligands.  相似文献   

19.
We describe the derivation and testing of a knowledge-based atomic environment potential for the modeling of protein structural energetics. An analysis of the probabilities of atomic interactions in a dataset of high-resolution protein structures shows that the probabilities of non-bonded inter-atomic contacts are not statistically independent events, and that the multi-body contact frequencies are poorly predicted from pairwise contact potentials. A pseudo-energy function is defined that measures the preferences for protein atoms to be in a given microenvironment defined by the number of contacting atoms in the environment and its atomic composition. This functional form is tested for its ability to recognize native protein structures amongst an ensemble of decoy structures and a detailed relative performance comparison is made with a number of common functions used in protein structure prediction.  相似文献   

20.
Ehrlich LP  Nilges M  Wade RC 《Proteins》2005,58(1):126-133
Accounting for protein flexibility in protein-protein docking algorithms is challenging, and most algorithms therefore treat proteins as rigid bodies or permit side-chain motion only. While the consequences are obvious when there are large conformational changes upon binding, the situation is less clear for the modest conformational changes that occur upon formation of most protein-protein complexes. We have therefore studied the impact of local protein flexibility on protein-protein association by means of rigid body and torsion angle dynamics simulation. The binding of barnase and barstar was chosen as a model system for this study, because the complexation of these 2 proteins is well-characterized experimentally, and the conformational changes accompanying binding are modest. On the side-chain level, we show that the orientation of particular residues at the interface (so-called hotspot residues) have a crucial influence on the way contacts are established during docking from short protein separations of approximately 5 A. However, side-chain torsion angle dynamics simulations did not result in satisfactory docking of the proteins when using the unbound protein structures. This can be explained by our observations that, on the backbone level, even small (2 A) local loop deformations affect the dynamics of contact formation upon docking. Complementary shape-based docking calculations confirm this result, which indicates that both side-chain and backbone levels of flexibility influence short-range protein-protein association and should be treated simultaneously for atomic-detail computational docking of proteins.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号