首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The predictive limits of the amino acid composition for the secondary structural content (percentage of residues in the secondary structural states helix, sheet, and coil) in proteins are assessed quantitatively. For the first time, techniques for prediction of secondary structural content are presented which rely on the amino acid composition as the only information on the query protein. In our first method, the amino acid composition of an unknown protein is represented by the best (in a least square sense) linear combination of the characteristic amino acid compositions of the three secondary structural types computed from a learning set of tertiary structures. The second technique is a generalization of the first one and takes into account also possible compositional couplings between any two sorts of amino acids. Its mathematical formulation results in an eigenvalue/eigenvector problem of the second moment matrix describing the amino acid compositional fluctuations of secondary structural types in various proteins of a learning set. Possible correlations of the principal directions of the eigenspaces with physical properties of the amino acids were also checked. For example, the first two eigenvectors of the helical eigenspace correlate with the size and hydrophobicity of the residue types respectively. As learning and test sets of tertiary structures, we utilized representative, automatically generated subsets of Protein Data Bank (PDB) consisting of non-homologous protein structures at the resolution thresholds ≤1.8Å, ≤2.0Å, ≤2.5Å, and ≤3.0Å. We show that the consideration of compositional couplings improves prediction accuracy, albeit not dramatically. Whereas in the self-consistency test (learning with the protein to be predicted), a clear decrease of prediction accuracy with worsening resolution is observed, the jackknife test (leave the predicted protein out) yielded best results for the largest dataset (≤3.0 Å, almost no difference to the self-consistency test!), i.e., only this set, with more than 400 proteins, is sufficient for stable computation of the parameters in the prediction function of the second method. The average absolute error in predicting the fraction of helix, sheet, and coil from amino acid composition of the query protein are 13.7, 12.6, and 11.4%, respectively with r.m.s. deviations in the range of 8.6 ÷ 11.8% for the 3.0 Å dataset in a jackknife test. The absolute precision of the average absolute errors is in the range of 1 ÷ 3% as measured for other representative subsets of the PDB. Secondary structural content prediction methods found in the literature have been clustered in accordance with their prediction accuracies. To our surprise, much more complex secondary structure prediction methods utilized for the same purpose of secondary structural content prediction achieve prediction accuracies very similar to those of the present analytic techniques, implying that all the information beyond the amino acid composition is, in fact, mainly utilized for positioning the secondary structural state in the sequence but not for determination of the overall number of residues in a secondary structural type. This result implies that higher prediction accuracies cannot be achieved relying solely on the amino acid composition of an unknown query protein as prediction input. Our prediction program SSCP has been made available as a World Wide Web and E-mail service. © 1996 Wiley-Liss, Inc.  相似文献   

2.
The distance geometry approach for computing the tertiary structure of globular proteins emphasized in this series of papers (Goelet al., J. theor. Biol. 99, 705–757, 1982) is developed further. This development includes incorporation of some secondary structure information—the location of alpha helices in the primary sequence—in the algorithm to compute the tertiary structure of alpha helical globular proteins. An algorithm is developed which estimates the interresidue distances between chain-proximate helices. These distances, in conjunction with the global statistical average distances obtainable from a database of real proteins and determined by the primary sequence of the protein under study, are used to determine the tertiary structure. Five proteins, parvalbumin, hemerythrin, human hemoglobin, lamprey hemoglobin, and sperm whale myoglobin, are investigated. The root mean square (RMS) errors between the calculated structures and those determined by X-ray diffraction range from 4.78 to 7.56 Å. These RMSs are 0.21–2.76 Å lower than those estimated without the secondary structure information. Contact maps and three-dimensional backbone representations also show considerable improvements with the introduction of secondary structure information.  相似文献   

3.
An analysis of geometrical models for computing the tertiary structure of globular proteins from the primary structure is presented. The roles of initial configuration, input information on inter-residue distances and the errors in this information are delineated. It is shown that for local information like that on secondary structure, the calculated structure is very sensitive to errors and to the initial configuration. Thus, such information is far from adequate for predicting the tertiary structure. On the other hand, global information on all the inter-residue distances is quite insensitive to errors. A semi-empirical method is presented to estimate these distances and the calculated structures are given for two proteins—pancreatic trypsin inhibitor and parvalbumin. These structures have good resemblances to those determined by X-ray diffraction. A strategy for further refinement of the method is indicated.  相似文献   

4.
R. Rajgaria  Y. Wei  C. A. Floudas 《Proteins》2010,78(8):1825-1846
An integer linear optimization model is presented to predict residue contacts in β, α + β, and α/β proteins. The total energy of a protein is expressed as sum of a Cα? Cα distance dependent contact energy contribution and a hydrophobic contribution. The model selects contact that assign lowest energy to the protein structure as satisfying a set of constraints that are included to enforce certain physically observed topological information. A new method based on hydrophobicity is proposed to find the β‐sheet alignments. These β‐sheet alignments are used as constraints for contacts between residues of β‐sheets. This model was tested on three independent protein test sets and CASP8 test proteins consisting of β, α + β, α/β proteins and it was found to perform very well. The average accuracy of the predictions (separated by at least six residues) was ~61%. The average true positive and false positive distances were also calculated for each of the test sets and they are 7.58 Å and 15.88 Å, respectively. Residue contact prediction can be directly used to facilitate the protein tertiary structure prediction. This proposed residue contact prediction model is incorporated into the first principles protein tertiary structure prediction approach, ASTRO‐FOLD. The effectiveness of the contact prediction model was further demonstrated by the improvement in the quality of the protein structure ensemble generated using the predicted residue contacts for a test set of 10 proteins. Proteins 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

5.
We have developed a new methodology that determines protein structures using small-angle X-ray scattering (SAXS) data. The current bottlenecks in determining the protein structures require a new strategy using the simple design of an experiment, and SAXS is suitable for this purpose in spite of its low information content. First we demonstrated that SAXS constraints work additively to NMR-derived information in calculating structures. Next, structure calculations for nine proteins taking different folds were performed using the SAXS constraints combined with the NMR-derived distance restraints for local geometry such as secondary structures or those for tertiary structure. The results show that the SAXS constraints complemented the tertiary-structural information for all the proteins, and that accuracy of the structures thus obtained with SAXS constraints and local geometrical restraints ranged from 1.85 to 4.33 Å. Based on these results, we were able to construct a coarse-grained protein model at amino acid residue resolution.  相似文献   

6.
MOTIVATION: Until ab initio structure prediction methods are perfected, the estimation of structure for protein molecules will depend on combining multiple sources of experimental and theoretical data. Secondary structure predictions are a particularly useful source of structural information, but are currently only approximately 70% correct, on average. Structure computation algorithms which incorporate secondary structure information must therefore have methods for dealing with predictions that are imperfect. EXPERIMENTS PERFORMED: We have modified our algorithm for probabilistic least squares structural computations to accept 'disjunctive' constraints, in which a constraint is provided as a set of possible values, each weighted with a probability. Thus, when a helix is predicted, the distances associated with a helix are given most of the weight, but some weights can be allocated to the other possibilities (strand and coil). We have tested a variety of strategies for this weighting scheme in conjunction with a baseline synthetic set of sparse distance data, and compared it with strategies which do not use disjunctive constraints. RESULTS: Naive interpretations in which predictions were taken as 100% correct led to poor-quality structures. Interpretations that allow disjunctive constraints are quite robust, and even relatively poor predictions (58% correct) can significantly increase the quality of computed structures (almost halving the RMS error from the known structure). CONCLUSIONS: Secondary structure predictions can be used to improve the quality of three-dimensional structural computations. In fact, when interpreted appropriately, imperfect predictions can provide almost as much improvement as perfect predictions in three-dimensional structure calculations.  相似文献   

7.
Solid-state nuclear magnetic resonance (NMR) measurements have made major contributions to our understanding of the molecular structures of amyloid fibrils, including fibrils formed by the beta-amyloid peptide associated with Alzheimer's disease, by proteins associated with fungal prions, and by a variety of other polypeptides. Because solid-state NMR techniques can be used to determine interatomic distances (both intramolecular and intermolecular), place constraints on backbone and side-chain torsion angles, and identify tertiary and quaternary contacts, full molecular models for amyloid fibrils can be developed from solid-state NMR data, especially when supplemented by lower-resolution structural constraints from electron microscopy and other sources. In addition, solid-state NMR data can be used as experimental tests of various proposals and hypotheses regarding the mechanisms of amyloid formation, the nature of intermediate structures, and the common structural features within amyloid fibrils. This review introduces the basic experimental and conceptual principles behind solid-state NMR methods that are applicable to amyloid fibrils, reviews the information about amyloid structures that has been obtained to date with these methods, and discusses how solid-state NMR data provide insights into the molecular interactions that stabilize amyloid structures, the generic propensity of polypeptide chains to form amyloid fibrils, and a number of related issues that are of current interest in the amyloid field.  相似文献   

8.
Chen H  Kihara D 《Proteins》2008,71(3):1255-1274
The error in protein tertiary structure prediction is unavoidable, but it is not explicitly shown in most of the current prediction algorithms. Estimated error of a predicted structure is crucial information for experimental biologists to use the prediction model for design and interpretation of experiments. Here, we propose a method to estimate errors in predicted structures based on the stability of the optimal target-template alignment when compared with a set of suboptimal alignments. The stability of the optimal alignment is quantified by an index named the SuboPtimal Alignment Diversity (SPAD). We implemented SPAD in a profile-based threading algorithm and investigated how well SPAD can indicate errors in threading models using a large benchmark dataset of 5232 alignments. SPAD shows a very good correlation not only to alignment shift errors but also structure-level errors, the root mean square deviation (RMSD) of predicted structure models to the native structures (i.e. global errors), and local errors at each residue position. We have further compared SPAD with seven other quality measures, six from sequence alignment-based measures and one atomic statistical potential, discrete optimized protein energy (DOPE), in terms of the correlation coefficient to the global and local structure-level errors. In terms of the correlation to the RMSD of structure models, when a target and a template are in the same SCOP family, the sequence identity showed a best correlation to the RMSD; in the superfamily level, SPAD was the best; and in the fold level, DOPE was best. However, in a head-to-head comparison, SPAD wins over the other measures. Next, SPAD is compared with three other measures of local errors. In this comparison, SPAD was best in all of the family, the superfamily and the fold levels. Using the discovered correlation, we have also predicted the global and local error of our predicted structures of CASP7 targets by the SPAD. Finally, we proposed a sausage representation of predicted tertiary structures which intuitively indicate the predicted structure and the estimated error range of the structure simultaneously.  相似文献   

9.
Published X‐ray crystallographic structures for glycoside hydrolases (GHs) from 39 different families are surveyed according to some rigorous selection criteria, and the distances separating 208 pairs of catalytic carboxyl groups (20 α‐retaining, 87 β‐retaining, 38 α‐inverting, and 63 β‐inverting) are analyzed. First, the average of all four inter‐carboxyl OO distances for each pair is determined; second, the mean of all the pair‐averages within each GH family is determined; third, means are determined for groups of GH families. No significant differences are found for free structures compared with those complexed with a ligand in the active site of the enzyme, nor for α‐GHs as compared with β‐GHs. The mean and standard deviation (1σ) of the unimodal distribution of average OO distances for all families of inverting GHs is 8 ± 2Å, with a very wide range from 5Å (GH82) to nearly 13Å (GH46). The distribution of average OO distances for all families of retaining GHs appears to be bimodal: the means and standard deviations of the two groups are 4.8 ± 0.3Å and 6.4 ± 0.6Å. These average values are more representative, and more likely to be meaningful, than the often‐quoted literature values, which are based on a very small sample of structures. The newly‐updated average values proposed here may alter perceptions about what separations between catalytic residues are “normal” or “abnormal” for GHs. Proteins 2014; 82:1747–1755. © 2014 Wiley Periodicals, Inc.  相似文献   

10.
In recent years, it has been repeatedly demonstrated that the coordinates of the main-chain atoms alone are sufficient to determine the side-chain conformations of buried residues of compact proteins. Given a perfect backbone, the side-chain packing method can predict the side-chain conformations to an accuracy as high as 1.2 Å RMS deviation (RMSD) with greater than 80% of the χ angles correct. However, similarly rigorous studies have not been conducted to determine how well these apply, if at all, to the more important problem of homology modeling per se. Specifically, if the available backbone is imperfect, as expected for practical application of homology modeling, can packing constraints alone achieve sufficiently accurate predictions to be useful? Here, by systematically applying such methods to the pairwise modeling of two repressor and two cro proteins from the closely related bacteriophages 434 and P22, we find that when the backbone RMSD is 0.8 Å, the prediction on buried side chain is accurate with an RMS error of 1.8 Å and approximately 70% of the χ angles correctly predicted. When the backbone RMSD is larger, in the range of 1.6–1.8 Å, the prediction quality is still significantly better than random, with RMS error at 2.2 Å on the buried side chains and 60% accuracy on χ angles. Together these results suggest the following rules-of-thumb for homology modeling of buried side chains. When the sequence identity between the modeled sequence and the template sequence is >50% (or, equivalently, the expected backbone RMSD is <1 Å), side-chain packing methods work well. When sequence identity is between 30–50%, reflecting a backbone RMS error of 1–2 Å, it is still valid to use side-chain packing methods to predict the buried residues, albeit with care. When sequence identity is below 30% (or backbone RMS error greater than 2 Å), the backbone constraint alone is unlikely to produce useful models. Other methods, such as those involving the use of database fragments to reconstruct a template backbone, may be necessary as a complementary guide for modeling.  相似文献   

11.
We have developed a method for predicting the structure of small RNA loops that can be used to augment already existing RNA modeling techniques. The method requires no input constraints on loop configuration other than end-to-end distance. Initial loop structures are generated by randomizing the torsion angles, beginning at one end of the polynucleotide chain and correlating each successive angle with the previous. The bond lengths of these structures are then scaled to fit within the known end constraints and the equilibrium bond lengths of the potential energy function are scaled accordingly. Through a series of rescaling and minimization steps the structures are allowed to relax to lower energy configurations with standard bond lengths and reduced van der Waals clashes. This algorithm has been tested on the variable loops of yeast tRNA-Asp and yeast tRNA-Phe, as well as the sarcin-ricin tetraloop and the anticodon loop of yeast tRNA-Phe. The results indicate good correlation between potential energy and the loop structure predictions that are closest to the variable loop crystal structures, but poorer correlation for the more isolated stem loops. The number of stacking interactions has proven to be a good objective measure of the best loop predictions. Selecting on the basis of energy and stacking, we obtain two structures with 0.65 and 0.75 Å all-atom rms deviations (RMSD) from the crystal structure for the tRNA-Asp variable loop. The best structure prediction for the tRNA-Phe variable loop has an all-atom RMSD of 2.2 Å and a backbone RMSD of 1.6 Å, with a single base responsible for most of the deviation. For the sarcin-ricin loop from 28S ribosomal RNA, the predicted structure's all-atom RMSD from the nmr structure is 1.0 Å. We obtain a 1.8 Å RMSD structure for the tRNA-Phe anticodon loop. © 1996 John Wiley & Sons, Inc.  相似文献   

12.
Abstract

As part of our on-going development of a method, based upon distance geometry calculations, for predicting the structures of proteins from the known structures of their homologues, we have predicted the structure of the 176 residue Flavodoxin from Escherichia coli. This prediction was based upon the crystal structures of the homologous Flavodoxins from Anacystis nidulans, Chondrus crispus, Desulfovibrio vulgaris and Clostridium beijerinckii, whose sequence identities with Escherichia coli were 44%, 33%, 23% and 16%, respectively. A total of 13,043 distance constraints among the alpha-carbons of the Escherichia coli structure were derived from the sequence alignments with the known structures, together with 8,893 distance constraints among backbone and sidechain atoms of adjacent residues, 978 between the alpha-carbons and selected atoms of the flavin mononucleotide cofactor, 116 constraints to enforce conserved hydrogen bonds, and 452 constraints on the torsion angles in conserved residues. An ensemble of ten random Escherichia coli structures was computed from these constraints, with an average root mean square coordinate deviation (RMSD) among the alpha carbons of 0.85 Ångstroms (excluding the first 1 and last 6 residues, which have no corresponding residues in any of the homologues and hence were unconstrained); the corresponding average heavy-atom RMSD was 1.60 Å.

Since the distance geometry calculations were performed without hydrogen atoms, protons were added to the resulting structures and these structures embedded in a 50 × 50 × 40 Å solvent box with periodic boundary conditions. They were then subjected to a 20 picosecond dynamical simulated annealing procedure, starting at 300 K and gradually reduced to 10K, in which all the distance and torsion angle constraints were maintained by means of harmonic restraint functions. This was followed up by 1000 iterations of unrestrained conjugate gradients minimization. The goal of this energy refinement procedure was not to drastically modify the structures in an attempt at a priori prediction, but merely to improve upon the predictions obtained from the geometric constraints, particularly with regard to their local backbone and sidechain conformations and their hydrogen bonds. The resulting structures differed from the respective starting structures by an average of 1.52 Å in their heavy atom RMSD's, while the average RMSD among the heavy atoms of residues 2-170 increased slightly to 1.66 Å. We hope these structures will be good enough to enable the phase problem to be solved for the crystallographic data that is now being collected on this protein.  相似文献   

13.
A comprehensive statistical analysis of residue-residue contacts and residue environment in protein 3-D structures is presented. In the present work the range of interresidue interactions (effective radius of influence) in tertiary structures of proteins is examined and found to be 10 Å. This result is obtained by correlating the average number of residues within a spherical volume of different radii (contact numbers) with hydrophobicity. Best correlations are obtained with a radius of 10 Å. The same result is obtained when (i) only long-range interactions are considered and (ii) representative side chain atoms are used to indicate the tertiary structure instead of the usual representation of Cα atoms. Residue environment has been investigated using similar methods. Environmental hydrophobicity varies within only a small range of all residue types. Other physicochemical properties also exhibit similar trends of variation, and only five hydrophobic residues (Leu, Val, Met, Phe and Ile) produce a decrement of around 10% from the expected mean of the physicochemical distance between a residue type and its average environment. An information theory approach is proposed to compare domains, which takes into account the effective radius of influence of residues and sequence similarity.  相似文献   

14.

Background

Protein inter-residue contact maps provide a translation and rotation invariant topological representation of a protein. They can be used as an intermediary step in protein structure predictions. However, the prediction of contact maps represents an unbalanced problem as far fewer examples of contacts than non-contacts exist in a protein structure. In this study we explore the possibility of completely eliminating the unbalanced nature of the contact map prediction problem by predicting real-value distances between residues. Predicting full inter-residue distance maps and applying them in protein structure predictions has been relatively unexplored in the past.

Results

We initially demonstrate that the use of native-like distance maps is able to reproduce 3D structures almost identical to the targets, giving an average RMSD of 0.5Å. In addition, the corrupted physical maps with an introduced random error of ±6Å are able to reconstruct the targets within an average RMSD of 2Å. After demonstrating the reconstruction potential of distance maps, we develop two classes of predictors using two-dimensional recursive neural networks: an ab initio predictor that relies only on the protein sequence and evolutionary information, and a template-based predictor in which additional structural homology information is provided. We find that the ab initio predictor is able to reproduce distances with an RMSD of 6Å, regardless of the evolutionary content provided. Furthermore, we show that the template-based predictor exploits both sequence and structure information even in cases of dubious homology and outperforms the best template hit with a clear margin of up to 3.7Å. Lastly, we demonstrate the ability of the two predictors to reconstruct the CASP9 targets shorter than 200 residues producing the results similar to the state of the machine learning art approach implemented in the Distill server.

Conclusions

The methodology presented here, if complemented by more complex reconstruction protocols, can represent a possible path to improve machine learning algorithms for 3D protein structure prediction. Moreover, it can be used as an intermediary step in protein structure predictions either on its own or complemented by NMR restraints.  相似文献   

15.
The present study addresses the effect of structural distortion, caused by protein modeling errors, on calculated binding affinities toward small molecules. The binding affinities to a total of 300 distorted structures based on five different protein–ligand complexes were evaluated to establish a broadly applicable relationship between errors in protein structure and errors in calculated binding affinities. Relatively accurate protein models (less than 2 Å RMSD within the binding site) demonstrate a 14.78 (±7.5)% deviation in binding affinity from that calculated by using the corresponding crystal structure. For structures of 2–3 Å, 3–4 Å, and >4 Å RMSD within the binding site, the error in calculated binding affinity increases to 20.8 (±5.98), 22.79 (±11.3), and 29.43 (±11.47)%, respectively. The results described here may be used in combination with other tools to evaluate the utility of modeled protein structures for drug development or other ligand‐binding studies. Proteins 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

16.
DISGEO is a new implementation of a distance geometry algorithm which has been specialized for the calculation of macromolecular conformation from distance measurements obtained by two-dimensional nuclear Overhauser enhancement spectroscopy. The improvements include (1) a decomposition of the complete embedding process into two successive, more tractable calculations by the use of “substructures”, (2) a compact data structure for storing incomplete distance information on a structure, (3) a more efficient shortest-path algorithm for computing the triangle inequality limits on all distances from this information, (4) a new algorithm for selecting random metric spaces from within these limits, (5) the use of chirality constraints to obtain good covalent geometry without the use ofad hoc weights or excessive optimization. The utility of the resultant program with nuclear magnetic resonance data is demonstrated by embedding complete spatial structures for the protein basic pancreatic trypsin inhibitor vs all 508 intramolecular, interresidue proton-proton contacts shorter than 4.0 Å that were present in its crystal structure. The crystal structure could be reproduced from this data set to within 1.3 Å minimum root mean square coordinate difference between the backbone atoms. We conclude that the information potentially available from nuclear magnetic resonance experiments in solution is sufficient to define the spatial structure of small proteins.  相似文献   

17.
We describe the development of a scoring function based on the decomposition P(structure/sequence) proportional to P(sequence/structure) *P(structure), which outperforms previous scoring functions in correctly identifying native-like protein structures in large ensembles of compact decoys. The first term captures sequence-dependent features of protein structures, such as the burial of hydrophobic residues in the core, the second term, universal sequence-independent features, such as the assembly of beta-strands into beta-sheets. The efficacies of a wide variety of sequence-dependent and sequence-independent features of protein structures for recognizing native-like structures were systematically evaluated using ensembles of approximately 30,000 compact conformations with fixed secondary structure for each of 17 small protein domains. The best results were obtained using a core scoring function with P(sequence/structure) parameterized similarly to our previous work (Simons et al., J Mol Biol 1997;268:209-225] and P(structure) focused on secondary structure packing preferences; while several additional features had some discriminatory power on their own, they did not provide any additional discriminatory power when combined with the core scoring function. Our results, on both the training set and the independent decoy set of Park and Levitt (J Mol Biol 1996;258:367-392), suggest that this scoring function should contribute to the prediction of tertiary structure from knowledge of sequence and secondary structure.  相似文献   

18.
This is the second of two papers describing a method for the joint refinement of the structure of fluid bilayers using x-ray and neutron diffraction data. We showed in the first paper (Wiener, M. C., and S. H. White. 1990. Biophys. J. 59:162-173) that fluid bilayers generally consist of a nearly perfect lattice of thermally disordered unit cells and that the canonical resolution d/hmax is a measure of the widths of quasimolecular components represented by simple Gaussian functions. The thermal disorder makes possible a "composition space" representation in which the quasimolecular Gaussian distributions describe the number or probability of occupancy per unit length across the width of the bilayer of each component. This representation permits the joint refinement of neutron and x-ray lamellar diffraction data by means of a single quasimolecular structure that is fit simultaneously to both diffraction data sets. Scaling of each component by the appropriate neutron or x-ray scattering length maps the composition space profile to the appropriate scattering length space for comparison to experimental data. Other extensive properties, such as mass, can also be obtained by an appropriate scaling of the refined composition space structure. Based upon simple bilayer models involving crystal and liquid crystal structural information, we estimate that a fluid bilayer with hmax observed diffraction orders will be accurately represented by a structure with approximately hmax quasimolecular components. Strategies for assignment of quasimolecular components are demonstrated through detailed parsing of a phospholipid molecule based upon the one-dimensional projection of the crystal structure of dimyristoylphosphatidylcholine. Finally, we discuss in detail the number of experimental variables required for the composition space joint refinement. We find fluid bilayer structures to be marginally determined by the experimental data. The analysis of errors, which takes on particular importance under these circumstances, is also discussed.  相似文献   

19.
The analysis of amino acid coevolution has emerged as a practical method for protein structural modeling by providing structural contact information from alignments of amino acid sequences. In parallel, chemical cross-linking/mass spectrometry (XLMS) has gained attention as a universally applicable method for obtaining low-resolution distance constraints to model the quaternary arrangements of proteins, and more recently even protein tertiary structures. Here, we show that the structural information obtained by XLMS and coevolutionary analysis are effectively complementary: the distance constraints obtained by each method are almost exclusively associated with non-coincident pairs of residues, and modeling results obtained by the combination of both sets are improved relative to considering the same total number of constraints of a single type. The structural rationale behind the complementarity of the distance constraints is discussed and illustrated for a representative set of proteins with different sizes and folds.  相似文献   

20.
A new, efficient method for the assembly of protein tertiary structure from known, loosely encoded secondary structure restraints and sparse information about exact side chain contacts is proposed and evaluated. The method is based on a new, very simple method for the reduced modeling of protein structure and dynamics, where the protein is described as a lattice chain connecting side chain centers of mass rather than Cαs. The model has implicit built-in multibody correlations that simulate short- and long-range packing preferences, hydrogen bonding cooperativity and a mean force potential describing hydrophobic interactions. Due to the simplicity of the protein representation and definition of the model force field, the Monte Carlo algorithm is at least an order of magnitude faster than previously published Monte Carlo algorithms for structure assembly. In contrast to existing algorithms, the new method requires a smaller number of tertiary restraints for successful fold assembly; on average, one for every seven residues as compared to one for every four residues. For example, for smaller proteins such as the B domain of protein G, the resulting structures have a coordinate root mean square deviation (cRMSD), which is about 3 Å from the experimental structure; for myoglobin, structures whose backbone cRMSD is 4.3 Å are produced, and for a 247-residue TIM barrel, the cRMSD of the resulting folds is about 6 Å. As would be expected, increasing the number of tertiary restraints improves the accuracy of the assembled structures. The reliability and robustness of the new method should enable its routine application in model building protocols based on various (very sparse) experimentally derived structural restraints. Proteins 32:475–494, 1998. © 1998 Wiley-Liss, Inc.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号