首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 968 毫秒
1.
The problem of protein tertiary structure prediction from primary sequence can be separated into two subproblems: generation of a library of possible folds and specification of a best fold given the library. A distance geometry procedure based on random pairwise metrization with good sampling properties was used to generate a library of 500 possible structures for each of 11 small helical proteins. The input to distance geometry consisted of sets of restraints to enforce predicted helical secondary structure and a generic range of 5 to 11 A between predicted contact residues on all pairs of helices. For each of the 11 targets, the resulting library contained structures with low RMSD versus the native structure. Near-native sampling was enhanced by at least three orders of magnitude compared to a random sampling of compact folds. All library members were scored with a combination of an all-atom distance-dependent function, a residue pair-potential, and a hydrophobicity function. In six of the 11 cases, the best-ranking fold was considered to be near native. Each library was also reduced to a final ab initio prediction via consensus distance geometry performed over the 50 best-ranking structures from the full set of 500. The consensus results were of generally higher quality, yielding six predictions within 6.5 A of the native fold. These favorable predictions corresponded to those for which the correlation between the RMSD and the scoring function were highest. The advantage of the reported methodology is its extreme simplicity and potential for including other types of structural restraints.  相似文献   

2.
Errors and imprecisions in distance restraints derived from NOESY peak volumes are usually accounted for by generous lower and upper bounds on the distances. In this paper, we propose a new form of distance restraints, replacing the subjective bounds by a potential function obtained from the error distribution of the distances. We derived the shape of the potential from molecular dynamics calculations and by comparison of NMR data with X-ray crystal structures. We used complete cross-validation to derive the optimal weight for the data in the calculation. In a model system with synthetic restraints, the accuracy of the structures improved significantly compared to calculations with the usual form of restraints. For experimental data sets, the structures systematically approach the X-ray crystal structures of the same protein. Also standard quality indicators improve compared to standard calculations. The results did not depend critically on the exact shape of the potential. The new approach is less subjective and uses fewer assumptions in the interpretation of NOESY peak volumes as distance restraints than the usual approach. Figures of merit for the structures, such as the RMS difference from the average structure or the RMS difference from the data, are therefore less biased and more meaningful measures of structure quality than with the usual form of restraints.  相似文献   

3.
A low-resolution scoring function for the selection of native and near-native structures from a set of predicted structures for a given protein sequence has been developed. The scoring function, ProVal (Protein Validate), used several variables that describe an aspect of protein structure for which the proximity to the native structure can be assessed quantitatively. Among the parameters included are a packing estimate, surface areas, and the contact order. A partial least squares for latent variables (PLS) model was built for each candidate set of the 28 decoy sets of structures generated for 22 different proteins using the described parameters as independent variables. The C(alpha) RMS of the candidate structures versus the experimental structure was used as the dependent variable. The final generalized scoring function was an average of all models derived, ensuring that the function was not optimized for specific fold classes or method of structure generation of the candidate folds. The results show that the crystal structure was scored best in 64% of the 28 test sets and was clearly separated from the decoys in many examples. In all the other cases in which the crystal structure did not rank first, it ranked within the top 10%. Thus, although ProVal could not distinguish between predicted structures that were similar overall in fold quality due to its inherently low resolution, it can clearly be used as a primary filter to eliminate approximately 90% of fold candidates generated by current prediction methods from all-atom modeling and further evaluation. The correlation between the predicted and actual C(alpha) RMS values varies considerably between the candidate fold sets.  相似文献   

4.
We present an efficient new algorithm that enumerates all possible conformations of a protein that satisfy a given set of distance restraints. Rapid growth of all possible self-avoiding conformations on the diamond lattice provides construction of alpha-carbon representations of a protein fold. We investigated the dependence of the number of conformations on pairwise distance restraints for the proteins crambin, pancreatic trypsin inhibitor, and ubiquitin. Knowledge of between one and two contacts per monomer is shown to be sufficient to restrict the number of candidate structures to approximately 1,000 conformations. Pairwise RMS deviations of atomic position comparisons between pairs of these 1,000 structures revealed that these conformations can be grouped into about 25 families of structures. These results suggest a new approach to assessing alternative protein folds given a very limited number of distance restraints. Such restraints are available from several experimental techniques such as NMR, NOESY, energy transfer fluorescence spectroscopy, and crosslinking experiments. This work focuses on exhaustive enumeration of protein structures with emphasis on the possible use of NOESY-determined distance restraints.  相似文献   

5.
A restrained least squares refinement of the solution structure of the double-stranded DNA undecamer 5'd(AAGTGT-GACAT).5'd(ATGTCACACTT) comprising a portion of the specific target site of the cAMP receptor protein in the gal operon is presented. The structure is refined on the basis of both distance and planarity restraints, 2331 in all. The distance restraints comprise 150 interproton distances determined from pre-steady state nuclear Overhauser enhancement measurements and 2159 other interatomic distances derived from idealized geometry (i.e., distances between covalently bonded atoms, between atoms defining fixed bond angles, and between atoms defining hydrogen bonding in AT and GC base pairs). Two refinements were carried out and in both cases the final RMS difference between the experimental and calculated interproton distances was 0.2 A. The difference between the two refined structures is small (overall RMS difference of 0.23 A) and represents the error in the refined coordinates. Although the refined structures have an overall B-type conformation there are large variations in many of the local conformational parameters including backbone and glycosidic bond torsion angles, helical twist and propellor twist, base roll and base tilt angles.  相似文献   

6.
There are several knowledge-based energy functions that can distinguish the native fold from a pool of grossly misfolded decoys for a given sequence of amino acids. These decoys, which are typically generated by mounting, or “threading”, the sequence onto the backbones of unrelated protein structures, tend to be non-compact and quite different from the native structure: the root-mean-squared (RMS) deviations from the native are commonly in the range of 15 to 20 Å. Effective energy functions should also demonstrate a similar recognition capability when presented with compact decoys that depart only slightly in conformation from the correct structure (i.e. those with RMS deviations of ∼5 Å or less). Recently, we developed a simple yet powerful method for native fold recognition based on the tendency for native folds to form hydrophobic cores. Our energy measure, which we call the hydrophobic fitness score, is challenged to recognize the native fold from 2000 near-native structures generated for each of five small monomeric proteins. First, 1000 conformations for each protein were generated by molecular dynamics simulation at room temperature. The average RMS deviation of this set of 5000 was 1.5 Å. A total of 323 decoys had energies lower than native; however, none of these had RMS deviations greater than 2 Å. Another 1000 structures were generated for each at high temperature, in which a greater range of conformational space was explored (4.3 Å average RMS deviation). Out of this set, only seven decoys were misrecognized. The hydrophobic fitness energy of a conformation is strongly dependent upon the RMS deviation. On average our potential yields energy values which are lowest for the population of structures generated at room temperature, intermediate for those produced at high temperature and highest for those constructed by threading methods. In general, the lowest energy decoy conformations have backbones very close to native structure. The possible utility of our method for screening backbone candidates for the purpose of modelling by side-chain packing optimization is discussed.  相似文献   

7.
One-dimensional (1D) structures of proteins such as secondary structure and contact number provide intuitive pictures to understand how the native three-dimensional (3D) structure of a protein is encoded in the amino acid sequence. However, it is still not clear whether a given set of 1D structures contains sufficient information for recovering the underlying 3D structure. Here we show that the 3D structure of a protein can be recovered from a set of three types of 1D structures, namely, secondary structure, contact number and residue-wise contact order which is introduced here for the first time. Using simulated annealing molecular dynamics simulations, the structures satisfying the given native 1D structural restraints were sought for 16 proteins of various structural classes and of sizes ranging from 56 to 146 residues. By selecting the structures best satisfying the restraints, all the proteins showed a coordinate RMS deviation of <4 A from the native structure, and, for most of them, the deviation was even <2 A. The present result opens a new possibility to protein structure prediction and our understanding of the sequence-structure relationship.  相似文献   

8.
We describe a novel method to generate ensembles of conformations of the main-chain atoms [N, C(alpha), C, O, Cbeta] for a sequence of amino acids within the context of a fixed protein framework. Each conformation satisfies fundamental stereo-chemical restraints such as idealized geometry, favorable phi/psi angles, and excluded volume. The ensembles include conformations both near and far from the native structure. Algorithms for effective conformational sampling and constant time overlap detection permit the generation of thousands of distinct conformations in minutes. Unlike previous approaches, our method samples dihedral angles from fine-grained phi/psi state sets, which we demonstrate is superior to exhaustive enumeration from coarse phi/psi sets. Applied to a large set of loop structures, our method samples consistently near-native conformations, averaging 0.4, 1.1, and 2.2 A main-chain root-mean-square deviations for four, eight, and twelve residue long loops, respectively. The ensembles make ideal decoy sets to assess the discriminatory power of a selection method. Using these decoy sets, we conclude that quality of anchor geometry cannot reliably identify near-native conformations, though the selection results are comparable to previous loop prediction methods. In a subsequent study (de Bakker et al.: Proteins 2003;51:21-40), we demonstrate that the AMBER forcefield with the Generalized Born solvation model identifies near-native conformations significantly better than previous methods.  相似文献   

9.
The effect of internal dynamics on the accuracy of nuclear magnetic resonance (NMR) structures was studied in detail using model distance restraint sets (DRS) generated from a 6.6 nanosecond molecular dynamics trajectory of bovine pancreatic trypsin inhibitor. The model data included the effects of internal dynamics in a very realistic way. Structure calculations using different error estimates were performed with iterative removal of systematically violated restraints. The accuracy of each calculated structure was measured as the atomic root mean square (RMS) difference to the optimized average structure derived from the trajectory by structure factors refinement. Many of the distance restraints were derived from NOEs that were significantly affected by internal dynamics. Depending on the error bounds used, these distance restraints seriously distorted the structure, leading to deviations from the coordinate average of the dynamics trajectory even in rigid regions. Increasing error bounds uniformly for all distance restraints relieved the strain on the structures. However, the accuracy did not improve. Significant improvement of accuracy was obtained by identifying inconsistent restraints with violation analysis, and excluding them from the calculation. The highest accuracy was obtained by setting bounds rather tightly, and removing about a third of the restraints. The limiting accuracy for all backbone atoms was between 0.6 and 0.7 A. Also, the precision of the structures increased with removal of inconsistent restraints, indicating that a high precision is not simply the consequence of tight error bounds but of the consistency of the DRS. The precision consistently overestimated the accuracy.  相似文献   

10.
Feig M  Brooks CL 《Proteins》2002,49(2):232-245
Physical energy scoring functions based on implicit solvation models are tested by evaluating predictions from the most recent CASP4 competition. The best performing scoring functions are identified along with the best protocol for preparing structures before energies are evaluated. Ranking of structures with the best scoring functions is compared across CASP4 targets to establish when physical scoring functions can be expected to reliably distinguish structures that are most similar to the native fold in a set of misfolded or unfolded protein conformations. The results are used to interpret previous studies where scoring functions were tested on the standard decoy sets by Park, Levitt, and Baker. We show that the best physical scoring functions can be applied successfully in automated consensus scoring applications where a single best conformation has to be selected from a set of structures from different sources. Finally, the potential for better protein structure scoring functions is discussed with a suggestion for an empirically parameterized linear combination of energy components.  相似文献   

11.
A set of conformational restraints derived from nuclear magnetic resonance (n.m.r.) measurements on solutions of the basic pancreatic trypsin inhibitor (BPTI) was used as input for distance geometry calculations with the programs DISGEO and DISMAN. Five structures obtained with each of these algorithms were systematically compared among themselves and with the crystal structure of BPTI. It is clear that the protein architecture observed in single crystals of BPTI is largely preserved in aqueous solution, with local structural differences mainly confined to the protein surface. The results confirm that protein conformations determined in solution by combined use of n.m.r. and distance geometry are a consequence of the experimental data and do not depend significantly on the algorithm used for the structure determination. The data obtained further provide an illustration that long intramolecular distances in proteins, which are comparable with the radius of gyration, are defined with high precision by relatively imprecise nuclear Overhauser enhancement measurements of a large number of much shorter distances.  相似文献   

12.
A general approach to the problem of molecular conformation is advanced. We describe a formalism that permits experimental and theoretical information to be incorporated into a set of upper and lower bounds on intramolecular distances. Structures (conformations) meeting these bounds can be readily generated and compared with each other. To illustrate the use of the method, we have employed a simple “firehose” model for protein folding to predict the long-range hydrophobic interactions in a small protein: pancreatic trypsin inhibitor. Models of this type lead to the proper hairpin turns and a reasonable set of long-range contacts for this protein. Application of the distance geometry method then yields backbone conformations with errors of 4–8 Å compared to the native structure. We discuss both the merits and shortcomings of the firehose model and the relation between distance geometry and energy minimization techniques.  相似文献   

13.
A distance-dependent atom-pair potential that treats long range and local interactions separately has been developed and optimized to distinguish native protein structures from sets of incorrect or decoy structures. Atoms are divided into 30 types based on chemical properties and relative position in the amino acid side-chains. Several parameters affecting the calculation and evaluation of this statistical potential, such as the reference state, the bin width, cutoff distances between pairs, and the number of residues separating the atom pairs, are adjusted to achieve the best discrimination. The native structure has the lowest energy for 39 of the 40 sets of original ROSETTA decoys (1000 structures per set) and 23 of the 25 improved decoys (approximately 1900 structures per set). Combined with the orientation-dependent backbone hydrogen bonding potential used by ROSETTA and a statistical solvation potential based on the solvent exclusion model of Lazaridis & Karplus, this potential is used as a scoring function for conformational search based on a genetic algorithm method. After unfolding the native structure by changing every phi and psi angle by either +/-3, +/-5 or +/-7 degrees, five small proteins can be efficiently refolded, in some cases to within 0.5 A C(alpha) distance matrix error (DME) to the native state. Although no significant correlation is found between the total energy and structural similarity to the native state, a surprisingly strong correlation exists between the radius of gyration and the DME for low energy structures.  相似文献   

14.
The solution structure of insectotoxin 15A (35 residues) from scorpion Buthus eupeus was determined on the basis of 386 interproton distance restraints 12 hydrogen-bonding restraints and 113 dihedral angle restraints derived from 1H NMR experiments. A group of 20 structures was calculated with the distance geometry program DIANA followed by the restrained energy minimization with the program CHARMM. The atomic RMS distribution about the mean coordinate position is 0.64 +/- 0.11 A for the backbone atoms and 1.35 +/- 0.20 A for all atoms. The structure contains an alpha-helix (residues 10-20) and a three-stranded antiparallel beta-sheet (residues 2-5, 24-28 and 29-33). A pairing of the eight cysteine residues of insectotoxin 15A was established basing on NMR data. Three disulfide bridges (residues 2-19, 16-31 and 20-33) connect the alpha-helix with the beta-sheet, and the fourth one (5-26) joins beta-strands together. The spatial fold of secondary structure elements (the alpha-helix and the beta-sheet) of the insectotoxin 15A is very similar to those of the other short and long scorpion toxins in spite of a low (about 20%) sequence homology.  相似文献   

15.
Crippen GM 《Proteins》2005,60(1):82-89
Cluster distance geometry is a recent generalization of distance geometry whereby protein structures can be described at even lower levels of detail than one point per residue. With improvements in the clustering technique, protein conformations can be summarized in terms of alternative contact patterns between clusters, where each cluster contains four sequentially adjacent amino acid residues. A very simple potential function involving 210 adjustable parameters can be determined that favors the native contacts of 31 small, monomeric proteins over their respective sets of nonnative contacts. This potential then favors the native contacts for 174 small, monomeric proteins that have low sequence identity with any of the training set. A broader search finds 698 small protein chains from the Protein Data Bank where the native contacts are preferred over all alternatives, even though they have low sequence identity with the training set. This amounts to a highly predictive method for ab initio protein folding at low spatial resolution.  相似文献   

16.
Protein structure prediction from sequence alone by "brute force" random methods is a computationally expensive problem. Estimates have suggested that it could take all the computers in the world longer than the age of the universe to compute the structure of a single 200-residue protein. Here we investigate the use of a faster version of our FOLDTRAJ probabilistic all-atom protein-structure-sampling algorithm. We have improved the method so that it is now over twenty times faster than originally reported, and capable of rapidly sampling conformational space without lattices. It uses geometrical constraints and a Leonard-Jones type potential for self-avoidance. We have also implemented a novel method to add secondary structure-prediction information to make protein-like amounts of secondary structure in sampled structures. In a set of 100,000 probabilistic conformers of 1VII, 1ENH, and 1PMC generated, the structures with smallest Calpha RMSD from native are 3.95, 5.12, and 5.95A, respectively. Expanding this test to a set of 17 distinct protein folds, we find that all-helical structures are "hit" by brute force more frequently than beta or mixed structures. For small helical proteins or very small non-helical ones, this approach should have a "hit" close enough to detect with a good scoring function in a pool of several million conformers. By fitting the distribution of RMSDs from the native state of each of the 17 sets of conformers to the extreme value distribution, we are able to estimate the size of conformational space for each. With a 0.5A RMSD cutoff, the number of conformers is roughly 2N where N is the number of residues in the protein. This is smaller than previous estimates, indicating an average of only two possible conformations per residue when sterics are accounted for. Our method reduces the effective number of conformations available at each residue by probabilistic bias, without requiring any particular discretization of residue conformational space, and is the fastest method of its kind. With computer speeds doubling every 18 months and parallel and distributed computing becoming more practical, the brute force approach to protein structure prediction may yet have some hope in the near future.  相似文献   

17.
Modeling mutations in protein structures   总被引:2,自引:0,他引:2  
We describe an automated method for the modeling of point mutations in protein structures. The protein is represented by all non-hydrogen atoms. The scoring function consists of several types of physical potential energy terms and homology-derived restraints. The optimization method implements a combination of conjugate gradient minimization and molecular dynamics with simulated annealing. The testing set consists of 717 pairs of known protein structures differing by a single mutation. Twelve variations of the scoring function were tested in three different environments of the mutated residue. The best-performing protocol optimizes all the atoms of the mutated residue, with respect to a scoring function that includes molecular mechanics energy terms for bond distances, angles, dihedral angles, peptide bond planarity, and non-bonded atomic contacts represented by Lennard-Jones potential, dihedral angle restraints derived from the aligned homologous structure, and a statistical potential for non-bonded atomic interactions extracted from a large set of known protein structures. The current method compares favorably with other tested approaches, especially when predicting long and flexible side-chains. In addition to the thoroughness of the conformational search, sampled degrees of freedom, and the scoring function type, the accuracy of the method was also evaluated as a function of the flexibility of the mutated side-chain, the relative volume change of the mutated residue, and its residue type. The results suggest that further improvement is likely to be achieved by concentrating on the improvement of the scoring function, in addition to or instead of increasing the variety of sampled conformations.  相似文献   

18.
Alexandrescu AT 《Proteins》2004,56(1):117-129
Introductory biochemistry texts often note that the fold of a protein is completely defined when the dihedral angles phi and psi are known for each amino acid. This assertion was examined with torsion angle dynamics and simulated annealing (TAD/SA) calculations of protein G using only dihedral angle restraints. When all dihedral angles were restrained to within 1 degrees of the values of the X-ray structure, the TAD/SA structures gave a backbone root mean square deviation to the target of 4 A. Factors that contributed to divergence from the correct solution include deviations of peptide bonds from planarity, internal conflicts resulting from the nonuniform energies of different phi, psi combinations, and relaxation to extended conformations in the absence of long-range constraints. Simulations including hydrogen-bond restraints showed that even a few long-range contacts constrain the fold better than a complete set of accurate dihedral restraints. A procedure is described for TAD/SA calculations using hydrogen-bond restraints, idealized dihedral restraints for residues in regular secondary structures, and "hydrophobic distance restraints" derived from the positions of hydrophobic residues in the amino acid sequence. The hydrogen-bond restraints are treated as inviolable, whereas violated hydrophobic restraints are removed following reduction of restraint upper bounds from 2 to 1 times the predicted radius of gyration. The strategy was tested with simulated restraints from X-ray structures of proteins from different fold classes and NMR data for cold shock protein A that included only backbone chemical shifts and hydrogen bonds obtained from a long-range HNCO experiment.  相似文献   

19.
W H Graham  E S Carter  R P Hicks 《Biopolymers》1992,32(12):1755-1764
Proton and 13C chemical shift assignments are reported for the neuropeptide Met-enkephalin (ME) in both aqueous solution and in the presence of 50 mM sodium dodecyl sulfate (SDS). Rotating frame nuclear Overhauser enhancement spectroscopy was used to qualitatively describe interproton distances. These distances were then used as restraints in the distance geometry based molecular modeling program Dspace, developed by Hare Research to generate sets of conformations of ME. The resulting aqueous solution conformations of ME were determined to exhibit characteristic of an extended random-coil polypeptide with no distinguishable secondary structure. The resulting set of solution conformations of ME in the presence of 50 mM SDS exhibited characteristics of an amphiphilic type IV beta turn that are stabilized by hydrophobic aromatic-aromatic interactions between the side chains of Tyr1 and Phe4.  相似文献   

20.
M J Sutcliffe  C M Dobson 《Proteins》1991,10(2):117-129
The effect of including paramagnetic relaxation data as additional restraints in the determination of protein tertiary structures from NMR data has been explored by a systematic series of model calculations. The system used for testing the method was the 2.0 A resolution tetragonal crystal structure of hen egg white lysozyme (129 amino acid residues) and structures were generated using a version of the hybrid "distance geometry-dynamic simulated annealing" procedure. A limited set of 769 NOEs was used as restraints in all the calculations; the strengths of these were categorized into three classes on the basis of distances observed in the crystal structure. The values of 50 phi angles were also restrained on the basis of amide-alpha coupling constants calculated from the X-ray structure. Five sets of 12 structures were determined using differing sets of paramagnetic relaxation data as restraints additional to those involving the NOE and coupling constant data. The paramagnetic relaxation data were modeled on the basis of the distances of defined protons from the crystallographic binding site of Gd3+ in lysozyme. Analysis of the results showed that the relaxation data significantly improved the correspondence between the set of generated structures and the crystal structure, and that the more well defined the relaxation data, the more significant the improvement in the quality of the structures. The results suggest that the inclusion of paramagnetic relaxation restraints could be of significant value for the experimental determination of protein structures from NMR data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号