首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The use of generous distance bounds has been the hallmark of NMR structure determination. However, bounds necessitate the estimation of data quality before the calculation, reduce the information content, introduce human bias, and allow for major errors in the structures. Here, we propose a new rapid structure calculation scheme based on Bayesian analysis. The minimization of an extended energy function, including a new type of distance restraint and a term depending on the data quality, results in an estimation of the data quality in addition to coordinates. This allows for the determination of the optimal weight on the experimental information. The resulting structures are of better quality and closer to the X-ray crystal structure of the same molecule. With the new calculation approach, the analysis of discrepancies from the target distances becomes meaningful. The strategy may be useful in other applications-for example, in homology modeling.  相似文献   

2.
The existence of a large number of proteins for which both nuclear magnetic resonance (NMR) and X-ray crystallographic coordinates have been deposited into the Protein Data Bank (PDB) makes the statistical comparison of the corresponding crystal and NMR structural models over a large data set possible, and facilitates the study of the effect of the crystal environment and other factors on structure. We present an approach for detecting statistically significant structural differences between crystal and NMR structural models which is based on structural superposition and the analysis of the distributions of atomic positions relative to a mean structure. We apply this to a set of 148 protein structure pairs (crystal vs NMR), and analyze the results in terms of methodological and physical sources of structural difference. For every one of the 148 structure pairs, the backbone root-mean-square distance (RMSD) over core atoms of the crystal structure to the mean NMR structure is larger than the average RMSD of the members of the NMR ensemble to the mean, with 76% of the structure pairs having an RMSD of the crystal structure to the mean more than a factor of two larger than the average RMSD of the NMR ensemble. On average, the backbone RMSD over core atoms of crystal structure to the mean NMR is approximately 1 A. If non-core atoms are included, this increases to 1.4 A due to the presence of variability in loops and similar regions of the protein. The observed structural differences are only weakly correlated with the age and quality of the structural model and differences in conditions under which the models were determined. We examine steric clashes when a putative crystalline lattice is constructed using a representative NMR structure, and find that repulsive crystal packing plays a minor role in the observed differences between crystal and NMR structures. The observed structural differences likely have a combination of physical and methodological causes. Stabilizing attractive interactions arising from intermolecular crystal contacts which shift the equilibrium of the crystal structure relative to the NMR structure is a likely physical source which can account for some of the observed differences. Methodological sources of apparent structural difference include insufficient sampling or other issues which could give rise to errors in the estimates of the precision and/or accuracy.  相似文献   

3.
F K Brown  J C Hempel  P W Jeffs 《Proteins》1992,13(4):306-326
Structures of the protein, transforming growth factor alpha (TGF-alpha), have been derived from NMR data using distance geometry and subsequent energy refinement. Analysis of the sequential NOE distance bounds using a template algorithm provides a check for consistency in the calculation of bounds, stereospecific assignment of prochiral centers, and secondary structure assignment. Application of the template algorithm to the long range NOEs found within the NMR data sets collected at pH 6.3 and pH 3.4 is used to assess the confidence levels for the accuracy of the structures obtained from modeling. The method also provides critical insight in differentiating regions of the structure that are well defined from those that are not. Use of the restraint analysis protocol is shown to be a powerful adjunct to currently used methods for the assignment of protein structures from NMR data.  相似文献   

4.
Errors and imprecisions in distance restraints derived from NOESY peak volumes are usually accounted for by generous lower and upper bounds on the distances. In this paper, we propose a new form of distance restraints, replacing the subjective bounds by a potential function obtained from the error distribution of the distances. We derived the shape of the potential from molecular dynamics calculations and by comparison of NMR data with X-ray crystal structures. We used complete cross-validation to derive the optimal weight for the data in the calculation. In a model system with synthetic restraints, the accuracy of the structures improved significantly compared to calculations with the usual form of restraints. For experimental data sets, the structures systematically approach the X-ray crystal structures of the same protein. Also standard quality indicators improve compared to standard calculations. The results did not depend critically on the exact shape of the potential. The new approach is less subjective and uses fewer assumptions in the interpretation of NOESY peak volumes as distance restraints than the usual approach. Figures of merit for the structures, such as the RMS difference from the average structure or the RMS difference from the data, are therefore less biased and more meaningful measures of structure quality than with the usual form of restraints.  相似文献   

5.
We present an automated method incorporated into a software package, FOLDER, to fold a protein sequence on a given three-dimensional (3D) template. Starting with the sequence alignment of a family of homologous proteins, tertiary structures are modeled using the known 3D structure of one member of the family as a template. Homologous interatomic distances from the template are used as constraints. For nonhomologous regions in the model protein, the lower and the upper bounds for the interatomic distances are imposed by steric constraints and the globular dimensions of the template, respectively. Distance geometry is used to embed an ensemble of structures consistent with these distance bounds. Structures are selected from this ensemble based on minimal distance error criteria, after a penalty function optimization step. These structures are then refined using energy optimization methods. The method is tested by simulating the alpha-chain of horse hemoglobin using the alpha-chain of human hemoglobin as the template and by comparing the generated models with the crystal structure of the alpha-chain of horse hemoglobin. We also test the packing efficiency of this method by reconstructing the atomic positions of the interior side chains beyond C beta atoms of a protein domain from a known 3D structure. In both test cases, models retain the template constraints and any additionally imposed constraints while the packing of the interior residues is optimized with no short contacts or bond deformations. To demonstrate the use of this method in simulating structures of proteins with nonhomologous disulfides, we construct a model of murine interleukin (IL)-4 using the NMR structure of human IL-4 as the template. The resulting geometry of the nonhomologous disulfide in the model structure for murine IL-4 is consistent with standard disulfide geometry.  相似文献   

6.
We propose a new approach for calculating the three-dimensional (3D) structure of a protein from distance and dihedral angle constraints derived from experimental data. We suggest that such constraints can be obtained from experiments such as tritium planigraphy, chemical or enzymatic cleavage of the polypeptide chain, paramagnetic perturbation of nuclear magnetic resonance (NMR) spectra, measurement of hydrogen-exchange rates, mutational studies, mass spectrometry, and electron paramagnetic resonance. These can be supplemented with constraints from theoretical prediction of secondary structures and of buried/exposed residues. We report here distance geometry calculations to generate the structures of a test protein Staphylococcal nuclease (STN), and the HIV-1 rev protein (REV) of unknown structure. From the available 3D atomic coordinates of STN, we set up simulated data sets consisting of varying number and quality of constraints, and used our group's Self Correcting Distance Geometry (SECODG) program DIAMOD to generate structures. We could generate the correct tertiary fold from qualitative (approximate) as well as precise distance constraints. The root mean square deviations of backbone atoms from the native structure were in the range of 2.0 A to 8.3 A, depending on the number of constraints used. We could also generate the correct fold starting from a subset of atoms that are on the surface and those that are buried. When we used data sets containing a small fraction of incorrect distance constraints, the SECODG technique was able to detect and correct them. In the case of REV, we used a combination of constraints obtained from mutagenic data and structure predictions. DIAMOD generated helix-loop-helix models, which, after four self-correcting cycles, populated one family exclusively. The features of the energy-minimized model are consistent with the available data on REV-RNA interaction. Our method could thus be an attractive alternative for calculating protein 3D structures, especially in cases where the traditional methods of X-ray crystallography and multidimensional NMR spectroscopy have been unsuccessful.  相似文献   

7.
We present an efficient new algorithm that enumerates all possible conformations of a protein that satisfy a given set of distance restraints. Rapid growth of all possible self-avoiding conformations on the diamond lattice provides construction of alpha-carbon representations of a protein fold. We investigated the dependence of the number of conformations on pairwise distance restraints for the proteins crambin, pancreatic trypsin inhibitor, and ubiquitin. Knowledge of between one and two contacts per monomer is shown to be sufficient to restrict the number of candidate structures to approximately 1,000 conformations. Pairwise RMS deviations of atomic position comparisons between pairs of these 1,000 structures revealed that these conformations can be grouped into about 25 families of structures. These results suggest a new approach to assessing alternative protein folds given a very limited number of distance restraints. Such restraints are available from several experimental techniques such as NMR, NOESY, energy transfer fluorescence spectroscopy, and crosslinking experiments. This work focuses on exhaustive enumeration of protein structures with emphasis on the possible use of NOESY-determined distance restraints.  相似文献   

8.
Summary A distance measure that reflects the dissimilarity among structures has been developed on the basis of the three-dimensional structures of similar proteins, this being totally independent of sequence in the sense that only the relative spatial positions of mainchain alpha-carbon atoms need be known. This procedure leads to phyletic relationships that are in general correlated with the sequence phylogenies based on residue type. Such relationships among known protein three-dimensional structures are also a useful aid to their classification and selection in knowledge-based modeling using homologous structures. We have applied this approach to six homologous sets of proteins: immunoglobulin fragments, globins, cytochromesc, serine proteinases, eye-lens gamma crystallins, and dinucleotide-binding domains.  相似文献   

9.
We propose a new geometric buildup algorithm for the solution of the distance geometry problem in protein modeling, which can prevent the accumulation of the rounding errors in the buildup calculations successfully and also tolerate small errors in given distances. In this algorithm, we use all instead of a subset of available distances for the determination of each unknown atom and obtain the position of the atom by using a least-squares approximation instead of an exact solution to the system of distance equations. We show that the least-squares approximation can be obtained by using a special singular value decomposition method, which not only tolerates and minimizes small distance errors, but also prevents the rounding errors from propagation effectively, especially when the distance data is sparse. We describe the least-squares formulations and their solution methods, and present the test results from applying the new algorithm for the determination of a set of protein structures with varying degrees of availability and accuracy of the distances. We show that the new development of the algorithm increases the modeling ability, and improves stability and robustness of the geometric buildup approach significantly from both theoretical and practical points of view.  相似文献   

10.
A fundamental problem in molecular biology is the determination of the conformation of macromolecules from NMR data. Several successful distance geometry programs have been developed for this purpose, for example DISGEO. A particularly difficult facet of these programs is the embedding problem, that is the problem of determining those conformations whose distances between atoms are nearest those measured by the NMR techniques. The embedding problem is the distance geometry equivalent of the multiple minima problem, which arises in energy minimization approaches to conformation determination. We show that the distance geometry approach has some nice geometry not associated with other methods that allows one to prove detailed results with regard to the location of local minima. We exploit this geometry to develop some algorithms which are faster and find more minima than the algorithms presently used. The authors were partially supported by National Science Foundation Grant CHE-8802341.  相似文献   

11.
A general approach to the problem of molecular conformation is advanced. We describe a formalism that permits experimental and theoretical information to be incorporated into a set of upper and lower bounds on intramolecular distances. Structures (conformations) meeting these bounds can be readily generated and compared with each other. To illustrate the use of the method, we have employed a simple “firehose” model for protein folding to predict the long-range hydrophobic interactions in a small protein: pancreatic trypsin inhibitor. Models of this type lead to the proper hairpin turns and a reasonable set of long-range contacts for this protein. Application of the distance geometry method then yields backbone conformations with errors of 4–8 Å compared to the native structure. We discuss both the merits and shortcomings of the firehose model and the relation between distance geometry and energy minimization techniques.  相似文献   

12.
We consider the problem of identifying common three-dimensional substructures between proteins. Our method is based on comparing the shape of the alpha-carbon backbone structures of the proteins in order to find three-dimensional (3D) rigid motions that bring portions of the geometric structures into correspondence. We propose a geometric representation of protein backbone chains that is compact yet allows for similarity measures that are robust against noise and outliers. This representation encodes the structure of the backbone as a sequence of unit vectors, defined by each adjacent pair of alpha-carbons. We then define a measure of the similarity of two protein structures based on the root mean squared (RMS) distance between corresponding orientation vectors of the two proteins. Our measure has several advantages over measures that are commonly used for comparing protein shapes, such as the minimum RMS distance between the 3D positions of corresponding atoms in two proteins. A key advantage is that this new measure behaves well for identifying common substructures, in contrast with position-based measures where the nonmatching portions of the structure dominate the measure. At the same time, it avoids the quadratic space and computational difficulties associated with methods based on distance matrices and contact maps. We show applications of our approach to detecting common contiguous substructures in pairs of proteins, as well as the more difficult problem of identifying common protein domains (i.e., larger substructures that are not necessarily contiguous along the protein chain).  相似文献   

13.

Background

The number of available structures of large multi-protein assemblies is quite small. Such structures provide phenomenal insights on the organization, mechanism of formation and functional properties of the assembly. Hence detailed analysis of such structures is highly rewarding. However, the common problem in such analyses is the low resolution of these structures. In the recent times a number of attempts that combine low resolution cryo-EM data with higher resolution structures determined using X-ray analysis or NMR or generated using comparative modeling have been reported. Even in such attempts the best result one arrives at is the very course idea about the assembly structure in terms of trace of the Cα atoms which are modeled with modest accuracy.

Methodology/Principal Findings

In this paper first we present an objective approach to identify potentially solvent exposed and buried residues solely from the position of Cα atoms and amino acid sequence using residue type-dependent thresholds for accessible surface areas of Cα. We extend the method further to recognize potential protein-protein interface residues.

Conclusion/ Significance

Our approach to identify buried and exposed residues solely from the positions of Cα atoms resulted in an accuracy of 84%, sensitivity of 83–89% and specificity of 67–94% while recognition of interfacial residues corresponded to an accuracy of 94%, sensitivity of 70–96% and specificity of 58–94%. Interestingly, detailed analysis of cases of mismatch between recognition of interface residues from Cα positions and all-atom models suggested that, recognition of interfacial residues using Cα atoms only correspond better with intuitive notion of what is an interfacial residue. Our method should be useful in the objective analysis of structures of protein assemblies when positions of only Cα positions are available as, for example, in the cases of integration of cryo-EM data and high resolution structures of the components of the assembly.  相似文献   

14.
Several hydration models for peptides and proteins based on solvent accessible surface area have been proposed previously. We have evaluated some of these models as well as four new ones in the context of near-native conformations of a protein. In addition, we propose an empirical site-site distance-dependent correction that can be used in conjunction with any of these models. The set of near-native structures consisted of 39 conformations of bovine pancreatic trypsin inhibitor (BPTI) each of which was a local minimum of an empirical energy function (ECEPP) in the absence of solvent. Root-mean-square (rms) deviations from the crystallographically determined structure were in the following ranges: 1.06-1.94 A for all heavy atoms, 0.77-1.36 A for all backbone heavy atoms, 0.68-1.33 A for all alpha-carbon atoms, and 1.41-2.72 A for all side-chain heavy atoms. We have found that there is considerable variation among the solvent models when evaluated in terms of concordance between the solvation free energy and the rms deviations from the crystallographically determined conformation. The solvation model for which the best concordance (0.939) with the rms deviations of the C alpha atoms was found was derived from NMR coupling constants of peptides in water combined with an exponential site-site distance dependence of the potential of mean force. Our results indicate that solvation free energy parameters derived from nonpeptide free energies of hydration may not be transferrable to peptides. Parameters derived from peptide and protein data may be more applicable to conformational analysis of proteins. A general approach to derive parameters for free energy of hydration from ensemble-averaged properties of peptides in solution is described.  相似文献   

15.
Charest G  Lavigne P 《Biopolymers》2006,81(3):202-214
We present a minimalist approach for the modeling of the three-dimensional structure of multistranded alpha-helical coiled coils. The approach is based on empirical principles introduced by F. H. C. Crick (F. H. C. Crick, Acta Crystallogr, 1953, Vol. 6, pp. 689-697). Crick hypothesized that keeping the distance between the residues at the interacting interface of alpha-helices constant would lead to supercoiling or the formation of a coiled coil through the knobs-into-holes mode of packing. We have implemented the latter hypothesis in a simulating annealing protocol in the simple form of interhelical distance restraints (two per heptad) between Calpha at the interfacial positions and. To demonstrate the authenticity of Crick's hypothesis and the precision and accuracy of our approach, we have modeled the crystal structures of six synthetic coiled coils in dimeric, trimeric, and tetrameric states. The mean root mean square deviations (RMSDs) between the backbone atoms of the ensemble of structures calculated and those of the corresponding geometric averages is always below 0.76 A, indicating that our protocol has an excellent degree of convergence and precision. The RMSDs between the backbone atoms of each of the six geometric average structures and the backbone of the corresponding crystal structures all range between 0.43 and 0.95 A, indicative of excellent accuracy and proving the authenticity of Crick's hypothesis. Moreover, without specifying any dihedral angles, we found that in 81% of the occurrences, the most populated conformer of the side chains at positions and in the ensembles calculated were identical to those observed in the crystal structures. This shows that our simple approach, which is the simplest reported so far, can generate accurate results for the backbone and side chains. Finally, as a test case for a wider application of our approach in the field of structural proteomics, we describe the successful modeling of the overall structure of SNARE and the organization of its interfacial ionic layer known to play an important functional role.  相似文献   

16.
We present two new databases of NMR-derived distance and dihedral angle restraints: the Database Of Converted Restraints (DOCR) and the Filtered Restraints Database (FRED). These databases currently correspond to 545 proteins with NMR structures deposited in the Protein Databank (PDB). The criteria for inclusion were that these should be unique, monomeric proteins with author-provided experimental NMR data and coordinates available from the PDB capable of being parsed and prepared in a consistent manner. The Wattos program was used to parse the files, and the CcpNmr FormatConverter program was used to prepare them semi-automatically. New modules, including a new implementation of Aqua in the BioMagResBank (BMRB) software Wattos were used to analyze the sets of distance restraints (DRs) for inconsistencies, redundancies, NOE completeness, classification and violations with respect to the original coordinates. Restraints that could not be associated with a known nomenclature were flagged. The coordinates of hydrogen atoms were recalculated from the positions of heavy atoms to allow for a full restraint analysis. The DOCR database contains restraint and coordinate data that is made consistent with each other and with IUPAC conventions. The FRED database is based on the DOCR data but is filtered for use by test calculation protocols and longitudinal analyses and validations. These two databases are available from websites of the BMRB and the Macromolecular Structure Database (MSD) in various formats: NMR-STAR, CCPN XML, and in formats suitable for direct use in the software packages CNS and CYANA.Supplementary material to this paper is available in electronic form at http://dx.doi.org/10.1007/s10858-005-2195-0These authors contributed equally to this work.  相似文献   

17.
18.
Phylogenetic codon models are routinely used to characterize selective regimes in coding sequences. Their parametric design, however, is still a matter of debate, in particular concerning the question of how to account for differing nucleotide frequencies and substitution rates. This problem relates to the fact that nucleotide composition in protein-coding sequences is the result of the interactions between mutation and selection. In particular, because of the structure of the genetic code, the nucleotide composition differs between the three coding positions, with the third position showing a more extreme composition. Yet, phylogenetic codon models do not correctly capture this phenomenon and instead predict that the nucleotide composition should be the same for all three positions. Alternatively, some models allow for different nucleotide rates at the three positions, an approach conflating the effects of mutation and selection on nucleotide composition. In practice, it results in inaccurate estimation of the strength of selection. Conceptually, the problem comes from the fact that phylogenetic codon models do not correctly capture the fixation bias acting against the mutational pressure at the mutation–selection equilibrium. To address this problem and to more accurately identify mutation rates and selection strength, we present an improved codon modeling approach where the fixation rate is not seen as a scalar, but as a tensor. This approach gives an accurate representation of how mutation and selection oppose each other at equilibrium and yields a reliable estimate of the mutational process, while disentangling the mean fixation probabilities prevailing in different mutational directions.  相似文献   

19.
Background

A commonly recurring problem in structural protein studies, is the determination of all heavy atom positions from the knowledge of the central α-carbon coordinates.

Results

We employ advances in virtual reality to address the problem. The outcome is a 3D visualisation based technique where all the heavy backbone and side chain atoms are treated on equal footing, in terms of the Cα coordinates. Each heavy atom is visualised on the surfaces of a different two-sphere, that is centered at another heavy backbone and side chain atoms. In particular, the rotamers are visible as clusters, that display a clear and strong dependence on the underlying backbone secondary structure.

Conclusions

We demonstrate that there is a clear interdependence between rotameric states and secondary structure. Our method easily detects those atoms in a crystallographic protein structure which are either outliers or have been likely misplaced, possibly due to radiation damage. Our approach forms a basis for the development of a new generation, visualization based side chain construction, validation and refinement tools. The heavy atom positions are identified in a manner which accounts for the secondary structure environment, leading to improved accuracy.

  相似文献   

20.
High-resolution solid-state NMR spectroscopy can provide structural information of proteins that cannot be studied by X-ray crystallography or solution NMR spectroscopy. Here we demonstrate that it is possible to determine a protein structure by solid-state NMR to a resolution comparable to that by solution NMR. Using an iterative assignment and structure calculation protocol, a large number of distance restraints was extracted from (1)H/(1)H mixing experiments recorded on a single uniformly labeled sample under magic angle spinning conditions. The calculated structure has a coordinate precision of 0.6 A and 1.3 A for the backbone and side chain heavy atoms, respectively, and deviates from the structure observed in solution. The approach is expected to be applicable to larger systems enabling the determination of high-resolution structures of amyloid or membrane proteins.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号