首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
PsiCSI is a highly accurate and automated method of assigning secondary structure from NMR data, which is a useful intermediate step in the determination of tertiary structures. The method combines information from chemical shifts and protein sequence using three layers of neural networks. Training and testing was performed on a suite of 92 proteins (9437 residues) with known secondary and tertiary structure. Using a stringent cross-validation procedure in which the target and homologous proteins were removed from the databases used for training the neural networks, an average 89% Q3 accuracy (per residue) was observed. This is an increase of 6.2% and 5.5% (representing 36% and 33% fewer errors) over methods that use chemical shifts (CSI) or sequence information (Psipred) alone. In addition, PsiCSI improves upon the translation of chemical shift information to secondary structure (Q3 = 87.4%) and is able to use sequence information as an effective substitute for sparse NMR data (Q3 = 86.9% without (13)C shifts and Q3 = 86.8% with only H(alpha) shifts available). Finally, errors made by PsiCSI almost exclusively involve the interchange of helix or strand with coil and not helix with strand (<2.5 occurrences per 10000 residues). The automation, increased accuracy, absence of gross errors, and robustness with regards to sparse data make PsiCSI ideal for high-throughput applications, and should improve the effectiveness of hybrid NMR/de novo structure determination methods. A Web server is available for users to submit data and have the assignment returned.  相似文献   

2.
Protein structure prediction methods such as Rosetta search for the lowest energy conformation of the polypeptide chain. However, the experimentally observed native state is at a minimum of the free energy, rather than the energy. The neglect of the missing configurational entropy contribution to the free energy can be partially justified by the assumption that the entropies of alternative folded states, while very much less than unfolded states, are not too different from one another, and hence can be to a first approximation neglected when searching for the lowest free energy state. The shortcomings of current structure prediction methods may be due in part to the breakdown of this assumption. Particularly problematic are proteins with significant disordered regions which do not populate single low energy conformations even in the native state. We describe two approaches within the Rosetta structure modeling methodology for treating such regions. The first does not require advance knowledge of the regions likely to be disordered; instead these are identified by minimizing a simple free energy function used previously to model protein folding landscapes and transition states. In this model, residues can be either completely ordered or completely disordered; they are considered disordered if the gain in entropy outweighs the loss of favorable energetic interactions with the rest of the protein chain. The second approach requires identification in advance of the disordered regions either from sequence alone using for example the DISOPRED server or from experimental data such as NMR chemical shifts. During Rosetta structure prediction calculations the disordered regions make only unfavorable repulsive contributions to the total energy. We find that the second approach has greater practical utility and illustrate this with examples from de novo structure prediction, NMR structure calculation, and comparative modeling.  相似文献   

3.
Protein chemical shifts encode detailed structural information that is difficult and computationally costly to describe at a fundamental level. Statistical and machine learning approaches have been used to infer correlations between chemical shifts and secondary structure from experimental chemical shifts. These methods range from simple statistics such as the chemical shift index to complex methods using neural networks. Notwithstanding their higher accuracy, more complex approaches tend to obscure the relationship between secondary structure and chemical shift and often involve many parameters that need to be trained. We present hidden Markov models (HMMs) with Gaussian emission probabilities to model the dependence between protein chemical shifts and secondary structure. The continuous emission probabilities are modeled as conditional probabilities for a given amino acid and secondary structure type. Using these distributions as outputs of first‐ and second‐order HMMs, we achieve a prediction accuracy of 82.3%, which is competitive with existing methods for predicting secondary structure from protein chemical shifts. Incorporation of sequence‐based secondary structure prediction into our HMM improves the prediction accuracy to 84.0%. Our findings suggest that an HMM with correlated Gaussian distributions conditioned on the secondary structure provides an adequate generative model of chemical shifts. Proteins 2013; © 2012 Wiley Periodicals, Inc.  相似文献   

4.
The peptide backbones of disordered proteins are routinely characterized by NMR with respect to transient structure and dynamics. Little experimental information is, however, available about the side chain conformations and how structure in the backbone affects the side chains. Methyl chemical shifts can in principle report the conformations of aliphatic side chains in disordered proteins and in order to examine this two model systems were chosen: the acid denatured state of acyl-CoA binding protein (ACBP) and the intrinsically disordered activation domain of the activator for thyroid hormone and retinoid receptors (ACTR). We find that small differences in the methyl carbon chemical shifts due to the γ-gauche effect may provide information about the side chain rotamer distributions. However, the effects of neighboring residues on the methyl group chemical shifts obscure the direct observation of γ-gauche effect. To overcome this, we reference the chemical shifts to those in a more disordered state resulting in residue specific random coil chemical shifts. The (13)C secondary chemical shifts of the methyl groups of valine, leucine, and isoleucine show sequence specific effects, which allow a quantitative analysis of the ensemble of χ(2)-angles of especially leucine residues in disordered proteins. The changes in the rotamer distributions upon denaturation correlate to the changes upon helix induction by the co-solvent trifluoroethanol, suggesting that the side chain conformers are directly or indirectly related to formation of transient α-helices.  相似文献   

5.
A simple alternative method for obtaining "random coil" chemical shifts by intrinsic referencing using the protein's own peptide sequence is presented. These intrinsic random coil backbone shifts were then used to calculate secondary chemical shifts, that provide important information on the residual secondary structure elements in the acid-denatured state of an acyl-coenzyme A binding protein. This method reveals a clear correlation between the carbon secondary chemical shifts and the amide secondary chemical shifts 3-5 residues away in the primary sequence. These findings strongly suggest transient formation of short helix-like segments, and identify unique sequence segments important for protein folding.  相似文献   

6.
Accurate determination of protein secondary structure from the chemical shift information is a key step for NMR tertiary structure determination. Relatively few work has been done on this subject. There needs to be a systematic investigation of algorithms that are (a) robust for large datasets; (b) easily extendable to (the dynamic) new databases; and (c) approaching to the limit of accuracy. We introduce new approaches using k-nearest neighbor algorithm to do the basic prediction and use the BCJR algorithm to smooth the predictions and combine different predictions from chemical shifts and based on sequence information only. Our new system, SUCCES, improves the accuracy of all existing methods on a large dataset of 805 proteins (at 86% Q(3) accuracy and at 92.6% accuracy when the boundary residues are ignored), and it is easily extendable to any new dataset without requiring any new training. The software is publicly available at http://monod.uwaterloo.ca/nmr/succes.  相似文献   

7.
The technique of 1H NMR spectroscopy and absorption UV spectroscopy were used to study the ionization of the tyrosine phenol cycles and the effect of ionizable groups on the chemical shifts of signals from protons in the side chains of several amino acid residues. The microenvironment of these residues was established by analysing the titration curves. The mutual orientation of two functionally important adjacent alpha-helical protein regions was determined in solution. The signals from methionine residues belonging to different regions of the secondary structure were assigned in the NMR spectrum. The results indicate that the spatial structure of the repressor is similar in solution an in the crystal. They confirm the model proposed for the cro repressor interaction with DNA and based on the data of X-ray diffraction analysis.  相似文献   

8.
Estimation of secondary structure in polypeptides is important for studying their structure, folding and dynamics. In NMR spectroscopy, such information is generally obtained after sequence specific resonance assignments are completed. We present here a new methodology for assignment of secondary structure type to spin systems in proteins directly from NMR spectra, without prior knowledge of resonance assignments. The methodology, named Combination of Shifts for Secondary Structure Identification in Proteins (CSSI-PRO), involves detection of specific linear combination of backbone 1Hα and 13C′ chemical shifts in a two-dimensional (2D) NMR experiment based on G-matrix Fourier transform (GFT) NMR spectroscopy. Such linear combinations of shifts facilitate editing of residues belonging to α-helical/β-strand regions into distinct spectral regions nearly independent of the amino acid type, thereby allowing the estimation of overall secondary structure content of the protein. Comparison of the predicted secondary structure content with those estimated based on their respective 3D structures and/or the method of Chemical Shift Index for 237 proteins gives a correlation of more than 90% and an overall rmsd of 7.0%, which is comparable to other biophysical techniques used for structural characterization of proteins. Taken together, this methodology has a wide range of applications in NMR spectroscopy such as rapid protein structure determination, monitoring conformational changes in protein-folding/ligand-binding studies and automated resonance assignment. Electronic supplementary material  The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

9.
Protein structure determination using nuclear magnetic resonance (NMR) spectroscopy can be both time-consuming and labor intensive. Here we demonstrate how chemical shift threading can permit rapid, robust, and accurate protein structure determination using only chemical shift data. Threading is a relatively old bioinformatics technique that uses a combination of sequence information and predicted (or experimentally acquired) low-resolution structural data to generate high-resolution 3D protein structures. The key motivations behind using NMR chemical shifts for protein threading lie in the fact that they are easy to measure, they are available prior to 3D structure determination, and they contain vital structural information. The method we have developed uses not only sequence and chemical shift similarity but also chemical shift-derived secondary structure, shift-derived super-secondary structure, and shift-derived accessible surface area to generate a high quality protein structure regardless of the sequence similarity (or lack thereof) to a known structure already in the PDB. The method (called E-Thrifty) was found to be very fast (often?<?10 min/structure) and to significantly outperform other shift-based or threading-based structure determination methods (in terms of top template model accuracy)—with an average TM-score performance of 0.68 (vs. 0.50–0.62 for other methods). Coupled with recent developments in chemical shift refinement, these results suggest that protein structure determination, using only NMR chemical shifts, is becoming increasingly practical and reliable. E-Thrifty is available as a web server at http://ethrifty.ca.  相似文献   

10.
We present a method for analyzing the chemical shift database to yield information on nearest-neighbor effects on carbon-13 chemical shift values for alpha and beta carbons of amino acids in proteins. For each amino acid sequence XYZ, we define two correction factors, Delta(XY) s and Delta(YZ) s , representing the effects on (delta13 Calpha-delta13 Cbeta) for residue Y from the preceding residue (X) and the following residue (Z), where X, Y, and Z represent one of the 20 naturally occurring amino acids, Delta designates the change in value or the correction factor (in ppm), and s is an index standing for one of three "pseudo secondary structure states" derived from chemical shift dispersions, which we show represent residues in primarily alpha-helix, beta-strand, and non-alphabeta(coil). The correction factors were obtained from maximum likelihood fitting of (delta13 Calpha-delta13 Cbeta) values from the chemical shifts of 651 proteins to a mixture of three Gaussians. These correction factors were derived strictly from the analysis of assigned chemical shifts, without regard to the three-dimensional structures of these proteins. The corrections factors were found to differ according to the secondary structural environment of the central residue (deduced from the chemical shift distribution) as well as by different identities of the nearest neighboring residues in the sequence. The areas subsumed by the sequence-dependent chemical shift distributions report on the relative energies of the sequences in different pseudo secondary structural environments, and the positions of the peaks indicate the chemical shifts of lowest energy conformations. As such, these results have potential applications to the determination of dihedral angle restraints from chemical shifts for structure determination and to more accurate predictions of chemical shifts in proteins of known structure. From a database of chemical shifts associated well-defined three-dimensional structures, comparisons were made between DSSP designations derived from three-dimensional structure and pseudo secondary structure designations derived from nearest-neighbor corrected chemical shift analysis. The high level of agreement between the two approaches to classifying secondary structure provides a measure of confidence in this chemical shift-based approach to the analysis of protein structure.  相似文献   

11.
We describe a computational method for the prediction of RNA secondary structure that uses a combination of free energy and comparative sequence analysis strategies. Using a homology-based sequence alignment as a starting point, all favorable pairings with respect to the Turner energy function are identified. Each potentially paired region within a multiple sequence alignment is scored using a function that combines both predicted free energy and sequence covariation with optimized weightings. High scoring regions are ranked and sequentially incorporated to define a growing secondary structure. Using a single set of optimized parameters, it is possible to accurately predict the foldings of several test RNAs defined previously by extensive phylogenetic and experimental data (including tRNA, 5 S rRNA, SRP RNA, tmRNA, and 16 S rRNA). The algorithm correctly predicts approximately 80% of the secondary structure. A range of parameters have been tested to define the minimal sequence information content required to accurately predict secondary structure and to assess the importance of individual terms in the prediction scheme. This analysis indicates that prediction accuracy most strongly depends upon covariational information and only weakly on the energetic terms. However, relatively few sequences prove sufficient to provide the covariational information required for an accurate prediction. Secondary structures can be accurately defined by alignments with as few as five sequences and predictions improve only moderately with the inclusion of additional sequences.  相似文献   

12.
The heat stable inhibitor of cAMP-dependent protein kinase (PKIalpha) contains both a nuclear export signal (NES) and a high affinity inhibitory region that is essential for inhibition of the catalytic subunit of the kinase. These functions are sequentially independent. Two-dimensional NMR spectroscopy was performed on uniformly [15N]-labeled PKIalpha to examine its structure free in solution. Seventy out of 75 residues were identified, and examination of the CaH chemical shifts revealed two regions of upfield chemical shifts characteristic of alpha-helices. When PKIalpha was fragmented into two functionally distinct peptides for study at higher concentrations, no significant alterations in chemical shifts or secondary structure were observed. The first ordered region, identified in PKIalpha (1-25), contains an alpha-helix from residues 1-13. This helix extends by one turn the helix observed in the crystal structure of a PKIalpha (5-24) peptide bound to the catalytic subunit. The second region of well-defined secondary structure, residues 35-47, overlaps with the nuclear export signal in the PKIalpha (26-75) fragment. This secondary structure consists of a helix with a hydrophobic face comprised of Leu37, Leu41, and Leu44, followed by a flexible turn containing Ile46. These four residues are critical for nuclear export function. The remainder of the protein in solution appears relatively unstructured, and this lack of structure surrounding a few essential and well-defined signaling elements may be characteristic of a growing family of small regulatory proteins that interact with protein kinases.  相似文献   

13.
Eliezer D  Chung J  Dyson HJ  Wright PE 《Biochemistry》2000,39(11):2894-2901
The partly folded state of apomyoglobin at pH 4 represents an excellent model for an obligatory kinetic folding intermediate. The structure and dynamics of this intermediate state have been extensively examined using NMR spectroscopy. Secondary chemical shifts, (1)H-(1)H NOEs, and amide proton temperature coefficients have been used to probe residual structure in the intermediate state, and NMR relaxation parameters T(1) and T(2) and ?(1)H?-(15)N NOE have been analyzed using spectral densities to correlate motion of the polypeptide chain with these structural observations. A significant amount of helical structure remains in the pH 4 state, indicated by the secondary chemical shifts of the (13)C(alpha), (13)CO, (1)H(alpha), and (13)C(beta) nuclei, and the boundaries of this helical structure are confirmed by the locations of (1)H-(1)H NOEs. Hydrogen bonding in the structured regions is predominantly native-like according to the amide proton chemical shifts and their temperature dependence. The locations of the A, G, and H helix segments and the C-terminal part of the B helix are similar to those in native apomyoglobin, consistent with the early, complete protection of the amides of residues in these helices in quench-flow experiments. These results confirm the similarity of the equilibrium form of apoMb at pH 4 and the kinetic intermediate observed at short times in the quench-flow experiment. Flexibility in this structured core is severely curtailed compared with the remainder of the protein, as indicated by the analysis of the NMR relaxation parameters. Regions with relatively high values of J(0) and low values of J(750) correspond well with the A, B, G, and H helices, an indication that nanosecond time scale backbone fluctuations in these regions of the sequence are restricted. Other parts of the protein show much greater flexibility and much reduced secondary chemical shifts. Nevertheless, several regions show evidence of the beginnings of helical structure, including stretches encompassing the C helix-CD loop, the boundary of the D and E helices, and the C-terminal half of the E helix. These regions are clearly not well-structured in the pH 4 state, unlike the A, B, G, and H helices, which form a native-like structured core. However, the proximity of this structured core most likely influences the region between the B and F helices, inducing at least transient helical structure.  相似文献   

14.
NMR chemical shifts in proteins depend strongly on local structure. The program TALOS establishes an empirical relation between 13C, 15N and 1H chemical shifts and backbone torsion angles ϕ and ψ (Cornilescu et al. J Biomol NMR 13 289–302, 1999). Extension of the original 20-protein database to 200 proteins increased the fraction of residues for which backbone angles could be predicted from 65 to 74%, while reducing the error rate from 3 to 2.5%. Addition of a two-layer neural network filter to the database fragment selection process forms the basis for a new program, TALOS+, which further enhances the prediction rate to 88.5%, without increasing the error rate. Excluding the 2.5% of residues for which TALOS+ makes predictions that strongly differ from those observed in the crystalline state, the accuracy of predicted ϕ and ψ angles, equals ±13°. Large discrepancies between predictions and crystal structures are primarily limited to loop regions, and for the few cases where multiple X-ray structures are available such residues are often found in different states in the different structures. The TALOS+ output includes predictions for individual residues with missing chemical shifts, and the neural network component of the program also predicts secondary structure with good accuracy. Electronic supplementary material  The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

15.
16.
Dowd TL  Rosen JF  Li L  Gundberg CM 《Biochemistry》2003,42(25):7769-7779
Structural information on osteocalcin or other noncollagenous bone proteins is very limited. We have solved the three-dimensional structure of calcium bound osteocalcin using (1)H 2D NMR techniques and proposed a mechanism for mineral binding. The protons in the 49 amino acid sequence were assigned using standard two-dimensional homonuclear NMR experiments. Distance constraints, dihedral angle constraints, hydrogen bonds, and (1)H and (13)C chemical shifts were all used to calculate a family of 13 structures. The tertiary structure of the protein consisted of an unstructured N terminus and a C-terminal loop (residues 16-49) formed by long-range hydrophobic interactions. Elements of secondary structure within residues 16-49 include type III turns (residues 20-25) and two alpha-helical regions (residues 27-35 and 41-44). The three Gla residues project from the same face of the helical turns and are surface exposed. The genetic algorithm-molecular dynamics simulation approach was used to place three calcium atoms on the NMR-derived structure. One calcium atom was coordinated by three side chain oxygen atoms, two from Asp30, and one from Gla24. The second calcium atom was coordinated to four oxygen atoms, two from the side chain in Gla 24, and two from the side chain of Gla 21. The third calcium atom was coordinated to two oxygen atoms of the side chain of Gla17. The best correlation of the distances between the uncoordinated Gla oxygen atoms is with the intercalcium distance of 9.43 A in hydroxyapatite. The structure may provide further insight into the function of osteocalcin.  相似文献   

17.
The polymorphic structures of silk fibroins in the solid state were examined on the basis of a quantitative relationship between the 13C chemical shift and local structure in proteins. To determine this relationship, 13C chemical shift contour plots for C alpha and C beta carbons of Ala and Ser residues, and the C alpha chemical shift plot for Gly residues were prepared using atomic co-ordinates from the Protein Data Bank and 13C NMR chemical shift data in aqueous solution reported for 40 proteins. The 13C CP/MAS NMR chemical shifts of Ala, Ser and Gly residues of Bombyx mori silk fibroin in silk I and silk II forms were used along with 13C CP/MAS NMR chemical shifts of Ala residues of Samia cynthia ricini silk fibroin in beta-sheet and alpha-helix forms for the structure analyses of silk fibroins. The allowed regions in the 13C chemical shift contour plots for C alpha and C beta carbons of Ala and Ser residues for the structures in silk fibroins, i.e. Silk II, Silk I and alpha-helix, were determined using their 13C isotropic NMR chemical shifts in the solid state. There are two area of the phi,psi map which satisfy the observed Silk I chemical shift data for both the C alpha and C beta carbons of Ala and Ser residues in the 13C chemical shift contour plots.  相似文献   

18.
Nuclear magnetic resonance (NMR) spectroscopy is a powerful technique for the study of the structure, dynamics, and folding of proteins in solution. It is particularly powerful when applied to dynamic or flexible systems, such as partially folded molten globule states of proteins, which are not usually amenable to X-ray crystallography. In this article, NMR methods suitable for the detailed characterisation of molten globule states are described. The specific method used to study the molten globule is determined by the quality of the NMR spectrum obtained. Molten globules are characterised by significant levels of secondary structure. Site-specific hydrogen-deuterium exchange experiments can be used to identify residues located in regions of secondary structure in the molten globule. If spectra characterised by sharp peaks are observed for the molten globule then information about secondary structure can be obtained by analysis of (1)H(alpha), (13)C(alpha), (13)C(beta), and (13)CO chemical shifts; this can be supplemented by (15)N relaxation studies. For molten globules characterised by extremely broad peaks (15)N-edited NMR experiments carried out in increasing concentrations of denaturants can be used to study the relative stabilities of different regions of structure. Examples of the application of these methods to the study of the low pH molten globule states of alpha-lactalbumin and apomyoglobin are presented.  相似文献   

19.
Fang Q  Shortle D 《Proteins》2005,60(1):97-102
In the preceding article in this issue of Proteins, an empirical energy function consisting of 4 statistical potentials that quantify local side-chain-backbone and side-chain-side-chain interactions has been demonstrated to successfully identify the native conformations of short sequence fragments and the native structure within large sets of high-quality decoys. Because this energy function consists entirely of interactions between residues separated by fewer than 5 positions, it can be used at the earliest stage of ab initio structure prediction to enhance the efficiency of conformational search. In this article, protein fragments are generated de novo by recombining very short segments of protein structures (2, 4, or 6 residues), either selected at random or optimized with respect this local energy function. When local energy is optimized in selected fragments, more efficient sampling of conformational space near the native conformation is consistently observed for 450 randomly selected single turn fragments, with turn lengths varying from 3 to 12 residues and all 4 combinations of flanking secondary structure. These results further demonstrate the energetic significance of local interactions in protein conformations. When used in combination with longer range energy functions, application of these potentials should lead to more accurate prediction of protein structure.  相似文献   

20.
A new program, TALOS-N, is introduced for predicting protein backbone torsion angles from NMR chemical shifts. The program relies far more extensively on the use of trained artificial neural networks than its predecessor, TALOS+. Validation on an independent set of proteins indicates that backbone torsion angles can be predicted for a larger, ≥90 % fraction of the residues, with an error rate smaller than ca 3.5 %, using an acceptance criterion that is nearly two-fold tighter than that used previously, and a root mean square difference between predicted and crystallographically observed (?, ψ) torsion angles of ca 12º. TALOS-N also reports sidechain χ1 rotameric states for about 50 % of the residues, and a consistency with reference structures of 89 %. The program includes a neural network trained to identify secondary structure from residue sequence and chemical shifts.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号