Summary The parameters for HN chemical shift calculations of proteins have been determined using data from high-resolution crystal structures of 15 proteins. Employing these chemical shift calculations for HN protons, the observed secondary structure chemical shift trends of HN protons, i.e., upfield shifts on helix formation and downfield shifts on -sheet formation, are discussed. Our calculations suggest that the main reason for the difference in NH chemical shifts in helices and sheets is not an effect from the directly hydrogen-bonded carbonyl, which gives rise to downfield shifts in both cases, but arises from an additional upfield shift predicted in helices and originating in residues i-2 and i-3. The calculations also explain the well-known relationship between amide proton shifts and hydrogen-bond lengths. In addition, the HN chemical shifts of the distorted amphipathic helices of the GCN4 leucine zipper are calculated and used to characterise the solution structure of the helices. By comparing the calculated and experimental shifts, it is shown that in general the agreement is good between residues 15 and 28. The most interesting observation is that in the N-terminal half of the zipper, although both calculated and experimental shifts show clear periodicity, they are no longer in phase. This suggests that for the N-terminal half, in the true average solution structure the period of the helix coil is longer by roughly one residue compared to the NMR structures.  相似文献   

Summary Computation of the 13C chemical shifts (or shieldings) of glycine, alanine and valine residues in bovine and Drosophila calmodulins and Staphylococcal nuclease, and comparison with experimental values, is reported using a gauge-including atomic orbital quantum-chemical approach. The full 24 ppm shielding range is reproduced (overall r.m.s.d.=1.4 ppm) using optimized protein structures, corrected for bond-length/bond-angle errors, and rovibrational effects.To whom correspondence should be addressed.  相似文献   

Summary An empirical correlation between the peptide 15N chemical shift, 15Ni, and the backbone torsion angles i, i–1 is reported. By using two-dimensional shielding surfaces (i1–1), it is possible in many cases to make reasonably accurate predictions of 15N chemical shifts for a given structure. On average, the rms error between experiment and prediction is about 3.5 ppm. Results for threonine, valine and isoleucine are worse (4.8 ppm), due presumably to 1-distribution/-gauche effects. The rms errors for the other amino acids are 3 ppm, for a typical maximal chemical shift range of 15–20 ppm. Thus, there is a significant correlation between 15N chemical shift and secondary structure.  相似文献   

Random coil chemical shifts are commonly used to detect protein secondary structural elements in chemical shift index (CSI) calculations. Though this technique is widely used and seems reliable for folded proteins, the choice of reference random coil chemical shift values can significantly alter the outcome of secondary structure estimation. In order to evaluate these effects, we present a comparison of secondary structure content calculated using CSI, based on five different reference random coil chemical shift value sets, to that derived from three-dimensional structures.Our results show that none of the reference random coil data sets chosen for evaluation fully reproduces the actual secondary structures. Among the reference values generally available to date, most tend to be good estimators only of helices. Based on our evaluation, we recommend the experimental values measured by Schwarzinger et al.(2000), and statistical values obtained by Lukin et al. (1997), as good estimators of both helical and sheet content.  相似文献   

Chemical shifts of amino acids in proteins are the most sensitive and easily obtainable NMR parameters that reflect the primary, secondary, and tertiary structures of the protein. In recent years, chemical shifts have been used to identify secondary structure in peptides and proteins, and it has been confirmed that 1Hα, 13Cα, 13Cβ, and 13C′ NMR chemical shifts for all 20 amino acids are sensitive to their secondary structure. Currently, most of the methods are purely based on one-dimensional statistical analyses of various chemical shifts for each residue to identify protein secondary structure. However, it is possible to achieve an increased accuracy from the two-dimensional analyses of these chemical shifts. The 2DCSi approach performs two-dimension cluster analyses of 1Hα, 1HN, 13Cα, 13Cβ, 13C′, and 15NH chemical shifts to identify protein secondary structure and the redox state of cysteine residue. For the analysis of paired chemical shifts of 6 data sets, each of the 20 amino acids has its own 15 two-dimension cluster scattering diagrams. Accordingly, the probabilities for identifying helix and extended structure were calculated by using our scoring matrix. Compared with existing the chemical shift-based methods, it appears to improve the prediction accuracy of secondary structure identification, particularly in the extended structure. In addition, the probability of the given residue to be helix or extended structure is displayed, allows the users to make decisions by themselves. Electronic Supplementary Material The online version of this article (doi:) contains supplementary material, which is available to authorized users. Grant sponsor: National Science Council of ROC; Grant numbers: NSC-94-2323-B006- 001, NSC-93-2212-E-006.  相似文献   

The algorithm PLATON is able to assign sets of chemical shifts derived from a single residue to amino acid types with its secondary structure (amino acid species). A subsequent ranking procedure using optionally two different penalty functions yields predictions for possible amino acid species for the given set of chemical shifts. This was demonstrated in the case of the -spectrin SH3 domain and applied to 9 further protein data sets taken from the BioMagRes database. A database consisting of reference chemical shift patterns (reference CSPs) was generated from assigned chemical shifts of proteins with known 3D-structure. This reference CSP database is used in our approach for extracting distributions of amino acid types with their most likely secondary structure elements (namely -helix, -sheet, and coil) for single amino acids by comparison with query CSPs. Results obtained for the 10 investigated proteins indicates that the percentage of correct amino acid species in the first three positions in the ranking list, ranges from 71.4% to 93.2% for the more favorable penalty function. Where only the top result of the ranking list for these 10 proteins is considered, 36.5% to 83.1% of the amino acid species are correctly predicted. The main advantage of our approach, over other methods that rely on average chemical shift values is the ability to increase database content by incorporating newly derived CSPs, and therefore to improve PLATON's performance over time.  相似文献   

The recognition of protein folds is an important step in the prediction of protein structure and function. Recently, an increasing number of researchers have sought to improve the methods for protein fold recognition. Following the construction of a dataset consisting of 27 protein fold classes by Ding and Dubchak in 2001, prediction algorithms, parameters and the construction of new datasets have improved for the prediction of protein folds. In this study, we reorganized a dataset consisting of 76-fold classes constructed by Liu et al. and used the values of the increment of diversity, average chemical shifts of secondary structure elements and secondary structure motifs as feature parameters in the recognition of multi-class protein folds. With the combined feature vector as the input parameter for the Random Forests algorithm and ensemble classification strategy, we propose a novel method to identify the 76 protein fold classes. The overall accuracy of the test dataset using an independent test was 66.69%; when the training and test sets were combined, with 5-fold cross-validation, the overall accuracy was 73.43%. This method was further used to predict the test dataset and the corresponding structural classification of the first 27-protein fold class dataset, resulting in overall accuracies of 79.66% and 93.40%, respectively. Moreover, when the training set and test sets were combined, the accuracy using 5-fold cross-validation was 81.21%. Additionally, this approach resulted in improved prediction results using the 27-protein fold class dataset constructed by Ding and Dubchak.  相似文献   

The reliability of 1H chemical shift calculations for DNA is assessed by comparing the experimentally and calculated chemical shifts of a reasonably large number of independently determined DNA structures. The calculated chemical shifts are based on semiempirical relations derived by Giessner-Prettre and Pullman [(1987) Q. Rev. Biophys., 20, 113–172]. The standard deviation between calculated and observed chemical shifts is found to be quite small, i.e. 0.17 ppm. This high accuracy, which is achieved without parameter adjustment, makes it possible to analyze the structural dependencies of chemical shifts in a reliable fashion. The conformation-dependent 1H chemical shift is mainly determined by the ring current effect and the local magnetic anisotropy, while the third possible effect, that of the electric field, is surprisingly small. It was further found that for a double helical environment, the chemical shift of the sugar protons, H2 to H5, is mainly affected by the ring current and magnetic anisotropy of their own base. Consequently, the chemical shift of these sugar protons is determined by two factors, namely the type of base to which the sugar ring is attached, C, T, A, or G, and secondly by the -angle. In particular, the H2 shift varies strongly with the -angle, and strong upfield H2 shifts directly indicate that the -angle is in the syn domain. The H1 shift is not only strongly affected by its own base, but also by its 3-neighboring base. On the other hand, base protons, in particular H5 of cytosine and methyl protons of thymine, are affected mainly by the 5-neighboring bases, although some effect (0.2 ppm) stems from the 3-neighboring base. The H2 protons are mainly affected by the 3-neighboring base. As a result of these findings a simple scheme is proposed for sequential assignment of resonances from B-helices based on chemical shifts.  相似文献   

Staphylokinase (Sak) is a 15.5 kDa protein secreted by several strains of Staphylococcusaureus. Due to its ability to convert plasminogen, the inactive proenzyme of the fibrinolyticsystem, into plasmin, Sak is presently undergoing clinical trials for blood clot lysis in thetreatment of thrombovascular disorders. With a view to developing a better understanding ofthe mode of action of Sak, we have initiated a structural investigation of Sak viamultidimensional heteronuclear NMR spectroscopy employing uniformly 15N- and 15N,13C-labelled Sak. Sequence-specific resonance assignments have been made employing 15N-editedTOCSY and NOE experiments and from HNCACB, CBCA(CO)NH, HBHA(CBCACO)NHand CC(CO)NH sets of experiments. From an analysis of the chemical shifts,3JHNH scalar coupling constants, NOEs and HN exchange data, the secondary structural elements of Sakhave been characterized.  相似文献   

Temperature coefficients have been measured by 2D NMR methods forthe amide and CH proton chemical shifts in two globularproteins, bovine pancreatic trypsin inhibitor and hen egg-white lysozyme.The temperature-dependent changes in chemical shift are generally linear upto about 15° below the global denaturation temperature, and the derivedcoefficients span a range of roughly –16 to +2 ppb/K for amide protonsand –4 to +3 ppb/K for CH. The temperaturecoefficients can be rationalized by the assumption that heating causesincreases in thermal motion in the protein. Precise calculations oftemperature coefficients derived from protein coordinates are not possible,since chemical shifts are sensitive to small changes in atomic coordinates.Amide temperature coefficients correlate well with the location of hydrogenbonds as determined by crystallography. It is concluded that a combined useof both temperature coefficients and exchange rates produces a far morereliable indicator of hydrogen bonding than either alone. If an amide protonexchanges slowly and has a temperature coefficient more positive than–4.5 ppb/K, it is hydrogen bonded, while if it exchanges rapidly andhas a temperature coefficient more negative than –4.5 ppb/K, it is nothydrogen bonded. The previously observed unreliability of temperaturecoefficients as measures of hydrogen bonding in peptides may arise fromlosses of peptide secondary structure on heating.  相似文献   

Random coil proton chemical shifts of deoxyribonucleic acids   总被引:2,自引:0,他引:2  
Sixteen 17-nucleotide DNA sequences have been used to determine the sequence effect on random coil DNA proton chemical shifts. Based on the proton chemical shifts measured for the central nucleotides in 64 triplets and the correction factors determined for the next nearest neighbor effects, a parameter set has been derived for predicting random coil DNA proton chemical shifts. The root-mean-square deviation (RMSD) between the predicted and the observed aromatic H6/H8 proton chemical shifts of 200 data from 22 random coil DNA sequences was determined to be 0.02 ppm with a correlation coefficient of 0.998. For the H1, H2, H2 and H3 sugar protons, the RMSD values between the predicted and the experimental shifts were found to be 0.02, 0.03, 0.03 and 0.02 ppm, respectively.  相似文献   

Wang CC  Chen JH  Yin SH  Chuang WJ 《Proteins》2006,64(1):219-226
Different programs and methods were employed to superimpose protein structures, using members of four very different protein families as test subjects, and the results of these efforts were compared. Algorithms based on human identification of key amino acid residues on which to base the superpositions were nearly always more successful than programs that used automated techniques to identify key residues. Among those programs automatically identifying key residues, MASS could not superimpose all members of some families, but was very efficient with other families. MODELLER, MultiProt, and STAMP had varying levels of success. A genetic algorithm program written for this project did not improve superpositions when results from neighbor-joining and pseudostar algorithms were used as its starting cases, but it always improved superpositions obained by MODELLER and STAMP. A program entitled PyMSS is presented that includes three superposition algorithms featuring human interaction.  相似文献   

Summary Essentially complete assignments have been obtained for the1H and protonated13C NMR spectra of the zinc finger peptide Xfin-31 in the presence and absence of zinc. The patterns observed for the1H and13C chemical shifts of the peptide in the presence of zinc, relative to the shifts in the absence of zinc, reflect the zinc-mediated folding of the unstructured peptide into a compact globular structure with distinct elements of secondary structure. Chemical shifts calculated from the 3D solution structure of the peptide in the presence of zinc and the observed shifts agree to within ca. 0.2 and 0.6 ppm for the backbone CaH and NH protons, respectively. In addition, homologous zinc finger proteins exhibit similar correlations between their1H chemical shifts and secondary structure.  相似文献   

Proteins with high‐sequence identity but very different folds present a special challenge to sequence‐based protein structure prediction methods. In particular, a 56‐residue three‐helical bundle protein (GA95) and an α/β‐fold protein (GB95), which share 95% sequence identity, were targets in the CASP‐8 structure prediction contest. With only 12 out of 300 submitted server‐CASP8 models for GA95 exhibiting the correct fold, this protein proved particularly challenging despite its small size. Here, we demonstrate that the information contained in NMR chemical shifts can readily be exploited by the CS‐Rosetta structure prediction program and yields adequate convergence, even when input chemical shifts are limited to just amide 1HN and 15N or 1HN and 1Hα values.  相似文献   



Protein sequence alignment is essential for a variety of tasks such as homology modeling and active site prediction. Alignment errors remain the main cause of low-quality structure models. A bioinformatics tool to refine alignments is needed to make protein alignments more accurate.


We developed the SFESA web server to refine pairwise protein sequence alignments. Compared to the previous version of SFESA, which required a set of 3D coordinates for a protein, the new server will search a sequence database for the closest homolog with an available 3D structure to be used as a template. For each alignment block defined by secondary structure elements in the template, SFESA evaluates alignment variants generated by local shifts and selects the best-scoring alignment variant. A scoring function that combines the sequence score of profile-profile comparison and the structure score of template-derived contact energy is used for evaluation of alignments. PROMALS pairwise alignments refined by SFESA are more accurate than those produced by current advanced alignment methods such as HHpred and CNFpred. In addition, SFESA also improves alignments generated by other software.


SFESA is a web-based tool for alignment refinement, designed for researchers to compute, refine, and evaluate pairwise alignments with a combined sequence and structure scoring of alignment blocks. To our knowledge, the SFESA web server is the only tool that refines alignments by evaluating local shifts of secondary structure elements. The SFESA web server is available at http://prodata.swmed.edu/sfesa.  相似文献   

Frank A  Onila I  Möller HM  Exner TE 《Proteins》2011,79(7):2189-2202
Despite the many protein structures solved successfully by nuclear magnetic resonance (NMR) spectroscopy, quality control of NMR structures is still by far not as well established and standardized as in crystallography. Therefore, there is still the need for new, independent, and unbiased evaluation tools to identify problematic parts and in the best case also to give guidelines that how to fix them. We present here, quantum chemical calculations of NMR chemical shifts for many proteins based on our fragment-based quantum chemical method: the adjustable density matrix assembler (ADMA). These results show that (13)C chemical shifts of reasonable accuracy can be obtained that can already provide a powerful measure for the structure validation. (1)H and even more (15)N chemical shifts deviate more strongly from experiment due to the insufficient treatment of solvent effects and conformational averaging.  相似文献   

For a long time, NMR chemical shifts have been used to identify protein secondary structures. Currently, this is accomplished through comparing the observed (1)H(alpha), (13)C(alpha), (13)C(beta), or (13)C' chemical shifts with the random coil values. Here, we present a new protocol, which is based on the joint probability of each of the three secondary structural types (beta-strand, alpha-helix, and random coil) derived from chemical-shift data, to identify the secondary structure. In combination with empirical smooth filters/functions, this protocol shows significant improvements in the accuracy and the confidence of identification. Updated chemical-shift statistics are reported, on the basis of which the reliability of using chemical shift to identify protein secondary structure is evaluated for each nucleus. The reliability varies greatly among the 20 amino acids, but, on average, is in the order of: (13)C(alpha)>(13)C'>(1)H(alpha)>(13)C(beta)>(15)N>(1)H(N) to distinguish an alpha-helix from a random coil; and (1)H(alpha)>(13)C(beta) >(1)H(N) approximately (13)C(alpha) approximately (13)C' approximately (15)N for a beta-strand from a random coil. Amide (15)N and (1)H(N) chemical shifts, which are generally excluded from the application, in fact, were found to be helpful in distinguishing a beta-strand from a random coil. In addition, the chemical-shift statistical data are compared with those reported previously, and the results are discussed. A JAVA User Interface program has been developed to make the entire procedure fully automated and is available via http://ccsr3150-p3.stanford.edu.  相似文献   

Protein chemical shifts encode detailed structural information that is difficult and computationally costly to describe at a fundamental level. Statistical and machine learning approaches have been used to infer correlations between chemical shifts and secondary structure from experimental chemical shifts. These methods range from simple statistics such as the chemical shift index to complex methods using neural networks. Notwithstanding their higher accuracy, more complex approaches tend to obscure the relationship between secondary structure and chemical shift and often involve many parameters that need to be trained. We present hidden Markov models (HMMs) with Gaussian emission probabilities to model the dependence between protein chemical shifts and secondary structure. The continuous emission probabilities are modeled as conditional probabilities for a given amino acid and secondary structure type. Using these distributions as outputs of first‐ and second‐order HMMs, we achieve a prediction accuracy of 82.3%, which is competitive with existing methods for predicting secondary structure from protein chemical shifts. Incorporation of sequence‐based secondary structure prediction into our HMM improves the prediction accuracy to 84.0%. Our findings suggest that an HMM with correlated Gaussian distributions conditioned on the secondary structure provides an adequate generative model of chemical shifts. Proteins 2013; © 2012 Wiley Periodicals, Inc.  相似文献   

Supersecondary structures of proteins have been systematically searched and classified, but not enough attention has been devoted to such large edifices beyond the basic identification of secondary structures. The objective of the present study is to show that the association of secondary structures that share some of their backbone residues is a commonplace in globular proteins, and that such deeper fusion of secondary structures, namely extended secondary structures (ESSs), helps stabilize the original secondary structures and the resulting tertiary structures. For statistical purposes, a set of 163 proteins from the protein databank was randomly selected and a few specific cases are structurally analyzed and characterized in more detail. The results point that about 30% of the residues from each protein, on average, participate in ESS. Alternatively, for the specific cases considered, our results were based on the secondary structures produced after extensive Molecular Dynamics simulation of a protein–aqueous solvent system. Based on the very small width of the time distribution of the root mean squared deviations, between the ESS taken along the simulation and the ESS from the mean structure of the protein, for each ESS, we conclude that the ESSs significantly increase the conformational stability by forming very stable aggregates. The ubiquity and specificity of the ESS suggest that the role they play in the structure of proteins, including the domains formation, deserves to be thoroughly investigated.  相似文献   

Summary A simple technique for identifying protein secondary structures through the analysis of backbone 13C chemical shifts is described. It is based on the Chemical-Shift Index [Wishart et al. (1992) Biochemistry, 31, 1647–1651] which was originally developed for the analysis of 1H chemical shifts. By extending the Chemical-Shift Index to include 13C, 13C and carbonyl 13C chemical shifts, it is now possible to use four independent chemical-shift measurements to identify and locate protein secondary structures. It is shown that by combining both 1H and 13C chemical-shift indices to produce a consensus estimate of secondary structure, it is possible to achieve a predictive accuracy in excess of 92%. This suggests that the secondary structure of peptides and proteins can be accurately obtained from 1H and 13C chemical shifts, without recourse to NOE measurements.Supplementary material is available in the form of a 10-page table (Table S1) describing the exact location of secondary structures in all 20 proteins as determined using the methods described in this paper. Requests for Table S1 should be directed to the authors.  相似文献   

