首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Secondary structure prediction from amino acid sequence is a key component of protein structure prediction, with current accuracy at approximately 75%. We analysed two state-of-the-art secondary structure prediction methods, PHD and JPRED, comparing predictions with secondary structure assigned by the algorithms DSSP and STRIDE. The specific focus of our study was alpha-helix N-termini, as empirical free energy scales are available for residue preferences at N-terminal positions. Although these prediction methods perform well in general at predicting the alpha-helical locations and length distributions in proteins, they perform less well at predicting the correct helical termini. For example, although most predicted alpha-helices overlap a real alpha-helix (with relatively few completely missed or extra predicted helices), only one-third of JPRED and PHD predictions correctly identify the N-terminus. Analysis of neighbouring N-terminal sequences to predicted helical N-termini shows that the correct N-terminus is often within one or two residues. More importantly, the true N-terminal motif is, on average, more favourable as judged by our experimentally measured free energies. This suggests a simple, but powerful, strategy to improve secondary structure prediction using empirically derived energies to adjust the predicted output to a more favourable N-terminal sequence.  相似文献   

2.
Shestopalov BV 《Tsitologiia》2007,49(7):594-600
One of the possible ways for complete and final solution of the problem of determination of three-dimensional structure of proteins on amino acid sequence is simulation of protein three-dimensional structure formation. The use of the code physics method developed by the author has been suggested to fulfill this task. The simulation of alpha-helix and beta-hairpin formation in water-soluble proteins as a start of realization of the plan is described here. The results of the simulation were compared with the experimental data for 14 proteins of no more than 50 amino acids and therefore with little number of alpha-helices and beta-strands (to meet limits of simulation process) and with secondary structure predictions by the best to data methods of protein secondary structure prediction, PSIpred, PORTER and PROFsec. Secondary structure of the proteins, obtained as a result of the simulation of alpha-helix and beta-hairpin formation using the code physics method, corresponded completely to experimental data while the secondary structure predicted by the PSIpred, PORTER and PROFsec methods differed from these data significantly.  相似文献   

3.
To improve secondary structure predictions in protein sequences, the information residing in multiple sequence alignments of substituted but structurally related proteins is exploited. A database comprised of 70 protein families and a total of 2,500 sequences, some of which were aligned by tertiary structural superpositions, was used to calculate residue exchange weight matrices within alpha-helical, beta-strand, and coil substructures, respectively. Secondary structure predictions were made based on the observed residue substitutions in local regions of the multiple alignments and the largest possible associated exchange weights in each of the three matrix types. Comparison of the observed and predicted secondary structure on a per-residue basis yielded a mean accuracy of 72.2%. Individual alpha-helix, beta-strand, and coil states were respectively predicted at 66.7, and 75.8% correctness, representing a well-balanced three-state prediction. The accuracy level, verified by cross-validation through jack-knife tests on all protein families, dropped, on average, to only 70.9%, indicating the rigor of the prediction procedure. On the basis of robustness, conceptual clarity, accuracy, and executable efficiency, the method has considerable advantage, especially with its sole reliance on amino acid substitutions within structurally related proteins.  相似文献   

4.
Reliability of the hydropathy method to predict the formation of membrane-spanning alpha-helices by integral membrane proteins and peptides whose structure is known from X-ray crystallography is analysed. It is shown that Kyte-Doolittle hydropathy plots do not predict accurately 22 transmembrane alpha-helices in the reaction centres (RC) of the photosynthetic bacteria Rhodopseudomonas viridis and Rhodobacter sphaeroides (R-26). The accuracy of prediction for these proteins was improved using an optimised Kyte-Doolittle hydrophobicity scale. However, this hydrophobicity scale did not improve the predictions for the alphabeta-peptides of the B800-850 (LH2) complexes of the photosynthetic bacteria Rhodopseudomonas acidophila and Rhodospirillum molischianum, which were excluded from the optimisation procedure. The best and worst predictions of membrane-spanning alpha-helices for the RC proteins and LH2 peptides, respectively, were obtained with a propensity scale (PRC) calculated from the amino acid sequences and X-ray data for the RC proteins. A propensity scale (PLH) obtained using the amino acid sequences and X-ray data for the alphabeta-peptides of the LH2 complexes did not give an acceptable prediction of the transmembrane segments in the LH2 peptides; moreover, it markedly contradicted the PRC scale. Amino acids have been concluded to have no significant preference to localisation in transmembrane segments. Therefore, the predictive ability of the hydropathy methodology appears to be limited: the number of transmembrane segments can be correctly calculated for the best case only, and the lengths and positions of membrane-spanning alpha-helices in a protein amino acid sequence can not be predicted exactly.  相似文献   

5.
The secondary structure content of the N-terminal extracellular domain of beta-dystroglycan (a recombinant fragment extending from positions 654 to 750) has been quantitatively determined by means of CD and FTIR spectroscopies. The elements of secondary structure, namely an 8-10 residue long alpha-helix (10%) and two beta-strands (24%) have been assigned to specific amino acid sequences by means of a GOR constrained prediction method. The remaining 66% of the whole sequence is classified as turns or unordered. The temperature dependence of CD and FTIR spectra has been investigated in detail. A reversible, non-cooperative thermal transition is observed with both CD and FTIR spectroscopies up to 95 degrees C. The profile of the transition is typical of the unfolding of isolated peptides and corresponds to the progressive loss of the secondary structure elements of the protein with no evidence for collapsing phenomena, typical of globular proteins, upon heating.  相似文献   

6.
Accurately predicted protein secondary structure provides useful information for target selection, to analyze protein function and to predict higher dimensional structure. Existing research shows that more data + refined search = better prediction. We analyze relation between the prediction accuracy and another crucial factor, the protein size. Empirical tests performed with two secondary structure predictors on a large set of high-resolution, non-redundant proteins show that the average accuracies for small proteins (<100 residues) equal 73% and 54% for alpha-helices and beta-strands, respectively. The alpha-helix/beta-strand accuracies for very large proteins (>300 residues) equal 77%/68%, respectively. Similarly, the tests with three secondary structure content predictors show that the prediction errors for the small/very large proteins equal 0.13/0.09 and 0.09/0.06 for alpha-helix and beta-strand content, respectively. Our tests confirm that the secondary structure/content predictions for the very large proteins are characterized statistically significantly better quality than prediction for the small proteins. This is in contrast with the tertiary structure predictions in which higher accuracy is obtained for smaller proteins.  相似文献   

7.
S Hayward  J F Collins 《Proteins》1992,14(3):372-381
Using a backpropagation neural network model we have found a limit for secondary structure prediction from local sequence. By including only sequences from whole alpha-helix and non-alpha-helix structures in our training and test sets--sequences spanning boundaries between these two structures were excluded--it was possible to investigate directly the relationship between sequence and structure for alpha-helix. A group of non-alpha-helix sequences, that was disrupting overall prediction success, was indistinguishable to the network from alpha-helix sequences. These sequences were found to occur at regions adjacent to the termini of alpha-helices with statistical significance, suggesting that potentially longer alpha-helices are disrupted by global constraints. Some of these regions spanned more than 20 residues. On these whole structure sequences, 10 residues in length, a comparatively high prediction success of 78% with a correlation coefficient of 0.52 was achieved. In addition, the structure of the input space, the distribution of beta-sheet in this space, and the effect of segment length were also investigated.  相似文献   

8.
9.
An alpha-helix terminates when the virtual extension of its most hydrophobic, longitudinal strip containing Leu, Ile, Val, Phe, and Met lacks those residues. In each of 247 helices a template was fitted to maximize the mean hydrophobicity of positions forming a longitudinal strip-of-helix. The template was then extended into sequences beyond the ends of the helices. Leu, Ile, Val, Phe, and Met occurred in positions in the longitudinal strip-of-helix at an increased frequency (p less than 0.001), but in the first and second positions beyond either end of each true helix, they occurred at the same frequency as for their empirical distribution over all the proteins. Excesses of Asp and Glu were found in the N-terminal loop, and of Arg, His, and Lys in specific positions about the C terminus of helices. The longitudinal hydrophobic strip, the smallest amino acid in that strip, and charged amino acids in that strip, related to rotational and longitudinal orientation of alpha-helices in 15 proteins. Adjacent helices generally crossed through their longitudinal hydrophobic strips. They usually crossed through the smallest residue in the strip. Charged residues, when they occurred in the strips, were excluded from the crossing regions.  相似文献   

10.
Jia M  Luo L  Liu C 《Biopolymers》2004,73(1):16-26
A new integrated sequence-structure database, called IADE (Integrated ASTRAL-DSSP-EMBL), incorporating matching mRNA sequence, amino acid sequence, and protein secondary structural data, is constructed. It includes 648 protein domains. Based on the IADE database, we studied the relation between RNA stem-loop frequencies and protein secondary structure. It was found that the alpha-helices and beta-strands on proteins tend to be preferably "coded" by mRNA stem region, while the coils on proteins tend to be preferably "coded" by mRNA loop region. These tendencies are more obvious if we observe the structural words (SWs). An SW is defined by a four-amino-acid-fragment that shows the pronounced secondary structural (alpha-helix or beta-strand) propensity. It is demonstrated that the deduced correlation between protein and mRNA structure can hardly be explained as the stochastic fluctuation effect.  相似文献   

11.
Protein secondary structure predictions and amino acid long range contact map predictions from primary sequence of proteins have been explored to aid in modelling protein tertiary structures. In order to evaluate the usefulness of secondary structure and 3D-residue contact prediction methods to model protein structures we have used the known Q3 (alpha-helix, beta-strands and irregular turns/loops) secondary structure information, along with residue-residue contact information as restraints for MODELLER. We present here results of our modelling studies on 30 best resolved single domain protein structures of varied lengths. The results shows that it is very difficult to obtain useful models even with 100% accurate secondary structure predictions and accurate residue contact predictions for up to 30% of residues in a sequence. The best models that we obtained for proteins of lengths 37, 70, 118, 136 and 193 amino acid residues are of RMSDs 4.17, 5.27, 9.12, 7.89 and 9.69, respectively. The results show that one can obtain better models for the proteins which have high percent of alpha-helix content. This analysis further shows that MODELLER restrain optimization program can be useful only if we have truly homologous structure(s) as a template where it derives numerous restraints, almost identical to the templates used. This analysis also clearly indicates that even if we satisfy several true residue-residue contact distances, up to 30% of their sequence length with fully known secondary structural information, we end up predicting model structures much distant from their corresponding native structures.  相似文献   

12.
Protein secondary structure predictions and amino acid long range contact map predictions from primary sequence of proteins have been explored to aid in modelling protein tertiary structures. In order to evaluate the usefulness of secondary structure and 3D-residue contact prediction methods to model protein structures we have used the known Q3 (alpha-helix,beta-strands and irregular turns/loops) secondary structure information, along with residue-residue contact information as restraints for MODELLER. We present here results of our modelling studies on 30 best resolved single domain protein structures of varied lengths. The results shows that it is very difficult to obtain useful models even with 100% accurate secondary structure predictions and accurate residue contact predictions for up to 30% of residues in a sequence. The best models that we obtained for proteins of lengths 37, 70, 118, 136 and 193 amino acid residues are of RMSDs 4.17, 5.27, 9.12, 7.89 and 9.69,respectively. The results show that one can obtain better models for the proteins which have high percent of alpha-helix content. This analysis further shows that MODELLER restrain optimization program can be useful only if we have truly homologous structure(s) as a template where it derives numerous restraints, almost identical to the templates used. This analysis also clearly indicates that even if we satisfy several true residue-residue contact distances, up to 30%of their sequence length with fully known secondary structural information, we end up predicting model structures much distant from their corresponding native structures.  相似文献   

13.
Wang J  Feng JA 《Protein engineering》2003,16(11):799-807
This paper reports an extensive sequence analysis of the alpha-helices of proteins. alpha-Helices were extracted from the Protein Data Bank (PDB) and were divided into groups according to their sizes. It was found that some amino acids had differential propensity values for adopting helical conformation in short, medium and long alpha-helices. Pro and Trp had a significantly higher propensity for helical conformation in short helices than in medium and long helices. Trp was the strongest helix conformer in short helices. Sequence patterns favoring helical conformation were derived from a neighbor-dependent sequence analysis of proteins, which calculated the effect of neighboring amino acid type on the propensity of residues for adopting a particular secondary structure in proteins. This method produced an enhanced statistical significance scale that allowed us to explore the positional preference of amino acids for alpha-helical conformations. It was shown that the amino acid pair preference for alpha-helix had a unique pattern and this pattern was not always predictable by assuming proportional contributions from the individual propensity values of the amino acids. Our analysis also yielded a series of amino acid dyads that showed preference for alpha-helix conformation. The data presented in this study, along with our previous study on loop sequences of proteins, should prove useful for developing potential 'codes' for recognizing sequence patterns that are favorable for specific secondary structural elements in proteins.  相似文献   

14.
Ribonuclease HII from hyperthermophile Thermococcus kodakaraensis (Tk-RNase HII) is a robust monomeric protein under kinetic control, which possesses some proline residues at the N-terminal of alpha-helices. Proline residue at the N-terminal of an alpha-helix is thought to stabilize a protein. In this work, the thermostability and folding kinetics of Tk-RNase HII were measured for mutant proteins in which a proline residue is introduced (Xaa to Pro) or removed (Pro to Ala) at the N-terminal of alpha-helices. In the folding experiments, the mutant proteins examined exhibit little influence on the remarkably slow unfolding of Tk-RNase HII. In contrast, E111P and K199P exhibit some thermostabilization, whereas P46A, P70A and P174A have some thermodestabilization. E111P/K199P and P46A/P70A double mutations cause cumulative changes in stability. We conclude that the proline effect on protein thermostability is observed in a hyperthermophilic protein, but each proline residue at the N-terminal of an alpha-helix slightly contributes to the thermostability. The present results also mean that even a natural hyperthermophilic protein can acquire improved thermostability.  相似文献   

15.
We present a method for analyzing the chemical shift database to yield information on nearest-neighbor effects on carbon-13 chemical shift values for alpha and beta carbons of amino acids in proteins. For each amino acid sequence XYZ, we define two correction factors, Delta(XY) s and Delta(YZ) s , representing the effects on (delta13 Calpha-delta13 Cbeta) for residue Y from the preceding residue (X) and the following residue (Z), where X, Y, and Z represent one of the 20 naturally occurring amino acids, Delta designates the change in value or the correction factor (in ppm), and s is an index standing for one of three "pseudo secondary structure states" derived from chemical shift dispersions, which we show represent residues in primarily alpha-helix, beta-strand, and non-alphabeta(coil). The correction factors were obtained from maximum likelihood fitting of (delta13 Calpha-delta13 Cbeta) values from the chemical shifts of 651 proteins to a mixture of three Gaussians. These correction factors were derived strictly from the analysis of assigned chemical shifts, without regard to the three-dimensional structures of these proteins. The corrections factors were found to differ according to the secondary structural environment of the central residue (deduced from the chemical shift distribution) as well as by different identities of the nearest neighboring residues in the sequence. The areas subsumed by the sequence-dependent chemical shift distributions report on the relative energies of the sequences in different pseudo secondary structural environments, and the positions of the peaks indicate the chemical shifts of lowest energy conformations. As such, these results have potential applications to the determination of dihedral angle restraints from chemical shifts for structure determination and to more accurate predictions of chemical shifts in proteins of known structure. From a database of chemical shifts associated well-defined three-dimensional structures, comparisons were made between DSSP designations derived from three-dimensional structure and pseudo secondary structure designations derived from nearest-neighbor corrected chemical shift analysis. The high level of agreement between the two approaches to classifying secondary structure provides a measure of confidence in this chemical shift-based approach to the analysis of protein structure.  相似文献   

16.
Wang JY  Ahmad S  Gromiha MM  Sarai A 《Biopolymers》2004,75(3):209-216
We developed dictionaries of two-, three-, and five-residue patterns in proteins and computed the average solvent accessibility of the central residues in their native proteins. These dictionaries serve as a look-up table for making subsequent predictions of solvent accessibility of amino acid residues. We find that predictions made in this way are very close to those made using more sophisticated methods of solvent accessibility prediction. We also analyzed the effect of immediate neighbors on the solvent accessibility of residues. This helps us in understanding how the same residue type may have different accessible surface areas in different proteins and in different positions of the same protein. We observe that certain residues have a tendency to increase or decrease the solvent accessibility of their neighboring residues in C- or N-terminal positions. Interestingly, the C-terminal and N-terminal neighbor residues are found to have asymmetric roles in modifying solvent accessibility of residues. As expected, similar neighbors enhance the hydrophobic or hydrophilic character of residues. Detailed look-up tables are provided on the web at www.netasa.org/look-up/.  相似文献   

17.
A study was made on the physical, chemical, energetic, conformational, geometric, and dynamic property potentials of amino acid residues in protein secondary structures: alpha-helix and beta-strand. Property patterns were obtained by computing the average property values for specified residue units partitioned longitudinally and transversely about the chain. It was found that in alpha-helices with not more than 15 residues, there exist longitudinally opposing portions, one characteristically higher in average property potentials than the other. The helical chain, in general, acquires either an increasing or decreasing average potential in the N-terminal to C-terminal direction. The sequence-wise and surface-wise variations of property potentials in the elements of beta-structure also revealed such general patterns. Possible wrong predictions in statistical methods of one secondary structural class over the other are pointed out.  相似文献   

18.
19.
20.
Improving the prediction of secondary structure of 'TIM-barrel' enzymes.   总被引:1,自引:0,他引:1  
The information contained in aligned sets of homologous protein sequences should improve the score of secondary structure prediction. Seven different enzymes having the (beta/alpha)8 or TIM-barrel fold were used to optimize the prediction with regard to this class of enzymes. The alpha-helix, beta-strand and loop propensities of the Garnier-Osguthorpe-Robson method were averaged at aligned residue positions, leading to a significant improvement over the average score obtained from single sequences. The increased accuracy correlates with the average sequence variability of the aligned set. Further improvements were obtained by using the following averaged properties as weights for the averaged state propensities: amphipathic moment and alpha-helix; hydropathy and beta-strand; chain flexibility and loop. The clustering of conserved residues at the C-terminal ends of the beta-strands was used as an additional positive weight for beta-strand propensity and increased the prediction of otherwise unpredicted beta-strands decisively. The automatic weighted prediction method identifies greater than 95% of the secondary structure elements of the set of seven TIM-barrel enzymes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号