首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Wu S  Zhang Y 《PloS one》2008,3(10):e3400
We developed a composite machine-learning based algorithm, called ANGLOR, to predict real-value protein backbone torsion angles from amino acid sequences. The input features of ANGLOR include sequence profiles, predicted secondary structure and solvent accessibility. In a large-scale benchmarking test, the mean absolute error (MAE) of the phi/psi prediction is 28 degrees/46 degrees , which is approximately 10% lower than that generated by software in literature. The prediction is statistically different from a random predictor (or a purely secondary-structure-based predictor) with p-value <1.0 x 10(-300) (or <1.0 x 10(-148)) by Wilcoxon signed rank test. For some residues (ILE, LEU, PRO and VAL) and especially the residues in helix and buried regions, the MAE of phi angles is much smaller (10-20 degrees ) than that in other environments. Thus, although the average accuracy of the ANGLOR prediction is still low, the portion of the accurately predicted dihedral angles may be useful in assisting protein fold recognition and ab initio 3D structure modeling.  相似文献   

2.
Xue B  Dor O  Faraggi E  Zhou Y 《Proteins》2008,72(1):427-433
The backbone structure of a protein is largely determined by the phi and psi torsion angles. Thus, knowing these angles, even if approximately, will be very useful for protein-structure prediction. However, in a previous work, a sequence-based, real-value prediction of psi angle could only achieve a mean absolute error of 54 degrees (83 degrees, 35 degrees, 33 degrees for coil, strand, and helix residues, respectively) between predicted and actual angles. Moreover, a real-value prediction of phi angle is not yet available. This article employs a neural-network based approach to improve psi prediction by taking advantage of angle periodicity and apply the new method to the prediction to phi angles. The 10-fold-cross-validated mean absolute error for the new method is 38 degrees (58 degrees, 33 degrees, 22 degrees for coil, strand, and helix, respectively) for psi and 25 degrees (35 degrees, 22 degrees, 16 degrees for coil, strand, and helix, respectively) for phi. The accuracy of real-value prediction is comparable to or more accurate than the predictions based on multistate classification of the phi-psi map. More accurate prediction of real-value angles will likely be useful for improving the accuracy of fold recognition and ab initio protein-structure prediction. The Real-SPINE 2.0 server is available on the website http://sparks.informatics.iupui.edu.  相似文献   

3.
A new program, TALOS-N, is introduced for predicting protein backbone torsion angles from NMR chemical shifts. The program relies far more extensively on the use of trained artificial neural networks than its predecessor, TALOS+. Validation on an independent set of proteins indicates that backbone torsion angles can be predicted for a larger, ≥90 % fraction of the residues, with an error rate smaller than ca 3.5 %, using an acceptance criterion that is nearly two-fold tighter than that used previously, and a root mean square difference between predicted and crystallographically observed (?, ψ) torsion angles of ca 12º. TALOS-N also reports sidechain χ1 rotameric states for about 50 % of the residues, and a consistency with reference structures of 89 %. The program includes a neural network trained to identify secondary structure from residue sequence and chemical shifts.  相似文献   

4.
Song J  Tan H  Wang M  Webb GI  Akutsu T 《PloS one》2012,7(2):e30361
Protein backbone torsion angles (Phi) and (Psi) involve two rotation angles rotating around the C(α)-N bond (Phi) and the C(α)-C bond (Psi). Due to the planarity of the linked rigid peptide bonds, these two angles can essentially determine the backbone geometry of proteins. Accordingly, the accurate prediction of protein backbone torsion angle from sequence information can assist the prediction of protein structures. In this study, we develop a new approach called TANGLE (Torsion ANGLE predictor) to predict the protein backbone torsion angles from amino acid sequences. TANGLE uses a two-level support vector regression approach to perform real-value torsion angle prediction using a variety of features derived from amino acid sequences, including the evolutionary profiles in the form of position-specific scoring matrices, predicted secondary structure, solvent accessibility and natively disordered region as well as other global sequence features. When evaluated based on a large benchmark dataset of 1,526 non-homologous proteins, the mean absolute errors (MAEs) of the Phi and Psi angle prediction are 27.8° and 44.6°, respectively, which are 1% and 3% respectively lower than that using one of the state-of-the-art prediction tools ANGLOR. Moreover, the prediction of TANGLE is significantly better than a random predictor that was built on the amino acid-specific basis, with the p-value<1.46e-147 and 7.97e-150, respectively by the Wilcoxon signed rank test. As a complementary approach to the current torsion angle prediction algorithms, TANGLE should prove useful in predicting protein structural properties and assisting protein fold recognition by applying the predicted torsion angles as useful restraints. TANGLE is freely accessible at http://sunflower.kuicr.kyoto-u.ac.jp/~sjn/TANGLE/.  相似文献   

5.
Dor O  Zhou Y 《Proteins》2007,68(1):76-81
Proteins can move freely in three-dimensional space. As a result, their structural properties, such as solvent accessible surface area, backbone dihedral angles, and atomic distances, are continuous variables. However, these properties are often arbitrarily divided into a few classes to facilitate prediction by statistical learning techniques. In this work, we establish an integrated system of neural networks (called Real-SPINE) for real-value prediction and apply the method to predict residue-solvent accessibility and backbone psi dihedral angles of proteins based on information derived from sequences only. Real-SPINE is trained with a large data set of 2640 protein chains, sequence profiles generated from multiple sequence alignment, representative amino-acid properties, a slow learning rate, overfitting protection, and predicted secondary structures. The method optimizes more than 200,000 weights and yields a 10-fold cross-validated Pearson's correlation coefficient (PCC) of 0.74 between predicted and actual solvent accessible surface areas and 0.62 between predicted and actual psi angles. In particular, 90% of 2640 proteins have a PCC value greater than 0.6 between predicted and actual solvent-accessible surface areas. The results of Real-SPINE can be compared with the best reported correlation coefficients of 0.64-0.67 for solvent-accessible surface areas and 0.47 for psi angles. The real-SPINE server, executable programs, and datasets are freely available on http://sparks.informatics.iupui.edu.  相似文献   

6.
The RCNPRED server implements a neural network-based method to predict the co-ordination numbers of residues starting from the protein sequence. Using evolutionary information as input, RCNPRED predicts the residue states of the proteins in the database with 69% accuracy and scores 12 percentage points higher than a simple statistical method. Moreover the server implements a neural network to predict the relative solvent accessibility of each residue. A protein sequence can be directly submitted to RCNPRED: residue co-ordination numbers and solvent accessibility for each chain are returned via e-mail. AVAILABILITY: Freely available to non-commercial users at http://prion.biocomp.unibo.it/rcnpred.html.  相似文献   

7.
The Ramachandran steric map and energy diagrams of the glycyl residue are symmetric. A plot of (phi,psi) angles of glycyl residues in 250 nonhomologous and high-resolution protein structures is also largely symmetric. However, there is a clear aberration in the symmetry. Although there is a cluster of points corresponding to the right-handed alpha-helical region, the "equivalent" cluster is clearly shifted to in and around the (phi,psi) values of (90 degrees, 0 degrees ) instead of being centered at the left-handed alpha-helical region of (60 degrees, 40 degrees ). This lack of symmetry exists even in the (phi,psi) distribution of residues from non-alpha-helical regions in proteins. Here we provide an explanation for this observation. An analysis of glycyl conformations in small peptide structures and in "coil" proteins, which are largely devoid of helical and sheet regions, shows that glycyl residues prefer to adopt conformations around (+/-90 degrees, 0 degrees ) instead of right- and left-handed alpha-helical regions. By using theoretical calculations, such conformations are shown to have highest solvent accessibility in a system of two-linked peptide units with glycyl residue at the central C(alpha) atom. This finding is consistent with the observations from 250 nonhomologous protein structures where glycyl residues with conformations close to (+/-90 degrees, 0 degrees ) are seen to have high solvent accessibility. Analysis of a subset of nonhomologous structures with very high resolution (1.5 A or better) shows that water molecules are indeed present at distances suitable for hydrogen bond interaction with glycyl residues possessing conformations close to (+/-90 degrees, 0 degrees ). It is suggested that water molecules play a key role in determining and stabilizing these conformations of glycyl residues and explain the aberration in the symmetry of glycyl conformations in proteins.  相似文献   

8.
The ability to predict local structural features of a protein from the primary sequence is of paramount importance for unraveling its function in absence of experimental structural information. Two main factors affect the utility of potential prediction tools: their accuracy must enable extraction of reliable structural information on the proteins of interest, and their runtime must be low to keep pace with sequencing data being generated at a constantly increasing speed. Here, we present NetSurfP-2.0, a novel tool that can predict the most important local structural features with unprecedented accuracy and runtime. NetSurfP-2.0 is sequence-based and uses an architecture composed of convolutional and long short-term memory neural networks trained on solved protein structures. Using a single integrated model, NetSurfP-2.0 predicts solvent accessibility, secondary structure, structural disorder, and backbone dihedral angles for each residue of the input sequences. We assessed the accuracy of NetSurfP-2.0 on several independent test datasets and found it to consistently produce state-of-the-art predictions for each of its output features. We observe a correlation of 80% between predictions and experimental data for solvent accessibility, and a precision of 85% on secondary structure 3-class predictions. In addition to improved accuracy, the processing time has been optimized to allow predicting more than 1000 proteins in less than 2 hours, and complete proteomes in less than 1 day.  相似文献   

9.

Background

Protein sequence profile-profile alignment is an important approach to recognizing remote homologs and generating accurate pairwise alignments. It plays an important role in protein sequence database search, protein structure prediction, protein function prediction, and phylogenetic analysis.

Results

In this work, we integrate predicted solvent accessibility, torsion angles and evolutionary residue coupling information with the pairwise Hidden Markov Model (HMM) based profile alignment method to improve profile-profile alignments. The evaluation results demonstrate that adding predicted relative solvent accessibility and torsion angle information improves the accuracy of profile-profile alignments. The evolutionary residue coupling information is helpful in some cases, but its contribution to the improvement is not consistent.

Conclusion

Incorporating the new structural information such as predicted solvent accessibility and torsion angles into the profile-profile alignment is a useful way to improve pairwise profile-profile alignment methods.  相似文献   

10.
Beta-breakers: an aperiodic secondary structure   总被引:1,自引:0,他引:1  
We have studied the architecture of parallel beta-sheets in proteins and focused on the residues that initiate and terminate the beta-strands. These beta-breaker residues are at the origin of the kink between the beta-strand and the turn that precedes or follows it. beta-Breakers can be located automatically using a consensus approach based on algorithmic secondary structure assignment, solvent accessibility and backbone dihedral angles. These beta-breakers are conformationally homogeneous with respect to side-chain solvent accessibility and backbone dihedral angle profile. A sequence-structure correlation is noted: a restricted subset of amino acids is observed at these positions. Analysis of homologous protein sequences shows that these residues are more highly conserved than other residues in the loop. We conclude that beta-breakers are the structural analogs of the N and C-terminal caps of alpha-helices. The identification of this aperiodic substructure suggests a strategy for improving secondary structure prediction and may guide site-directed mutagenesis experiments.  相似文献   

11.
The conformational preferences of azaphenylalanine-containing peptide were investigated using a model compound, Ac-azaPhe-NHMe with ab initio method at the HF/3-21G and HF/6-31G(*) levels, and the seven minimum energy conformations with trans orientation of acetyl group and the 4 minimum energy conformations with cis orientation of acetyl group were found at the HF/6-31G(*) level if their mirror images were not considered. An average backbone dihedral angle of the 11 minimum energy conformations is phi=+/-91 degrees +/-24 degrees , psi =+/-18 degrees +/-10 degrees (or +/-169 degrees +/-8 degrees ), corresponding to the i+2 position of beta-turn (delta(R)) or polyproline II (beta(P)) structure, respectively. The chi(1) angle in the aromatic side chain of azaPhe residue adopts preferentially between +/-60 degrees and +/-130 degrees, which reflect a steric hindrance between the N-terminal carbonyl group or the C-terminal amide group and the aromatic side chain with respect to the configuration of the acetyl group. These conformational preferences of Ac-azaPhe-NHMe predicted theoretically were compared with those of For-Phe-NHMe to characterize the structural role of azaPhe residue. Four tripeptides containing azaPhe residue, Boc-Xaa-azaPhe-Ala-OMe [Xaa=Gly(1), Ala(2), Phe(3), Asn(4)] were designed and synthesized to verify whether the backbone torsion angles of azaPhe reside are still the same as compared with theoretical conformations and how the preceding amino acids of azaPhe residue perturb the beta-turn skeleton in solution. The solution conformations of these tripeptide models containing azaPhe residue were determined in CDCl(3) and DMSO solvents using NMR and molecular modeling techniques. The characteristic NOE patterns, the temperature coefficients of amide protons and small solvent accessibility for the azapeptides 1-4 reveal to adopt the beta-turn structure. The structures of azapeptides containing azaPhe residue from a restrained molecular dynamics simulation indicated that average dihedral angles [(phi(1), psi(1)), (phi(2), psi(2))] of Xaa-azaPhe fragment in azapeptide, Boc-Xaa-azaPhe-Ala-OMe were [(-68 degrees, 135 degrees ), (116 degrees, -1 degrees )], and this implies that the intercalation of an azaPhe residue in tripeptide induces the betaII-turn conformation, and the volume change of a preceding amino acid of azaPhe residue in tripeptides would not perturb seriously the backbone dihedral angle of beta-turn conformation. We believe such information could be critical in designing useful molecules containing azaPhe residue for drug discovery and peptide engineering.  相似文献   

12.
We present an in silico method to estimate the contribution of each residue in a protein to its overall stability using three database‐derived statistical potentials that are based on inter‐residue distances, backbone torsion angles and solvent accessibility, respectively. Residues that contribute very unfavorably to the folding free energy are defined as stability weaknesses, whereas residues that show a highly stabilizing contribution are called stability strengths. Strengths and/or weaknesses on residues that are in spatial contact are clustered into 3‐dimensional (3D) stability patches. The identification and analysis of strength‐ and weakness‐containing regions in a protein may reveal structural or functional characteristics, and/or interesting spots to introduce mutations. To illustrate the power of our method, we apply it to bovine seminal ribonuclease. This enzyme catalyzes the degradation of RNA strands, and has the peculiarity of undergoing 3D domain swapping in physiological conditions. The weaknesses and strengths were compared among the monomeric, dimeric and swapped dimeric forms. We identified weaknesses among the catalytic residues and a mixture of weaknesses and strengths among the substrate‐binding residues in the three forms. In the regions involved in 3D swapping, we observed an accumulation of weaknesses in the monomer, which disappear in the dimer and especially in the swapped dimer. Moreover, monomeric homologous proteins were found to exhibit less weaknesses in these regions, whereas mutants known to favor unswapped dimerization appear stabilized in this form. Our method has several perspectives for functional annotation, rational prediction of targeted mutations, and mapping of stability changes upon conformational rearrangements. Proteins 2016; 84:143–158. © 2015 Wiley Periodicals, Inc.  相似文献   

13.
This paper describes the chemical synthesis and crystal molecular conformation of a non-chiral beta-Ala containing model peptide Boc-beta-Ala-Acc5-OCH3. The analysis revealed the existence of two crystallographically independent molecules A and B, in the asymmetric unit. Unexpectedly, while the magnitudes of the backbone torsion angles in both molecules are remarkably similar, the signs of the corresponding torsion angles are reverse therefore, inclining us to suggest the existence of non-superimposable stereogeometrical features in a non-chiral one-component beta-Ala model system. The critical mu torsion angle around CbetaH2-CalphaH2 bond of the beta-Ala residue represents a typical gauche orientation i.e., mu = 67.7 degrees in A and mu = -61.2 degrees in B, providing the molecule an overall crescent shaped topology. The observed conformation contrasts markedly to those determined for the correlated non-chiral model peptides: Boc-beta-Ala-Acc6-OCH3 and Boc-beta-Ala-Aib-OCH3 signifying the role of stereocontrolling elements since the stereochemically constrained Calpha, alpha-disubstituted glycyl residues (e.g., Acc5, Acc6, and the prototype Aib) are known to strongly restrict the peptide backbone conformations in the 3(10)/alpha-helical-regions ( phi approximately +/-60+/-20 degrees, psi approximately +/-30+/-20 degrees) of the Ramachandran map. Unpredictably, the preferred, phi, psi torsion angles of the Acc5 residue fall outside the helical regions of the Ramachandran map and exhibit opposite-handed twists for A and B. The implications of the semi-extended conformation of the Acc5 residue in the construction of backbone-modified novel scaffolds and peptides of biological relevance are highlighted. Taken together, the results indicate that in short linear beta-Ala containing peptides specific structural changes can be induced by selective substitution of non-coded linear- or cyclic symmetrically Calpha,alpha-disubstituted glycines, reinstating the hypothesis that in addition to conformational restrictions, the chemical nature of the neighboring side-chain substituents and local environments collectively influences the stabilization of folding-unfolding behavior of the two methylene units of a beta-Ala residue.  相似文献   

14.
Recently, we demonstrated that yeast protein evolutionary rate at the level of individual amino acid residues scales linearly with degree of solvent accessibility. This residue-level structure-evolution relationship is sensitive to protein core size: surface residues from large-core proteins evolve much faster than those from small-core proteins, while buried residues are equally constrained independent of protein core size. In this work, we investigate the joint effects of protein core size and expression on the residue-level structure-evolution relationship. At the whole-protein level, protein expression is a much more dominant determinant of protein evolutionary rate than protein core size. In contrast, at the residue level, protein core size and expression both have major impacts on protein structure-evolution relationships. In addition, protein core size and expression influence residue-level structure-evolution relationships in qualitatively different ways. Protein core size preferentially affects the non-synonymous substitution rates of surface residues compared to buried residues, and has little influence on synonymous substitution rates. In comparison, protein expression uniformly affects all residues independent of degree of solvent accessibility, and affects both non-synonymous and synonymous substitution rates. Protein core size and expression exert largely independent effects on protein evolution at the residue level, and can combine to produce dramatic changes in the slope of the linear relationship between residue evolutionary rate and solvent accessibility. Our residue-level findings demonstrate that protein core size and expression are both important, yet qualitatively different, determinants of protein evolution. These results underscore the complementary nature of residue-level and whole-protein analysis of protein evolution.  相似文献   

15.
Xu Z  Zhang C  Liu S  Zhou Y 《Proteins》2006,63(4):961-966
Solvent accessibility, one of the key properties of amino acid residues in proteins, can be used to assist protein structure prediction. Various approaches such as neural network, support vector machines, probability profiles, information theory, Bayesian theory, logistic function, and multiple linear regression have been developed for solvent accessibility prediction. In this article, a much simpler quadratic programming method based on the buriability parameter set of amino acid residues is developed. The new method, called QBES (Quadratic programming and Buriability Energy function for Solvent accessibility prediction), is reasonably accurate for predicting the real value of solvent accessibility. By using a dataset of 30 proteins to optimize three parameters, the average correlation coefficients between the predicted and actual solvent accessibility are about 0.5 for all four independent test sets ranging from 126 to 513 proteins. The method is efficient. It takes only 20 min for a regular PC to obtain results of 30 proteins with an average length of 263 amino acids. Although the proposed method is less accurate than a few more sophisticated methods based on neural network or support vector machines, this is the first attempt to predict solvent accessibility by energy optimization with constraints. Possible improvements and other applications of the method are discussed.  相似文献   

16.
The crystal structures of two oligopeptides containing di-n-propylglycine (Dpg) residues, Boc-Gly-Dpg-Gly-Leu-OMe (1) and Boc-Val-Ala-Leu-Dpg-Val-Ala-Leu-Val-Ala-Leu-Dpg-Val-Ala-Leu-OMe (2) are presented. Peptide 1 adopts a type I'beta-turn conformation with Dpg(2)-Gly(3) at the corner positions. The 14-residue peptide 2 crystallizes with two molecules in the asymmetric unit, both of which adopt alpha-helical conformations stabilized by 11 successive 5 --> 1 hydrogen bonds. In addition, a single 4 --> 1 hydrogen bond is also observed at the N-terminus. All five Dpg residues adopt backbone torsion angles (phi, psi) in the helical region of conformational space. Evaluation of the available structural data on Dpg peptides confirm the correlation between backbone bond angle N-C(alpha)-C' (tau) and the observed backbone phi,psi values. For tau > 106 degrees, helices are observed, while fully extended structures are characterized by tau < 106 degrees. The mean tau values for extended and folded conformations for the Dpg residue are 103.6 degrees +/- 1.7 degrees and 109.9 degrees +/- 2.6 degrees, respectively.  相似文献   

17.

Background  

We describe Distill, a suite of servers for the prediction of protein structural features: secondary structure; relative solvent accessibility; contact density; backbone structural motifs; residue contact maps at 6, 8 and 12 Angstrom; coarse protein topology. The servers are based on large-scale ensembles of recursive neural networks and trained on large, up-to-date, non-redundant subsets of the Protein Data Bank. Together with structural feature predictions, Distill includes a server for prediction of C α traces for short proteins (up to 200 amino acids).  相似文献   

18.
Wang JY  Ahmad S  Gromiha MM  Sarai A 《Biopolymers》2004,75(3):209-216
We developed dictionaries of two-, three-, and five-residue patterns in proteins and computed the average solvent accessibility of the central residues in their native proteins. These dictionaries serve as a look-up table for making subsequent predictions of solvent accessibility of amino acid residues. We find that predictions made in this way are very close to those made using more sophisticated methods of solvent accessibility prediction. We also analyzed the effect of immediate neighbors on the solvent accessibility of residues. This helps us in understanding how the same residue type may have different accessible surface areas in different proteins and in different positions of the same protein. We observe that certain residues have a tendency to increase or decrease the solvent accessibility of their neighboring residues in C- or N-terminal positions. Interestingly, the C-terminal and N-terminal neighbor residues are found to have asymmetric roles in modifying solvent accessibility of residues. As expected, similar neighbors enhance the hydrophobic or hydrophilic character of residues. Detailed look-up tables are provided on the web at www.netasa.org/look-up/.  相似文献   

19.
20.
Left-handed polyproline II helices (PPII) are contiguous elements of protein secondary structure in which the phi and psi angles of constituent residues are restricted to around -75 degrees and 145 degrees, respectively. They are important in structural proteins, in unfolded states and as ligands for signaling proteins. Here, we present a survey of 274 nonhomologous polypeptide chains from proteins of known structure for regions that form these structures. Such regions are rare, but the majority of proteins contain at least one PPII helix. Most PPII helices are shorter than five residues, although the longest found contained 12 amino acids. Proline predominates in PPII, but Gln and positively charged residues are also favored. The basis of Gln's prevalence is its ability to form an i, i + 1 side-chain to main-chain hydrogen bond with the backbone carbonyl oxygen of the proceeding residue; this helps to fix the psi angle of the Gln and the phi and psi of the proceeding residue in PPII conformations and explains why Gln is favored at the first position in a PPII helix. PPII helices are highly solvent exposed, which explains why apolar amino acids are disfavored despite preferring this region of phi/psi space when in isolation. PPII helices have perfect threefold rotational symmetry and within these structures we find significant correlation between the hydrophobicity of residues at i and i + 3; thus, PPII helices in globular proteins can be considered to be amphipathic.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号