首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Support vector machine for predicting alpha-turn types   总被引:3,自引:0,他引:3  
Cai YD  Feng KY  Li YX  Chou KC 《Peptides》2003,24(4):629-630
Tight turns play an important role in globular proteins from both the structural and functional points of view. Of tight turns, beta-turns and gamma-turns have been extensively studied, but alpha-turns were little investigated. Recently, a systematic search for alpha-turns classified alpha-turns into nine different types according to their backbone trajectory features. In this paper, Support Vector Machines (SVMs), a new machine learning method, is proposed for predicting the alpha-turn types in proteins. The high rates of correct prediction imply that that the formation of different alpha-turn types is evidently correlated with the sequence of a pentapeptide, and hence can be approximately predicted based on the sequence information of the pentapeptide alone, although the incorporation of its interaction with the other part of a protein, the so-called "long distance interaction", will further improve the prediction quality.  相似文献   

2.
The number of gamma-turns in a representative protein dataset selected from the current Protein Data Bank has increased almost seven times during the past decade. Eighty percent classic gamma-turns and 57% inverse gamma-turns are associated as multiple turns with either another y-turn or a beta-turn. We refer to these as multiple turns of the (gammabeta)1,2,3 or (betagamma)1,2,3 type, depending upon whether the gamma-turn is before or after the beta-turn along the protein chain, respectively. However, for multiple turns involving only gamma-turns, we follow the nomenclature analogous to that proposed earlier for the multiple (or double) beta-turns. Fifty-eight per cent beta-turns are associated as multiple turns with another beta-turn. We extracted multiple turns from the protein dataset and classified them on the basis of individual gamma- or beta-turn types and the number of overlapping residues. Furthermore, we evaluated the amino acid positional potentials and determined the statistically significant amino acid preferences, hydrogen bond/side-chain interaction preferences in the multiple turns and secondary structure preferences for residues immediately flanking these turns. The results of our analysis would be useful in the modeling, prediction or design of multiple turns in proteins. The amino acid sequence corresponding to the multiple turn, position in the protein chain, PDB Code/chain in which multiple turn is present and the individual turn types constituting the multiple turns are available from our website and this information would also be integrated in our Database of Structural Motifs in Proteins (http://www.cdfd.org.in/dsmp.html).  相似文献   

3.
Chou KC 《Biopolymers》1997,42(7):837-853
Tight turns play an important role in globular proteins from both the structural and functional points of view. Of tight turns, beta-turns and gamma-turns have been extensively studied, but alpha-turns were little investigated. Recently, a systematic search for alpha-turns was conducted by V. Pavone et al. [(1996) Biopolymers, Vol. 38, pp. 705-721] from 190 proteins (221 protein chains). They found 356 alpha-turns that were classified into nine different types according to their backbone trajectory features. In view of this new discovery, a sequence-coupled model based on Markov chain theory is proposed for predicting the alpha-turn types in proteins. The high rates of correct prediction by resubstitution test and jackknife test imply that that the formation of different alpha-turn types is evidently correlated with the sequence of a pentapeptide, and hence can be approximately predicted based on the sequence information of the pentapeptide alone, although the role of its interaction with the other part of a protein cannot be completely ignored. The algorithm presented here can also be used to conduct the prediction in which a distinction between alpha-turns and non-alpha-turns is also required.  相似文献   

4.
The number of beta-turns in a representative set of 426 protein three-dimensional crystal structures selected from the recent Protein Data Bank has nearly doubled and the number of gamma-turns in a representative set of 320 proteins has increased over seven times since the previous analysis. Beta-turns (7153) and gamma-turns (911) extracted from these proteins were used to derive a revised set of type-dependent amino acid positional preferences and potentials. Compared with previous results, the preference for proline, methionine and tryptophan has increased and the preference for glutamine, valine, glutamic acid and alanine has decreased for beta-turns. Certain new amino acid preferences were observed for both turn types and individual amino acids showed turn-type dependent positional preferences. The rationale for new amino acid preferences are discussed in the light of hydrogen bonds and other interactions involving the turns. Where main-chain hydrogen bonds of the type NH(i + 3) --> CO(i) were not observed for some beta-turns, other main-chain hydrogen bonds or solvent interactions were observed that possibly stabilize such beta-turns. A number of unexpected isolated beta-turns with proline at i + 2 position were also observed. The NH(i + 2) --> CO(i) hydrogen bond was observed for almost all gamma-turns. Nearly 20% classic gamma-turns and 43% inverse gamma-turns are isolated turns.  相似文献   

5.
Due to the structural and functional importance of tight turns, some methods have been proposed to predict gamma-turns, beta-turns, and alpha-turns in proteins. In the past, studies of pi-turns were made, but not a single prediction approach has been developed so far. It will be useful to develop a method for identifying pi-turns in a protein sequence. In this paper, the support vector machine (SVM) method has been introduced to predict pi-turns from the amino acid sequence. The training and testing of this approach is performed with a newly collected data set of 640 non-homologous protein chains containing 1931 pi-turns. Different sequence encoding schemes have been explored in order to investigate their effects on the prediction performance. With multiple sequence alignment and predicted secondary structure, the final SVM model yields a Matthews correlation coefficient (MCC) of 0.556 by a 7-fold cross-validation. A web server implementing the prediction method is available at the following URL: http://210.42.106.80/piturn/.  相似文献   

6.
We observed that beta- and gamma-turns in protein structure may be associated as peptides representing combinations of turns that span between nine and 26 amino acid residues along the polypeptide backbone chain and often correspond to loops in the protein structure. Around 475 peptides resulted from the analysis of a non-redundant data set corresponding to 248 protein crystal structures selected from the Protein Data Bank. Nearly 40% protein chains are associated with two or more peptides and the peptides with nine and 10 amino acid residues are more frequent. A maximum of four distinct peptides varying in number of amino acid residues were observed in at least 10 proteins along the same protein chain. Nearly 80% peptides comprise type IV beta-turns that are associated with irregular dihedral angle values suggesting this may be important for the conformational diversity associated with the loops in proteins. In general, predominant interactions that possibly stabilize these peptides involve main-chain and side-chain interactions with solvent, in addition to hydrogen bond, salt-bridge and non-bonded interactions. Majority of the peptides were observed in hydrolase, oxidoreductase, transferase, serine proteinase/inhibitor complex, electron transport/electron transfer and lyase proteins.  相似文献   

7.
In the present study, an attempt has been made to develop a method for predicting gamma-turns in proteins. First, we have implemented the commonly used statistical and machine-learning techniques in the field of protein structure prediction, for the prediction of gamma-turns. All the methods have been trained and tested on a set of 320 nonhomologous protein chains by a fivefold cross-validation technique. It has been observed that the performance of all methods is very poor, having a Matthew's Correlation Coefficient (MCC) 相似文献   

8.
Recently, two different models have been developed for predicting gamma-turns in proteins by Kaur and Raghava [2002. An evaluation of beta-turn prediction methods. Bioinformatics 18, 1508-1514; 2003. A neural-network based method for prediction of gamma-turns in proteins from multiple sequence alignment. Protein Sci. 12, 923-929]. However, the major limitation of previous methods is inability in predicting gamma-turns types. Thus, there is a need to predict gamma-turn types using an approach which will be useful in overall tertiary structure prediction. In this work, support vector machines (SVMs), a powerful model is proposed for predicting gamma-turn types in proteins. The high rates of prediction accuracy showed that the formation of gamma-turn types is evidently correlated with the sequence of tripeptides, and hence can be approximately predicted based on the sequence information of the tripeptides alone.  相似文献   

9.
We report the observation of continuous turns in proteins which comprise individual gamma-turns or beta-turns or both that are situated immediately one after the other along the polypeptide chain. The continuous turns were identified from a representative data set of three-dimensional protein crystal structures. The gammabeta/betagamma, gammagamma and betabeta continuous turns represent peptides of varying amino acid residue lengths and conformations. The continuous turns frequently observed in proteins were: gammabeta, between a coil and a strand; betagamma, between a helix and a strand; gammagamma, between coils; and betabeta, either between a strand and a coil or between strands or coils. We determined the statistically significant amino acid residue preferences at individual positions in the turn, calculated amino acid positional potentials and analyzed main chain hydrogen bonds and side-chain interactions likely to stabilize the continuous turns. The data on continuous turns have been integrated in the database of structural motifs in proteins (DSMP) on our web server at (http://www.cdfd.org.in/dsmp.html). This is useful to make queries on sequences compatible with different continuous turns.  相似文献   

10.
MOTIVATION: beta-turns play an important role from a structural and functional point of view. beta-turns are the most common type of non-repetitive structures in proteins and comprise on average, 25% of the residues. In the past numerous methods have been developed to predict beta-turns in a protein. Most of these prediction methods are based on statistical approaches. In order to utilize the full potential of these methods, there is a need to develop a web server. RESULTS: This paper describes a web server called BetaTPred, developed for predicting beta-TURNS in a protein from its amino acid sequence. BetaTPred allows the user to predict turns in a protein using existing statistical algorithms. It also allows to predict different types of beta-TURNS e.g. type I, I', II, II', VI, VIII and non-specific. This server assists the users in predicting the consensus beta-TURNS in a protein. AVAILABILITY: The server is accessible from http://imtech.res.in/raghava/betatpred/  相似文献   

11.
The structure of the central repetitive domain of high molecular weight HMW) wheat gluten proteins was characterized in solution and in the dry state using HMW proteins Bx6 and Bx7 and a subcloned, bacterially expressed part of the repetitive domain of HMW Dx5. Model studies of the HMW consensus peptides PGQGQQ and GYYPTSPQQ formed the basis for the data analysis (van Dijk AA et al., 1997, Protein Sci 6:637-648). In solution, the repetitive domain contained a continuous nonoverlapping series of both type I and type II II beta-turns at positions predicted from the model studies; type II beta-turns occurred at QPGQ and QQGY sequences and type I beta-turns at YPTS and SPQQ. The subcloned part of the HMW Dx5 repetitive domain sometimes migrated as two bands on SDS-PAGE; we present evidence that this may be caused by a single amino acid insertion that disturbs the regular structure of beta-turns. The type I beta-turns are lost when the protein is dried on a solid surface, probably by conversion to type II beta-turns. The homogeneous type II beta-turn distribution is compatible with the formation of a beta-spiral structure, which provides the protein with elastic properties. The beta-turns and thus the beta-spiral are stabilized by hydrogen bonds within and between turns. Reformation of this hydrogen bonding network after, e.g., mechanical disruption may be important for the elastic properties of gluten proteins.  相似文献   

12.
Improved method for predicting beta-turn using support vector machine   总被引:2,自引:0,他引:2  
MOTIVATION: Numerous methods for predicting beta-turns in proteins have been developed based on various computational schemes. Here, we introduce a new method of beta-turn prediction that uses the support vector machine (SVM) algorithm together with predicted secondary structure information. Various parameters from the SVM have been adjusted to achieve optimal prediction performance. RESULTS: The SVM method achieved excellent performance as measured by the Matthews correlation coefficient (MCC = 0.45) using a 7-fold cross validation on a database of 426 non-homologous protein chains. To our best knowledge, this MCC value is the highest achieved so far for predicting beta-turn. The overall prediction accuracy Qtotal was 77.3%, which is the best among the existing prediction methods. Among its unique attractive features, the present SVM method avoids overtraining and compresses information and provides a predicted reliability index.  相似文献   

13.
γ-转角是所有转角中数量位居第二的结构,约占整个蛋白质结构的3.4%。γ-转角有助于球形结构的形成,帮助肽链改变折叠方向,因此就更加有必要研究γ-转角的预测方法,从而提高蛋白质二级结构的预测精度。近几十年来关于γ-转角的预测方法越来越成熟,预测精度越来越高。本文综述了近年来对γ-转角研究进展,包括它的研究方法以及预测的准确度等。  相似文献   

14.
SUMMARY: The database of structural motifs in proteins (DSMP) contains data relevant to helices, beta-turns, gamma-turns, beta-hairpins, psi-loops, beta-alpha-beta motifs, beta-sheets, beta-strands and disulphide bridges extracted from all proteins in the Protein Data Bank primarily using the PROMOTIF program and implemented as a web-based network service using the SRS. The data corresponding to the structural motifs includes; sequence, position in polypeptide chain, geometry, type, unique code, keywords and resolution of crystal structure. This data is available for a representative data set of 1028 protein chains and also for all 10 213 proteins in the Protein Data Bank. The three-dimensional coordinates for all structural motifs (except sheet and disulphide bridge) are also available for the representative data set. Using features in SRS, DSMP can be queried to extract information from one or more structural motifs that may be useful for sequence-structure analysis, prediction, modelling or design. AVAILABILITY: http://www. cdfd.org.in/dsmp.html  相似文献   

15.
Gamma-turns occur as one of two possible enantiomers with regard to their main-chain structure, called classic and inverse. Of these, inverse ones are more common. Unlike other hydrogen bonds, those in inverse gamma-turns include a large proportion that are weak. If such hydrogen bonds are included, these turns may be said to be abundant in proteins. A significant number of inverse gamma-turns, usually weak ones, exist as consecutive turns in a structural feature, now called the 2.2(7)-helix, proposed for polypeptides as long ago as 1943 by Huggins. Most of these features occur within strands of beta-sheet. The less-weak inverse gamma-turns fall into several structural subgroups. They are frequently situated directly at either end of alpha-helices or of strands of beta-sheet, or adjacent to certain loop motifs. In general, they are well conserved during evolution and some are found at key positions in proteins. One occurs in the first hypervariable loop in the heavy chain of immunoglobulins.  相似文献   

16.
Amino acid composition, Fourier transform analysis and secondary structure prediction methods strongly support a tripartite structure for Drosophila chorion proteins s36 and s38. Each protein consists of a central domain and two flanking 'arms'. The central domain contains tandemly repetitive peptides, which apparently generate a secondary structure of beta-sheet strands alternating with beta-turns, most probably, forming a twisted beta-pleated sheet or beta-barrel. The central domains of s36 and s38 share similarities, but they are recognizably different. The flanking 'arms', with different primary and secondary structure features, presumably serve protein-specific functions. The possible roles of the protein domains for the establishment of higher order structure in Drosophila chorion and the possible function of the molecules are discussed. The predicted secondary structure of Drosophila chorion proteins s36 and s38 is supported by experimental information obtained from Fourier transform infrared spectroscopic studies of Drosophila chorions.  相似文献   

17.
Analysis and prediction of the different types of beta-turn in proteins   总被引:30,自引:0,他引:30  
beta-Turns have been extracted from 59 non-identical proteins (resolution 2 A) using the standard criterion that the distance between C alpha (i) and C alpha (i + 3) is less than 7 A (1 A = 0.1 nm). The beta-turns have been classified, using phi, psi angles, into seven conventional turn types (I, I', II, II', IV, VIa, VIb) and a new class of beta-turn, designated type VIII, in which the central residues (i + 1, i + 2) adopt an alpha R beta conformation. Most beta-turn types are found in various topological environments, with the exception of I' and II' beta-turns, where 83% and 50%, respectively, are found in beta-hairpins. Sufficient data have been gathered to enable, for the first time, the separate statistical analysis of type I and II beta-turns. The two turn types have been shown to be strikingly different in their sequence preferences. Type I turns favour Asp, Asn, Ser and Cys at i; Asp, Ser, Thr and Pro at i + 1; Asp, Ser, Asn and Arg at i + 2; Gly, Trp and Met at i + 3, whilst type II turns prefer Pro at i + 1; Gly and Asn at i + 2; Gln and Arg at i + 3. These preferences have been explained by the specific side-chain interactions observed within the X-ray structures. The positional trends for type I and II beta-turns have been incorporated into the simple empirical predictive algorithm originally developed by P.N. Lewis et al. The program has improved the positional prediction of beta-turns, and has enhanced and extended the method by predicting the type of beta-turn. Since the observed preferences reflect local interactions these predictions are applicable not only to proteins, but also to peptides, many of which are thought to contain beta-turns.  相似文献   

18.
Numerous studies have been performed for analysis and prediction of β‐turns in a protein. This study focuses on analyzing, predicting, and designing of β‐turns to understand the preference of amino acids in β‐turn formation. We analyzed around 20,000 PDB chains to understand the preference of residues or pair of residues at different positions in β‐turns. Based on the results, a propensity‐based method has been developed for predicting β‐turns with an accuracy of 82%. We introduced a new approach entitled “Turn level prediction method,” which predicts the complete β‐turn rather than focusing on the residues in a β‐turn. Finally, we developed BetaTPred3, a Random forest based method for predicting β‐turns by utilizing various features of four residues present in β‐turns. The BetaTPred3 achieved an accuracy of 79% with 0.51 MCC that is comparable or better than existing methods on BT426 dataset. Additionally, models were developed to predict β‐turn types with better performance than other methods available in the literature. In order to improve the quality of prediction of turns, we developed prediction models on a large and latest dataset of 6376 nonredundant protein chains. Based on this study, a web server has been developed for prediction of β‐turns and their types in proteins. This web server also predicts minimum number of mutations required to initiate or break a β‐turn in a protein at specified location of a protein. Proteins 2015; 83:910–921. © 2015 Wiley Periodicals, Inc.  相似文献   

19.
Knowledge of structural class plays an important role in understanding protein folding patterns. In this study, a simple and powerful computational method, which combines support vector machine with PSI-BLAST profile, is proposed to predict protein structural class for low-similarity sequences. The evolution information encoding in the PSI-BLAST profiles is converted into a series of fixed-length feature vectors by extracting amino acid composition and dipeptide composition from the profiles. The resulting vectors are then fed to a support vector machine classifier for the prediction of protein structural class. To evaluate the performance of the proposed method, jackknife cross-validation tests are performed on two widely used benchmark datasets, 1189 (containing 1092 proteins) and 25PDB (containing 1673 proteins) with sequence similarity lower than 40% and 25%, respectively. The overall accuracies attain 70.7% and 72.9% for 1189 and 25PDB datasets, respectively. Comparison of our results with other methods shows that our method is very promising to predict protein structural class particularly for low-similarity datasets and may at least play an important complementary role to existing methods.  相似文献   

20.
We predicted gamma-turns from amino acid sequences using the first-order Markov chain theory and enlarged representative data sets corresponding to protein chains selected from the Protein Data Bank (PDB). The following data sets were used for training and deriving the probability values: (1) an initial data set containing 315 protein chains comprising 904 gamma-turns and (2) a later data set in order to include new entries in the PDB, containing 434 protein chains and comprising 1053 gamma-turns. By excluding 93 protein chains that were common to these two training data sets, we generated two mutually exclusive data sets containing 222 and 341 protein chains for testing our predictions. Applying amino acid probability values derived from training data sets on to testing data sets yielded overall prediction accuracies in the range 54-57%. We recommend the use of probability values derived from the data set comprising 315 protein chains that represents more gamma-turns and also provides better predictions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号