共查询到20条相似文献,搜索用时 15 毫秒
1.
Chen J Chaudhari N 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2007,4(4):572-582
Protein secondary structure (PSS) prediction is an important topic in bioinformatics. Our study on a large set of non-homologous proteins shows that long-range interactions commonly exist and negatively affect PSS prediction. Besides, we also reveal strong correlations between secondary structure (SS) elements. In order to take into account the long-range interactions and SS-SS correlations, we propose a novel prediction system based on cascaded bidirectional recurrent neural network (BRNN). We compare the cascaded BRNN against another two BRNN architectures, namely the original BRNN architecture used for speech recognition as well as Pollastri's BRNN that was proposed for PSS prediction. Our cascaded BRNN achieves an overall three state accuracy Q3 of 74.38\%, and reaches a high Segment OVerlap (SOV) of 66.0455. It outperforms the original BRNN and Pollastri's BRNN in both Q3 and SOV. Specifically, it improves the SOV score by 4-6%. 相似文献
2.
A simple and fast secondary structure prediction method using hidden neural networks 总被引:5,自引:0,他引:5
MOTIVATION: In this paper, we present a secondary structure prediction method YASPIN that unlike the current state-of-the-art methods utilizes a single neural network for predicting the secondary structure elements in a 7-state local structure scheme and then optimizes the output using a hidden Markov model, which results in providing more information for the prediction. RESULTS: YASPIN was compared with the current top-performing secondary structure prediction methods, such as PHDpsi, PROFsec, SSPro2, JNET and PSIPRED. The overall prediction accuracy on the independent EVA5 sequence set is comparable with that of the top performers, according to the Q3, SOV and Matthew's correlations accuracy measures. YASPIN shows the highest accuracy in terms of Q3 and SOV scores for strand prediction. AVAILABILITY: YASPIN is available on-line at the Centre for Integrative Bioinformatics website (http://ibivu.cs.vu.nl/programs/yaspinwww/) at the Vrije University in Amsterdam and will soon be mirrored on the Mathematical Biology website (http://www.mathbio.nimr.mrc.ac.uk) at the NIMR in London. CONTACT: kxlin@nimr.mrc.ac.uk 相似文献
3.
The importance of larger data sets for protein secondary structure prediction with neural networks. 下载免费PDF全文
J. M. Chandonia M. Karplus 《Protein science : a publication of the Protein Society》1996,5(4):768-774
A neural network algorithm is applied to secondary structure and structural class prediction for a database of 318 nonhomologous protein chains. Significant improvement in accuracy is obtained as compared with performance on smaller databases. A systematic study of the effects of network topology shows that, for the larger database, better results are obtained with more units in the hidden layer. In a 32-fold cross validated test, secondary structure prediction accuracy is 67.0%, relative to 62.6% obtained previously, without any evolutionary information on the sequence. Introduction of sequence profiles increases this value to 72.9%, suggesting that the two types of information are essentially independent. Tertiary structural class is predicted with 80.2% accuracy, relative to 73.9% obtained previously. The use of a larger database is facilitated by the introduction of a scaled conjugate gradient algorithm for optimizing the neural network. This algorithm is about 10-20 times as fast as the standard steepest descent algorithm. 相似文献
4.
Protein secondary structure prediction using three neural networks and a segmental semi Markov model
Prediction of protein secondary structure is an important step towards elucidating its three dimensional structure and its function. This is a challenging problem in bioinformatics. Segmental semi Markov models (SSMMs) are one of the best studied methods in this field. However, incorporating evolutionary information to these methods is somewhat difficult. On the other hand, the systems of multiple neural networks (NNs) are powerful tools for multi-class pattern classification which can easily be applied to take these sorts of information into account.To overcome the weakness of SSMMs in prediction, in this work we consider a SSMM as a decision function on outputs of three NNs that uses multiple sequence alignment profiles. We consider four types of observations for outputs of a neural network. Then profile table related to each sequence is reduced to a sequence of four observations. In order to predict secondary structure of each amino acid we need to consider a decision function. We use an SSMM on outputs of three neural networks. The proposed SSMM has discriminative power and weights over different dependency models for outputs of neural networks. The results show that the accuracy of our model in predictions, particularly for strands, is considerably increased. 相似文献
5.
神经网络在蛋白质二级结构预测中的应用 总被引:3,自引:0,他引:3
介绍了蛋白质二级结构预测的研究意义,讨论了用在蛋白质二级结构预测方面的神经网络设计问题,并且较详尽地评述了近些年来用神经网络方法在蛋白质二级结构预测中的主要工作进展情况,展望了蛋白质结构预测的前景。 相似文献
6.
The back-propagation neural network algorithm is a commonly used method for predicting the secondary structure of proteins. Whilst popular, this method can be slow to learn and here we compare it with an alternative: the cascade-correlation architecture. Using a constructive algorithm, cascade-correlation achieves predictive accuracies comparable to those obtained by back-propagation, in shorter time. 相似文献
7.
Background
An artificial neural network approach was chosen to model the outcome of the complex signaling pathways in the gastro-intestinal tract and other peripheral organs that eventually produce the satiety feeling in the brain upon feeding.Methods
A multilayer feed-forward neural network was trained with sets of experimental data relating concentration-time courses of plasma satiety hormones to Visual Analog Scales (VAS) scores. The network successfully predicted VAS responses from sets of satiety hormone data obtained in experiments using different food compositions.Results
The correlation coefficients for the predicted VAS responses for test sets having i) a full set of three satiety hormones, ii) a set of only two satiety hormones, and iii) a set of only one satiety hormone were 0.96, 0.96, and 0.89, respectively. The predicted VAS responses discriminated the satiety effects of high satiating food types from less satiating food types both in orally fed and ileal infused forms.Conclusions
From this application of artificial neural networks, one may conclude that neural network models are very suitable to describe situations where behavior is complex and incompletely understood. However, training data sets that fit the experimental conditions need to be available.8.
Compared with the protein 3-class secondary structure (SS) prediction, the 8-class prediction gains less attention and is also much more challenging, especially for proteins with few sequence homologs. This paper presents a new probabilistic method for 8-class SS prediction using conditional neural fields (CNFs), a recently invented probabilistic graphical model. This CNF method not only models the complex relationship between sequence features and SS, but also exploits the interdependency among SS types of adjacent residues. In addition to sequence profiles, our method also makes use of non-evolutionary information for SS prediction. Tested on the CB513 and RS126 data sets, our method achieves Q8 accuracy of 64.9 and 64.7%, respectively, which are much better than the SSpro8 web server (51.0 and 48.0%, respectively). Our method can also be used to predict other structure properties (e.g. solvent accessibility) of a protein or the SS of RNA. 相似文献
9.
Secondary structure predictions are increasingly becoming the workhorse for several methods aiming at predicting protein structure and function. Here we use ensembles of bidirectional recurrent neural network architectures, PSI-BLAST-derived profiles, and a large nonredundant training set to derive two new predictors: (a) the second version of the SSpro program for secondary structure classification into three categories and (b) the first version of the SSpro8 program for secondary structure classification into the eight classes produced by the DSSP program. We describe the results of three different test sets on which SSpro achieved a sustained performance of about 78% correct prediction. We report confusion matrices, compare PSI-BLAST to BLAST-derived profiles, and assess the corresponding performance improvements. SSpro and SSpro8 are implemented as web servers, available together with other structural feature predictors at: http://promoter.ics.uci.edu/BRNN-PRED/. 相似文献
10.
Improvements in protein secondary structure prediction by an enhanced neural network 总被引:47,自引:0,他引:47
Computational neural networks have recently been used to predict the mapping between protein sequence and secondary structure. They have proven adequate for determining the first-order dependence between these two sets, but have, until now, been unable to garner higher-order information that helps determine secondary structure. By adding neural network units that detect periodicities in the input sequence, we have modestly increased the secondary structure prediction accuracy. The use of tertiary structural class causes a marked increase in accuracy. The best case prediction was 79% for the class of all-alpha proteins. A scheme for employing neural networks to validate and refine structural hypotheses is proposed. The operational difficulties of applying a learning algorithm to a dataset where sequence heterogeneity is under-represented and where local and global effects are inadequately partitioned are discussed. 相似文献
11.
Combining evolutionary information and neural networks to predict protein secondary structure 总被引:1,自引:0,他引:1
Using evolutionary information contained in multiple sequence alignments as input to neural networks, secondary structure can be predicted at significantly increased accuracy. Here, we extend our previous three-level system of neural networks by using additional input information derived from multiple alignments. Using a position-specific conservation weight as part of the input increases performance. Using the number of insertions and deletions reduces the tendency for overprediction and increases overall accuracy. Addition of the global amino acid content yields a further improvement, mainly in predicting structural class. The final network system has a sustained overall accuracy of 71.6% in a multiple cross-validation test on 126 unique protein chains. A test on a new set of 124 recently solved protein structures that have no significant sequence similarity to the learning set confirms the high level of accuracy. The average cross-validated accuracy for all 250 sequence-unique chains is above 72%. Using various data sets, the method is compared to alternative prediction methods, some of which also use multiple alignments: the performance advantage of the network system is at least 6 percentage points in three-state accuracy. In addition, the network estimates secondary structure content from multiple sequence alignments about as well as circular dichroism spectroscopy on a single protein and classifies 75% of the 250 proteins correctly into one of four protein structural classes. Of particular practical importance is the definition of a position-specific reliability index. For 40% of all residues the method has a sustained three-state accuracy of 88%, as high as the overall average for homology modelling. A further strength of the method is greatly increased accuracy in predicting the placement of secondary structure segments. © 1994 Wiley-Liss, Inc. 相似文献
12.
Background
We present a novel method of protein fold decoy discrimination using machine learning, more specifically using neural networks. Here, decoy discrimination is represented as a machine learning problem, where neural networks are used to learn the native-like features of protein structures using a set of positive and negative training examples. A set of native protein structures provides the positive training examples, while negative training examples are simulated decoy structures obtained by reversing the sequences of native structures. Various features are extracted from the training dataset of positive and negative examples and used as inputs to the neural networks. 相似文献13.
14.
We describe a new classifier for protein secondary structure prediction that is formed by cascading together different types of classifiers using neural networks and linear discrimination. The new classifier achieves an accuracy of 76.7% (assessed by a rigorous full Jack-knife procedure) on a new nonredundant dataset of 496 nonhomologous sequences (obtained from G.J. Barton and J.A. Cuff). This database was especially designed to train and test protein secondary structure prediction methods, and it uses a more stringent definition of homologous sequence than in previous studies. We show that it is possible to design classifiers that can highly discriminate the three classes (H, E, C) with an accuracy of up to 78% for beta-strands, using only a local window and resampling techniques. This indicates that the importance of long-range interactions for the prediction of beta-strands has been probably previously overestimated. 相似文献
15.
If secondary structure predictions are to be incorporated into fold recognition methods, an assessment of the effect of specific types of errors in predicted secondary structures on the sensitivity of fold recognition should be carried out. Here, we present a systematic comparison of different secondary structure prediction methods by measuring frequencies of specific types of error. We carry out an evaluation of the effect of specific types of error on secondary structure element alignment (SSEA), a baseline fold recognition method. The results of this evaluation indicate that missing out whole helix or strand elements, or predicting the wrong type of element, is more detrimental than predicting the wrong lengths of elements or overpredicting helix or strand. We also suggest that SSEA scoring is an effective method for assessing accuracy of secondary structure prediction and perhaps may also provide a more appropriate assessment of the "usefulness" and quality of predicted secondary structure, if secondary structure alignments are to be used in fold recognition. 相似文献
16.
17.
Protein secondary structure and homology by neural networks. The alpha-helices in rhodopsin 总被引:7,自引:0,他引:7
H Bohr J Bohr S Brunak R M Cotterill B Lautrup L N?rskov O H Olsen S B Petersen 《FEBS letters》1988,241(1-2):223-228
Neural networks provide a basis for semiempirical studies of pattern matching between the primary and secondary structures of proteins. Networks of the perceptron class have been trained to classify the amino-acid residues into two categories for each of three types of secondary feature: alpha-helix or not, beta-sheet or not, and random coil or not. The explicit prediction for the helices in rhodopsin is compared with both electron microscopy results and those of the Chou-Fasman method. A new measure of homology between proteins is provided by the network approach, which thereby leads to quantification of the differences between the primary structures of proteins. 相似文献
18.
Kay C Wiese Alain A Deschenes Andrew G Hendriks 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2008,5(1):25-41
This paper presents two in-depth studies on RnaPredict, an evolutionary algorithm for RNA secondary structure prediction. The first study is an analysis of the performance of two thermodynamic models, Individual Nearest Neighbor (INN) and Individual Nearest Neighbor Hydrogen Bond (INN-HB). The correlation between the free energy of predicted structures and the sensitivity is analyzed for 19 RNA sequences. Although some variance is shown, there is a clear trend between a lower free energy and an increase in true positive base pairs. With increasing sequence length, this correlation generally decreases. In the second experiment, the accuracy of the predicted structures for these 19 sequences are compared against the accuracy of the structures generated by the mfold dynamic programming algorithm (DPA) and also to known structures. RnaPredict is shown to outperform the minimum free energy structures produced by mfold and has comparable performance when compared to sub-optimal structures produced by mfold. 相似文献
19.
Pan XM 《Proteins》2001,43(3):256-259
In the present work, a novel method was proposed for prediction of secondary structure. Over a database of 396 proteins (CB396) with a three-state-defining secondary structure, this method with jackknife procedure achieved an accuracy of 68.8% and SOV score of 71.4% using single sequence and an accuracy of 73.7% and SOV score of 77.3% using multiple sequence alignments. Combination of this method with DSC, PHD, PREDATOR, and NNSSP gives Q3 = 76.2% and SOV = 79.8%. 相似文献