首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Computational model of neural network is used for prediction of secondary structure of globular proteins of known sequence. In contrast to earlier works some information about expected tertiary interactions were built in into the neural network. As a result the prediction accuracy was improved by 3% to 5%. Possible applications of this new approach are briefly discussed.  相似文献   

2.
Protein secondary structure (PSS) prediction is an important topic in bioinformatics. Our study on a large set of non-homologous proteins shows that long-range interactions commonly exist and negatively affect PSS prediction. Besides, we also reveal strong correlations between secondary structure (SS) elements. In order to take into account the long-range interactions and SS-SS correlations, we propose a novel prediction system based on cascaded bidirectional recurrent neural network (BRNN). We compare the cascaded BRNN against another two BRNN architectures, namely the original BRNN architecture used for speech recognition as well as Pollastri's BRNN that was proposed for PSS prediction. Our cascaded BRNN achieves an overall three state accuracy Q3 of 74.38\%, and reaches a high Segment OVerlap (SOV) of 66.0455. It outperforms the original BRNN and Pollastri's BRNN in both Q3 and SOV. Specifically, it improves the SOV score by 4-6%.  相似文献   

3.

Background  

Protein secondary structure prediction method based on probabilistic models such as hidden Markov model (HMM) appeals to many because it provides meaningful information relevant to sequence-structure relationship. However, at present, the prediction accuracy of pure HMM-type methods is much lower than that of machine learning-based methods such as neural networks (NN) or support vector machines (SVM).  相似文献   

4.
In this paper we propose constructing an improved two-level neural network to predict protein secondary structure. Firstly, we code the whole protein composition information as the inputs to the first-level network besides the evolutionary information. Secondly, we calculate the reliability score for each residue position based on the output of the first-level network, and the role of the second-level network is to take full advantage of the residues with a higher reliability score to impact the neighboring residues with a lower one for improving the whole prediction accuracy. Thirdly, considering it is indeed a problem that the target protein can be lost in the multiple sequence alignment we propose to code single sequence into the second-level network. The experimental results show that our proposed method can efficiently improve the prediction accuracy.  相似文献   

5.
The back-propagation neural network algorithm is a commonly used method for predicting the secondary structure of proteins. Whilst popular, this method can be slow to learn and here we compare it with an alternative: the cascade-correlation architecture. Using a constructive algorithm, cascade-correlation achieves predictive accuracies comparable to those obtained by back-propagation, in shorter time.  相似文献   

6.
A feed-forward neural network has been employed for protein secondary structure prediction. Attempts were made to improve on previous prediction accuracies using a hierarchical mixture of experts (HME). In this method input data are clustered and used to train a series of different networks. Application of an HME to the prediction of protein secondary structure is shown to provide no advantages over a single network. We have also tried various new input representations, chosen to incorporate the effect of residues a long distance away in the one-dimensional amino acid chain. Prediction accuracy using these methods is comparable to that achieved by other neural networks.1–4  相似文献   

7.
Matsuo K  Watanabe H  Gekko K 《Proteins》2008,73(1):104-112
Synchrotron-radiation vacuum-ultraviolet circular dichroism (VUVCD) spectroscopy can significantly improve the predictive accuracy of the contents and segment numbers of protein secondary structures by extending the short-wavelength limit of the spectra. In the present study, we combined VUVCD spectra down to 160 nm with neural-network (NN) method to improve the sequence-based prediction of protein secondary structures. The secondary structures of 30 target proteins (test set) were assigned into alpha-helices, beta-strands, and others by the DSSP program based on their X-ray crystal structures. Combining the alpha-helix and beta-strand contents estimated from the VUVCD spectra of the target proteins improved the overall sequence-based predictive accuracy Q(3) for three secondary-structure components from 59.5 to 60.7%. Incorporating the position-specific scoring matrix in the NN method improved the predictive accuracy from 70.9 to 72.1% when combining the secondary-structure contents, to 72.5% when combining the numbers of segments, and finally to 74.9% when filtering the VUVCD data. Improvement in the sequence-based prediction of secondary structures was also apparent in two other indices of the overall performance: the correlation coefficient (C) and the segment overlap value (SOV). These results suggest that VUVCD data could enhance the predictive accuracy to over 80% when combined with the currently best sequence-prediction algorithms, greatly expanding the applicability of VUVCD spectroscopy to protein structural biology.  相似文献   

8.
A priori knowledge of secondary structure content can be of great use in theoretical and experimental determination of protein structure. We present a method that uses two computer-simulated neural networks placed in "tandem" to predict the secondary structure content of water-soluble, globular proteins. The first of the two networks, NET1, predicts a protein's helix and strand content given information about the protein's amino acid composition, molecular weight and heme presence. Because NET1 contained more adjustable parameters (network weights) than learning examples, this network experienced problems with memorization, which is the inability to generalize onto new, never-seen-before examples. To overcome this problem, we designed a second network, NET2, which learned to determine when NET1 was in a state of generalization. Together, these two networks produce prediction errors as low as 5.0% and 5.6% for helix and strand content, respectively, on a set of protein crystal structures bearing little homology to those used in network training. A comparison between three other methods including a multiple linear regression analysis, a non-hidden-node network analysis and a secondary structure assignment analysis reveals that our tandem neural network scheme is, indeed, the best method for predicting secondary structure content. The results of our analysis suggest that the knowledge of sequence information is not necessary for highly accurate predictions of protein secondary structure content.  相似文献   

9.
In this study we present an accurate secondary structure prediction procedure by using a query and related sequences. The most novel aspect of our approach is its reliance on local pairwise alignment of the sequence to be predicted with each related sequence rather than utilization of a multiple alignment. The residue-by-residue accuracy of the method is 75% in three structural states after jack-knife tests. The gain in prediction accuracy compared with the existing techniques, which are at best 72%, is achieved by secondary structure propensities based on both local and long-range effects, utilization of similar sequence information in the form of carefully selected pairwise alignment fragments, and reliance on a large collection of known protein primary structures. The method is especially appropriate for large-scale sequence analysis efforts such as genome characterization, where precise and significant multiple sequence alignments are not available or achievable. Proteins 27:329–335, 1997. © 1997 Wiley-Liss, Inc.  相似文献   

10.
In this paper(1) we present a novel framework for protein secondary structure prediction. In this prediction framework, firstly we propose a novel parameterized semi-probability profile, which combines single sequence with evolutionary information effectively. Secondly, different semi-probability profiles are respectively applied as network input to predict protein secondary structure. Then a comparison among these different predictions is discussed in this article. Finally, na?ve Bayes approaches are used to combine these predictions in order to obtain a better prediction performance than individual prediction. The experimental results show that our proposed framework can indeed improve the prediction accuracy.  相似文献   

11.
神经网络在蛋白质二级结构预测中的应用   总被引:3,自引:0,他引:3  
介绍了蛋白质二级结构预测的研究意义,讨论了用在蛋白质二级结构预测方面的神经网络设计问题,并且较详尽地评述了近些年来用神经网络方法在蛋白质二级结构预测中的主要工作进展情况,展望了蛋白质结构预测的前景。  相似文献   

12.
目前蛋白质二级结构的预测准确率徘徊在75%左右,难以作进一步提高。本文通过统计学的方法,对蛋白质的冗余数据库进行了分析。并由此证明,目前影响预测准确率继续的真正原因是蛋白质数据库本身的系统误差,系统误差大约为25%。而该误差是由于实验条件的客观原因带来的。  相似文献   

13.
MOTIVATION: In many fields of pattern recognition, combination has proved efficient to increase the generalization performance of individual prediction methods. Numerous systems have been developed for protein secondary structure prediction, based on different principles. Finding better ensemble methods for this task may thus become crucial. Furthermore, efforts need to be made to help the biologist in the post-processing of the outputs. RESULTS: An ensemble method has been designed to post-process the outputs of discriminant models, in order to obtain an improvement in prediction accuracy while generating class posterior probability estimates. Experimental results establish that it can increase the recognition rate of protein secondary structure prediction methods that provide inhomogeneous scores, even though their individual prediction successes are largely different. This combination thus constitutes a help for the biologist, who can use it confidently on top of any set of prediction methods. Moreover, the resulting estimates can be used in various ways, for instance to determine which areas in the sequence are predicted with a given level of reliability. AVAILABILITY: The prediction is freely available over the Internet on the Network Protein Sequence Analysis (NPS@) WWW server at http://pbil.ibcp.fr/NPSA/npsa_server.ht ml. The source code of the combiner can be obtained on request for academic use.  相似文献   

14.
A neural network algorithm is applied to secondary structure and structural class prediction for a database of 318 nonhomologous protein chains. Significant improvement in accuracy is obtained as compared with performance on smaller databases. A systematic study of the effects of network topology shows that, for the larger database, better results are obtained with more units in the hidden layer. In a 32-fold cross validated test, secondary structure prediction accuracy is 67.0%, relative to 62.6% obtained previously, without any evolutionary information on the sequence. Introduction of sequence profiles increases this value to 72.9%, suggesting that the two types of information are essentially independent. Tertiary structural class is predicted with 80.2% accuracy, relative to 73.9% obtained previously. The use of a larger database is facilitated by the introduction of a scaled conjugate gradient algorithm for optimizing the neural network. This algorithm is about 10-20 times as fast as the standard steepest descent algorithm.  相似文献   

15.
This paper proposes an efficient ensemble system to tackle the protein secondary structure prediction problem with neural networks as base classifiers. The experimental results show that the multi-layer system can lead to better results. When deploying more accurate classifiers, the higher accuracy of the ensemble system can be obtained.  相似文献   

16.
Hybrid system for protein secondary structure prediction.   总被引:13,自引:0,他引:13  
We have developed a hybrid system to predict the secondary structures (alpha-helix, beta-sheet and coil) of proteins and achieved 66.4% accuracy, with correlation coefficients of C(coil) = 0.429, C alpha = 0.470 and C beta = 0.387. This system contains three subsystems ("experts"): a neural network module, a statistical module and a memory-based reasoning module. First, the three experts independently learn the mapping between amino acid sequences and secondary structures from the known protein structures, then a Combiner learns to combine automatically the outputs of the experts to make final predictions. The hybrid system was tested with 107 protein structures through k-way cross-validation. Its performance was better than each expert and all previously reported methods with greater than 0.99 statistical significance. It was observed that for 20% of the residues, all three experts produced the same but wrong predictions. This may suggest an upper bound on the accuracy of secondary structure predictions based on local information from the currently available protein structures, and indicate places where non-local interactions may play a dominant role in conformation. For 64% of the residues, at least two experts were the same and correct, which shows that the Combiner performed better than majority vote. For 77% of the residues, at least one expert was correct, thus there may still be room for improvement in this hybrid approach. Rigorous evaluation procedures were used in testing the hybrid system, and statistical significance measures were developed in analyzing the differences among different methods. When measured in terms of the number of secondary structures (rather than the number of residues) that were predicted correctly, the prediction produced by the hybrid system was also better than those of individual experts.  相似文献   

17.
Pan XM 《Proteins》2001,43(3):256-259
In the present work, a novel method was proposed for prediction of secondary structure. Over a database of 396 proteins (CB396) with a three-state-defining secondary structure, this method with jackknife procedure achieved an accuracy of 68.8% and SOV score of 71.4% using single sequence and an accuracy of 73.7% and SOV score of 77.3% using multiple sequence alignments. Combination of this method with DSC, PHD, PREDATOR, and NNSSP gives Q3 = 76.2% and SOV = 79.8%.  相似文献   

18.
19.
A pentapeptide-based method for protein secondary structure prediction   总被引:7,自引:0,他引:7  
We present a new method for protein secondary structure prediction, based on the recognition of well-defined pentapeptides, in a large databank. Using a databank of 635 protein chains, we obtained a success rate of 68.6%. We show that progress is achieved when the databank is enlarged, when the 20 amino acids are adequately grouped in 10 sets and when more pentapeptides are attributed one of the defined conformations, alpha-helices or beta-strands. The analysis of the model indicates that the essential variable is the number of pentapeptides of well-defined structure in the database. Our model is simple, does not rely on arbitrary parameters and allows the analysis in detail of the results of each chosen hypothesis.  相似文献   

20.
GOR V server for protein secondary structure prediction   总被引:3,自引:0,他引:3  
SUMMARY: We have created the GOR V web server for protein secondary structure prediction. The GOR V algorithm combines information theory, Bayesian statistics and evolutionary information. In its fifth version, the GOR method reached (with the full jack-knife procedure) an accuracy of prediction Q3 of 73.5%. Although GOR V has been among the most successful methods, its online unavailability has been a deterrent to its popularity. Here, we remedy this situation by creating the GOR V server.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号