共查询到20条相似文献,搜索用时 0 毫秒
1.
The back-propagation neural network algorithm is a commonly used method for predicting the secondary structure of proteins. Whilst popular, this method can be slow to learn and here we compare it with an alternative: the cascade-correlation architecture. Using a constructive algorithm, cascade-correlation achieves predictive accuracies comparable to those obtained by back-propagation, in shorter time. 相似文献
2.
Combining evolutionary information and neural networks to predict protein secondary structure 总被引:1,自引:0,他引:1
Using evolutionary information contained in multiple sequence alignments as input to neural networks, secondary structure can be predicted at significantly increased accuracy. Here, we extend our previous three-level system of neural networks by using additional input information derived from multiple alignments. Using a position-specific conservation weight as part of the input increases performance. Using the number of insertions and deletions reduces the tendency for overprediction and increases overall accuracy. Addition of the global amino acid content yields a further improvement, mainly in predicting structural class. The final network system has a sustained overall accuracy of 71.6% in a multiple cross-validation test on 126 unique protein chains. A test on a new set of 124 recently solved protein structures that have no significant sequence similarity to the learning set confirms the high level of accuracy. The average cross-validated accuracy for all 250 sequence-unique chains is above 72%. Using various data sets, the method is compared to alternative prediction methods, some of which also use multiple alignments: the performance advantage of the network system is at least 6 percentage points in three-state accuracy. In addition, the network estimates secondary structure content from multiple sequence alignments about as well as circular dichroism spectroscopy on a single protein and classifies 75% of the 250 proteins correctly into one of four protein structural classes. Of particular practical importance is the definition of a position-specific reliability index. For 40% of all residues the method has a sustained three-state accuracy of 88%, as high as the overall average for homology modelling. A further strength of the method is greatly increased accuracy in predicting the placement of secondary structure segments. © 1994 Wiley-Liss, Inc. 相似文献
3.
Compared with the protein 3-class secondary structure (SS) prediction, the 8-class prediction gains less attention and is also much more challenging, especially for proteins with few sequence homologs. This paper presents a new probabilistic method for 8-class SS prediction using conditional neural fields (CNFs), a recently invented probabilistic graphical model. This CNF method not only models the complex relationship between sequence features and SS, but also exploits the interdependency among SS types of adjacent residues. In addition to sequence profiles, our method also makes use of non-evolutionary information for SS prediction. Tested on the CB513 and RS126 data sets, our method achieves Q8 accuracy of 64.9 and 64.7%, respectively, which are much better than the SSpro8 web server (51.0 and 48.0%, respectively). Our method can also be used to predict other structure properties (e.g. solvent accessibility) of a protein or the SS of RNA. 相似文献
4.
A simple and fast secondary structure prediction method using hidden neural networks 总被引:5,自引:0,他引:5
MOTIVATION: In this paper, we present a secondary structure prediction method YASPIN that unlike the current state-of-the-art methods utilizes a single neural network for predicting the secondary structure elements in a 7-state local structure scheme and then optimizes the output using a hidden Markov model, which results in providing more information for the prediction. RESULTS: YASPIN was compared with the current top-performing secondary structure prediction methods, such as PHDpsi, PROFsec, SSPro2, JNET and PSIPRED. The overall prediction accuracy on the independent EVA5 sequence set is comparable with that of the top performers, according to the Q3, SOV and Matthew's correlations accuracy measures. YASPIN shows the highest accuracy in terms of Q3 and SOV scores for strand prediction. AVAILABILITY: YASPIN is available on-line at the Centre for Integrative Bioinformatics website (http://ibivu.cs.vu.nl/programs/yaspinwww/) at the Vrije University in Amsterdam and will soon be mirrored on the Mathematical Biology website (http://www.mathbio.nimr.mrc.ac.uk) at the NIMR in London. CONTACT: kxlin@nimr.mrc.ac.uk 相似文献
5.
It was recently found that some short peptides (including C- and S-peptide fragments of RNase A) can have considerable helicity in solution, 1–12 which was considered to be surprising. Does the observed helicity require a new explanation, or is it consistent with previous understanding? In this work we show that this helicity is consistent with the physical theory of secondary structure12–19 based on an extension of the conventional Zimm-Bragg model.20 Without any special modifications, this theory explains reasonably well almost all the experimentally observed dependencies of helicity on pH, temperature, and amino acid replacements. We conclude that the observed “general level” of helicity of C- and S-peptides (5–30% at room temperature and 10–50% near 0°C) is “normal” for short peptides consisting mainly of helix-forming and helix-indifferent residues. The helicity is modified by a multitude of weak specific side chain interactions, many of which are taken into account by the present theory;13–19 some discrepancies between the theory and experiment can be explained by weak side-chain-side chain interactions that were neglected. A reasonable coincidence of the theory with experiment suggests that it had been used to investigate the role of local interactions in the formation of α-helical “embryos” in unfolded protein chains. 相似文献
6.
Secondary structure prediction from amino acid sequence is a key component of protein structure prediction, with current accuracy at approximately 75%. We analysed two state-of-the-art secondary structure prediction methods, PHD and JPRED, comparing predictions with secondary structure assigned by the algorithms DSSP and STRIDE. The specific focus of our study was alpha-helix N-termini, as empirical free energy scales are available for residue preferences at N-terminal positions. Although these prediction methods perform well in general at predicting the alpha-helical locations and length distributions in proteins, they perform less well at predicting the correct helical termini. For example, although most predicted alpha-helices overlap a real alpha-helix (with relatively few completely missed or extra predicted helices), only one-third of JPRED and PHD predictions correctly identify the N-terminus. Analysis of neighbouring N-terminal sequences to predicted helical N-termini shows that the correct N-terminus is often within one or two residues. More importantly, the true N-terminal motif is, on average, more favourable as judged by our experimentally measured free energies. This suggests a simple, but powerful, strategy to improve secondary structure prediction using empirically derived energies to adjust the predicted output to a more favourable N-terminal sequence. 相似文献
7.
Structural parameters of rhodopsin in disc membrane preparations from frog and cattle were studied by hydrogen exchange methods. The method measures the exchange of protein amide hydrogens with water and can distinguish protons which are internally bonded from those which are hydrogen-bonded to water. The results show that about 70% of rhodopsin's peptide group protons are exposed to water. The identification of these groups as free peptides was made initially on the usual basis of the identity of their exchange rate with the well characterized free peptide rate; other experiments specifically excluded contributions from lipids, protein side chains, adventitious mucopolysaccharides, and intradisc water. In contrast to rhodopsin, other proteins generally have only 20 to 40% free peptide groups. Apparently rhodopsin has some unusual structural feature. Our results together with available information on rhodopsin suggest that a considerable length of its polypeptide chain is arranged at the surface of a channel of water penetrating into the membrane. Physicochemical considerations indicate that such a channel would have to be quite wide, 10 to 12 A or more, to explain the hydrogen exchange results. 相似文献
8.
M B Tsendina D I Frishman V F Levchenko A L Berman 《Zhurnal evoliutsionno? biokhimii i fiziologii》1988,24(6):797-807
Computer analysis has been made of the primary structure of 6 different types of receptor proteins: rhodopsin, adrenoreceptor, muscarinic acetylcholine receptor, insulin receptor, nicotinic cholinoreceptor, and bacteriorhodopsin. The aim of the present investigation was to elucidate, at least partially, to what extent insignificant similarity in the primary structure of rhodopsin, muscarinic cholinoreceptor and adrenoreceptor is due to divergent, but not convergent, evolution. Nicotinic cholinoreceptor, bacteriorhodopsin and insulin receptor were chosen for comparison with rhodopsin, adrenoreceptor and muscarinic cholinoreceptor since each of these proteins exhibits this or that structural or functional property which is common for rhodopsin, adrenoreceptor or muscarinic cholinoreceptor; on the other hand, nicotinic cholinoreceptor, bacteriorhodopsin and insulin receptor differ from other receptor proteins by their molecular mechanisms. Comparison of the primary structure of rhodopsin, adrenoreceptor and muscarinic cholinoreceptor on the one hand, and insulin receptor, nicotinic cholinoreceptor and bacteriorhodopsin on the other indicates that only the former exhibit similar primary structure, whereas insulin receptor, nicotinic cholinoreceptor and bacteriorhodopsin show no similarity neither in their primary structure, nor in the primary structure of rhodopsin and other receptor proteins which are similar to the latter with respect to their mode of action. The data obtained indicate that similarity in the primary structure between rhodopsin, muscarinic cholinoreceptor and adrenoreceptor is a consequence of divergent, not convergent, evolution; in other words, these receptor proteins are homologous. 相似文献
9.
Noncoding RNAs play important roles in cell and their secondary structures are vital for understanding their tertiary structures and functions.Many prediction m... 相似文献
10.
Ian Walsh Alberto JM Martin Catherine Mooney Enrico Rubagotti Alessandro Vullo Gianluca Pollastri 《BMC bioinformatics》2009,10(1):195-19
Background
Proteins, especially larger ones, are often composed of individual evolutionary units, domains, which have their own function and structural fold. Predicting domains is an important intermediate step in protein analyses, including the prediction of protein structures. 相似文献11.
A complex, cascaded neural network designed to predict the secondary structure of globular proteins has been developed. Information about the local buried-unburied pattern and the average tendency of the particular types of amino acids to be buried inside the globule were used. Nonspecific information about long distance contact maps was also employed. These modifications result in a noticeable improvement (3-9%) of prediction accuracy. The best result for the average success ratio for the testing set of nonhomologous proteins was 68.3% (with corresponding Matthews' coefficients, C alpha,beta,coil equal to 0.60, 0.47, 0.43, respectively). 相似文献
12.
13.
Computational model of neural network is used for prediction of secondary structure of globular proteins of known sequence. In contrast to earlier works some information about expected tertiary interactions were built in into the neural network. As a result the prediction accuracy was improved by 3% to 5%. Possible applications of this new approach are briefly discussed. 相似文献
14.
MOTIVATION: Protein structure comparison (PSC) has been used widely in studies of structural and functional genomics. However, PSC is computationally expensive and as a result almost all of the PSC methods currently in use look only for the optimal alignment and ignore many alternative alignments that are statistically significant and that may provide insight into protein evolution or folding. RESULTS: We have developed a new PSC method with efficiency to detect potentially viable alternative alignments in all-against-all database comparisons. The efficiency of the new PSC method derives from the ability to directly home in on a limited number of viable and ranked alignment solutions based on intuitively derived SSE (secondary structure element)-matching probabilities. 相似文献
15.
16.
神经网络在蛋白质二级结构预测中的应用 总被引:3,自引:0,他引:3
介绍了蛋白质二级结构预测的研究意义,讨论了用在蛋白质二级结构预测方面的神经网络设计问题,并且较详尽地评述了近些年来用神经网络方法在蛋白质二级结构预测中的主要工作进展情况,展望了蛋白质结构预测的前景。 相似文献
17.
Identifying common local segments, also called motifs, in multiple protein sequences plays an important role for establishing homology between proteins. Homology is easy to establish when sequences are similar (sharing an identity > 25%). However, for distant proteins, it is much more difficult to align motifs that are not similar in sequences but still share common structures or functions. This paper is a first attempt to align multiple protein sequences using both primary and secondary structure information. A new sequence model is proposed so that the model assigns high probabilities not only to motifs that contain conserved amino acids but also to motifs that present common secondary structures. The proposed method is tested in a structural alignment database BAliBASE. We show that information brought by the predicted secondary structures greatly improves motif identification. A website of this program is available at www.stat.purdue.edu/~junxie/2ndmodel/sov.html. 相似文献
18.
Here we perform a systematic exploration of the use of distance constraints derived from small angle X-ray scattering (SAXS) measurements to filter candidate protein structures for the purpose of protein structure prediction. This is an intrinsically more complex task than that of applying distance constraints derived from NMR data where the identity of the pair of amino acid residues subject to a given distance constraint is known. SAXS, on the other hand, yields a histogram of pair distances (pair distribution function), but the identities of the pairs contributing to a given bin of the histogram are not known. Our study is based on an extension of the Levitt-Hinds coarse grained approach to ab initio protein structure prediction to generate a candidate set of C(alpha) backbones. In spite of the lack of specific residue information inherent in the SAXS data, our study shows that the implementation of a SAXS filter is capable of effectively purifying the set of native structure candidates and thus provides a substantial improvement in the reliability of protein structure prediction. We test the quality of our predicted C(alpha) backbones by doing structural homology searches against the Dali domain library, and find that the results are very encouraging. In spite of the lack of local structural details and limited modeling accuracy at the C(alpha) backbone level, we find that useful information about fold classification can be extracted from this procedure. This approach thus provides a way to use a SAXS data based structure prediction algorithm to generate potential structural homologies in cases where lack of sequence homology prevents identification of candidate folds for a given protein. Thus our approach has the potential to help in determination of the biological function of a protein based on structural homology instead of sequence homology. 相似文献
19.
Improvements in protein secondary structure prediction by an enhanced neural network 总被引:47,自引:0,他引:47
Computational neural networks have recently been used to predict the mapping between protein sequence and secondary structure. They have proven adequate for determining the first-order dependence between these two sets, but have, until now, been unable to garner higher-order information that helps determine secondary structure. By adding neural network units that detect periodicities in the input sequence, we have modestly increased the secondary structure prediction accuracy. The use of tertiary structural class causes a marked increase in accuracy. The best case prediction was 79% for the class of all-alpha proteins. A scheme for employing neural networks to validate and refine structural hypotheses is proposed. The operational difficulties of applying a learning algorithm to a dataset where sequence heterogeneity is under-represented and where local and global effects are inadequately partitioned are discussed. 相似文献
20.
A comparison of neural network methods and Bayesian statistical methods is presented for prediction of the secondary structure of proteins given their primary sequence. The Bayesian method makes the unphysical assumption that the probability of an amino acid occurring in each position in the protein is independent of the amino acids occurring elsewhere. However, we find the predictive accuracy of the Bayesian method to be only minimally less than the accuracy of the most sophisticated methods used to date. We present the relationship of neural network methods to Bayesian statistical methods and show that, in principle, neural methods offer considerable power, although apparently they are not particularly useful for this problem. In the process, we derive a neural formalism in which the output neurons directly represent the conditional probabilities of structure class. The probabilistic formalism allows introduction of a new objective function, the mutual information, which translates the notion of correlation as a measure of predictive accuracy into a useful training measure. Although a similar accuracy to other approaches (utilizing a mean-square error) is achieved using this new measure, the accuracy on the training set is significantly and tantalizingly higher, even though the number of adjustable parameters remains the same. The mutual information measure predicts a greater fraction of helix and sheet structures correctly than the mean-square error measure, at the expense of coil accuracy, precisely as it was designed to do. By combining the two objective functions, we obtain a marginally improved accuracy of 64.4%, with Matthews coefficients C alpha, C beta and Ccoil of 0.40, 0.32 and 0.42, respectively. However, since all methods to date perform only slightly better than the Bayes algorithm, which entails the drastic assumption of independence of amino acids, one is forced to conclude that little progress has been made on this problem, despite the application of a variety of sophisticated algorithms such as neural networks, and that further advances will require a better understanding of the relevant biophysics. 相似文献