共查询到19条相似文献,搜索用时 46 毫秒
1.
2.
神经网络在蛋白质二级结构预测中的应用 总被引:3,自引:0,他引:3
介绍了蛋白质二级结构预测的研究意义,讨论了用在蛋白质二级结构预测方面的神经网络设计问题,并且较详尽地评述了近些年来用神经网络方法在蛋白质二级结构预测中的主要工作进展情况,展望了蛋白质结构预测的前景。 相似文献
3.
蛋白质二级结构预测是蛋白质结构研究的一个重要环节,大量的新预测方法被提出的同时,也不断有新的蛋白质二级结构预测服务器出现。试验选取7种目前常用的蛋白质二级结构预测服务器:PSRSM、SPOT-1D、MUFOLD、Spider3、RaptorX,Psipred和Jpred4,对它们进行了使用方法的介绍和预测效果的评估。随机选取了PDB在2018年8月至11月份发布的180条蛋白质作为测试集,评估角度为:Q3、Sov、边界识别率、内部识别率、转角C识别率,折叠E识别率和螺旋H识别率七种角度。上述服务器180条测试数据的Q3结果分别为:89.96%、88.18%、86.74%、85.77%、83.61%,79.72%和78.29%。结果表明PSRSM的预测结果最好。180条测试集中,以同源性30%,40%,70%分类的实验结果中,PSRSM的Q3结果分别为:89.49%、90.53%、89.87%,均优于其他服务器。实验结果表明,蛋白质二级结构预测可从结合多种深度学习方法以及使用大数据训练模型方向做进一步的研究。 相似文献
4.
5.
6.
蛋白质结构预测是现代计算生物领域最重要的问题之一,而蛋白质二级结构预测是蛋白质高级结构预测的基础。目前蛋白质二级结构的预测方法较多,其中SVM方法取得了较高的预测精度。重在阐述使用SVM用于蛋白质二级结构预测的步骤,以及与其他方法进行比较时应该注意的事项,为下一步的研究提供参考及启发。 相似文献
7.
8.
9.
提出了一种新的蛋白质二级结构预测方法. 该方法从氨基酸序列中提取出和自然语言中的“词”类似的与物种相关的蛋白质二级结构词条, 这些词条形成了蛋白质二级结构词典, 该词典描述了氨基酸序列和蛋白质二级结构之间的关系. 预测蛋白质二级结构的过程和自然语言中的分词和词性标注一体化的过程类似. 该方法把词条序列看成是马尔科夫链, 通过Viterbi算法搜索每个词条被标注为某种二级结构类型的最大概率, 其中使用词网格描述分词的结果, 使用最大熵马尔科夫模型计算词条的二级结构概率. 蛋白质二级结构预测的结果是最优的分词所对应的二级结构类型. 在4个物种的蛋白质序列上对这种方法进行测试, 并和PHD方法进行比较. 试验结果显示, 这种方法的Q3准确率比PHD方法高3.9%, SOV准确率比PHD方法高4.6%. 结合BLAST搜索的局部相似的序列可以进一步提高预测的准确率. 在50个CASP5目标蛋白质序列上进行测试的结果是: Q3准确率为78.9%, SOV准确率为77.1%. 基于这种方法建立了一个蛋白质二级结构预测的服务器, 可以通过http://www.insun.hit.edu.cn:81/demos/biology/index.html来访问. 相似文献
10.
运用加入竞争层的BP网络,研究了基于蛋白质二级结构内容的域结构类预测问题.在BP网络中嵌入一竞争,层显著提高了网络预测性能.仅使用了一个小的训练集和简单的网络结构,获得了很高的预测精度自支持精度97.62%,jack-knife测试精度97.62%,及平均外推精度90.74%.在建立更完备的域结构类特征向量和更有代表性的训练集的基础上,所述方法将为蛋白质域结构分类领域提供新的分类基准. 相似文献
11.
Macdonald JR Johnson WC 《Protein science : a publication of the Protein Society》2001,10(6):1172-1177
We have investigated amino acid features that determine secondary structure: (1) the solvent accessibility of each side chain, and (2) the interaction of each side chain with others one to four residues apart. Solvent accessibility is a simple model that distinguishes residue environment. The pairwise interactions represent a simple model of local side chain to side chain interactions. To test the importance of these features we developed an algorithm to separate alpha-helices, beta-strands, and \"other\" structure. Single residue and pairwise probabilities were determined for 25,141 samples from proteins with <30% homology. Combining the features of solvent accessibility with pairwise probabilities allows us to distinguish the three structures after cross validation at the 82.0% level. We gain 1.4% to 2.0% accuracy by optimizing the propensities, demonstrating that probabilities do not necessarily reflect propensities. Optimization of residue exposures, weights of all probabilities, and propensities increased accuracy to 84.0%. 相似文献
12.
Gerald D. Fasman 《Journal of biosciences》1985,8(1-2):15-23
The Chou-Fasman predictive algorithm for determining the secondary structure of proteins from the primary sequence is reviewed. Many examples of its use are presented which illustrate its wide applicability, such as predicting (a) regions with the potential for conformational change, (b) sequences which are capable of assuming several conformations in different environments, (c) effects of single amino acid mutations, (d) amino acid replacements in synthesis of peptides to bring about a change in conformation, (e) guide to the synthesis of polypeptides with definitive secondary structure,e.g. signal sequences, (f) conformational homologues from varying sequences and (g) the amino acid requirements for amphiphilicα-helical peptides. 相似文献
13.
We have used a statistical approach for protein secondary structure prediction based on information theory and simultaneously taking into consideration pairwise residue types and conformational states. Since the prediction of residue secondary structure by one residue window sliding make ambiguity in state prediction, we used a dynamic programming algorithm to find the path with maximum score. A score system for residue pairs in particular conformations is derived for adjacent neighbors up to ten residue apart in sequence. The three state overall per-residue accuracy, Q3, of this method in a jackknife test with dataset created from PDBSELECT is more than 70%. 相似文献
14.
蛋白质二级结构是指蛋白质骨架结构中有规律重复的构象。由蛋白质原子坐标正确地指定蛋白质二级结构是分析蛋白质结构与功能的基础,二级结构的指定对于蛋白质分类、蛋白质功能模体的发现以及理解蛋白质折叠机制有着重要的作用。并且蛋白质二级结构信息广泛应用到蛋白质分子可视化、蛋白质比对以及蛋白质结构预测中。目前有超过20种蛋白质二级结构指定方法,这些方法大体可以分为两大类:基于氢键和基于几何,不同方法指定结果之间的差异较大。由于尚没有蛋白质二级结构指定方法的综述文献,因此,本文主要介绍和总结已有蛋白质二级结构指定方法。 相似文献
15.
蛋白质二级结构预测对于我们了解蛋白质空间结构是至关重要的一步。文章提出了一种简单的二级结构预测方法,该方法采用多数投票法将现有的3种较好的二级结构预测方法的预测结果汇集形成一致性预测结果。从PDB数据库中随机选取近两年新测定结构的57条相似性小于30%的蛋白质,对该方法的预测结果进行测试,其Q3准确率比3种独立的方法提高了1.12—2.29%,相关系数及SOV准确率也有相应的提高。并且各项准确率均比同样采用一致性方法的Jpred二级结构预测程序准确率要高。这种预测方法虽然原理简单,但无须使用额外的参数,计算量小,易于实现,最重要的前提就是必须选用目前准确性比较出色的蛋白质二级结构预测方法。 相似文献
16.
以编码P53N末端120个残基的mRNA二级结构为基础,结合Chou-Fasman蛋白质二级结构预测原则,预测出P53蛋白质N端的93个残基包含四段α螺旋结构(14-26;38-46;51-56;68-70),没有发现β片层。与四种以多重序列联配为基础的蛋白质二级结构预测方法(准确率均为73.20%左右)相对照,结果十分相近。在SGI工作站上以此为初始结构建立的三维构象提示,P53N末端前80个氨基酸肽段呈弧型板块结构,其转录激活区由两段主要螺旋组成,呈上下构形,占据弧型板块的顶部及底部外侧缘。C端13个富含脯氨酸肽段则呈弯曲松散状。这些构象与P53N末端的生物功能是相吻合的 相似文献
17.
O. B. Ptitsyn 《Journal of biosciences》1985,8(1-2):1-13
Physical principles determining the protein structure and protein folding are reviewed: (i) the molecular theory of protein
secondary structure and the method of its prediction based on this theory; (ii) the existence of a limited set of thermodynamically
favourable folding patterns of α- and β-regions in a compact globule which does not depend on the details of the amino acid
sequence; (iii) the moderns approaches to the prediction of the folding patterns of α- and β-regions in concrete proteins;
(iv) experimental approaches to the mechanism of protein folding. The review reflects theoretical and experimental works of
the author and his collaborators as well as those of other groups. 相似文献
18.
Combining evolutionary information and neural networks to predict protein secondary structure 总被引:1,自引:0,他引:1
Using evolutionary information contained in multiple sequence alignments as input to neural networks, secondary structure can be predicted at significantly increased accuracy. Here, we extend our previous three-level system of neural networks by using additional input information derived from multiple alignments. Using a position-specific conservation weight as part of the input increases performance. Using the number of insertions and deletions reduces the tendency for overprediction and increases overall accuracy. Addition of the global amino acid content yields a further improvement, mainly in predicting structural class. The final network system has a sustained overall accuracy of 71.6% in a multiple cross-validation test on 126 unique protein chains. A test on a new set of 124 recently solved protein structures that have no significant sequence similarity to the learning set confirms the high level of accuracy. The average cross-validated accuracy for all 250 sequence-unique chains is above 72%. Using various data sets, the method is compared to alternative prediction methods, some of which also use multiple alignments: the performance advantage of the network system is at least 6 percentage points in three-state accuracy. In addition, the network estimates secondary structure content from multiple sequence alignments about as well as circular dichroism spectroscopy on a single protein and classifies 75% of the 250 proteins correctly into one of four protein structural classes. Of particular practical importance is the definition of a position-specific reliability index. For 40% of all residues the method has a sustained three-state accuracy of 88%, as high as the overall average for homology modelling. A further strength of the method is greatly increased accuracy in predicting the placement of secondary structure segments. © 1994 Wiley-Liss, Inc. 相似文献
19.
The physicochemical mechanism of protein folding has been elucidated by the island model, describing a growth type of folding. The folding pathway is closely related with nucleation on the polypeptide chain and thus the formation of small local structures or secondary structures at the earliest stage of folding is essential to all following steps. The island model is applicable to any protein, but a high precision of secondary structure prediction is indispensable to folding simulation. The secondary structures formed at the earliest stage of folding are supposed to be of standard form, but they are usually deformed during the folding process, especially at the last stage, although the degree of deformation is different for each protein. Ferredoxin is an example of a protein having this property. According to X-ray investigation (1FDX), ferredoxin is not supposed to have secondary structures. However, if we assumed that in ferredoxin all the residues are in a coil state, we could not attain the correct structure similar to the native one. Further, we found that some parts of the chain are not flexible, suggesting the presence of secondary structures, in agreement with the recent PDB data (1DUR). Assuming standard secondary structures (-helices and -strands) at the nonflexible parts at the early stage of folding, and deforming these at the final stage, a structure similar to the native one was obtained. Another peculiarity of ferredoxin is the absence of disulfide bonds, in spite of its having eight cysteines. The reason cysteines do not form disulfide bonds became clear by applying the lampshade criterion, but more importantly, the two groups of cysteines are ready to make iron complexes, respectively, at a rather later stage of folding. The reason for poor prediction accuracy of secondary structure with conventional methods is discussed. 相似文献