首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
蛋白质结构型的识别方法   总被引:2,自引:0,他引:2  
给出了α型、β型、α/β型、多域型蛋白质二级结构主序列六联体的分布规律.提出了根据蛋白质二级结构主序列对蛋白质结构型进行识别(分类)的方法.以蛋白质二级结构主序列三联体为参数,利用Mahalanobis距离方法对上述4种结构型的蛋白质进行识别,分类的总体准确率为81%;以二级结构主序列中六联体的频数构成蛋白质结构的多样性源,利用多样性增量极小化对上述4种结构型进行识别,分类的总体准确率为83%. 同时也给出了对紧结构域的识别途径.  相似文献   

2.
蛋白质结构型的定义和识别   总被引:5,自引:1,他引:4  
提出紧结构域的概念,由二级结构序列中一段或几段连续的α螺旋和β折叠构成的空间紧密堆集的最大折叠体称为紧结构域.利用3种紧结构域(α域,β域和α/β域)定义球蛋白的5种结构型:α型蛋白,β型蛋白,α/β型蛋白,多域蛋白和ζ型蛋白.将1 261个代表性的蛋白质(1 022家族)进行分类,并和SCOP库的分类做了比较.进行了删去序列冗余的分析.在此基础上提出结构型的预测方案,成功率在82%~85%.  相似文献   

3.
提出紧结构域的概念,由二级结构序列中一段或几段连续的α螺旋和β折叠构成的空间紧密堆集的最大折叠体称为紧结构域.利用3种紧结构域(α域,β域和α/β域)定义球蛋白的5种结构型:α型蛋白,β型蛋白,α/β型蛋白,多域蛋白和ζ型蛋白.将1 261个代表性的蛋白质(1 022家族)进行分类,并和SCOP库的分类做了比较.进行了删去序列冗余的分析.在此基础上提出结构型的预测方案,成功率在82%~85%.  相似文献   

4.
用离散量预测蛋白质的结构型   总被引:14,自引:2,他引:12  
基于蛋白质的结构类型决定了它的二级结构序列的概念,用二级结构序列参数Nα,Nβ,Nβaβ,N(βαβ)构成离散源,并计算离散量D(Xα),D(Xβ),D(Xα+β),利用离散增量预测蛋白质的结构类型,它是由这个蛋白质的离散量D(Xn)与四个标准离散D(Xα),D(Xβ),D(Xα/β),D(Xα+β)之间离散增量的最小值所决定的,预测结果表明,准确率分别达到84.8%(标准集)和83.3%(检验集)。  相似文献   

5.
用人工神经网络方法预测蛋白质超二级结构   总被引:10,自引:0,他引:10  
蛋白质超二级结构,即由α-螺旋和β-折叠等二级结构单元和连接短肽组成的超二级结构,是蛋白质结构研究中的一个重要层次。目前蛋白质超二级结构的预测工作尚属摸索阶段,还没有成熟的方法。人工神经网络预测方法是近年来在二级结构预测中发展起来的新方法。本文成功的将人工神经网络引入蛋白质超二级结构的预测工作中,结果表明蛋白质的超二级结构的发生与其局域的氨基酸的序列模式有重要联系,可以由蛋白质的一级结构序列预测该  相似文献   

6.
神经网络在蛋白质二级结构预测中的应用   总被引:3,自引:0,他引:3  
介绍了蛋白质二级结构预测的研究意义,讨论了用在蛋白质二级结构预测方面的神经网络设计问题,并且较详尽地评述了近些年来用神经网络方法在蛋白质二级结构预测中的主要工作进展情况,展望了蛋白质结构预测的前景。  相似文献   

7.
曹晨  马堃 《生物信息学》2016,14(3):181-187
蛋白质二级结构是指蛋白质骨架结构中有规律重复的构象。由蛋白质原子坐标正确地指定蛋白质二级结构是分析蛋白质结构与功能的基础,二级结构的指定对于蛋白质分类、蛋白质功能模体的发现以及理解蛋白质折叠机制有着重要的作用。并且蛋白质二级结构信息广泛应用到蛋白质分子可视化、蛋白质比对以及蛋白质结构预测中。目前有超过20种蛋白质二级结构指定方法,这些方法大体可以分为两大类:基于氢键和基于几何,不同方法指定结果之间的差异较大。由于尚没有蛋白质二级结构指定方法的综述文献,因此,本文主要介绍和总结已有蛋白质二级结构指定方法。  相似文献   

8.
蛋白质折叠速率预测研究进展   总被引:2,自引:0,他引:2  
蛋白质折叠速率预测是当今生物物理学最具挑战性的课题之一。近年来,该领域的研究取得了很大的进展,提出了许多经验参数,例如:接触序、长程序、总接触距离、链拓扑参数、二级结构含量、有效长度、螺旋参数、n-阶接触距离等。这些参数都和蛋白质的折叠速率有很好的相关性,基于这些参数的各种预测方法所得到的预测结果也与实验数据较好地吻合。  相似文献   

9.
蛋白质超二级结构预测是三级结构预测的一个非常重要的中间步骤。本文从蛋白质的一级序列出发,对5793个蛋白质中的四类简单超二级结构进行预测,以位点氨基酸为参数,采用3种片段截取方式,分别用离散增量算法预测的结果不理想,将组合的离散增量值作为特征参数输入支持向量机,取得了较好的预测结果,5交叉检验的平均预测总精度达到83.0%,Matthew’s相关系数在0.71以上。  相似文献   

10.
蛋白质的二级结构预测研究进展   总被引:1,自引:0,他引:1  
唐媛  李春花  张瑗  尚进  邹凌云  李立奇 《生物磁学》2013,(26):5180-5182
认识蛋白质的二级结构是了解蛋白质的折叠模式和三级结构的基础,并为研究蛋白质的功能以及它们之间的相互作用模式提供结构基础,同时还可以为新药研发提供帮助。故研究蛋白质的二级结构具有重要的意义。随着后基因组时代的到来,越来越多的蛋白质序列不断被发现,给蛋白质的二级结构研究带来巨大的挑战和研究空间。而依靠传统的实验方法很难获取大规模蛋白质的二级结构信息。目前,采用生物信息学手段仍然是获得大部分蛋白质二级结构的途径。近年来,许多研究者通过构建用于二级结构预测的蛋白质数据集,计算、提取蛋白质的各种特征信息,并采用不同的预测算法预测蛋白质的二级结构得到了快速的发展。本文拟从蛋白质的特征信息的提取与筛选、预测算法以及预测效果的检验方法等方面进行综述,介绍蛋白质二级结构预测领域的研究进展。相信随着基因组学、蛋白质组学和生物信息学的不断发展,蛋白质二级结构预测会不断取得新突破。  相似文献   

11.
12.
We present an approach to predicting protein structural class that uses amino acid composition and hydrophobic pattern frequency information as input to two types of neural networks: (1) a three-layer back-propagation network and (2) a learning vector quantization network. The results of these methods are compared to those obtained from a modified Euclidean statistical clustering algorithm. The protein sequence data used to drive these algorithms consist of the normalized frequency of up to 20 amino acid types and six hydrophobic amino acid patterns. From these frequency values the structural class predictions for each protein (all-alpha, all-beta, or alpha-beta classes) are derived. Examples consisting of 64 previously classified proteins were randomly divided into multiple training (56 proteins) and test (8 proteins) sets. The best performing algorithm on the test sets was the learning vector quantization network using 17 inputs, obtaining a prediction accuracy of 80.2%. The Matthews correlation coefficients are statistically significant for all algorithms and all structural classes. The differences between algorithms are in general not statistically significant. These results show that information exists in protein primary sequences that is easily obtainable and useful for the prediction of protein structural class by neural networks as well as by standard statistical clustering algorithms.  相似文献   

13.
One major problem with the existing algorithm for the prediction of protein structural classes is low accuracies for proteins from α/β and α+β classes. In this study, three novel features were rationally designed to model the differences between proteins from these two classes. In combination with other rational designed features, an 11-dimensional vector prediction method was proposed. By means of this method, the overall prediction accuracy based on 25PDB dataset was 1.5% higher than the previous best-performing method, MODAS. Furthermore, the prediction accuracy for proteins from α+β class based on 25PDB dataset was 5% higher than the previous best-performing method, SCPRED. The prediction accuracies obtained with the D675 and FC699 datasets were also improved.  相似文献   

14.
以2002年4月份的Culled Protein Data Bank数据库中的639条蛋白质多肽链为研究对象,统计分析了其含有的584条二硫键的形成特征,发现半胱氨酸氧化还原状态表现出明显的协同性现象:含有二硫键的蛋白质中几乎所有的半胱氨酸都以氧化态形式存在。这一协同性可以通过蛋白质全局水平上的20种氨基酸组分的百分含量很好地加以说明,由此来预测半胱氨酸的氧化还原状态准确率最高可达84.5%。结果表明半胱氨酸是否形成二硫键主要取决于蛋白质全局的而非局部的结构信息。  相似文献   

15.
Proteins are generally classified into four structural classes: all-alpha proteins, all-beta proteins, alpha + beta proteins, and alpha/beta proteins. In this article, a protein is expressed as a vector of 20-dimensional space, in which its 20 components are defined by the composition of its 20 amino acids. Based on this, a new method, the so-called maximum component coefficient method, is proposed for predicting the structural class of a protein according to its amino acid composition. In comparison with the existing methods, the new method yields a higher general accuracy of prediction. Especially for the all-alpha proteins, the rate of correct prediction obtained by the new method is much higher than that by any of the existing methods. For instance, for the 19 all-alpha proteins investigated previously by P.Y. Chou, the rate of correct prediction by means of his method was 84.2%, but the correct rate when predicted with the new method would be 100%! Furthermore, the new method is characterized by an explicable physical picture. This is reflected by the process in which the vector representing a protein to be predicted is decomposed into four component vectors, each of which corresponds to one of the norms of the four protein structural classes.  相似文献   

16.
J Boberg  T Salakoski  M Vihinen 《Proteins》1992,14(2):265-276
Reliable structural and statistical analyses of three dimensional protein structures should be based on unbiased data. The Protein Data Bank is highly redundant, containing several entries for identical or very similar sequences. A technique was developed for clustering the known structures based on their sequences and contents of alpha- and beta-structures. First, sequences were aligned pairwise. A representative sample of sequences was then obtained by grouping similar sequences together, and selecting a typical representative from each group. The similarity significance threshold needed in the clustering method was found by analyzing similarities of random sequences. Because three dimensional structures for proteins of same structural class are generally more conserved than their sequences, the proteins were clustered also according to their contents of secondary structural elements. The results of these clusterings indicate conservation of alpha- and beta-structures even when sequence similarity is relatively low. An unbiased sample of 103 high resolution structures, representing a wide variety of proteins, was chosen based on the suggestions made by the clustering algorithm. The proteins were divided into structural classes according to their contents and ratios of secondary structural elements. Previous classifications have suffered from subjective view of secondary structures, whereas here the classification was based on backbone geometry. The concise view lead to reclassification of some structures. The representative set of structures facilitates unbiased analyses of relationships between protein sequence, function, and structure as well as of structural characteristics.  相似文献   

17.
MOTIVATION: Clustering protein structures is an important task in structural bioinformatics. De novo structure prediction, for example, often involves a clustering step for finding the best prediction. Other applications include assigning proteins to fold families and analyzing molecular dynamics trajectories. RESULTS: We present Pleiades, a novel approach to clustering protein structures with a rigorous mathematical underpinning. The method approximates clustering based on the root mean square deviation by first mapping structures to Gauss integral vectors--which were introduced by R?gen and co-workers--and subsequently performing K-means clustering. Conclusions: Compared to current methods, Pleiades dramatically improves on the time needed to perform clustering, and can cluster a significantly larger number of structures, while providing state-of-the-art results. The number of low energy structures generated in a typical folding study, which is in the order of 50,000 structures, can be clustered within seconds to minutes.  相似文献   

18.
A protein is usually classified into one of the following four structural classes: all alpha, all beta, (alpha + beta) and alpha/beta. In this paper, based on the maximum correlation-coefficient principle, a new formulation is proposed for predicting the structural class of a protein according to its amino acid composition. Calculations have been made for a development set of proteins from which the amino acid compositions for the standard structural classes were derived, and an independent set of proteins which are outside the development set. The former can test the self consistency of a method and the latter can test its extrapolating effectiveness. In both cases, the results showed that the new method gave a considerably higher rate of correct prediction than any of the previous methods, implying that a significant improvement has been achieved by implementing the maximum-correlation-coefficient principle in the new method.  相似文献   

19.
《Genomics》2020,112(2):1941-1946
In this paper, a step-by-step classification algorithm based on double-layer SVM model is constructed to predict the secondary structure of proteins. The most important feature of this algorithm is to improve the prediction accuracy of α+β and α/β classes through transforming the prediction of two classes of proteins, α+β and α/β classes, with low accuracy in the past, into the prediction of all-α and all-β classes with high accuracy. A widely-used dataset, 25PDB dataset with sequence similarity lower than 40%, is used to evaluate this method. The results show that this method has good performance, and on the basis of ensuring the accuracy of other three structural classes of proteins, the accuracy of α+β class proteins is improved significantly.  相似文献   

20.
给出了以疏水一亲水模型为基础的蛋白质设计方法,该方法以物理学原理为基础,以相对熵作为优化的目标函数。对四种不同结构类型的天然结构的真实蛋白质进行了检测,分析了影响检测成功率的主要因素,结果表明,该方法是普适的,可用于对不同结构类型的蛋白质设计序列。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号