首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到17条相似文献,搜索用时 156 毫秒
1.
通过研究神经网络权值矩阵的算法,挖掘蛋白质二级结构与氨基酸序列间的内在规律,提高一级序列预测二级结构的准确度。神经网络方法在特征分类方面具有良好表现,经过学习训练后的神经元连接权值矩阵包含样本的内在特征和规律。研究使用神经网络权值矩阵打分预测;采用错位比对方法寻找敏感的氨基酸邻域;分析测试集在不同加窗长度下的共性表现。实验表明,在滑动窗口长度L=7时,预测性能变化显著;邻域位置P=4的氨基酸残基对预测性能有加强作用。该研究方法为基于局部序列特征的蛋白质二级结构预测提供了新的算法设计。  相似文献   

2.
蛋白质结构的预测在理解蛋白质结构组成和蛋白质的生物学功能有重要意义,而蛋白质二级结构预测是蛋白质结构预测的重要环节。当PSSM位置特异性进化矩阵被广泛应用于将蛋白质初级结构序列编码作为输入样本后,每个残基可以被表示成二维空间的数据平面,由此文中尝试利用卷积神经网络对其进行训练。文中还设计了另一种卷积神经网络,利用长短记忆网络感知了CNN最后卷积特征面的横向特征和纵向特征后连同卷积神经网络的全连接共同完成分类,最后用ensemble方法对两类卷积神经网络模型进行了整合,最终ensemble方法中包含两类卷积神经网络的六个模型,在CB513蛋白质数据集测得的Q3结果为77.2。  相似文献   

3.
神经网络在蛋白质二级结构预测中的应用   总被引:3,自引:0,他引:3  
介绍了蛋白质二级结构预测的研究意义,讨论了用在蛋白质二级结构预测方面的神经网络设计问题,并且较详尽地评述了近些年来用神经网络方法在蛋白质二级结构预测中的主要工作进展情况,展望了蛋白质结构预测的前景。  相似文献   

4.
氨基酸组成聚类、蛋白质结构型和结构型的预测   总被引:11,自引:0,他引:11  
用信息聚类方法对蛋白质的氨基酸组成进行聚类,发现存在梯级成团(大集团分解成小集团)现象,645个蛋白质可分成15个小集团,每一个小集团与蛋白质二级结构含量决定的结构型有一定相关性,但与蛋白质五大结构型相关性不明显。指出了由氨基酸成分和二级结构含量预测结构型的方案中存在的问题。提出了由蛋白质二级结构序列预测蛋白质结构型的新方法,并给出了预测蛋白质结构型的简明预测规则  相似文献   

5.
蛋白质结构型的识别方法   总被引:2,自引:0,他引:2  
给出了α型、β型、α/β型、多域型蛋白质二级结构主序列六联体的分布规律.提出了根据蛋白质二级结构主序列对蛋白质结构型进行识别(分类)的方法.以蛋白质二级结构主序列三联体为参数,利用Mahalanobis距离方法对上述4种结构型的蛋白质进行识别,分类的总体准确率为81%;以二级结构主序列中六联体的频数构成蛋白质结构的多样性源,利用多样性增量极小化对上述4种结构型进行识别,分类的总体准确率为83%. 同时也给出了对紧结构域的识别途径.  相似文献   

6.
依据蛋白质氨基酸特性,以氨基酸组成和有偏自协方差函数为特征矢量,用BP神经网络提出了一种预测非同源蛋白质中α螺旋和β折叠二级结构含量的计算方法。采用相互独立的非同源蛋白质数据库对该方法进行了检验。用Ponnuswamy值时,对二级结构α螺旋和β折叠含量的预测结果是;自检验平均绝对误差分别为0.069和0.065,相应标准偏差分别为0.044和0.047;他检验平均绝对误差分别为0.077和0.070,相应标准偏差分别为0.051和0.049。与仅以氨基酸组成为特征矢量的BP神经网络方法比较,相应的他检验平均绝对误差分别减小了0.024和0.016,标准偏差分别减小了0.031和0.018;与改进的多元线性回归方法比较,相应的他检验平均绝对误差分别减小了0.018和0.011,准偏差分别减小了0.020和0.012。表明:基于氨基酸组成和有偏自协方差函数为特征矢量的BP神经网络预测蛋白质二级结构含量的方法可有效提高预测精度。  相似文献   

7.
吴琳琳  徐硕 《生物信息学》2010,8(3):187-190
蛋白质结构预测是现代计算生物领域最重要的问题之一,而蛋白质二级结构预测是蛋白质高级结构预测的基础。目前蛋白质二级结构的预测方法较多,其中SVM方法取得了较高的预测精度。重在阐述使用SVM用于蛋白质二级结构预测的步骤,以及与其他方法进行比较时应该注意的事项,为下一步的研究提供参考及启发。  相似文献   

8.
蛋白质二级结构的实测与预测   总被引:4,自引:1,他引:3  
蛋白质二级结构的实测和预测在理论和实践上都有重要意义。常用实测方法为圆二色性,红外及二维核磁共振。讨论了各种圆二色性计算方法的优点和缺陷,介绍了二维核磁共振测二级结构的主要方法,讨论了Chou和Fasman预测法的优劣。建议用多种方法测定并结合预测结果分析,得出恰当结论。  相似文献   

9.
蛋白质的二级结构预测研究进展   总被引:1,自引:0,他引:1  
唐媛  李春花  张瑗  尚进  邹凌云  李立奇 《生物磁学》2013,(26):5180-5182
认识蛋白质的二级结构是了解蛋白质的折叠模式和三级结构的基础,并为研究蛋白质的功能以及它们之间的相互作用模式提供结构基础,同时还可以为新药研发提供帮助。故研究蛋白质的二级结构具有重要的意义。随着后基因组时代的到来,越来越多的蛋白质序列不断被发现,给蛋白质的二级结构研究带来巨大的挑战和研究空间。而依靠传统的实验方法很难获取大规模蛋白质的二级结构信息。目前,采用生物信息学手段仍然是获得大部分蛋白质二级结构的途径。近年来,许多研究者通过构建用于二级结构预测的蛋白质数据集,计算、提取蛋白质的各种特征信息,并采用不同的预测算法预测蛋白质的二级结构得到了快速的发展。本文拟从蛋白质的特征信息的提取与筛选、预测算法以及预测效果的检验方法等方面进行综述,介绍蛋白质二级结构预测领域的研究进展。相信随着基因组学、蛋白质组学和生物信息学的不断发展,蛋白质二级结构预测会不断取得新突破。  相似文献   

10.
本文对来自PDB (Protein Data Bank)数据库的蛋白质-RNA复合物结构构建了非冗余非核糖体数据库(694个结构),并对此数据库统计了蛋白质和RNA序列及二级结构的界面偏好性.结果发现,蛋白质β折叠、3_(10)-helix和RNA未配对核苷酸,尤其是未配对中空间排列不规整的核苷酸,具有显著的界面偏好性.据此,对二级结构进行归类,建立了考虑序列和二级结构信息的60×12氨基酸-核苷酸成对偏好势,并将其作为打分函数用于蛋白质-RNA对接中近天然结构的筛选.结果表明,该60×12统计势的打分成功率为65.77%,优于考虑蛋白质或RNA二级结构信息的统计势,及我们小组之前在251个结构上构建的60×8~*统计势.该工作有助于加深对蛋白质-RNA特异性识别的理解,可推动复合物结构预测的进展.  相似文献   

11.
Chao Fang  Yi Shang  Dong Xu 《Proteins》2018,86(5):592-598
Protein secondary structure prediction can provide important information for protein 3D structure prediction and protein functions. Deep learning offers a new opportunity to significantly improve prediction accuracy. In this article, a new deep neural network architecture, named the Deep inception‐inside‐inception (Deep3I) network, is proposed for protein secondary structure prediction and implemented as a software tool MUFOLD‐SS. The input to MUFOLD‐SS is a carefully designed feature matrix corresponding to the primary amino acid sequence of a protein, which consists of a rich set of information derived from individual amino acid, as well as the context of the protein sequence. Specifically, the feature matrix is a composition of physio‐chemical properties of amino acids, PSI‐BLAST profile, and HHBlits profile. MUFOLD‐SS is composed of a sequence of nested inception modules and maps the input matrix to either eight states or three states of secondary structures. The architecture of MUFOLD‐SS enables effective processing of local and global interactions between amino acids in making accurate prediction. In extensive experiments on multiple datasets, MUFOLD‐SS outperformed the best existing methods and other deep neural networks significantly. MUFold‐SS can be downloaded from http://dslsrv8.cs.missouri.edu/~cf797/MUFoldSS/download.html .  相似文献   

12.
A feed-forward neural network has been employed for protein secondary structure prediction. Attempts were made to improve on previous prediction accuracies using a hierarchical mixture of experts (HME). In this method input data are clustered and used to train a series of different networks. Application of an HME to the prediction of protein secondary structure is shown to provide no advantages over a single network. We have also tried various new input representations, chosen to incorporate the effect of residues a long distance away in the one-dimensional amino acid chain. Prediction accuracy using these methods is comparable to that achieved by other neural networks.1–4  相似文献   

13.
14.
Computational neural networks have recently been used to predict the mapping between protein sequence and secondary structure. They have proven adequate for determining the first-order dependence between these two sets, but have, until now, been unable to garner higher-order information that helps determine secondary structure. By adding neural network units that detect periodicities in the input sequence, we have modestly increased the secondary structure prediction accuracy. The use of tertiary structural class causes a marked increase in accuracy. The best case prediction was 79% for the class of all-alpha proteins. A scheme for employing neural networks to validate and refine structural hypotheses is proposed. The operational difficulties of applying a learning algorithm to a dataset where sequence heterogeneity is under-represented and where local and global effects are inadequately partitioned are discussed.  相似文献   

15.
In order to process data of proteins, a numerical representation for an amino acid is often necessary. Many suitable parameters can be derived from experiments or statistical analysis of databases. To ensure a fast and efficient use of these sources of information, a reduction and extraction of relevant information out of these parameters is a basic need. In this approach established methods like principal component analysis (PCA) are supplemented by a method based on symmetric neural networks. Two different parameter representations of amino acids are reduced from five and seven dimensions, respectively, to one, two, three, or four dimensions by using a symmetric neural network approach alternatively with one or three hidden layers. It is possible to create general reduced parameter representations for amino acids. To demonstrate the ability of this approach, these reduced sets of parameters are applied for the ab initio prediction of protein secondary structure from primary structure only. Artificial neural networks are implemented and trained with a diverse representation of 430 proteins out of the PDB. An essentially faster training and also prediction without a decrease in accuracy is obtained for the reduced parameter representations in comparison with the complete set of parameters. The method is transferable to other amino acids or even other molecular building blocks, like nucleic acids, and therefore represents a general approach.Electronic Supplementary Material available.  相似文献   

16.
Sim J  Kim SY  Lee J 《Proteins》2005,59(3):627-632
Successful prediction of protein domain boundaries provides valuable information not only for the computational structure prediction of multidomain proteins but also for the experimental structure determination. Since protein sequences of multiple domains may contain much information regarding evolutionary processes such as gene-exon shuffling, this information can be detected by analyzing the position-specific scoring matrix (PSSM) generated by PSI-BLAST. We have presented a method, PPRODO (Prediction of PROtein DOmain boundaries) that predicts domain boundaries of proteins from sequence information by a neural network. The network is trained and tested using the values obtained from the PSSM generated by PSI-BLAST. A 10-fold cross-validation technique is performed to obtain the parameters of neural networks using a nonredundant set of 522 proteins containing 2 contiguous domains. PPRODO provides good and consistent results for the prediction of domain boundaries, with accuracy of about 66% using the +/-20 residue criterion. The PPRODO source code, as well as all data sets used in this work, are available from http://gene.kias.re.kr/ approximately jlee/pprodo/.  相似文献   

17.
A new method based on neural network theory is presented to analyze and quantify the information content of far UV circular dichroism spectra. Using a backpropagation network model with a single hidden layer between input and output, it was possible to deduce five different secondary structure fractions (helix, parallel and antiparallel beta-sheet, beta-turn and random coil) with satisfactory correlations between calculated and measured secondary structure data. We demonstrate that for each wavelength interval a specific network is suitable. The remaining discrepancy between the secondary structure data from neural network prediction and crystallography may be attributed to errors in the determination of protein concentration and random noise in the CD signal, as indicated by simulations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号