首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 187 毫秒
1.
蛋白质网络聚类是识别功能模块的重要手段,不仅有利于理解生物系统的组织结构,对预测蛋白质功能也具有重要的意义。针对目前蛋白质网络聚类算法缺乏有效分析软件的事实,本文设计并实现了一个新的蛋白质网络聚类算法分析平台ClusterE。该平台实现了查全率、查准率、敏感性、特异性、功能富集分析等聚类评估方法,并且集成了FAG-EC、Dpclus、Monet、IPC-MCE、IPCA等聚类算法,不仅可以对蛋白质网络聚类分析结果进行可视化,并且可以在不同聚类分析指标下对多个聚类算法进行可视化比较与分析。该平台具有良好的扩展性,其中聚类算法以及聚类评估方法都是以插件形式集成到系统中。  相似文献   

2.
为提高蛋白质-蛋白质相互作用(protein-protein interaction, PPI)预测的准确性,并深入探索细胞信号传导和疾病发生的生物学机制,本文提出一种简称为CBSG-PPI的预测算法。该算法首先利用3层前馈网络来处理蛋白质的k-mer特征,采用CT方法和Bert方法提取蛋白质的氨基酸序列以及使用卷积神经网络提取蛋白质的序列特征,再结合图神经网络和多层感知机来准确预测PPI。与现有的预测技术相比,CBSG-PPI在准确率、 F1分数、召回率和精确率等多个关键性能指标上展现了明显的优势,在公开数据集上分别达到了0.855、 0.853、 0.840和0.866的高分。此外,本算法采用了一种改进的参数调整方法,显著提高了计算效率,其预测速度比传统算法快了约140倍。这一显著的性能提升,不仅证实了CBSG-PPI在预测PPI方面的研究价值,也为未来蛋白质间相互作用网络的构建和分析提供了有用的计算工具。  相似文献   

3.
蛋白质功能注释是后基因组时代研究的核心内容之一,基于蛋白质相互作用网络的蛋白质功能预测方法越来越受到研究者们的关注.提出了一种基于贝叶斯网络和蛋白质相互作用可信度的蛋白质功能预测方法.该方法在功能预测过程中为待注释的蛋白质建立贝叶斯网络预测模型,并充分考虑了蛋白质相互作用的可信度问题.在构建的芽殖酵母数据集上的三重交叉验证测试表明,在功能预测过程中考虑蛋白质可信度能够有效地提高功能预测的性能.与现有一些算法相比,该方法能够给出令人满意的预测效果.  相似文献   

4.
蔡娟  王建新  李敏  陈钢 《生物信息学》2011,9(3):185-188
生物网络中的聚类分析是功能模块识别及蛋白质功能预测的重要方法,聚类结果的可视化对于快速有效地分析生物网络结构也具有重要作用。通过分析生物网络显示和分析平台Cytoscape的架构,设计了一个使用方便的聚类分析和显示插件ClusterViz。这是一个可扩展的聚类算法的集成平台,可以不断增加其中的聚类算法,并对不同算法的结果进行比较分析,目前已实现了三种典型的算法实例。该插件能够成为蛋白质相互作用网络机理研究的一个有效工具。  相似文献   

5.
通过研究神经网络权值矩阵的算法,挖掘蛋白质二级结构与氨基酸序列间的内在规律,提高一级序列预测二级结构的准确度。神经网络方法在特征分类方面具有良好表现,经过学习训练后的神经元连接权值矩阵包含样本的内在特征和规律。研究使用神经网络权值矩阵打分预测;采用错位比对方法寻找敏感的氨基酸邻域;分析测试集在不同加窗长度下的共性表现。实验表明,在滑动窗口长度L=7时,预测性能变化显著;邻域位置P=4的氨基酸残基对预测性能有加强作用。该研究方法为基于局部序列特征的蛋白质二级结构预测提供了新的算法设计。  相似文献   

6.
目前评价蛋白质二级结构预测方法主要考虑预测准确率,并没有充分考虑方法自身参数对方法的影响。本文提出一种新型评价方法,将内在评价与外在评价相结合评价预测方法的优劣。以基于混合并行遗传算法的蛋白质二级结构预测方法为例,通过内在评价,合理选取内在参数——切片长度和组内类别数,有效提高预测准确率,同时,通过外在评价,与其他基于随机算法的蛋白质二级结构预测算法比较和与CASP所提供的结论比较,说明了方法的有效性与正确性,以此验证内在评价和外在评价的客观性、公正性和全面性。  相似文献   

7.
蛋白质二级结构预测是蛋白质结构研究的一个重要环节,大量的新预测方法被提出的同时,也不断有新的蛋白质二级结构预测服务器出现。试验选取7种目前常用的蛋白质二级结构预测服务器:PSRSM、SPOT-1D、MUFOLD、Spider3、RaptorX,Psipred和Jpred4,对它们进行了使用方法的介绍和预测效果的评估。随机选取了PDB在2018年8月至11月份发布的180条蛋白质作为测试集,评估角度为:Q3、Sov、边界识别率、内部识别率、转角C识别率,折叠E识别率和螺旋H识别率七种角度。上述服务器180条测试数据的Q3结果分别为:89.96%、88.18%、86.74%、85.77%、83.61%,79.72%和78.29%。结果表明PSRSM的预测结果最好。180条测试集中,以同源性30%,40%,70%分类的实验结果中,PSRSM的Q3结果分别为:89.49%、90.53%、89.87%,均优于其他服务器。实验结果表明,蛋白质二级结构预测可从结合多种深度学习方法以及使用大数据训练模型方向做进一步的研究。  相似文献   

8.
随着基因组规模的高通量实验鉴定技术和计算预测方法的发展,出现了大量蛋白质相互作用数据,但大规模蛋白质相互作用数据中的较高比例的假阳性影响了相互作用数据的质量。生物信息学方法能够从已有的数据和知识出发,通过计算方法系统评估大规模蛋白质相互作用的可信度。本文从过程模型设计、数据集构建、特征选择与综合属性抽取、一些算法使用、实例概述等方面介绍了生物信息学方法评估蛋白质相互作用可信度的研究特点与进展。  相似文献   

9.
蛋白质的二级结构预测研究进展   总被引:1,自引:0,他引:1  
唐媛  李春花  张瑗  尚进  邹凌云  李立奇 《生物磁学》2013,(26):5180-5182
认识蛋白质的二级结构是了解蛋白质的折叠模式和三级结构的基础,并为研究蛋白质的功能以及它们之间的相互作用模式提供结构基础,同时还可以为新药研发提供帮助。故研究蛋白质的二级结构具有重要的意义。随着后基因组时代的到来,越来越多的蛋白质序列不断被发现,给蛋白质的二级结构研究带来巨大的挑战和研究空间。而依靠传统的实验方法很难获取大规模蛋白质的二级结构信息。目前,采用生物信息学手段仍然是获得大部分蛋白质二级结构的途径。近年来,许多研究者通过构建用于二级结构预测的蛋白质数据集,计算、提取蛋白质的各种特征信息,并采用不同的预测算法预测蛋白质的二级结构得到了快速的发展。本文拟从蛋白质的特征信息的提取与筛选、预测算法以及预测效果的检验方法等方面进行综述,介绍蛋白质二级结构预测领域的研究进展。相信随着基因组学、蛋白质组学和生物信息学的不断发展,蛋白质二级结构预测会不断取得新突破。  相似文献   

10.
唐羽  李敏 《生物信息学》2014,12(1):38-45
蛋白质网络聚类是识别功能模块的重要手段,不仅有利于理解生物系统的组织结构,对预测蛋白质功能也具有重要的意义.聚类结果的可视化分析是实现蛋白质网络聚类的有效途径.本论文基于开源的Cytoscape平台,设计并实现了一个蛋白质网络聚类分析及可视化插件CytoCluster.该插件集成了MCODE,FAG-EC,HC-PIN,OH-PIN,IPCA,EAGLE等六种典型的聚类算法;实现了聚类结果的可视化,将分析所得的clusters以缩略图列表的形式直观地显示出来,对于单个cluster,可显示在原网络中的位置,并能生成相应的子图单独显示;可对聚类结果进行导出,记录了算法名称、参数、聚类结果等信息.该插件具有良好的扩展性,提供了统一的算法接口,可不断添加新的聚类算法.  相似文献   

11.
《Genomics》2020,112(1):174-183
Protein complexes are one of the most important functional units for deriving biological processes within the cell. Experimental methods have provided valuable data to infer protein complexes. However, these methods have inherent limitations. Considering these limitations, many computational methods have been proposed to predict protein complexes, in the last decade. Almost all of these in-silico methods predict protein complexes from the ever-increasing protein–protein interaction (PPI) data. These computational approaches usually use the PPI data in the format of a huge protein–protein interaction network (PPIN) as input and output various sub-networks of the given PPIN as the predicted protein complexes. Some of these methods have already reached a promising efficiency in protein complex detection. Nonetheless, there are challenges in prediction of other types of protein complexes, specially sparse and small ones. New methods should further incorporate the knowledge of biological properties of proteins to improve the performance. Additionally, there are several challenges that should be considered more effectively in designing the new complex prediction algorithms in the future. This article not only reviews the history of computational protein complex prediction but also provides new insight for improvement of new methodologies. In this article, most important computational methods for protein complex prediction are evaluated and compared. In addition, some of the challenges in the reconstruction of the protein complexes are discussed. Finally, various tools for protein complex prediction and PPIN analysis as well as the current high-throughput databases are reviewed.  相似文献   

12.
Cytoprophet is a software tool that allows prediction and visualization of protein and domain interaction networks. It is implemented as a plug-in of Cytoscape, an open source software framework for analysis and visualization of molecular networks. Cytoprophet implements three algorithms that predict new potential physical interactions using the domain composition of proteins and experimental assays. The algorithms for protein and domain interaction inference include maximum likelihood estimation (MLE) using expectation maximization (EM); the set cover approach maximum specificity set cover (MSSC) and the sum-product algorithm (SPA). After accepting an input set of proteins with Uniprot ID/Accession numbers and a selected prediction algorithm, Cytoprophet draws a network of potential interactions with probability scores and GO distances as edge attributes. A network of domain interactions between the domains of the initial protein list can also be generated. Cytoprophet was designed to take advantage of the visual capabilities of Cytoscape and be simple to use. An example of inference in a signaling network of myxobacterium Myxococcus xanthus is presented and available at Cytoprophet's website. AVAILABILITY: http://cytoprophet.cse.nd.edu.  相似文献   

13.
Inference of protein functions is one of the most important aims of modern biology. To fully exploit the large volumes of genomic data typically produced in modern-day genomic experiments, automated computational methods for protein function prediction are urgently needed. Established methods use sequence or structure similarity to infer functions but those types of data do not suffice to determine the biological context in which proteins act. Current high-throughput biological experiments produce large amounts of data on the interactions between proteins. Such data can be used to infer interaction networks and to predict the biological process that the protein is involved in. Here, we develop a probabilistic approach for protein function prediction using network data, such as protein-protein interaction measurements. We take a Bayesian approach to an existing Markov Random Field method by performing simultaneous estimation of the model parameters and prediction of protein functions. We use an adaptive Markov Chain Monte Carlo algorithm that leads to more accurate parameter estimates and consequently to improved prediction performance compared to the standard Markov Random Fields method. We tested our method using a high quality S.cereviciae validation network with 1622 proteins against 90 Gene Ontology terms of different levels of abstraction. Compared to three other protein function prediction methods, our approach shows very good prediction performance. Our method can be directly applied to protein-protein interaction or coexpression networks, but also can be extended to use multiple data sources. We apply our method to physical protein interaction data from S. cerevisiae and provide novel predictions, using 340 Gene Ontology terms, for 1170 unannotated proteins and we evaluate the predictions using the available literature.  相似文献   

14.
Proteins are played key roles in different functionalities in our daily life. All functional roles of a protein are a bit enhanced in interaction compared to individuals. Identification of essential proteins of an organism is a time consume and costly task during observation in the wet lab. The results of observation in wet lab always ensure high reliability and accuracy in the biological ground. Essential protein prediction using computational approaches is an alternative choice in research. It proves its significance rapidly in day-to-day life as well as reduces the experimental cost of wet lab effectively. Existing computational methods were implemented using Protein interaction networks (PPIN), Sequence, Gene Expression Dataset (GED), Gene Ontology (GO), Orthologous groups, and Subcellular localized datasets. Machine learning has diverse categories of features that enable to model and predict essential macromolecules of understudied organisms. A novel methodology MEM-FET (membership feature) is predicted based on features, that is, edge clustering coefficient, Average clustering coefficient, subcellular localization, and Gene Ontology within a compartment of common neighbors. The accuracy (ACC) values of the predicted true positive (TP) essential proteins are 0.79, 0.74, 0.78, and 0.71 for YHQ, YMIPS, YDIP, and YMBD datasets. An enriched set of essential proteins are also predicted using the MEM-FET algorithm. Ensemble ML also validated the proposed model with an accuracy of 60%. It has been predicted that MEM-FET algorithms outperform other existing algorithms with an ACC value of 80% for the yeast dataset.  相似文献   

15.
In recent years, significant effort has been given to predicting protein functions from protein interaction data generated from high throughput techniques. However, predicting protein functions correctly and reliably still remains a challenge. Recently, many computational methods have been proposed for predicting protein functions. Among these methods, clustering based methods are the most promising. The existing methods, however, mainly focus on protein relationship modeling and the prediction algorithms that statically predict functions from the clusters that are related to the unannotated proteins. In fact, the clustering itself is a dynamic process and the function prediction should take this dynamic feature of clustering into consideration. Unfortunately, this dynamic feature of clustering is ignored in the existing prediction methods. In this paper, we propose an innovative progressive clustering based prediction method to trace the functions of relevant annotated proteins across all clusters that are generated through the progressive clustering of proteins. A set of prediction criteria is proposed to predict functions of unannotated proteins from all relevant clusters and traced functions. The method was evaluated on real protein interaction datasets and the results demonstrated the effectiveness of the proposed method compared with representative existing methods.  相似文献   

16.
BackgroundSimilarity based computational methods are a useful tool for predicting protein functions from protein–protein interaction (PPI) datasets. Although various similarity-based prediction algorithms have been proposed, unsatisfactory prediction results have occurred on many occasions. The purpose of this type of algorithm is to predict functions of an unannotated protein from the functions of those proteins that are similar to the unannotated protein. Therefore, the prediction quality largely depends on how to select a set of proper proteins (i.e., a prediction domain) from which the functions of an unannotated protein are predicted, and how to measure the similarity between proteins. Another issue with existing algorithms is they only believe the function prediction is a one-off procedure, ignoring the fact that interactions amongst proteins are mutual and dynamic in terms of similarity when predicting functions. How to resolve these major issues to increase prediction quality remains a challenge in computational biology.ResultsIn this paper, we propose an innovative approach to predict protein functions of unannotated proteins iteratively from a PPI dataset. The iterative approach takes into account the mutual and dynamic features of protein interactions when predicting functions, and addresses the issues of protein similarity measurement and prediction domain selection by introducing into the prediction algorithm a new semantic protein similarity and a method of selecting the multi-layer prediction domain. The new protein similarity is based on the multi-layered information carried by protein functions. The evaluations conducted on real protein interaction datasets demonstrated that the proposed iterative function prediction method outperformed other similar or non-iterative methods, and provided better prediction results.ConclusionsThe new protein similarity derived from multi-layered information of protein functions more reasonably reflects the intrinsic relationships among proteins, and significant improvement to the prediction quality can occur through incorporation of mutual and dynamic features of protein interactions into the prediction algorithm.  相似文献   

17.
Prediction of protein function is one of the most challenging problems in the post-genomic era. In this paper, we propose a novel algorithm Improved ProteinRank (IPR) for protein function prediction, which is based on the search engine technology and the preferential attachment criteria. In addition, an improved algorithm IPRW is developed from IPR to be used in the weighted protein?protein interaction (PPI) network. The proposed algorithms IPR and IPRW are applied to the PPI network of S.cerevisiae. The experimental results show that both IPR and IPRW outweigh the previous methods for the prediction of protein functions.  相似文献   

18.
Recently a number of computational approaches have been developed for the prediction of protein–protein interactions. Complete genome sequencing projects have provided the vast amount of information needed for these analyses. These methods utilize the structural, genomic, and biological context of proteins and genes in complete genomes to predict protein interaction networks and functional linkages between proteins. Given that experimental techniques remain expensive, time-consuming, and labor-intensive, these methods represent an important advance in proteomics. Some of these approaches utilize sequence data alone to predict interactions, while others combine multiple computational and experimental datasets to accurately build protein interaction maps for complete genomes. These methods represent a complementary approach to current high-throughput projects whose aim is to delineate protein interaction maps in complete genomes. We will describe a number of computational protocols for protein interaction prediction based on the structural, genomic, and biological context of proteins in complete genomes, and detail methods for protein interaction network visualization and analysis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号