首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
随着基因组规模的高通量实验鉴定技术和计算预测方法的发展,出现了大量蛋白质相互作用数据,但大规模蛋白质相互作用数据中的较高比例的假阳性影响了相互作用数据的质量。生物信息学方法能够从已有的数据和知识出发,通过计算方法系统评估大规模蛋白质相互作用的可信度。本文从过程模型设计、数据集构建、特征选择与综合属性抽取、一些算法使用、实例概述等方面介绍了生物信息学方法评估蛋白质相互作用可信度的研究特点与进展。  相似文献   

2.
蛋白质相互作用的生物信息学研究进展   总被引:2,自引:0,他引:2  
生命过程的分子基础在于生物分子之间的相互作用,其中蛋白质分子之间的相互作用占有极其重要的地位。研究蛋白质相互作用对于理解生命的真谛、探讨致病微生物的致病机理,以及研究新药提高人们的健康水平具有重要的作用。用生物信息学的方法研究蛋白质的相互作用已经取得了许多重要的成果,但也有很多问题还需解决。本文从蛋白质相互作用的数据库、预测方法、可预测蛋白质相互作用的网上服务、蛋白质相互作用网络等几方面,对蛋白质相互作用的生物信息学研究成果及其存在的问题做了概述。  相似文献   

3.
生物信息学方法预测蛋白质相互作用网络中的功能模块   总被引:1,自引:0,他引:1  
蛋白质相互作用是大多数生命过程的基础。随着高通量实验技术和计算机预测方法的发展,在各种生物中已获得了数目十分庞大的蛋白质相互作用数据,如何从中提取出具有生物学意义的数据是一项艰巨的挑战。从蛋白质相互作用数据出发获得相互作用网络进而预测出其中的功能模块,对于蛋白质功能预测、揭示各种生化反应过程的分子机理都有着极大的帮助。我们分类概括了用生物信息学预测蛋白质相互作用功能模块的方法,以及对这些方法的评价,并介绍了蛋白质相互作用网络比较的一些方法。  相似文献   

4.
预测蛋白质间相互作用的生物信息学方法   总被引:8,自引:0,他引:8  
后基因组时代的研究模式,已从原来的序列-结构-功能转向基因表达-系统动力学-生理功能。建立蛋白质间相互作用的完全网络,即蛋白质相互作用组(interactome),将有助于从系统角度加深对细胞结构和功能的认识,并为新药靶点的发现和药物设计提供理论基础。一系列系统分析蛋白质相互作用的实验方法已经建立,近年来,出现了多种预测蛋白质相互作用的生物信息学方法,这些方法不仅是对传统实验方法的有价值的补充,而且能够扩展实验方法的预测范围;同时,在开发这些方法的过程中建立了一些重要的分子进化和分子生物学慨念。本文综述了9种生物信息学方法的原理、方法评估、存在的问题.并分析了这个领域的发展前景。  相似文献   

5.
蛋白质相互作用预测是生物信息学研究的重要问题之一,提出了一种基于物理化学性质优化的蛋白质相互作用预测方法,与现有方法的显著不同就是,并未使用已知的氨基酸残基的物理化学性质,而是通过粒子群算法优化得到有益于相互作用预测的物理化学性质数值.对真实的数据集测试表明,优化得到的物理化学性质比现有的物理化学性质更有益于提高蛋白质相互作用的预测性能,与其它方法相比,也具有一定的优势,说明该方法是一种有效的蛋白质相互作用预测方法.  相似文献   

6.
[目的]对SGTA基因及其蛋白的结构和特征进行生物信息学分析,为研究SGTA与肿瘤形成和发展的相关性提供理论基础。[方法]运用生物信息学数据库和软件对SGTA基因的结构、单核苷酸多态性位点(SNP)、SGTA基因与其他基因的相互作用网络、SGTA蛋白的理化性质、二级结构、蛋白结构域、蛋白翻译后修饰、蛋白质之间相互作用网络进行分析。[结果]人SGTA基因有5种可变剪接产物,编码区存在78个SNP位点,其中错义突变31个,无义突变1个。人SGTA蛋白由313个氨基酸组成,是稳定性不高的亲水蛋白,α-螺旋是其主要二级结构元件,属于TRP超家族,预测有3个磷酸化激酶修饰位点和数个潜在泛素化修饰位点。与SGTA存在相互作用的基因和蛋白多数与维持体内蛋白质稳定的分子伴侣功能相关。[结论]SGTA基因及其蛋白的生物信息学分析为进一步实验研究其在肿瘤形成和发展中的地位及调控机制奠定了基础。  相似文献   

7.
计算方法在蛋白质相互作用研究中的应用   总被引:3,自引:1,他引:2  
计算方法在蛋白质相互作用研究的各个阶段扮演了一个重要的角色。对此,作者将从以下几个方面对计算方法在蛋白质相互作用及相互作用网络研究中的应用做一个概述:蛋白质相互作用数据库及其发展;数据挖掘方法在蛋白质相互作用数据收集和整合中的应用;高通量方法实验结果的验证;根据蛋白质相互作用网络预测和推断未知蛋白质的功能;蛋白质相互作用的预测。  相似文献   

8.
大规模蛋白质功能预测方法的进展   总被引:2,自引:0,他引:2  
全基因组测序的快速发展在获得大量序列信息的同时也迫切需要获取功能信息,用生物信息学方法进行大规模蛋白质功能预测在这种需求中获得发展。这些预测方法从基于序列同源性发展到基于genomic-context获得功能相关蛋白质对。基于genomic-context的方法具体有基因融合、染色体邻近、相似系统发生谱等。由于各种方法的偏向性,最新的趋势是整合多种方法的数据,组成蛋白质相互作用网络,通过分析网络的结构进行蛋白质功能预测。  相似文献   

9.
结构域是进化上的保守序列单元,是蛋白质的结构和功能的标准组件.典型的两个蛋白质间的相互作用涉及特殊结构域间的结合,而且识别相互作用结构域对于在结构域水平上彻底理解蛋白质的功能与进化、构建蛋白质相互作用网络、分析生物学通路等十分重要.目前,依赖于对实验数据的进一步挖掘和对各种不同输入数据的计算预测,已识别出了一些相互作用/功能连锁结构域对,并由此构建了内容丰富、日益更新的结构域相互作用数据库.综述了产生结构域相互作用的8种计算预测方法.介绍了5个结构域相互作用公共数据库3DID、iPfam、InterDom、DIMA和DOMINE的有关信息和最新动态.实例概述了结构域相互作用在蛋白质相互作用计算预测、可信度评估,蛋白质结构域注释,以及在生物学通路分析中的应用.  相似文献   

10.
蛋白质组学是对蛋白质组进行研究的一门新兴学科。它可以揭示细胞内蛋白质组成成分与修饰状态的动态变化.研究蛋白质之间的相互作用.从全局的高度来研究代谢.发育以及调控等复杂的问题。 由于蛋白质组学研究范围十分广泛,所需要的技术手段也多种多样,如双向凝胶电泳(2-DE).色谱.蛋白质芯片、质谱以及生物信息学等。2-DE分离蛋白质.通过生物质谱以及生物信息学手段对蛋白质进行鉴定是目前最常用的一种研究策略。由于该套技术平台中影响因素较多.很多初学者难以快速掌握.所以本实验方法系列讲座将详细介绍这一常规技术平台中的多项经典实验方案(样品制备.2一DE、染色、质谱鉴定等).并着重对一些常见问题与难点进行深入的探讨。此外,本实验方法系列讲座还将涉及近年来刚刚兴起的大规模的蛋白质组自动分析系统——液相色谱-质谱联用.旨在为研究者拓宽实验方法的视野。  相似文献   

11.
Li Z  Zhou X  Dai Z  Zou X 《Amino acids》2012,43(2):793-804
The coupling between G protein-coupled receptors (GPCRs) and guanine nucleotide-binding proteins (G proteins) regulates various signal transductions from extracellular space into the cell. However, the coupling mechanism between GPCRs and G proteins is still unknown, and experimental determination of their coupling specificity and function is both expensive and time consuming. Therefore, it is significant to develop a theoretical method to predict the coupling specificity between GPCRs and G proteins as well as their function using their primary sequences. In this study, a novel four-layer predictor (GPCRsG_CWTIT) based on support vector machine (SVM), continuous wavelet transform (CWT) and information theory (IT) is developed to classify G proteins and predict the coupling specificity between GPCRs and G proteins. SVM is used for construction of models. CWT and IT are used to characterize the primary structure of protein. Performance of GPCRsG_CWTIT is evaluated with cross-validation test on various working dataset. The overall accuracy of the G proteins at the levels of class and family is 98.23 and 85.42%, respectively. The accuracy of the coupling specificity prediction varies from 74.60 to 94.30%. These results indicate that the proposed predictor is an effective and feasible tool to predict the coupling specificity between GPCRs and G proteins as well as their functions using only the protein full sequence. The establishment of such an accurate prediction method will facilitate drug discovery by improving the ability to identify and predict protein-protein interactions. GPCRsG_CWTIT and dataset can be acquired freely on request from the authors.  相似文献   

12.
13.
In order to predict biologically significant attributes such as function from protein sequences, searching against large databases for homologous proteins is a common practice. In particular, BLAST and HMMER are widely used in a variety of biological fields. However, sequencehomologous proteins determined by BLAST and proteins having the same domains predicted by HMMER are not always functionally equivalent, even though their sequences are aligning with high similarity. Thus, accurate assignment of functionally equivalent proteins from aligned sequences remains a challenge in bioinformatics. We have developed the FEP-BH algorithm to predict functionally equivalent proteins from protein-protein pairs identified by BLAST and from protein-domain pairs predicted by HMMER. When examined against domain classes of the Pfam-A seed database, FEP-BH showed 71.53% accuracy, whereas BLAST and HMMER were 57.72% and 36.62%, respectively. We expect that the FEP-BH algorithm will be effective in predicting functionally equivalent proteins from BLAST and HMMER outputs and will also suit biologists who want to search out functionally equivalent proteins from among sequence-homologous proteins.  相似文献   

14.
A simplified protein surface cartography approach has been developed to assist in the analysis of surface features in homologous families, and thus to predict conservation or divergence of protein functions and protein-protein interaction patterns. A spherical approximation of protein surface was used, with a focus on charged and hydrophobic residues. The resulting surface map allows for qualitative analysis and comparison of surfaces of proteins, but can also be used to define a simple numerical measure of map similarity between two or more proteins. The latter was shown to be useful for function based classifications within large protein families.Surface map analysis was tested on several test cases: haemoglobins, death domains and TRAF domains. It was shown that surface map comparison allows a better function prediction than general sequence analysis methods and can reproduce known examples of functional variation within a divergent group of proteins. In another example, we predict novel, unexpected sets of common functional properties for seemingly distant members of a large group of divergent proteins. The method was also shown to be robust enough to allow using protein models from comparative modelling instead of experimental structures.  相似文献   

15.
Mitochondria are an essential organelle, not only to the human cell, but to all eukaryotic life. This essentiality is reflected in the large number of mutations in genes encoding mitochondrial proteins that lead to disease. Aside from their relevance to disease, mitochondria are, given their endosymbiotic origin, very interesting from an evolutionary point of view. Here, in the year that marks the bicentenary of Darwin's birth and the 150th anniversary of the publication of “On the origin of species” we review approaches that implicitly or explicitly use evolutionary analyses to find new genes involved in mitochondrial disease and to predict their function and involvement in pathways. We show how the phenotypic spectrum of mitochondrial disease is linked to the evolutionary origin of mitochondrial proteins, how combinations of evolutionary data and genomics data have been used to predict the mitochondrial proteome and functional links between the mitochondrial proteins and how the evolution of the mitochondrial proteome has been used to predict new mitochondrial disease genes. For the latter we review and reanalyze the eukaryotic evolution of the NADH:ubiquinone oxidoreductase (complex I) and the proteins involved in its assembly.  相似文献   

16.
Genomics projects have elucidated several genes that encode protein sequences. Subsequently, the advent of the proteomics age has enabled the synthesis and 3D structure determination for these protein sequences. Some of these proteins incorporate metal atoms but it is often not known whether they are metal-binding proteins and the nature of the biological activity is not understood. Consequently, the development of methods to predict metal-mediated biological activity of proteins from the 3D structure of metal-unbound proteins is a goal of major importance. More specifically, the amino terminal Cu(II)- and Ni(II)-binding (ATCUN) motif is a small metal-binding site found in the N-terminus of many naturally occurring proteins. The ATCUN motif participates in DNA cleavage and has anti-tumor activity. In this study, we calculated average 3D electrostatic potentials (xi(k)) for 265 different proteins including 133 potential ATCUN anti-tumor proteins. We also calculated xi(k) values for the total protein or for the following specific protein regions: the core, inner, middle, and outer orbits. A linear discriminant analysis model was subsequently developed to assign proteins into two groups called ATCUN DNA-cleavage proteins and non-active proteins. The best model found was: ATCUN=1.15.xi(1)(inner)+2.18.xi(5)(middle)+27.57.xi(0)(outer)-27.57.xi(0)(total)+0.09. The model correctly classified 182 out of 197 (91.4%) and 61 out of 66 (92.4%) proteins in training and external predicting series', respectively. Finally, desirability analysis was used to predict the values for the electrostatic potential in one single region and the combined values in two regions that are desirable for ATCUN-like proteins. To the best of our knowledge, the present work is the first study in which desirability analysis has been used in protein quantitative-structure-activity-relationship (QSAR).  相似文献   

17.
An integrated family of amino acid sequence analysis programs   总被引:12,自引:0,他引:12  
During the last years abundant sequence data has become availabledue to the rapid progress in protein and DNA sequencing techniques.The exact three-dimensional structures, however, are availableonly for a fraction of proteins with known sequences. For manypurposes the primary amino acid sequence of a protein can bedirectly used to predict important structural parameters. However,mathematical presentation of the calculated values often makesinterpretation difficult, especially if many proteins must beanalysed and compared. Here we introduce a broad-based, user-definedanalysis of amino acid sequence information. The program packageis based on published algorithms and is designed to access standardprotein data bases, calculate hydropathy, surface probabilityand flexibility values and perform secondary structure predictions.The data output is in an ‘easy-to-read’ graphicformat and several parameters can be superimposed within a singleplot in order to simplify data interpretations. Additionally,this package includes a novel algorithm for the prediction ofpotential antigenic sites. Thus the software package presentedhere offers a powerful means of analysing an amino acid sequencefor the purpose of structure/function studies as well as antigenicsite analyses. These algorithms were written to function incontext with the UWGCG (University of Wisconsin Genetics ComputerGroup) program collection, and are now distributed within thatpackage. Received on March 20, 1987; accepted on September 4, 1987  相似文献   

18.
MOTIVATION: Structural genomics projects are beginning to produce protein structures with unknown function, therefore, accurate, automated predictors of protein function are required if all these structures are to be properly annotated in reasonable time. Identifying the interface between two interacting proteins provides important clues to the function of a protein and can reduce the search space required by docking algorithms to predict the structures of complexes. RESULTS: We have combined a support vector machine (SVM) approach with surface patch analysis to predict protein-protein binding sites. Using a leave-one-out cross-validation procedure, we were able to successfully predict the location of the binding site on 76% of our dataset made up of proteins with both transient and obligate interfaces. With heterogeneous cross-validation, where we trained the SVM on transient complexes to predict on obligate complexes (and vice versa), we still achieved comparable success rates to the leave-one-out cross-validation suggesting that sufficient properties are shared between transient and obligate interfaces. AVAILABILITY: A web application based on the method can be found at http://www.bioinformatics.leeds.ac.uk/ppi_pred. The dataset of 180 proteins used in this study is also available via the same web site. CONTACT: westhead@bmb.leeds.ac.uk SUPPLEMENTARY INFORMATION: http://www.bioinformatics.leeds.ac.uk/ppi-pred/supp-material.  相似文献   

19.

Background  

Bioinformatics can be used to predict protein function, leading to an understanding of cellular activities, and equally-weighted protein-protein interactions (PPI) are normally used to predict such protein functions. The present study provides a weighting strategy for PPI to improve the prediction of protein functions. The weights are dependent on the local and global network topologies and the number of experimental verification methods. The proposed methods were applied to the yeast proteome and integrated with the neighbour counting method to predict the functions of unknown proteins.  相似文献   

20.
Liang S  Grishin NV 《Proteins》2004,54(2):271-281
We have developed an effective scoring function for protein design. The atomic solvation parameters, together with the weights of energy terms, were optimized so that residues corresponding to the native sequence were predicted with low energy in the training set of 28 protein structures. The solvation energy of non-hydrogen-bonded hydrophilic atoms was considered separately and expressed in a nonlinear way. As a result, our scoring function predicted native residues as the most favorable in 59% of the total positions in 28 proteins. We then tested the scoring function by comparing the predicted stability changes for 103 T4 lysozyme mutants with the experimental values. The correlation coefficients were 0.77 for surface mutations and 0.71 for all mutations. Finally, the scoring function combined with Monte Carlo simulation was used to predict favorable sequences on a fixed backbone. The designed sequences were similar to the natural sequences of the family to which the template structure belonged. The profile of the designed sequences was helpful for identification of remote homologues of the native sequence.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号