首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
DIP: the database of interacting proteins   总被引:24,自引:3,他引:21  
The Database of Interacting Proteins (DIP; http://dip.doe-mbi.ucla.edu) is a database that documents experimentally determined protein-protein interactions. This database is intended to provide the scientific community with a comprehensive and integrated tool for browsing and efficiently extracting information about protein interactions and interaction networks in biological processes. Beyond cataloging details of protein-protein interactions, the DIP is useful for understanding protein function and protein-protein relationships, studying the properties of networks of interacting proteins, benchmarking predictions of protein-protein interactions, and studying the evolution of protein-protein interactions.  相似文献   

2.
One possible path towards understanding the biological function of a target protein is through the discovery of how it interfaces within protein-protein interaction networks. The goal of this study was to create a virtual protein-protein interaction model using the concepts of orthologous conservation (or interologs) to elucidate the interacting networks of a particular target protein. POINT (the prediction of interactome database) is a functional database for the prediction of the human protein-protein interactome based on available orthologous interactome datasets. POINT integrates several publicly accessible databases, with emphasis placed on the extraction of a large quantity of mouse, fruit fly, worm and yeast protein-protein interactions datasets from the Database of Interacting Proteins (DIP), followed by conversion of them into a predicted human interactome. In addition, protein-protein interactions require both temporal synchronicity and precise spatial proximity. POINT therefore also incorporates correlated mRNA expression clusters obtained from cell cycle microarray databases and subcellular localization from Gene Ontology to further pinpoint the likelihood of biological relevance of each predicted interacting sets of protein partners.  相似文献   

3.
High throughput methods for detecting protein interactions require assessment of their accuracy. We present two forms of computational assessment. The first method is the expression profile reliability (EPR) index. The EPR index estimates the biologically relevant fraction of protein interactions detected in a high throughput screen. It does so by comparing the RNA expression profiles for the proteins whose interactions are found in the screen with expression profiles for known interacting and non-interacting pairs of proteins. The second form of assessment is the paralogous verification method (PVM). This method judges an interaction likely if the putatively interacting pair has paralogs that also interact. In contrast to the EPR index, which evaluates datasets of interactions, PVM scores individual interactions. On a test set, PVM identifies correctly 40% of true interactions with a false positive rate of approximately 1%. EPR and PVM were applied to the Database of Interacting Proteins (DIP), a large and diverse collection of protein-protein interactions that contains over 8000 Saccharomyces cerevisiae pairwise protein interactions. Using these two methods, we estimate that approximately 50% of them are reliable, and with the aid of PVM we identify confidently 3003 of them. Web servers for both the PVM and EPR methods are available on the DIP website (dip.doe-mbi.ucla.edu/Services.cgi).  相似文献   

4.
Han DS  Kim HS  Jang WH  Lee SD  Suh JK 《Nucleic acids research》2004,32(21):6312-6320
With the accumulation of protein and its related data on the Internet, many domain-based computational techniques to predict protein interactions have been developed. However, most techniques still have many limitations when used in real fields. They usually suffer from low accuracy in prediction and do not provide any interaction possibility ranking method for multiple protein pairs. In this paper, we propose a probabilistic framework to predict the interaction probability of proteins and develop an interaction possibility ranking method for multiple protein pairs. Using the ranking method, one can discern the protein pairs that are more likely to interact with each other in multiple protein pairs. The validity of the prediction model was evaluated using an interacting set of protein pairs in yeast and an artificially generated non-interacting set of protein pairs. When 80% of the set of interacting protein pairs in the DIP (Database of Interacting Proteins) was used as a learning set of interacting protein pairs, high sensitivity (77%) and specificity (95%) were achieved for the test groups containing common domains with the learning set of proteins within our framework. The stability of the prediction model was also evident when tested over DIP CORE, HMS-PCI and TAP data. In the validation of the ranking method, we reveal that some correlations exist between the interacting probability and the accuracy of the prediction.  相似文献   

5.
As protein-protein interaction is intrinsic to most cellular processes, the ability to predict which proteins in the cell interact can aid significantly in identifying the function of newly discovered proteins, and in understanding the molecular networks they participate in. Here we demonstrate that characteristic pairs of sequence-signatures can be learned from a database of experimentally determined interacting proteins, where one protein contains the one sequence-signature and its interacting partner contains the other sequence-signature. The sequence-signatures that recur in concert in various pairs of interacting proteins are termed correlated sequence-signatures, and it is proposed that they can be used for predicting putative pairs of interacting partners in the cell. We demonstrate the potential of this approach on a comprehensive database of experimentally determined pairs of interacting proteins in the yeast Saccharomyces cerevisiae. The proteins in this database have been characterized by their sequence-signatures, as defined by the InterPro classification. A statistical analysis performed on all possible combinations of sequence-signature pairs has identified those pairs that are over-represented in the database of yeast interacting proteins. It is demonstrated how the use of the correlated sequence-signatures as identifiers of interacting proteins can reduce significantly the search space, and enable directed experimental interaction screens.  相似文献   

6.
Molecular understanding of disease processes can be accelerated if all interactions between the host and pathogen are known. The unavailability of experimental methods for large-scale detection of interactions across host and pathogen organisms hinders this process. Here we apply a simple method to predict protein-protein interactions across a host and pathogen organisms. We use homology detection approaches against the protein-protein interaction databases, DIP and iPfam in order to predict interacting proteins in a host-pathogen pair. In the present work, we first applied this approach to the test cases involving the pairs phage T4 -Escherichia coli and phage lambda -E. coli and show that previously known interactions could be recognized using our approach. We further apply this approach to predict interactions between human and three pathogens E. coli, Salmonella enterica typhimurium and Yersinia pestis. We identified several novel interactions involving proteins of host or pathogen that could be thought of as highly relevant to the disease process. Serendipitously, many interactions involve hypothetical proteins of yet unknown function. Hypothetical proteins are predicted from computational analysis of genome sequences with no laboratory analysis on their functions yet available. The predicted interactions involving such proteins could provide hints to their functions.  相似文献   

7.
The Database of Interacting Proteins (DIP; http://dip.doe-mbi.ucla. edu) is a database that documents experimentally determined protein-protein interactions. Since January 2000 the number of protein-protein interactions in DIP has nearly tripled to 3472 and the number of proteins to 2659. New interactive tools have been developed to aid in the visualization, navigation and study of networks of protein interactions.  相似文献   

8.
Human Protein Reference Database (HPRD) is a rich resource of experimentally proven features of human proteins. Protein information in HPRD includes protein-protein interactions, post-translational modifications, enzyme/substrate relationships, disease associations, tissue expression, and subcellular localization of human proteins. Although, protein-protein interaction data from HPRD has been widely used by the scientific community, its phosphoproteome data has not been exploited to its full potential. HPRD is one of the largest documentations of human phosphoproteins in the public domain. Currently, phosphorylation data in HPRD comprises of 95,016 phosphosites mapped on to 13,041 proteins. Additionally, enzyme-substrate reactions responsible for 5930 phosphorylation events were also documented. Significant improvements in technologies and high-throughput platforms in biomedical investigations led to an exponential increase of biological data and phosphoproteomic data in recent years. Human Proteinpedia, a community annotation portal developed by us, has also contributed to the significant increase in phosphoproteomic data in HPRD. A large number of phosphorylation events have been mapped on to reference sequences available in HPRD and Human Proteinpedia along with associated protein features. This will provide a platform for systems biology approaches to determine the role of protein phosphorylation in protein function, cell signaling, biological processes and their implication in human diseases. This review aims to provide a composite view of phosphoproteomic data pertaining to human proteins in HPRD and Human Proteinpedia.  相似文献   

9.
10.
生物学通路被广泛应用于基因功能学研究, 但现有的生物学通路知识并不完善, 仍需进一步扩充。生物信息学预测为通路扩充提供了一种有效且经济的途径。文章提出了一种融合蛋白质-蛋白质互作知识以及Gene Ontology(GO)数据库信息进行基因通路预测的新方法。首先选取目标基因在蛋白质-蛋白质互作层面上的邻居所在的Kyoto Encyclopedia of Genes and Genomes(KEGG)通路为候选通路, 然后通过检验候选通路中的基因是否在与目标基因关联的GO节点富集来判断目标基因的通路归属。分别利用Human Protein Reference Database (HPRD)和Biological General Repository for Interaction Datasets(BioGRID)数据库中的蛋白质-蛋白质互作信息进行预测。结果表明, 在两套数据中, 随着互作邻居个数的增加, 预测的平均准确率(在所有目标基因注释的通路中被成功预测的比例)及相对准确率(在至少有一个注释通路被成功预测的基因集中, 所有注释通路均被预测正确的基因所占的比例)均呈现上升趋势。当互作邻居个数达到22时, 预测的平均准确率分别达到96.2%(HPRD)和96.3%(BioGRID), 而相对准确率分别为93.3%(HPRD)和84.1%(BioGRID)。进一步利用新版数据库对旧版数据库中被更新的89个基因进行验证, 至少有一个更新通路被预测正确的基因有50个, 其中43个基因的更新通路被完全正确预测, 相对准确率为86.0%。这些结果显示该方法是一种可靠且有效的通路扩充方法。  相似文献   

11.
蛋白质相互作用数据库及其应用   总被引:3,自引:0,他引:3  
对蛋白质相互作用及其网络的了解不仅有助于深入理解生命活动的本质和疾病发生的机制,而且可以为药物研发提供靶点.目前,通过高通量筛选、计算方法预测和文献挖掘等方法,获得了大批量的蛋白质相互作用数据,并由此构建了很多内容丰富并日益更新的蛋白质相互作用数据库.本文首先简要阐述了大规模蛋白质相互作用数据产生的3种方法,然后重点介绍了几个人类相关的蛋白质相互作用公共数据库,包括HPRD、BIND、 IntAct、MINT、 DIP 和MIPS,并概述了蛋白质相互作用数据库的整合情况以及这些数据库在蛋白质相互作用网络构建上的应用.  相似文献   

12.
随着“蛋白质组学”的蓬勃发展和人类对生物大分子功能机制的知识积累,涌现出海量的蛋白质相互作用数据。随之,研究者开发了300多个蛋白质相互作用数据库,用于存储、展示和数据的重利用。蛋白质相互作用数据库是系统生物学、分子生物学和临床药物研究的宝贵资源。本文将数据库分为3类:(1)综合蛋白质相互作用数据库;(2)特定物种的蛋白质相互作用数据库;(3)生物学通路数据库。重点介绍常用的蛋白质相互作用数据库,包括BioGRID、STRING、IntAct、MINT、DIP、IMEx、HPRD、Reactome和KEGG等。  相似文献   

13.
Protein domains are conserved and functionally independent structures that play an important role in interactions among related proteins. Domain-domain interactions have been recently used to predict protein-protein interactions (PPI). In general, the interaction probability of a pair of domains is scored using a trained scoring function. Satisfying a threshold, the protein pairs carrying those domains are regarded as "interacting". In this study, the signature contents of proteins were utilized to predict PPI pairs in Saccharomyces cerevisiae, Caenorhabditis elegans, and Homo sapiens. Similarity between protein signature patterns was scored and PPI predictions were drawn based on the binary similarity scoring function. Results show that the true positive rate of prediction by the proposed approach is approximately 32% higher than that using the maximum likelihood estimation method when compared with a test set, resulting in 22% increase in the area under the receiver operating characteristic (ROC) curve. When proteins containing one or two signatures were removed, the sensitivity of the predicted PPI pairs increased significantly. The predicted PPI pairs are on average 11 times more likely to interact than the random selection at a confidence level of 0.95, and on average 4 times better than those predicted by either phylogenetic profiling or gene expression profiling.  相似文献   

14.
Recent advances in functional genomics have helped generate large-scale high-throughput protein interaction data. Such networks, though extremely valuable towards molecular level understanding of cells, do not provide any direct information about the regions (domains) in the proteins that mediate the interaction. Here, we performed co-evolutionary analysis of domains in interacting proteins in order to understand the degree of co-evolution of interacting and non-interacting domains. Using a combination of sequence and structural analysis, we analyzed protein-protein interactions in F1-ATPase, Sec23p/Sec24p, DNA-directed RNA polymerase and nuclear pore complexes, and found that interacting domain pair(s) for a given interaction exhibits higher level of co-evolution than the non-interacting domain pairs. Motivated by this finding, we developed a computational method to test the generality of the observed trend, and to predict large-scale domain-domain interactions. Given a protein-protein interaction, the proposed method predicts the domain pair(s) that is most likely to mediate the protein interaction. We applied this method on the yeast interactome to predict domain-domain interactions, and used known domain-domain interactions found in PDB crystal structures to validate our predictions. Our results show that the prediction accuracy of the proposed method is statistically significant. Comparison of our prediction results with those from two other methods reveals that only a fraction of predictions are shared by all the three methods, indicating that the proposed method can detect known interactions missed by other methods. We believe that the proposed method can be used with other methods to help identify previously unrecognized domain-domain interactions on a genome scale, and could potentially help reduce the search space for identifying interaction sites.  相似文献   

15.
Lu L  Lu H  Skolnick J 《Proteins》2002,49(3):350-364
In this postgenomic era, the ability to identify protein-protein interactions on a genomic scale is very important to assist in the assignment of physiological function. Because of the increasing number of solved structures involving protein complexes, the time is ripe to extend threading to the prediction of quaternary structure. In this spirit, a multimeric threading approach has been developed. The approach is comprised of two phases. In the first phase, traditional threading on a single chain is applied to generate a set of potential structures for the query sequences. In particular, we use our recently developed threading algorithm, PROSPECTOR. Then, for those proteins whose template structures are part of a known complex, we rethread on both partners in the complex and now include a protein-protein interfacial energy. To perform this analysis, a database of multimeric protein structures has been constructed, the necessary interfacial pairwise potentials have been derived, and a set of empirical indicators to identify true multimers based on the threading Z-score and the magnitude of the interfacial energy have been established. The algorithm has been tested on a benchmark set comprised of 40 homodimers, 15 heterodimers, and 69 monomers that were scanned against a protein library of 2478 structures that comprise a representative set of structures in the Protein Data Bank. Of these, the method correctly recognized and assigned 36 homodimers, 15 heterodimers, and 65 monomers. This protocol was applied to identify partners and assign quaternary structures of proteins found in the yeast database of interacting proteins. Our multimeric threading algorithm correctly predicts 144 interacting proteins, compared to the 56 (26) cases assigned by PSI-BLAST using a (less) permissive E-value of 1 (0.01). Next, all possible pairs of yeast proteins have been examined. Predictions (n = 2865) of protein-protein interactions are made; 1138 of these 2865 interactions have counterparts in the Database of Interacting Proteins. In contrast, PSI-BLAST made 1781 predictions, and 1215 have counterparts in DIP. An estimation of the false-negative rate for yeast-predicted interactions has also been provided. Thus, a promising approach to help assist in the assignment of protein-protein interactions on a genomic scale has been developed.  相似文献   

16.
Online predicted human interaction database   总被引:8,自引:0,他引:8  
MOTIVATION: High-throughput experiments are being performed at an ever-increasing rate to systematically elucidate protein-protein interaction (PPI) networks for model organisms, while the complexities of higher eukaryotes have prevented these experiments for humans. RESULTS: The Online Predicted Human Interaction Database (OPHID) is a web-based database of predicted interactions between human proteins. It combines the literature-derived human PPI from BIND, HPRD and MINT, with predictions made from Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster and Mus musculus. The 23,889 predicted interactions currently listed in OPHID are evaluated using protein domains, gene co-expression and Gene Ontology terms. OPHID can be queried using single or multiple IDs and results can be visualized using our custom graph visualization program. AVAILABILITY: Freely available to academic users at http://ophid.utoronto.ca, both in tab-delimited and PSI-MI formats. Commercial users, please contact I.J. CONTACT: juris@ai.utoronto.ca SUPPLEMENTARY INFORMATION: http://ophid.utoronto.ca/supplInfo.pdf.  相似文献   

17.
18.
A long-standing question in molecular biology is whether interfaces of protein-protein complexes are more conserved than the rest of the protein surfaces. Although it has been reported that conservation can be used as an indicator for predicting interaction sites on proteins, there are recent reports stating that the interface regions are only slightly more conserved than the rest of the protein surfaces, with conservation signals not being statistically significant enough for predicting protein-protein binding sites. In order to properly address these controversial reports we have studied a set of 28 well resolved hetero complex structures of proteins that consists of transient and non-transient complexes. The surface positions were classified into four conservation classes and the conservation index of the surface positions was quantitatively analyzed. The results indicate that the surface density of highly conserved positions is significantly higher in the protein-protein interface regions compared with the other regions of the protein surface. However, the average conservation index of the patches in the interface region is not significantly higher compared with other surface regions of the protein structures. This finding demonstrates that the number of conserved residue positions is a more appropriate indicator for predicting protein-protein binding sites than the average conservation index in the interacting region. We have further validated our findings on a set of 59 benchmark complex structures. Furthermore, an analysis of 19 complexes of antigen-antibody interactions shows that there is no conservation of amino acid positions in the interacting regions of these complexes, as expected, with the variable region of the immunoglobulins interacting mostly with the antigens. Interestingly, antigen interacting regions also have a higher number of non-conserved residue positions in the interacting region than the rest of the protein surface.  相似文献   

19.
The Database of Interacting Proteins (DIP: http://dip.doe-mbi.ucla.edu) is a database that documents experimentally determined protein–protein interactions. It provides the scientific community with an integrated set of tools for browsing and extracting information about protein interaction networks. As of September 2001, the DIP catalogs ~11 000 unique interactions among 5900 proteins from >80 organisms; the vast majority from yeast, Helicobacter pylori and human. Tools have been developed that allow users to analyze, visualize and integrate their own experimental data with the information about protein–protein interactions available in the DIP database.  相似文献   

20.
Statistical analysis of domains in interacting protein pairs   总被引:10,自引:0,他引:10  
MOTIVATION: Several methods have recently been developed to analyse large-scale sets of physical interactions between proteins in terms of physical contacts between the constituent domains, often with a view to predicting new pairwise interactions. Our aim is to combine genomic interaction data, in which domain-domain contacts are not explicitly reported, with the domain-level structure of individual proteins, in order to learn about the structure of interacting protein pairs. Our approach is driven by the need to assess the evidence for physical contacts between domains in a statistically rigorous way. RESULTS: We develop a statistical approach that assigns p-values to pairs of domain superfamilies, measuring the strength of evidence within a set of protein interactions that domains from these superfamilies form contacts. A set of p-values is calculated for SCOP superfamily pairs, based on a pooled data set of interactions from yeast. These p-values can be used to predict which domains come into contact in an interacting protein pair. This predictive scheme is tested against protein complexes in the Protein Quaternary Structure (PQS) database, and is used to predict domain-domain contacts within 705 interacting protein pairs taken from our pooled data set.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号