首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The knowledge of protein and domain interactions provide crucial insights into their function within a cell. Several computational methods have been proposed to detect interactions between proteins and their constitutive domains. In this work, we focus on approaches based on correlated evolution (coevolution) of sequences of interacting proteins. In this type of approach, often referred to as the mirrortree method, a high correlation of evolutionary histories of two proteins is used as an indicator to predict protein interactions. Recently, it has been observed that subtracting the underlying speciation process by separating coevolution due to common speciation divergence from that due to common function of interacting pairs greatly improves the predictive power of the mirrortree approach. In this article, we investigate possible improvements and limitations of this method. In particular, we demonstrate that the performance of the mirrortree method that can be further improved by restricting the coevolution analysis to the relatively conserved regions in the protein domain sequences (disregarding highly divergent regions). We provide a theoretical validation of our results leading to new insights into the interplay between coevolution and speciation of interacting proteins.  相似文献   

2.
Recent advances in functional genomics have helped generate large-scale high-throughput protein interaction data. Such networks, though extremely valuable towards molecular level understanding of cells, do not provide any direct information about the regions (domains) in the proteins that mediate the interaction. Here, we performed co-evolutionary analysis of domains in interacting proteins in order to understand the degree of co-evolution of interacting and non-interacting domains. Using a combination of sequence and structural analysis, we analyzed protein-protein interactions in F1-ATPase, Sec23p/Sec24p, DNA-directed RNA polymerase and nuclear pore complexes, and found that interacting domain pair(s) for a given interaction exhibits higher level of co-evolution than the non-interacting domain pairs. Motivated by this finding, we developed a computational method to test the generality of the observed trend, and to predict large-scale domain-domain interactions. Given a protein-protein interaction, the proposed method predicts the domain pair(s) that is most likely to mediate the protein interaction. We applied this method on the yeast interactome to predict domain-domain interactions, and used known domain-domain interactions found in PDB crystal structures to validate our predictions. Our results show that the prediction accuracy of the proposed method is statistically significant. Comparison of our prediction results with those from two other methods reveals that only a fraction of predictions are shared by all the three methods, indicating that the proposed method can detect known interactions missed by other methods. We believe that the proposed method can be used with other methods to help identify previously unrecognized domain-domain interactions on a genome scale, and could potentially help reduce the search space for identifying interaction sites.  相似文献   

3.
It has been observed that the evolutionary distances of interacting proteins often display a higher level of similarity than those of noninteracting proteins. This finding indicates that interacting proteins are subject to common evolutionary constraints and constitutes the basis of a method to predict protein interactions known as mirrortree. It has been difficult, however, to identify the direct cause of the observed similarities between evolutionary trees. One possible explanation is the existence of compensatory mutations between partners' binding sites to maintain proper binding. This explanation, though, has been recently challenged, and it has been suggested that the signal of correlated evolution uncovered by the mirrortree method is unrelated to any correlated evolution between binding sites. We examine the contribution of binding sites to the correlation between evolutionary trees of interacting domains. We show that binding neighborhoods of interacting proteins have, on average, higher coevolutionary signal compared with the regions outside binding sites; however, when the binding neighborhood is removed, the remaining domain sequence still contains some coevolutionary signal. In conclusion, the correlation between evolutionary trees of interacting domains cannot exclusively be attributed to the correlated evolution of the binding sites or to common evolutionary pressure exerted on the whole protein domain sequence, each of which contributes to the signal measured by the mirrortree approach.  相似文献   

4.
Polyketides, a diverse group of heteropolymers with antibiotic and antitumor properties, are assembled in bacteria by multiprotein chains of modular polyketide synthase (PKS) proteins. Specific protein-protein interactions determine the order of proteins within a multiprotein chain, and thereby the order in which chemically distinct monomers are added to the growing polyketide product. Here we investigate the evolutionary and molecular origins of protein interaction specificity. We focus on the short, conserved N- and C-terminal docking domains that mediate interactions between modular PKS proteins. Our computational analysis, which combines protein sequence data with experimental protein interaction data, reveals a hierarchical interaction specificity code. PKS docking domains are descended from a single ancestral interacting pair, but have split into three phylogenetic classes that are mutually noninteracting. Specificity within one such compatibility class is determined by a few key residues, which can be used to define compatibility subclasses. We identify these residues using a novel, highly sensitive co-evolution detection algorithm called CRoSS (correlated residues of statistical significance). The residue pairs selected by CRoSS are involved in direct physical interactions in a docked-domain NMR structure. A single PKS system can use docking domain pairs from multiple classes, as well as domain pairs from multiple subclasses of any given class. The termini of individual proteins are frequently shuffled, but docking domain pairs straddling two interacting proteins are linked as an evolutionary module. The hierarchical and modular organization of the specificity code is intimately related to the processes by which bacteria generate new PKS pathways.  相似文献   

5.
Physical interactions between proteins mediate a variety of biological functions, including signal transduction, physical structuring of the cell and regulation. While extensive catalogs of such interactions are known from model organisms, their evolutionary histories are difficult to study given the lack of interaction data from phylogenetic outgroups. Using phylogenomic approaches, we infer a upper bound on the time of origin for a large set of human protein-protein interactions, showing that most such interactions appear relatively ancient, dating no later than the radiation of placental mammals. By analyzing paired alignments of orthologous and putatively interacting protein-coding genes from eight mammals, we find evidence for weak but significant co-evolution, as measured by relative selective constraint, between pairs of genes with interacting proteins. However, we find no strong evidence for shared instances of directional selection within an interacting pair. Finally, we use a network approach to show that the distribution of selective constraint across the protein interaction network is non-random, with a clear tendency for interacting proteins to share similar selective constraints. Collectively, the results suggest that, on the whole, protein interactions in mammals are under selective constraint, presumably due to their functional roles.  相似文献   

6.
VY Muley  A Ranjan 《PloS one》2012,7(7):e42057

Background

Recent progress in computational methods for predicting physical and functional protein-protein interactions has provided new insights into the complexity of biological processes. Most of these methods assume that functionally interacting proteins are likely to have a shared evolutionary history. This history can be traced out for the protein pairs of a query genome by correlating different evolutionary aspects of their homologs in multiple genomes known as the reference genomes. These methods include phylogenetic profiling, gene neighborhood and co-occurrence of the orthologous protein coding genes in the same cluster or operon. These are collectively known as genomic context methods. On the other hand a method called mirrortree is based on the similarity of phylogenetic trees between two interacting proteins. Comprehensive performance analyses of these methods have been frequently reported in literature. However, very few studies provide insight into the effect of reference genome selection on detection of meaningful protein interactions.

Methods

We analyzed the performance of four methods and their variants to understand the effect of reference genome selection on prediction efficacy. We used six sets of reference genomes, sampled in accordance with phylogenetic diversity and relationship between organisms from 565 bacteria. We used Escherichia coli as a model organism and the gold standard datasets of interacting proteins reported in DIP, EcoCyc and KEGG databases to compare the performance of the prediction methods.

Conclusions

Higher performance for predicting protein-protein interactions was achievable even with 100–150 bacterial genomes out of 565 genomes. Inclusion of archaeal genomes in the reference genome set improves performance. We find that in order to obtain a good performance, it is better to sample few genomes of related genera of prokaryotes from the large number of available genomes. Moreover, such a sampling allows for selecting 50–100 genomes for comparable accuracy of predictions when computational resources are limited.  相似文献   

7.
Protein-protein interactions play crucial roles in biological processes. Experimental methods have been developed to survey the proteome for interacting partners and some computational approaches have been developed to extend the impact of these experimental methods. Computational methods are routinely applied to newly discovered genes to infer protein function and plausible protein-protein interactions. Here, we develop and extend a quantitative method that identifies interacting proteins based upon the correlated behavior of the evolutionary histories of protein ligands and their receptors. We have studied six families of ligand-receptor pairs including: the syntaxin/Unc-18 family, the GPCR/G-alpha's, the TGF-beta/TGF-beta receptor system, the immunity/colicin domain collection from bacteria, the chemokine/chemokine receptors, and the VEGF/VEGF receptor family. For correlation scores above a defined threshold, we were able to find an average of 79% of all known binding partners. We then applied this method to find plausible binding partners for proteins with uncharacterized binding specificities in the syntaxin/Unc-18 protein and TGF-beta/TGF-beta receptor families. Analysis of the results shows that co-evolutionary analysis of interacting protein families can reduce the search space for identifying binding partners by not only finding binding partners for uncharacterized proteins but also recognizing potentially new binding partners for previously characterized proteins. We believe that correlated evolutionary histories provide a route to exploit the wealth of whole genome sequences and recent systematic proteomic results to extend the impact of these studies and focus experimental efforts to categorize physiologically or pathologically relevant protein-protein interactions.  相似文献   

8.
MOTIVATION: Interacting pairs of proteins should co-evolve to maintain functional and structural complementarity. Consequently, such a pair of protein families shows similarity between their phylogenetic trees. Although the tendency of co-evolution has been known for various ligand-receptor pairs, it has not been studied systematically in the widest possible scope. We investigated the degree of co-evolution for more than 900 family pairs in a global protein structural interactome map (PSIMAP--a map of all the structural domain-domain interactions in the PDB). RESULTS: There was significant correlation in 45% of the total SCOPs Family level pairs, rising to 78% in 454 reliable family interactions. Expectedly, the intra-molecular interactions between protein families showed stronger co-evolution than inter-molecular interactions. However, both types of interaction have a fundamentally similar pattern of co-evolution except for cases where different interfaces are involved. These results validate the use of co-evolution analysis with predictive methods such as PSIMAP to improve the accuracy of prediction based on "homologous interaction". The tendency of co-evolution enabled a nearly 5-fold enrichment in the identification of true interactions among the potential interlogues in PSIMAP. The estimated sensitivity was 79.2%, and the specificity was 78.6%. AVAILABILITY: The results of co-evolution analysis are available online at http://www.biointeraction.org  相似文献   

9.
10.
Experimental high-throughput studies of protein-protein interactions are beginning to provide enough data for comprehensive computational studies. Today, about ten large data sets, each with thousands of interacting pairs, coarsely sample the interactions in fly, human, worm, and yeast. Another about 55,000 pairs of interacting proteins have been identified by more careful, detailed biochemical experiments. Most interactions are experimentally observed in prokaryotes and simple eukaryotes; very few interactions are observed in higher eukaryotes such as mammals. It is commonly assumed that pathways in mammals can be inferred through homology to model organisms, e.g. the experimental observation that two yeast proteins interact is transferred to infer that the two corresponding proteins in human also interact. Two pairs for which the interaction is conserved are often described as interologs. The goal of this investigation was a large-scale comprehensive analysis of such inferences, i.e. of the evolutionary conservation of interologs. Here, we introduced a novel score for measuring the overlap between protein-protein interaction data sets. This measure appeared to reflect the overall quality of the data and was the basis for our two surprising results from our large-scale analysis. Firstly, homology-based inferences of physical protein-protein interactions appeared far less successful than expected. In fact, such inferences were accurate only for extremely high levels of sequence similarity. Secondly, and most surprisingly, the identification of interacting partners through sequence similarity was significantly more reliable for protein pairs within the same organism than for pairs between species. Our analysis underlined that the discrepancies between different datasets are large, even when using the same type of experiment on the same organism. This reality considerably constrains the power of homology-based transfer of interactions. In particular, the experimental probing of interactions in distant model organisms has to be undertaken with some caution. More comprehensive images of protein-protein networks will require the combination of many high-throughput methods, including in silico inferences and predictions. http://www.rostlab.org/results/2006/ppi_homology/  相似文献   

11.
MOTIVATION: Given that association and dissociation of protein molecules is crucial in most biological processes several in silico methods have been recently developed to predict protein-protein interactions. Structural evidence has shown that usually interacting pairs of close homologs (interologs) physically interact in the same way. Moreover, conservation of an interaction depends on the conservation of the interface between interacting partners. In this article we make use of both, structural similarities among domains of known interacting proteins found in the Database of Interacting Proteins (DIP) and conservation of pairs of sequence patches involved in protein-protein interfaces to predict putative protein interaction pairs. RESULTS: We have obtained a large amount of putative protein-protein interaction (approximately 130,000). The list is independent from other techniques both experimental and theoretical. We separated the list of predictions into three sets according to their relationship with known interacting proteins found in DIP. For each set, only a small fraction of the predicted protein pairs could be independently validated by cross checking with the Human Protein Reference Database (HPRD). The fraction of validated protein pairs was always larger than that expected by using random protein pairs. Furthermore, a correlation map of interacting protein pairs was calculated with respect to molecular function, as defined in the Gene Ontology database. It shows good consistency of the predicted interactions with data in the HPRD database. The intersection between the lists of interactions of other methods and ours produces a network of potentially high-confidence interactions.  相似文献   

12.
13.
MOTIVATION: Large-scale experiments reveal pairs of interacting proteins but leave the residues involved in the interactions unknown. These interface residues are essential for understanding the mechanism of interaction and are often desired drug targets. Reliable identification of residues that reside in protein-protein interface typically requires analysis of protein structure. Therefore, for the vast majority of proteins, for which there is no high-resolution structure, there is no effective way of identifying interface residues. RESULTS: Here we present a machine learning-based method that identifies interacting residues from sequence alone. Although the method is developed using transient protein-protein interfaces from complexes of experimentally known 3D structures, it never explicitly uses 3D information. Instead, we combine predicted structural features with evolutionary information. The strongest predictions of the method reached over 90% accuracy in a cross-validation experiment. Our results suggest that despite the significant diversity in the nature of protein-protein interactions, they all share common basic principles and that these principles are identifiable from sequence alone.  相似文献   

14.
Predicting the interactions between all the possible pairs of proteins in a given organism (making a protein-protein interaction map) is a crucial subject in bioinformatics. Most of the previous methods based on supervised machine learning use datasets containing approximately the same number of interacting pairs of proteins (positives) and non-interacting pairs of proteins (negatives) for training a classifier and are estimated to yield a large number of false positives. Thinking that the negatives used in previous studies cannot adequately represent all the negatives that need to be taken into account, we have developed a method based on multiple Support Vector Machines (SVMs) that uses more negatives than positives for predicting interactions between pairs of yeast proteins and pairs of human proteins. We show that the performance of a single SVM improved as we increased the number of negatives used for training and that, if more than one CPU is available, an approach using multiple SVMs is useful not only for improving the performance of classifiers but also for reducing the time required for training them. Our approach can also be applied to assessing the reliability of high-throughput interactions.  相似文献   

15.
Protein disorder has been frequently associated with protein-protein interaction. However, our knowledge of how protein disorder evolves within a network is limited. It is expected that physically interacting proteins evolve in a coordinated manner. This has so far been shown in their evolutionary rate, and in their gene expression levels. Here we examine the percentage of predicted disorder residues within binary and complex interacting proteins (physical and functional interactions respectively) to investigate how the disorder of a protein relates to that of its interacting partners. We show that the level of disorder of interacting proteins are correlated, with a greater correlation seen among proteins that are co-members of the same complex, and a lesser correlation between proteins that are documented as binary interactors of each other. There is a striking variation among complexes not only in their disorder, but in the extent to which the proteins within the complex differ in their levels of disorder, with RNA processes and protein binding complexes showing more variation in the disorder of their proteins, whilst other complexes show very little variation in the overall disorder of their constituent proteins. There is likely to be a stronger selection for complex subunits to have similar disorder, than is seen for proteins involved in binary interactions. Thus, binary interactions may be more resilient to changes in disorder than are complex interactions. These results add a new dimension to the role of disorder in protein networks, and highlight the potential importance of maintaining similar disorder in the members of a complex.  相似文献   

16.

Background  

Elucidating protein-protein interactions (PPIs) is essential to constructing protein interaction networks and facilitating our understanding of the general principles of biological systems. Previous studies have revealed that interacting protein pairs can be predicted by their primary structure. Most of these approaches have achieved satisfactory performance on datasets comprising equal number of interacting and non-interacting protein pairs. However, this ratio is highly unbalanced in nature, and these techniques have not been comprehensively evaluated with respect to the effect of the large number of non-interacting pairs in realistic datasets. Moreover, since highly unbalanced distributions usually lead to large datasets, more efficient predictors are desired when handling such challenging tasks.  相似文献   

17.
Protein co-evolution, co-adaptation and interactions   总被引:2,自引:0,他引:2  
Pazos F  Valencia A 《The EMBO journal》2008,27(20):2648-2655
Co-evolution has an important function in the evolution of species and it is clearly manifested in certain scenarios such as host–parasite and predator–prey interactions, symbiosis and mutualism. The extrapolation of the concepts and methodologies developed for the study of species co-evolution at the molecular level has prompted the development of a variety of computational methods able to predict protein interactions through the characteristics of co-evolution. Particularly successful have been those methods that predict interactions at the genomic level based on the detection of pairs of protein families with similar evolutionary histories (similarity of phylogenetic trees: mirrortree). Future advances in this field will require a better understanding of the molecular basis of the co-evolution of protein families. Thus, it will be important to decipher the molecular mechanisms underlying the similarity observed in phylogenetic trees of interacting proteins, distinguishing direct specific molecular interactions from other general functional constraints. In particular, it will be important to separate the effects of physical interactions within protein complexes (‘co-adaptation') from other forces that, in a less specific way, can also create general patterns of co-evolution.  相似文献   

18.
Han DS  Kim HS  Jang WH  Lee SD  Suh JK 《Nucleic acids research》2004,32(21):6312-6320
With the accumulation of protein and its related data on the Internet, many domain-based computational techniques to predict protein interactions have been developed. However, most techniques still have many limitations when used in real fields. They usually suffer from low accuracy in prediction and do not provide any interaction possibility ranking method for multiple protein pairs. In this paper, we propose a probabilistic framework to predict the interaction probability of proteins and develop an interaction possibility ranking method for multiple protein pairs. Using the ranking method, one can discern the protein pairs that are more likely to interact with each other in multiple protein pairs. The validity of the prediction model was evaluated using an interacting set of protein pairs in yeast and an artificially generated non-interacting set of protein pairs. When 80% of the set of interacting protein pairs in the DIP (Database of Interacting Proteins) was used as a learning set of interacting protein pairs, high sensitivity (77%) and specificity (95%) were achieved for the test groups containing common domains with the learning set of proteins within our framework. The stability of the prediction model was also evident when tested over DIP CORE, HMS-PCI and TAP data. In the validation of the ranking method, we reveal that some correlations exist between the interacting probability and the accuracy of the prediction.  相似文献   

19.
Proteins are the building blocks, effectors and signal mediators of cellular processes. A protein’s function, regulation and localization often depend on its interactions with other proteins. Here, we describe a protocol for the yeast protein-fragment complementation assay (PCA), a powerful method to detect direct and proximal associations between proteins in living cells. The interaction between two proteins, each fused to a dihydrofolate reductase (DHFR) protein fragment, translates into growth of yeast strains in presence of the drug methotrexate (MTX). Differential fitness, resulting from different amounts of reconstituted DHFR enzyme, can be quantified on high-density colony arrays, allowing to differentiate interacting from non-interacting bait-prey pairs. The high-throughput protocol presented here is performed using a robotic platform that parallelizes mating of bait and prey strains carrying complementary DHFR-fragment fusion proteins and the survival assay on MTX. This protocol allows to systematically test for thousands of protein-protein interactions (PPIs) involving bait proteins of interest and offers several advantages over other PPI detection assays, including the study of proteins expressed from their endogenous promoters without the need for modifying protein localization and for the assembly of complex reporter constructs.  相似文献   

20.

Background  

Information about protein interaction networks is fundamental to understanding protein function and cellular processes. Interaction patterns among proteins can suggest new drug targets and aid in the design of new therapeutic interventions. Efforts have been made to map interactions on a proteomic-wide scale using both experimental and computational techniques. Reference datasets that contain known interacting proteins (positive cases) and non-interacting proteins (negative cases) are essential to support computational prediction and validation of protein-protein interactions. Information on known interacting and non interacting proteins are usually stored within databases. Extraction of these data can be both complex and time consuming. Although, the automatic construction of reference datasets for classification is a useful resource for researchers no public resource currently exists to perform this task.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号