首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
2.
Conserved network motifs allow protein-protein interaction prediction   总被引:5,自引:0,他引:5  
MOTIVATION: High-throughput protein interaction detection methods are strongly affected by false positive and false negative results. Focused experiments are needed to complement the large-scale methods by validating previously detected interactions but it is often difficult to decide which proteins to probe as interaction partners. Developing reliable computational methods assisting this decision process is a pressing need in bioinformatics. RESULTS: We show that we can use the conserved properties of the protein network to identify and validate interaction candidates. We apply a number of machine learning algorithms to the protein connectivity information and achieve a surprisingly good overall performance in predicting interacting proteins. Using a 'leave-one-out' approach we find average success rates between 20 and 40% for predicting the correct interaction partner of a protein. We demonstrate that the success of these methods is based on the presence of conserved interaction motifs within the network. AVAILABILITY: A reference implementation and a table with candidate interacting partners for each yeast protein are available at http://www.protsuggest.org.  相似文献   

3.
4.
The characterization of protein interactions is essential for understanding biological systems. While genome-scale methods are available for identifying interacting proteins, they do not pinpoint the interacting motifs (e.g., a domain, sequence segments, a binding site, or a set of residues). Here, we develop and apply a method for delineating the interacting motifs of hub proteins (i.e., highly connected proteins). The method relies on the observation that proteins with common interaction partners tend to interact with these partners through a common interacting motif. The sole input for the method are binary protein interactions; neither sequence nor structure information is needed. The approach is evaluated by comparing the inferred interacting motifs with domain families defined for 368 proteins in the Structural Classification of Proteins (SCOP). The positive predictive value of the method for detecting proteins with common SCOP families is 75% at sensitivity of 10%. Most of the inferred interacting motifs were significantly associated with sequence patterns, which could be responsible for the common interactions. We find that yeast hubs with multiple interacting motifs are more likely to be essential than hubs with one or two interacting motifs, thus rationalizing the previously observed correlation between essentiality and the number of interacting partners of a protein. We also find that yeast hubs with multiple interacting motifs evolve slower than the average protein, contrary to the hubs with one or two interacting motifs. The proposed method will help us discover unknown interacting motifs and provide biological insights about protein hubs and their roles in interaction networks.  相似文献   

5.
Global viewing of protein–protein interactions (PPIs)is a useful way to assign biological roles to large numbersof proteins predicted by complete genome sequence. Here, wesystematically analyzed PPIs in the nitrogen-fixing soil bacteriumMesorhizobium loti using a modified high-throughput yeast two-hybridsystem. The aims of this study are primarily on the providingfunctional clues to M. loti proteins that are relevant to symbioticnitrogen fixation and conserved in other rhizobium species,especially proteins with regulatory functions and unannotatedproteins. By the screening of 1542 genes as bait, 3121 independentinteractions involving 1804 proteins (24% of the total proteincoding genes) were identified and each interaction was evaluatedusing an interaction generality (IG) measure and the generalfeatures of the interacting partners. Most PPIs detected inthis study are novel interactions revealing potential functionalrelationships between genes for symbiotic nitrogen fixationand signal transduction. Furthermore, we have predicted theputative functions of unannotated proteins through their interactionswith known proteins. The results described here represent newinsight into protein network of M. loti and provide useful experimentalclues to elucidate the biological function of rhizobial genesthat can not be assigned directly from their genomic sequence.  相似文献   

6.
Unraveling the "code" of genome structure is an important goal of genomics research. Colocalization of genes in eukaryotic genomes may facilitate preservation of favorable allele combinations between epistasic loci or coregulation of functionally related genes. However, the presence of interacting gene clusters in the human genome has remained unclear. We systematically searched the human genome for evidence of closely linked genes whose protein products interact. We find 83 pairs of interacting genes that are located within 1 Mbp in the human genome or 37 if we exclude hub proteins. This number of interacting gene clusters is significantly more than expected by chance and is not the result of tandem duplications. Furthermore, we find that these clusters are significantly more conserved across vertebrate (but not chordate) genomes than other pairs of genes located within 1 Mbp in the human genome. In many cases, the genes are both present but not clustered in older vertebrate lineages. These results suggest gene cluster creation along the human lineage. These clusters are not enriched for housekeeping genes, but we find a significant contribution from genes involved in "response to stimulus." Many of these genes are involved in the immune response, including, but not limited to, known clusters such as the major histocompatibility complex. That these clusters were formed contemporaneously with the origin of adaptive immunity within the vertebrate lineage suggests that novel evolutionary and regulatory constraints were associated with the operation of the immune system.  相似文献   

7.
Several species of yeast, including the baker's yeast Saccharomyces cerevisiae, underwent a genome duplication roughly 100 million years ago. We analyze genetic networks whose members were involved in this duplication. Many networks show detectable redundancy and strong asymmetry in their interactions. For networks of co-expressed genes, we find evidence for network partitioning whereby the paralogs appear to have formed two relatively independent subnetworks from the ancestral network. We simulate the degeneration of networks after duplication and find that a model wherein the rate of interaction loss depends on the “neighborliness” of the interacting genes produces networks with parameters similar to those seen in the real partitioned networks. We propose that the rationalization of network structure through the loss of pair-wise gene interactions after genome duplication provides a mechanism for the creation of semi-independent daughter networks through the division of ancestral functions between these daughter networks.  相似文献   

8.
Biological regulatory systems require the specific organization of proteins into multicomponent complexes. Two hybrid systems have been used to identify novel components of signaling networks based on interactions with defined partner proteins. An important issue in the use of two-hybrid systems has been the degree to which interacting proteins distinguish their biological partner from evolutionarily conserved related proteins and the degree to which observed interactions are specific. We adapted the basic two-hybrid strategy to create a novel dual bait system designed to allow single-step screening of libraries for proteins that interact with protein 1 of interest, fused to DNA binding domain A (LexA), but do not interact with protein 2, fused to DNA binding domain B (lambda cI). Using the selective interactions of Ras and Krev-1(Rap1A) with Raf, RalGDS, and Krit1 as a model, we systematically compared LexA- and cI-fused baits and reporters. The LexA and cI baitr reporter systems are well matched for level of bait expression and sensitivity range for interaction detection and allow effective isolation of specifically interacting protein pairs against a nonspecific background. These reagents should prove useful to refine the selectivity of library screens, to reduce the isolation of false positives in such screens, and to perform directed analyses of sequence elements governing the interaction of a single protein with multiple partners.  相似文献   

9.
10.
11.
JH Oh  HP Wong  X Wang  JO Deasy 《PloS one》2012,7(6):e38870
The number of biomarker candidates is often much larger than the number of clinical patient data points available, which motivates the use of a rational candidate variable filtering methodology. The goal of this paper is to apply such a bioinformatics filtering process to isolate a modest number (<10) of key interacting genes and their associated single nucleotide polymorphisms involved in radiation response, and to ultimately serve as a basis for using clinical datasets to identify new biomarkers. In step 1, we surveyed the literature on genetic and protein correlates to radiation response, in vivo or in vitro, across cellular, animal, and human studies. In step 2, we analyzed two publicly available microarray datasets and identified genes in which mRNA expression changed in response to radiation. Combining results from Step 1 and Step 2, we identified 20 genes that were common to all three sources. As a final step, a curated database of protein interactions was used to generate the most statistically reliable protein interaction network among any subset of the 20 genes resulting from Steps 1 and 2, resulting in identification of a small, tightly interacting network with 7 out of 20 input genes. We further ranked the genes in terms of likely importance, based on their location within the network using a graph-based scoring function. The resulting core interacting network provides an attractive set of genes likely to be important to radiation response.  相似文献   

12.
Based on the hypothesis that the neighbors of disease genes trend to cause similar diseases, network-based methods for disease prediction have received increasing attention. Taking full advantage of network structure, the performance of global distance measurements is generally superior to local distance measurements. However, some problems exist in the global distance measurements. For example, global distance measurements may mistake non-disease hub proteins that have dense interactions with known disease proteins for potential disease proteins. To find a new method to avoid the aforementioned problem, we analyzed the differences between disease proteins and other proteins by using essential proteins (proteins encoded by essential genes) as references. We find that disease proteins are not well connected with essential proteins in the protein interaction networks. Based on this new finding, we proposed a novel strategy for gene prioritization based on protein interaction networks. We allocated positive flow to disease genes and negative flow to essential genes, and adopted network propagation for gene prioritization. Experimental results on 110 diseases verified the effectiveness and potential of the proposed method.  相似文献   

13.
We have developed an integrative analysis method combining genetic interactions, identified using type 1 diabetes genome scan data, and a high-confidence human protein interaction network. Resulting networks were ranked by the significance of the enrichment of proteins from interacting regions. We identified a number of new protein network modules and novel candidate genes/proteins for type 1 diabetes. We propose this type of integrative analysis as a general method for the elucidation of genes and networks involved in diabetes and other complex diseases.  相似文献   

14.
Experiments to probe for protein-protein interactions are the focus of functional proteomic studies, thus proteomic data repositories are increasingly likely to contain a large cross-section of such information. Here, we use the Global Proteome Machine database (GPMDB), which is the largest curated and publicly available proteomic data repository derived from tandem mass spectrometry, to develop an in silico protein interaction analysis tool. Using a human histone protein for method development, we positively identified an interaction partner from each histone protein family that forms the histone octameric complex. Moreover, this method, applied to the α subunits of the human proteasome, identified all of the subunits in the 20S core particle. Furthermore, we applied this approach to human integrin αIIb and integrin β3, a major receptor involved in the activation of platelets. We identified 28 proteins, including a protein network for integrin and platelet activation. In addition, proteins interacting with integrin β1 obtained using this method were validated by comparing them to those identified in a formaldehyde-supported coimmunoprecipitation experiment, protein-protein interaction databases and the literature. Our results demonstrate that in silico protein interaction analysis is a novel tool for identifying known/candidate protein-protein interactions and proteins with shared functions in a protein network.  相似文献   

15.
We characterized and evaluated the functional attributes of three yeast high-confidence protein-protein interaction data sets derived from affinity purification/mass spectrometry, protein-fragment complementation assay, and yeast two-hybrid experiments. The interacting proteins retrieved from these data sets formed distinct, partially overlapping sets with different protein-protein interaction characteristics. These differences were primarily a function of the deployed experimental technologies used to recover these interactions. This affected the total coverage of interactions and was especially evident in the recovery of interactions among different functional classes of proteins. We found that the interaction data obtained by the yeast two-hybrid method was the least biased toward any particular functional characterization. In contrast, interacting proteins in the affinity purification/mass spectrometry and protein-fragment complementation assay data sets were over- and under-represented among distinct and different functional categories. We delineated how these differences affected protein complex organization in the network of interactions, in particular for strongly interacting complexes (e.g. RNA and protein synthesis) versus weak and transient interacting complexes (e.g. protein transport). We quantified methodological differences in detecting protein interactions from larger protein complexes, in the correlation of protein abundance among interacting proteins, and in their connectivity of essential proteins. In the latter case, we showed that minimizing inherent methodology biases removed many of the ambiguous conclusions about protein essentiality and protein connectivity. We used these findings to rationalize how biological insights obtained by analyzing data sets originating from different sources sometimes do not agree or may even contradict each other. An important corollary of this work was that discrepancies in biological insights did not necessarily imply that one detection methodology was better or worse, but rather that, to a large extent, the insights reflected the methodological biases themselves. Consequently, interpreting the protein interaction data within their experimental or cellular context provided the best avenue for overcoming biases and inferring biological knowledge.  相似文献   

16.
Genes carry out their biological functions through pathways in complex networks consisting of many interacting molecules. Studies on the effect of network architecture on the evolution of individual proteins will provide valuable information for understanding the origin and evolution as well as functional conservation of signaling pathways. However, the relationship between the network architecture and the individual protein sequence evolution is yet little known. In current study, we carried out network-level molecular evolution analysis on TLR (Toll-like receptor ) signaling pathway, which plays an important role in innate immunity in insects and mammals, and we found that: 1) The selection constraint of genes was negatively correlated with its position along TLR signaling pathway; 2) all genes in TLR signaling pathway were highly conserved and underwent strong purifying selection; 3) the distribution of selective pressure along the pathway was driven by differential nonsynonymous substitution levels; 4) The TLR signaling pathway might present in a common ancestor of sponges and eumetazoa, and evolve via the TLR, IKK, IκB and NF-κB genes underwent duplication events as well as adaptor molecular enlargement, and gene structure and conservation motif of NF-κB genes shifted in their evolutionary history. Our results will improve our understanding on the evolutionary history of animal TLR signaling pathway as well as the relationship between the network architecture and the sequences evolution of individual protein.  相似文献   

17.
In this paper, the structure and evolution of the protein interaction network of the yeast Saccharomyces cerevisiae is analyzed. The network is viewed as a graph whose nodes correspond to proteins. Two proteins are connected by an edge if they interact. The network resembles a random graph in that it consists of many small subnets (groups of proteins that interact with each other but do not interact with any other protein) and one large connected subnet comprising more than half of all interacting proteins. The number of interactions per protein appears to follow a power law distribution. Within approximately 200 Myr after a duplication, the products of duplicate genes become almost equally likely to (1) have common protein interaction partners and (2) be part of the same subnetwork as two proteins chosen at random from within the network. This indicates that the persistence of redundant interaction partners is the exception rather than the rule. After gene duplication, the likelihood that an interaction gets lost exceeds 2.2 x 10(-3)/Myr. New interactions are estimated to evolve at a rate that is approximately three orders of magnitude smaller. Every 300 Myr, as many as half of all interactions may be replaced by new interactions.  相似文献   

18.
Information Flow Analysis of Interactome Networks   总被引:1,自引:0,他引:1  
Recent studies of cellular networks have revealed modular organizations of genes and proteins. For example, in interactome networks, a module refers to a group of interacting proteins that form molecular complexes and/or biochemical pathways and together mediate a biological process. However, it is still poorly understood how biological information is transmitted between different modules. We have developed information flow analysis, a new computational approach that identifies proteins central to the transmission of biological information throughout the network. In the information flow analysis, we represent an interactome network as an electrical circuit, where interactions are modeled as resistors and proteins as interconnecting junctions. Construing the propagation of biological signals as flow of electrical current, our method calculates an information flow score for every protein. Unlike previous metrics of network centrality such as degree or betweenness that only consider topological features, our approach incorporates confidence scores of protein–protein interactions and automatically considers all possible paths in a network when evaluating the importance of each protein. We apply our method to the interactome networks of Saccharomyces cerevisiae and Caenorhabditis elegans. We find that the likelihood of observing lethality and pleiotropy when a protein is eliminated is positively correlated with the protein's information flow score. Even among proteins of low degree or low betweenness, high information scores serve as a strong predictor of loss-of-function lethality or pleiotropy. The correlation between information flow scores and phenotypes supports our hypothesis that the proteins of high information flow reside in central positions in interactome networks. We also show that the ranks of information flow scores are more consistent than that of betweenness when a large amount of noisy data is added to an interactome. Finally, we combine gene expression data with interaction data in C. elegans and construct an interactome network for muscle-specific genes. We find that genes that rank high in terms of information flow in the muscle interactome network but not in the entire network tend to play important roles in muscle function. This framework for studying tissue-specific networks by the information flow model can be applied to other tissues and other organisms as well.  相似文献   

19.
Systems biology approaches can reveal intermediary levels of organization between genotype and phenotype that often underlie biological phenomena such as polygenic effects and protein dispensability. An important conceptualization is the module, which is loosely defined as a cohort of proteins that perform a dedicated cellular task. Based on a computational analysis of limited interaction datasets in the budding yeast Saccharomyces cerevisiae, it has been suggested that the global protein interaction network is segregated such that highly connected proteins, called hubs, tend not to link to each other. Moreover, it has been suggested that hubs fall into two distinct classes: "party" hubs are co-expressed and co-localized with their partners, whereas "date" hubs interact with incoherently expressed and diversely localized partners, and thereby cohere disparate parts of the global network. This structure may be compared with altocumulus clouds, i.e., cotton ball-like structures sparsely connected by thin wisps. However, this organization might reflect a small and/or biased sample set of interactions. In a multi-validated high-confidence (HC) interaction network, assembled from all extant S. cerevisiae interaction data, including recently available proteome-wide interaction data and a large set of reliable literature-derived interactions, we find that hub-hub interactions are not suppressed. In fact, the number of interactions a hub has with other hubs is a good predictor of whether a hub protein is essential or not. We find that date hubs are neither required for network tolerance to node deletion, nor do date hubs have distinct biological attributes compared to other hubs. Date and party hubs do not, for example, evolve at different rates. Our analysis suggests that the organization of global protein interaction network is highly interconnected and hence interdependent, more like the continuous dense aggregations of stratus clouds than the segregated configuration of altocumulus clouds. If the network is configured in a stratus format, cross-talk between proteins is potentially a major source of noise. In turn, control of the activity of the most highly connected proteins may be vital. Indeed, we find that a fluctuation in steady-state levels of the most connected proteins is minimized.  相似文献   

20.
MOTIVATION: Given that association and dissociation of protein molecules is crucial in most biological processes several in silico methods have been recently developed to predict protein-protein interactions. Structural evidence has shown that usually interacting pairs of close homologs (interologs) physically interact in the same way. Moreover, conservation of an interaction depends on the conservation of the interface between interacting partners. In this article we make use of both, structural similarities among domains of known interacting proteins found in the Database of Interacting Proteins (DIP) and conservation of pairs of sequence patches involved in protein-protein interfaces to predict putative protein interaction pairs. RESULTS: We have obtained a large amount of putative protein-protein interaction (approximately 130,000). The list is independent from other techniques both experimental and theoretical. We separated the list of predictions into three sets according to their relationship with known interacting proteins found in DIP. For each set, only a small fraction of the predicted protein pairs could be independently validated by cross checking with the Human Protein Reference Database (HPRD). The fraction of validated protein pairs was always larger than that expected by using random protein pairs. Furthermore, a correlation map of interacting protein pairs was calculated with respect to molecular function, as defined in the Gene Ontology database. It shows good consistency of the predicted interactions with data in the HPRD database. The intersection between the lists of interactions of other methods and ours produces a network of potentially high-confidence interactions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号