首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Many biological processes are mediated by protein-protein interactions (PPIs). Because protein domains are the building blocks of proteins, PPIs likely rely on domain-domain interactions (DDIs). Several attempts exist to infer DDIs from PPI networks but the produced datasets are heterogeneous and sometimes not accessible, while the PPI interactome data keeps growing.We describe a new computational approach called “PPIDM” (Protein-Protein Interactions Domain Miner) for inferring DDIs using multiple sources of PPIs. The approach is an extension of our previously described “CODAC” (Computational Discovery of Direct Associations using Common neighbors) method for inferring new edges in a tripartite graph. The PPIDM method has been applied to seven widely used PPI resources, using as “Gold-Standard” a set of DDIs extracted from 3D structural databases. Overall, PPIDM has produced a dataset of 84,552 non-redundant DDIs. Statistical significance (p-value) is calculated for each source of PPI and used to classify the PPIDM DDIs in Gold (9,175 DDIs), Silver (24,934 DDIs) and Bronze (50,443 DDIs) categories. Dataset comparison reveals that PPIDM has inferred from the 2017 releases of PPI sources about 46% of the DDIs present in the 2020 release of the 3did database, not counting the DDIs present in the Gold-Standard. The PPIDM dataset contains 10,229 DDIs that are consistent with more than 13,300 PPIs extracted from the IMEx database, and nearly 23,300 DDIs (27.5%) that are consistent with more than 214,000 human PPIs extracted from the STRING database. Examples of newly inferred DDIs covering more than 10 PPIs in the IMEx database are provided.Further exploitation of the PPIDM DDI reservoir includes the inventory of possible partners of a protein of interest and characterization of protein interactions at the domain level in combination with other methods. The result is publicly available at http://ppidm.loria.fr/.  相似文献   

2.
3.
MOTIVATION: Identifying protein-protein interactions is critical for understanding cellular processes. Because protein domains represent binding modules and are responsible for the interactions between proteins, computational approaches have been proposed to predict protein interactions at the domain level. The fact that protein domains are likely evolutionarily conserved allows us to pool information from data across multiple organisms for the inference of domain-domain and protein-protein interaction probabilities. RESULTS: We use a likelihood approach to estimating domain-domain interaction probabilities by integrating large-scale protein interaction data from three organisms, Saccharomyces cerevisiae, Caenorhabditis elegans and Drosophila melanogaster. The estimated domain-domain interaction probabilities are then used to predict protein-protein interactions in S.cerevisiae. Based on a thorough comparison of sensitivity and specificity, Gene Ontology term enrichment and gene expression profiles, we have demonstrated that it may be far more informative to predict protein-protein interactions from diverse organisms than from a single organism. AVAILABILITY: The program for computing the protein-protein interaction probabilities and supplementary material are available at http://bioinformatics.med.yale.edu/interaction.  相似文献   

4.
5.
6.
Recent advances in functional genomics have helped generate large-scale high-throughput protein interaction data. Such networks, though extremely valuable towards molecular level understanding of cells, do not provide any direct information about the regions (domains) in the proteins that mediate the interaction. Here, we performed co-evolutionary analysis of domains in interacting proteins in order to understand the degree of co-evolution of interacting and non-interacting domains. Using a combination of sequence and structural analysis, we analyzed protein-protein interactions in F1-ATPase, Sec23p/Sec24p, DNA-directed RNA polymerase and nuclear pore complexes, and found that interacting domain pair(s) for a given interaction exhibits higher level of co-evolution than the non-interacting domain pairs. Motivated by this finding, we developed a computational method to test the generality of the observed trend, and to predict large-scale domain-domain interactions. Given a protein-protein interaction, the proposed method predicts the domain pair(s) that is most likely to mediate the protein interaction. We applied this method on the yeast interactome to predict domain-domain interactions, and used known domain-domain interactions found in PDB crystal structures to validate our predictions. Our results show that the prediction accuracy of the proposed method is statistically significant. Comparison of our prediction results with those from two other methods reveals that only a fraction of predictions are shared by all the three methods, indicating that the proposed method can detect known interactions missed by other methods. We believe that the proposed method can be used with other methods to help identify previously unrecognized domain-domain interactions on a genome scale, and could potentially help reduce the search space for identifying interaction sites.  相似文献   

7.
The continuing growth in high-throughput data acquisition has led to a proliferation of network models to represent and analyse biological systems. These networks involve distinct interaction types detected by a combination of methods, ranging from directly observed physical interactions based in biochemistry to interactions inferred from phenotype measurements, genomic expression and comparative genomics. The discovery of interactions increasingly requires a blend of experimental and computational methods. Considering yeast as a model system, recent analytical methods are reviewed here and specific aims are proposed to improve network interaction inference and facilitate predictive biological modelling.  相似文献   

8.
A network of protein-protein interactions in yeast   总被引:29,自引:0,他引:29  
A global analysis of 2,709 published interactions between proteins of the yeast Saccharomyces cerevisiae has been performed, enabling the establishment of a single large network of 2,358 interactions among 1,548 proteins. Proteins of known function and cellular location tend to cluster together, with 63% of the interactions occurring between proteins with a common functional assignment and 76% occurring between proteins found in the same subcellular compartment. Possible functions can be assigned to a protein based on the known functions of its interacting partners. This approach correctly predicts a functional category for 72% of the 1,393 characterized proteins with at least one partner of known function, and has been applied to predict functions for 364 previously uncharacterized proteins.  相似文献   

9.
Protein-protein interactions (PPI) control most of the biological processes in a living cell. In order to fully understand protein functions, a knowledge of protein-protein interactions is necessary. Prediction of PPI is challenging, especially when the three-dimensional structure of interacting partners is not known. Recently, a novel prediction method was proposed by exploiting physical interactions of constituent domains. We propose here a novel knowledge-based prediction method, namely PPI_SVM, which predicts interactions between two protein sequences by exploiting their domain information. We trained a two-class support vector machine on the benchmarking set of pairs of interacting proteins extracted from the Database of Interacting Proteins (DIP). The method considers all possible combinations of constituent domains between two protein sequences, unlike most of the existing approaches. Moreover, it deals with both single-domain proteins and multi domain proteins; therefore it can be applied to the whole proteome in high-throughput studies. Our machine learning classifier, following a brainstorming approach, achieves accuracy of 86%, with specificity of 95%, and sensitivity of 75%, which are better results than most previous methods that sacrifice recall values in order to boost the overall precision. Our method has on average better sensitivity combined with good selectivity on the benchmarking dataset. The PPI_SVM source code, train/test datasets and supplementary files are available freely in the public domain at: .  相似文献   

10.

Background  

Although protein-protein interaction networks determined with high-throughput methods are incomplete, they are commonly used to infer the topology of the complete interactome. These partial networks often show a scale-free behavior with only a few proteins having many and the majority having only a few connections. Recently, the possibility was suggested that this scale-free nature may not actually reflect the topology of the complete interactome but could also be due to the error proneness and incompleteness of large-scale experiments.  相似文献   

11.
Braun P  Gingras AC 《Proteomics》2012,12(10):1478-1498
Today, it is widely appreciated that protein-protein interactions play a fundamental role in biological processes. This was not always the case. The study of protein interactions started slowly and evolved considerably, together with conceptual and technological progress in different areas of research through the late 19th and the 20th centuries. In this review, we present some of the key experiments that have introduced major conceptual advances in biochemistry and molecular biology, and review technological breakthroughs that have paved the way for today's systems-wide approaches to protein-protein interaction analysis.  相似文献   

12.
Using indirect protein-protein interactions for protein complex prediction   总被引:1,自引:0,他引:1  
Protein complexes are fundamental for understanding principles of cellular organizations. As the sizes of protein-protein interaction (PPI) networks are increasing, accurate and fast protein complex prediction from these PPI networks can serve as a guide for biological experiments to discover novel protein complexes. However, it is not easy to predict protein complexes from PPI networks, especially in situations where the PPI network is noisy and still incomplete. Here, we study the use of indirect interactions between level-2 neighbors (level-2 interactions) for protein complex prediction. We know from previous work that proteins which do not interact but share interaction partners (level-2 neighbors) often share biological functions. We have proposed a method in which all direct and indirect interactions are first weighted using topological weight (FS-Weight), which estimates the strength of functional association. Interactions with low weight are removed from the network, while level-2 interactions with high weight are introduced into the interaction network. Existing clustering algorithms can then be applied to this modified network. We have also proposed a novel algorithm that searches for cliques in the modified network, and merge cliques to form clusters using a "partial clique merging" method. Experiments show that (1) the use of indirect interactions and topological weight to augment protein-protein interactions can be used to improve the precision of clusters predicted by various existing clustering algorithms; and (2) our complex-finding algorithm performs very well on interaction networks modified in this way. Since no other information except the original PPI network is used, our approach would be very useful for protein complex prediction, especially for prediction of novel protein complexes.  相似文献   

13.

Background:

Deciphering physical protein-protein interactions is fundamental to elucidating both the functions of proteins and biological processes. The development of high-throughput experimental technologies such as the yeast two-hybrid screening has produced an explosion in data relating to interactions. Since manual curation is intensive in terms of time and cost, there is an urgent need for text-mining tools to facilitate the extraction of such information. The BioCreative (Critical Assessment of Information Extraction systems in Biology) challenge evaluation provided common standards and shared evaluation criteria to enable comparisons among different approaches.

Results:

During the benchmark evaluation of BioCreative 2006, all of our results ranked in the top three places. In the task of filtering articles irrelevant to physical protein interactions, our method contributes a precision of 75.07%, a recall of 81.07%, and an AUC (area under the receiver operating characteristic curve) of 0.847. In the task of identifying protein mentions and normalizing mentions to molecule identifiers, our method is competitive among runs submitted, with a precision of 34.83%, a recall of 24.10%, and an F1 score of28.5%. In extracting protein interaction pairs, our profile-based method was competitive on the SwissProt-only subset (precision = 36.95%, recall = 32.68%, and F1 score = 30.40%) and on the entire dataset (30.96%, 29.35%, and26.20%, respectively). From the biologist's point of view, however, these findings are far from satisfactory. The error analysis presented in this report provides insight into how performance could be improved: three-quarters of false negatives were due to protein normalization problems (532/698), and about one-quarter were due to problems with correctly extracting interactions for this system.

Conclusion:

We present a text-mining framework to extract physical protein-protein interactions from the literature. Three key issues are addressed, namely filtering irrelevant articles, identifying protein names and normalizing them to molecule identifiers, and extracting protein-protein interactions. Our system is among the top three performers in the benchmark evaluation of BioCreative 2006. The tool will be helpful for manual interaction curation and can greatly facilitate the process of extracting protein-protein interactions.
  相似文献   

14.
Removal of lipid from detergent-solubilized succinate cytochrome c reductase by a mild method leads to a series of changes in the optical and EPR spectra of the b cytochromes. This culminates in a state that resembles purified b cytochromes from the same source and bisimidazole ferriheme model complexes. Reconstitution of the lipid-depleted complex with phospholipid restores the native spectra in a significant fraction of the complexes in the early stages of lipid depletion. Once the final state has been reached, however, reconstitution has so far been incapable of restoring described in this communication can be related to a model for integral membrane cytochromes.  相似文献   

15.

Background  

With the accumulation of increasing omics data, a key goal of systems biology is to construct networks at different cellular levels to investigate cellular machinery of the cell. However, there is currently no satisfactory method to construct an integrated cellular network that combines the gene regulatory network and the signaling regulatory pathway.  相似文献   

16.
17.
In mammalian cells, the Ku and DNA-dependent protein kinase catalytic subunit (DNA-PKcs) proteins are required for the correct and efficient repair of DNA double-strand breaks. Ku comprises two tightly-associated subunits of approximately 69 and approximately 83 kDa, which are termed Ku70 and Ku80 (or Ku86), respectively. Previously, a number of regions of both Ku subunits have been demonstrated to be involved in their interaction, but the molecular mechanism of this interaction remains unknown. We have identified a region in Ku70 (amino acid residues 449-578) and a region in Ku80 (residues 439-592) that participate in Ku subunit interaction. Sequence analysis reveals that these interaction regions share sequence homology and suggests that the Ku subunits are structurally related. On binding to a DNA double-strand break, Ku is able to interact with DNA-PKcs, but how this interaction is mediated has not been defined. We show that the extreme C-terminus of Ku80, specifically the final 12 amino acid residues, mediates a highly specific interaction with DNA-PKcs. Strikingly, these residues appear to be conserved only in Ku80 sequences from vertebrate organisms. These data suggest that Ku has evolved to become part of the DNA-PK holo-enzyme by acquisition of a protein-protein interaction motif at the C-terminus of Ku80.  相似文献   

18.
Yang P  Li X  Wu M  Kwoh CK  Ng SK 《PloS one》2011,6(7):e21502

Background

Phenotypically similar diseases have been found to be caused by functionally related genes, suggesting a modular organization of the genetic landscape of human diseases that mirrors the modularity observed in biological interaction networks. Protein complexes, as molecular machines that integrate multiple gene products to perform biological functions, express the underlying modular organization of protein-protein interaction networks. As such, protein complexes can be useful for interrogating the networks of phenome and interactome to elucidate gene-phenotype associations of diseases.

Methodology/Principal Findings

We proposed a technique called RWPCN (Random Walker on Protein Complex Network) for predicting and prioritizing disease genes. The basis of RWPCN is a protein complex network constructed using existing human protein complexes and protein interaction network. To prioritize candidate disease genes for the query disease phenotypes, we compute the associations between the protein complexes and the query phenotypes in their respective protein complex and phenotype networks. We tested RWPCN on predicting gene-phenotype associations using leave-one-out cross-validation; our method was observed to outperform existing approaches. We also applied RWPCN to predict novel disease genes for two representative diseases, namely, Breast Cancer and Diabetes.

Conclusions/Significance

Guilt-by-association prediction and prioritization of disease genes can be enhanced by fully exploiting the underlying modular organizations of both the disease phenome and the protein interactome. Our RWPCN uses a novel protein complex network as a basis for interrogating the human phenome-interactome network. As the protein complex network can capture the underlying modularity in the biological interaction networks better than simple protein interaction networks, RWPCN was found to be able to detect and prioritize disease genes better than traditional approaches that used only protein-phenotype associations.  相似文献   

19.
During intense network activity in vivo, cortical neurons are in a high-conductance state, in which the membrane potential (V(m)) is subject to a tremendous fluctuating activity. Clearly, this "synaptic noise" contains information about the activity of the network, but there are presently no methods available to extract this information. We focus here on this problem from a computational neuroscience perspective, with the aim of drawing methods to analyze experimental data. We start from models of cortical neurons, in which high-conductance states stem from the random release of thousands of excitatory and inhibitory synapses. This highly complex system can be simplified by using global synaptic conductances described by effective stochastic processes. The advantage of this approach is that one can derive analytically a number of properties from the statistics of resulting V(m) fluctuations. For example, the global excitatory and inhibitory conductances can be extracted from synaptic noise, and can be related to the mean activity of presynaptic neurons. We show here that extracting the variances of excitatory and inhibitory synaptic conductances can provide estimates of the mean temporal correlation-or level of synchrony-among thousands of neurons in the network. Thus, "probing the network" through intracellular V(m) activity is possible and constitutes a promising approach, but it will require a continuous effort combining theory, computational models and intracellular physiology.  相似文献   

20.
Inferring protein interactions from phylogenetic distance matrices   总被引:2,自引:0,他引:2  
Finding the interacting pairs of proteins between two different protein families whose members are known to interact is an important problem in molecular biology. We developed and tested an algorithm that finds optimal matches between two families of proteins by comparing their distance matrices. A distance matrix provides a measure of the sequence similarity of proteins within a family. Since the protein sets of interest may have dozens of proteins each, the use of an efficient approximate solution is necessary. Therefore the approach we have developed consists of a Metropolis Monte Carlo optimization algorithm which explores the search space of possible matches between two distance matrices. We demonstrate that by using this algorithm we are able to accurately match chemokines and chemokine-receptors as well as the tgfbeta family of ligands and their receptors.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号