首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 312 毫秒
1.
Cell signaling networks propagate information from extracellular cues via dynamic modulation of protein-protein interactions in a context-dependent manner. Networks based on receptor tyrosine kinases (RTKs), for example, phosphorylate intracellular proteins in response to extracellular ligands, resulting in dynamic protein-protein interactions that drive phenotypic changes. Most commonly used methods for discovering these protein-protein interactions, however, are optimized for detecting stable, longer-lived complexes, rather than the type of transient interactions that are essential components of dynamic signaling networks such as those mediated by RTKs. Substrate phosphorylation downstream of RTK activation modifies substrate activity and induces phospho-specific binding interactions, resulting in the formation of large transient macromolecular signaling complexes. Since protein complex formation should follow the trajectory of events that drive it, we reasoned that mining phosphoproteomic datasets for highly similar dynamic behavior of measured phosphorylation sites on different proteins could be used to predict novel, transient protein-protein interactions that had not been previously identified. We applied this method to explore signaling events downstream of EGFR stimulation. Our computational analysis of robustly co-regulated phosphorylation sites, based on multiple clustering analysis of quantitative time-resolved mass-spectrometry phosphoproteomic data, not only identified known sitewise-specific recruitment of proteins to EGFR, but also predicted novel, a priori interactions. A particularly intriguing prediction of EGFR interaction with the cytoskeleton-associated protein PDLIM1 was verified within cells using co-immunoprecipitation and in situ proximity ligation assays. Our approach thus offers a new way to discover protein-protein interactions in a dynamic context- and phosphorylation site-specific manner.  相似文献   

2.
Mass spectrometry-based approaches are commonly used to identify proteins from multiprotein complexes, typically with the goal of identifying new complex members or identifying post-translational modifications. However, with the recent demonstration that spectral counting is a powerful quantitative proteomic approach, the analysis of multiprotein complexes by mass spectrometry can be reconsidered in certain cases. Using the chromatography-based approach named multidimensional protein identification technology, multiprotein complexes may be analyzed quantitatively using the normalized spectral abundance factor that allows comparison of multiple independent analyses of samples. This study describes an approach to visualize multiprotein complex datasets that provides structure function information that is superior to tabular lists of data. In this method review, we describe a reanalysis of the Rpd3/Sin3 small and large histone deacetylase complexes previously described in a tabular form to demonstrate the normalized spectral abundance factor approach.  相似文献   

3.
4.
5.

Background

High-throughput techniques are becoming widely used to study protein-protein interactions and protein complexes on a proteome-wide scale. Here we have explored the potential of these techniques to accurately determine the constituent proteins of complexes and their architecture within the complex.

Results

Two-dimensional representations of the 19S and 20S proteasome, mediator, and SAGA complexes were generated and overlaid with high quality pairwise interaction data, core-module-attachment classifications from affinity purifications of complexes and predicted domain-domain interactions. Pairwise interaction data could accurately determine the members of each complex, but was unexpectedly poor at deciphering the topology of proteins in complexes. Core and module data from affinity purification studies were less useful for accurately defining the member proteins of these complexes. However, these data gave strong information on the spatial proximity of many proteins. Predicted domain-domain interactions provided some insight into the topology of proteins within complexes, but was affected by a lack of available structural data for the co-activator complexes and the presence of shared domains in paralogous proteins.

Conclusion

The constituent proteins of complexes are likely to be determined with accuracy by combining data from high-throughput techniques. The topology of some proteins in the complexes will be able to be clearly inferred. We finally suggest strategies that can be employed to use high throughput interaction data to define the membership and understand the architecture of proteins in novel complexes.  相似文献   

6.
Histone deacetylase Rpd3 is part of two distinct complexes: the large (Rpd3L) and small (Rpd3S) complexes. While Rpd3L targets specific promoters for gene repression, Rpd3S is recruited to ORFs to deacetylate histones in the wake of RNA polymerase II, to prevent cryptic initiation within genes. Methylation of histone H3 at lysine 36 by the Set2 methyltransferase is thought to mediate the recruitment of Rpd3S. Here, we confirm by ChIP-Chip that Rpd3S binds active ORFs. Surprisingly, however, Rpd3S is not recruited to all active genes, and its recruitment is Set2-independent. However, Rpd3S complexes recruited in the absence of H3K36 methylation appear to be inactive. Finally, we present evidence implicating the yeast DSIF complex (Spt4/5) and RNA polymerase II phosphorylation by Kin28 and Ctk1 in the recruitment of Rpd3S to active genes. Taken together, our data support a model where Set2-dependent histone H3 methylation is required for the activation of Rpd3S following its recruitment to the RNA polymerase II C-terminal domain.  相似文献   

7.
8.

Background

Effectively predicting protein complexes not only helps to understand the structures and functions of proteins and their complexes, but also is useful for diagnosing disease and developing new drugs. Up to now, many methods have been developed to detect complexes by mining dense subgraphs from static protein-protein interaction (PPI) networks, while ignoring the value of other biological information and the dynamic properties of cellular systems.

Results

In this paper, based on our previous works CPredictor and CPredictor2.0, we present a new method for predicting complexes from PPI networks with both gene expression data and protein functional annotations, which is called CPredictor3.0. This new method follows the viewpoint that proteins in the same complex should roughly have similar functions and are active at the same time and place in cellular systems. We first detect active proteins by using gene express data of different time points and cluster proteins by using gene ontology (GO) functional annotations, respectively. Then, for each time point, we do set intersections with one set corresponding to active proteins generated from expression data and the other set corresponding to a protein cluster generated from functional annotations. Each resulting unique set indicates a cluster of proteins that have similar function(s) and are active at that time point. Following that, we map each cluster of active proteins of similar function onto a static PPI network, and get a series of induced connected subgraphs. We treat these subgraphs as candidate complexes. Finally, by expanding and merging these candidate complexes, the predicted complexes are obtained.We evaluate CPredictor3.0 and compare it with a number of existing methods on several PPI networks and benchmarking complex datasets. The experimental results show that CPredictor3.0 achieves the highest F1-measure, which indicates that CPredictor3.0 outperforms these existing method in overall.

Conclusion

CPredictor3.0 can serve as a promising tool of protein complex prediction.
  相似文献   

9.
Recognition of histone post-translational modifications is pivotal for directing chromatin-modifying enzymes to specific genomic regions and regulating their activities. Emerging evidence suggests that other structural features of nucleosomes also contribute to precise targeting of downstream chromatin complexes, such as linker DNA, the histone globular domain, and nucleosome spacing. However, how chromatin complexes coordinate individual interactions to achieve high affinity and specificity remains unclear. The Rpd3S histone deacetylase utilizes the chromodomain-containing Eaf3 subunit and the PHD domain-containing Rco1 subunit to recognize nucleosomes that are methylated at lysine 36 of histone H3 (H3K36me). We showed previously that the binding of Eaf3 to H3K36me can be allosterically activated by Rco1. To investigate how this chromatin recognition module is regulated in the context of the Rpd3S complex, we first determined the subunit interaction network of Rpd3S. Interestingly, we found that Rpd3S contains two copies of the essential subunit Rco1, and both copies of Rco1 are required for full functionality of Rpd3S. Our functional dissection of Rco1 revealed that besides its known chromatin-recognition interfaces, other regions of Rco1 are also critical for Rpd3S to recognize its nucleosomal substrates and functionin vivo. This unexpected result uncovered an important and understudied aspect of chromatin recognition. It suggests that precisely reading modified chromatin may not only need the combined actions of reader domains but also require an internal signaling circuit that coordinates the individual actions in a productive way.  相似文献   

10.
11.
12.

Background

Protein-protein interactions play a crucial role in enabling a pathogen to survive within a host. In many cases the interactions involve a complex of proteins rather than just two given proteins. This is especially true for pathogens like M. tuberculosis that are able to successfully survive the inhospitable environment of the macrophage. Studying such interactions in detail may help in developing small molecules that either disrupt or augment the interactions. Here, we describe the development of an E. coli based bacterial three-hybrid system that can be used effectively to study ternary protein complexes.

Methodology/Principal Findings

The protein-protein interactions involved in M. tuberculosis pathogenesis have been used as a model for the validation of the three-hybrid system. Using the M. tuberculosis RD1 encoded proteins CFP10, ESAT6 and Rv3871 for our proof-of-concept studies, we show that the interaction between the proteins CFP10 and Rv3871 is strengthened and stabilized in the presence of ESAT6, the known heterodimeric partner of CFP10. Isolating peptide candidates that can disrupt crucial protein-protein interactions is another application that the system offers. We demonstrate this by using CFP10 protein as a disruptor of a previously established interaction between ESAT6 and a small peptide HCL1; at the same time we also show that CFP10 is not able to disrupt the strong interaction between ESAT6 and another peptide SL3.

Conclusions/Significance

The validation of the three-hybrid system paves the way for finding new peptides that are stronger binders of ESAT6 compared even to its natural partner CFP10. Additionally, we believe that the system offers an opportunity to study tri-protein complexes and also perform a screening of protein/peptide binders to known interacting proteins so as to elucidate novel tri-protein complexes.  相似文献   

13.
Chromatin immunoprecipitation (ChrIP or ChIP) has commonly been used to map protein-DNA interaction sites at specific genomic loci through use of formaldehyde-induced crosslinking. However, formaldehyde alone has proved inadequate for crosslinking of certain proteins such as the yeast histone deacetylase Rpd3. We report here a modified crosslinking procedure that includes a protein-protein crosslinking agent in addition to formaldehyde. Using this double crosslinking method, we have successfully mapped Rpd3 binding sites in vivo. We also describe the use of ChrIP in combination with DNA microarrays (ChrIP-array) to determine the pattern of Rpd3 binding genomewide. This approach couples the versatility of ChrIP with that of microarrays to identify binding patterns that would otherwise be hidden in a gene-by-gene survey.  相似文献   

14.
Most cellular processes are performed by proteomic units that interact with each other. These units are often stoichiometrically stable complexes comprised of several proteins. To obtain a faithful view of the protein interactome we must view it in terms of these basic units (complexes and proteins) and the interactions between them. This study makes two contributions toward this goal. First, it provides a new algorithm for reconstruction of stable complexes from a variety of heterogeneous biological assays; our approach combines state-of-the-art machine learning methods with a novel hierarchical clustering algorithm that allows clusters to overlap. We demonstrate that our approach constructs over 40% more known complexes than other recent methods and that the complexes it produces are more biologically coherent even compared with the reference set. We provide experimental support for some of our novel predictions, identifying both a new complex involved in nutrient starvation and a new component of the eisosome complex. Second, we provide a high accuracy algorithm for the novel problem of predicting transient interactions involving complexes. We show that our complex level network, which we call ComplexNet, provides novel insights regarding the protein-protein interaction network. In particular, we reinterpret the finding that “hubs” in the network are enriched for being essential, showing instead that essential proteins tend to be clustered together in essential complexes and that these essential complexes tend to be large.Biological processes exhibit a hierarchical structure in which the basic working units, proteins, physically associate to form stoichiometrically stable complexes. Complexes interact with individual proteins or other complexes to form functional modules and pathways that carry out most cellular processes. Such higher level interactions are more transient than those within complexes and are highly dependent on temporal and spatial context. The function of each protein or complex depends on its interaction partners. Therefore, a faithful reconstruction of the entire set of complexes in the cell is essential to identifying the function of individual proteins and complexes as well as serving as a building block for understanding the higher level organization of the cell, such as the interactions of complexes and proteins within cellular pathways. Here we describe a novel method for reconstruction of complexes from a variety of biological assays and a method for predicting the network of interactions relating these core cellular units (complexes and proteins).Our reconstruction effort focuses on the yeast Saccharomyces cerevisiae. Yeast serves as the prototypical case study for the reconstruction of protein-protein interaction networks. Moreover the yeast complexes often have conserved orthologs in other organisms, including human, and are of interest in their own right. Several studies (14) using a variety of assays have generated high throughput data that directly measure protein-protein interactions. Most notably, two high quality data sets (3, 4) used tandem affinity purification (TAP)1 followed by MS to provide a proteome-wide measurement of protein complexes. These data provide the basis for attempting a comprehensive reconstruction of a large fraction of the protein complexes in this organism. Indeed a number of works (5, 6) have attempted such a reconstruction. Generally speaking, all use the same general procedure: one or more data sources are used to estimate a set of affinities between pairs of proteins, essentially measuring the likelihood of that pair to participate together in a complex. These affinities induce a weighted graph whose nodes are proteins and whose edges encode the affinities. A clustering algorithm is then used to construct complexes, sets of proteins that have high affinity in the graph. Although similar at a high level, the different methods differ significantly on the design choices made for the key steps in the process.Recent works (since 2006) all focus on processing the proteome-wide TAP-MS data and using the results to define complexes. Gavin et al. (3), Collins et al. (7), and Hart et al. (5) all use probabilistic models that compare the number of interactions observed between proteins in the data versus the number expected in some null model. Collins et al. (7) and Hart et al. (5) both used all three of the available high throughput data sets (24) in an attempt to provide a unified interaction network. The two unified networks resulting from these studies were shown to have large overlap and to achieve comparable agreement with the set of co-complex interactions in the MIPS data set (8) that are collated from previous small scale studies. The interaction graphs resulting from the computed affinity scores are then clustered to produce a set of identified complexes. Gavin et al. (3), Hart et al. (5), and Pu et al. (6) all use a Markov clustering (MCL) (9) procedure; Collins et al. (7) use a hierarchical agglomerative clustering (HAC) procedure but do not suggest a computational procedure for using the resulting dendrogram to produce specific complex predictions.Despite the fairly high quality of these networks and the agreement between them, they still contain many false positives and negatives. False negatives can arise, for example, from the difficulty in detecting interactions involving low abundance proteins or membrane proteins or from cases where the tag added to the bait protein during TAP-MS prevents binding of the bait to its interacting partners. False positives can arise, for example, from complexes that share components or from the contaminants that bind to the bait nonspecifically after cell lysis. Therefore, the set of complexes derived from the protein-protein interaction network alone has limited accuracy. Less than 20% of the MIPS complexes (8), which are derived from reliable small scale experiments, are exactly captured by the predictions of Pu et al. (6) or by those of Hart et al. (5).In this study, we constructed a method that generates a set of complexes with higher sensitivity and coverage by integrating multiple sources of data, including mRNA gene expression data, cellular localization, and yeast two-hybrid data. The data integration approach was used in some early works on predicting protein-protein interactions (10, 11) and more recently by Qiu and Noble (12), but these studies focus only on predicting pairs of proteins in the same complex and not on reconstructing entire complexes. Many recent studies (1321) have successfully integrated multiple types of data to predict functional linkage between proteins, constructing a graph whose pairwise affinity score summarizes the information from different sources of data. However, because the data integration is not trained toward predicting complexes, the high affinity pairs contain transient binding partners and even protein pairs that never interact directly but merely function in the same pathways. When these graphs are clustered, the clusters correspond to a variety of cellular entities, including pathways, functional modules, or co-expression clusters. We developed a data integration approach that is aimed directly at the problem of predicting stoichiometrically stable complexes.We used a two-phase automated procedure that we trained on a new high quality reference set that we generated from annotations in MIPS and SGD and from manual curation of the literature. In the first phase, we used boosting (22), a state-of-the-art machine learning method, to train an affinity function that is specifically aimed at predicting whether two proteins are co-complexed. Unlike most other learning methods, boosting is capable of inducing useful features by combining different aspects of the raw data, making it particularly well suited to a data integration setting. Once we generated the learned affinity graph over pairs of proteins, we predicted complexes by using a novel clustering algorithm called hierarchical agglomerative clustering with overlap (HACO). The HACO algorithm is a simple and elegant extension of HAC that addresses many of its limitations, such as the irreversible commitment to a possibly incorrect clustering decision. HACO can be applied to any setting where HAC is applied; given the enormous usefulness of HAC for the analysis of biological data sets of many different types (e.g. Refs. 7, 23, and 24), we believe that HACO may be applicable in a broad range of other tasks.To validate our approach, we tested the ability of our methods and other methods to predict reference complexes that were not used in training. By integrating multiple sources of data, we recovered more reference complexes than other state-of-the-art methods (5, 6) when applied to the same set of yeast proteins. We also validated our predicted set of complexes against external data sources that are not used in the training. In all cases, our predictions were shown to be more coherent than other methods and, in many cases, more coherent even than the set of reference complexes.A detailed examination of our predicted complexes suggests that many of them were previously known but not included in our (comprehensive) reference set, suggesting that our complexes form a valuable new set of reference complexes. In several cases, our predicted complexes were not previously characterized. We experimentally validated two of these predictions: a new component in the recently characterized eisosome complex (25), which marks the site of endocytosis in eukaryotes, and a newly characterized six-protein complex, including four phosphatases, that appears to be involved in the response to nutrient starvation and that we named the nutrient starvation complex (NSC).The complex-based view provides a new perspective on the analysis and reconstruction of the protein interaction network. In the past, Jeong et al. (26) have suggested that the degree of a protein in an interaction network is positively correlated with its essentiality and have argued that “hubs” in the network are more likely to be essential because they are involved in more interactions. Our analysis presents a complex-based alternative view: essential proteins tend to cluster together in essential complexes (5), and essential complexes tend to be large; thus, the essential hubs in the network are often members in large complexes comprised mostly of essential proteins. We also reformulate the task of reconstructing the protein interaction network. Rather than considering interactions between individual proteins (2729), a somewhat confusing network that confounds interactions within complexes and interactions between complexes, we tackle the novel task of predicting a comprehensive protein interaction network that involves both individual proteins and larger complexes. We argue that these entities are the right building blocks in reconstructing cellular processes, providing a view of cellular interaction networks that is both easier to interpret than the complex network of interactions between individual proteins and more faithful to biological reality. Moreover a complex, which is a stable collection of many proteins that act together, provides a more robust basis for predicting interactions as we can combine signals for all its constituent proteins, reducing sensitivity to noise.To accomplish this goal, we constructed a reference set of complex-complex interactions, considering two complexes to interact if they are significantly enriched for reliable interactions between their components. We further augmented this set with a hand-curated list of established complex-complex interactions. We then used a machine learning approach to detect the “signature” of such interactions from a large set of assays that are likely to be indicative. We explored different machine learning methods and showed that a partially supervised naïve Bayes model, where we learned the model from both labeled and unlabeled interactions, provides the best performance. This model was applied both to our predicted complexes and to individual proteins, providing a new, comprehensive reconstruction of the S. cerevisiae interaction network, which can be downloaded from our project Web page.2 We showed that entities that are predicted to interact are more likely to share the same functional categories. A detailed investigation of our new predicted interactions presents many that are established in the literature as well as some that are novel but consistent, presenting plausible hypotheses for further investigation.  相似文献   

15.
16.
17.
The Rpd3 histone deacetylase (HDAC) functions in a large complex containing many proteins including Sin3 and Sap30. Previous evidence indicates that the pho23, rpd3, sin3, and sap30 mutants exhibit similar defects in PHO5 regulation. We report that pho23 mutants like rpd3, sin3, and sap30 are hypersensitive to cycloheximide and heat shock and exhibit enhanced silencing of rDNA, telomeric, and HMR loci, suggesting that these genes are functionally related. Based on these observations, we explored whether Pho23 is a component of the Rpd3 HDAC complex. Our results demonstrate that Myc-Pho23 co-immunoprecipitates with HA-Rpd3 and HA-Sap30. Furthermore, similar levels of HDAC activity were detected in immunoprecipitates of HA-Pho23, HA-Rpd3, or HA-Sap30. In contrast, HDAC activity was not detected in immunoprecipitates of HA-Pho23 or HA-Sap30 from strains lacking Rpd3, suggesting that Rpd3 is the HDAC associated with these proteins. However, HDAC activity was detected in immunoprecipitates of HA-Sap30 or HA-Rpd3 from cells lacking Pho23, although levels were significantly lower than those detected in wild-type cells, indicating that Rpd3 activity is compromised in the absence of Pho23. Together, our genetic and biochemical studies provide strong evidence that Pho23 is a component of the Rpd3 HDAC complex, and is required for the normal function of this complex.  相似文献   

18.
Proteins perform essential cellular functions as part of protein complexes, often in conjunction with RNA, DNA, metabolites and other small molecules. The genome encodes thousands of proteins but not all of them are expressed in every cell type; and expressed proteins are not active at all times. Such diversity of protein expression and function accounts for the level of biological intricacy seen in nature. Defining protein-protein interactions in protein complexes, and establishing the when, what and where of potential interactions, is therefore crucial to understanding the cellular function of any protein—especially those that have not been well studied by traditional molecular genetic approaches. We generated a large-scale resource of affinity-tagged expression-ready clones and used co-affinity purification combined with tandem mass-spectrometry to identify protein partners of nearly 5,000 Drosophila melanogaster proteins. The resulting protein complex “map” provided a blueprint of metazoan protein complex organization. Here we describe how the map has provided valuable insights into protein function in addition to generating hundreds of testable hypotheses. We also discuss recent technological advancements that will be critical in addressing the next generation of questions arising from the map.  相似文献   

19.
To understand the function of protein complexes and their association with biological processes, a lot of studies have been done towards analyzing the protein-protein interaction (PPI) networks. However, the advancement in high-throughput technology has resulted in a humongous amount of data for analysis. Moreover, high level of noise, sparseness, and skewness in degree distribution of PPI networks limits the performance of many clustering algorithms and further analysis of their interactions.In addressing and solving these problems we present a novel random walk based algorithm that converts the incomplete and binary PPI network into a protein-protein topological similarity matrix (PP-TS matrix). We believe that if two proteins share some high-order topological similarities they are likely to be interacting with each other. Using the obtained PP-TS matrix, we constructed and used weighted networks to further study and analyze the interaction among proteins. Specifically, we applied a fully automated community structure finding algorithm (Auto-HQcut) on the obtained weighted network to cluster protein complexes. We then analyzed the protein complexes for significance in biological processes. To help visualize and analyze these protein complexes we also developed an interface that displays the resulting complexes as well as the characteristics associated with each complex.Applying our approach to a yeast protein-protein interaction network, we found that the predicted protein-protein interaction pairs with high topological similarities have more significant biological relevance than the original protein-protein interactions pairs. When we compared our PPI network reconstruction algorithm with other existing algorithms using gene ontology and gene co-expression, our algorithm produced the highest similarity scores. Also, our predicted protein complexes showed higher accuracy measure compared to the other protein complex predictions.  相似文献   

20.
We present a computational procedure for modeling protein-protein association and predicting the structures of protein-protein complexes. The initial sampling stage is based on an efficient Brownian dynamics algorithm that mimics the physical process of diffusional association. Relevant biochemical data can be directly incorporated as distance constraints at this stage. The docked configurations are then grouped with a hierarchical clustering algorithm into ensembles that represent potential protein-protein encounter complexes. Flexible refinement of selected representative structures is done by molecular dynamics simulation. The protein-protein docking procedure was thoroughly tested on 10 structurally and functionally diverse protein-protein complexes. Starting from X-ray crystal structures of the unbound proteins, in 9 out of 10 cases it yields structures of protein-protein complexes close to those determined experimentally with the percentage of correct contacts >30% and interface backbone RMSD <4 A. Detailed examination of all the docking cases gives insights into important determinants of the performance of the computational approach in modeling protein-protein association and predicting of protein-protein complex structures.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号