首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Traditional protein annotation methods describe known domains with probabilistic models representing consensus among homologous domain sequences. However, when relevant signals become too weak to be identified by a global consensus, attempts for annotation fail. Here we address the fundamental question of domain identification for highly divergent proteins. By using high performance computing, we demonstrate that the limits of state-of-the-art annotation methods can be bypassed. We design a new strategy based on the observation that many structural and functional protein constraints are not globally conserved through all species but might be locally conserved in separate clades. We propose a novel exploitation of the large amount of data available: 1. for each known protein domain, several probabilistic clade-centered models are constructed from a large and differentiated panel of homologous sequences, 2. a decision-making protocol combines outcomes obtained from multiple models, 3. a multi-criteria optimization algorithm finds the most likely protein architecture. The method is evaluated for domain and architecture prediction over several datasets and statistical testing hypotheses. Its performance is compared against HMMScan and HHblits, two widely used search methods based on sequence-profile and profile-profile comparison. Due to their closeness to actual protein sequences, clade-centered models are shown to be more specific and functionally predictive than the broadly used consensus models. Based on them, we improved annotation of Plasmodium falciparum protein sequences on a scale not previously possible. We successfully predict at least one domain for 72% of P. falciparum proteins against 63% achieved previously, corresponding to 30% of improvement over the total number of Pfam domain predictions on the whole genome. The method is applicable to any genome and opens new avenues to tackle evolutionary questions such as the reconstruction of ancient domain duplications, the reconstruction of the history of protein architectures, and the estimation of protein domain age. Website and software: http://www.lcqb.upmc.fr/CLADE.  相似文献   

2.
Large-scale analyses of protein-protein interactions based on coarse-grain molecular docking simulations and binding site predictions resulting from evolutionary sequence analysis, are possible and realizable on hundreds of proteins with variate structures and interfaces. We demonstrated this on the 168 proteins of the Mintseris Benchmark 2.0. On the one hand, we evaluated the quality of the interaction signal and the contribution of docking information compared to evolutionary information showing that the combination of the two improves partner identification. On the other hand, since protein interactions usually occur in crowded environments with several competing partners, we realized a thorough analysis of the interactions of proteins with true partners but also with non-partners to evaluate whether proteins in the environment, competing with the true partner, affect its identification. We found three populations of proteins: strongly competing, never competing, and interacting with different levels of strength. Populations and levels of strength are numerically characterized and provide a signature for the behavior of a protein in the crowded environment. We showed that partner identification, to some extent, does not depend on the competing partners present in the environment, that certain biochemical classes of proteins are intrinsically easier to analyze than others, and that small proteins are not more promiscuous than large ones. Our approach brings to light that the knowledge of the binding site can be used to reduce the high computational cost of docking simulations with no consequence in the quality of the results, demonstrating the possibility to apply coarse-grain docking to datasets made of thousands of proteins. Comparison with all available large-scale analyses aimed to partner predictions is realized. We release the complete decoys set issued by coarse-grain docking simulations of both true and false interacting partners, and their evolutionary sequence analysis leading to binding site predictions. Download site: http://www.lgm.upmc.fr/CCDMintseris/  相似文献   

3.
4.
Transient receptor potential (TRP) channels are a family of Ca2+-permeable cation channels that play a crucial role in biological and disease processes. To advance TRP channel research, we previously created the TRIP (TRansient receptor potential channel-Interacting Protein) Database, a manually curated database that compiles scattered information on TRP channel protein-protein interactions (PPIs). However, the database needs to be improved for information accessibility and data utilization. Here, we present the TRIP Database 2.0 (http://www.trpchannel.org) in which many helpful, user-friendly web interfaces have been developed to facilitate knowledge acquisition and inspire new approaches to studying TRP channel functions: 1) the PPI information found in the supplementary data of referred articles was curated; 2) the PPI summary matrix enables users to intuitively grasp overall PPI information; 3) the search capability has been expanded to retrieve information from ‘PubMed’ and ‘PIE the search’ (a specialized search engine for PPI-related articles); and 4) the PPI data are available as sif files for network visualization and analysis using ‘Cytoscape’. Therefore, our TRIP Database 2.0 is an information hub that works toward advancing data-driven TRP channel research.  相似文献   

5.
The perturbations of protein-protein interactions (PPIs) were found to be the main cause of cancer. Previous PPI prediction methods which were trained with non-disease general PPI data were not compatible to map the PPI network in cancer. Therefore, we established a novel cancer specific PPI prediction method dubbed NECARE, which was based on relational graph convolutional network (R-GCN) with knowledge-based features. It achieved the best performance with a Matthews correlation coefficient (MCC) = 0.84±0.03 and an F1 = 91±2% compared with other methods. With NECARE, we mapped the cancer interactome atlas and revealed that the perturbations of PPIs were enriched on 1362 genes, which were named cancer hub genes. Those genes were found to over-represent with mutations occurring at protein-macromolecules binding interfaces. Furthermore, over 56% of cancer treatment-related genes belonged to hub genes and they were significantly related to the prognosis of 32 types of cancers. Finally, by coimmunoprecipitation, we confirmed that the NECARE prediction method was highly reliable with a 90% accuracy. Overall, we provided the novel network-based cancer protein-protein interaction prediction method and mapped the perturbation of cancer interactome. NECARE is available at: https://github.com/JiajunQiu/NECARE.  相似文献   

6.

Background

It is well established that only a portion of residues that mediate protein-protein interactions (PPIs), the so-called hot spot, contributes the most to the total binding energy, and thus its identification is an important and relevant question that has clear applications in drug discovery and protein design. The experimental identification of hot spots is however a lengthy and costly process, and thus there is an interest in computational tools that can complement and guide experimental efforts.

Principal Findings

Here, we present Presaging Critical Residues in Protein interfaces-Web server (http://www.bioinsilico.org/PCRPi), a web server that implements a recently described and highly accurate computational tool designed to predict critical residues in protein interfaces: PCRPi. PRCPi depends on the integration of structural, energetic, and evolutionary-based measures by using Bayesian Networks (BNs).

Conclusions

PCRPi-W has been designed to provide an easy and convenient access to the broad scientific community. Predictions are readily available for download or presented in a web page that includes among other information links to relevant files, sequence information, and a Jmol applet to visualize and analyze the predictions in the context of the protein structure.  相似文献   

7.
The growing body of experimental and computational data describing how proteins interact with each other has emphasized the multiplicity of protein interactions and the complexity underlying protein surface usage and deformability. In this work, we propose new concepts and methods toward deciphering such complexity. We introduce the notion of interacting region to account for the multiple usage of a protein's surface residues by several partners and for the variability of protein interfaces coming from molecular flexibility. We predict interacting patches by crossing evolutionary, physicochemical and geometrical properties of the protein surface with information coming from complete cross-docking (CC-D) simulations. We show that our predictions match well interacting regions and that the different sources of information are complementary. We further propose an indicator of whether a protein has a few or many partners. Our prediction strategies are implemented in the dynJET2 algorithm and assessed on a new dataset of 262 protein on which we performed CC-D. The code and the data are available at: http://www.lcqb.upmc.fr/dynJET2/ .  相似文献   

8.
9.
The specificity of protein-protein interactions is encoded in those parts of the sequence that compose the binding interface. Therefore, understanding how changes in protein sequence influence interaction specificity, and possibly the phenotype, requires knowing the location of binding sites in those sequences. However, large-scale detection of protein interfaces remains a challenge. Here, we present a sequence- and interactome-based approach to mine interaction motifs from the recently published Arabidopsis thaliana interactome. The resultant proteome-wide predictions are available via www.ab.wur.nl/sliderbio and set the stage for further investigations of protein-protein binding sites. To assess our method, we first show that, by using a priori information calculated from protein sequences, such as evolutionary conservation and residue surface accessibility, we improve the performance of interface prediction compared to using only interactome data. Next, we present evidence for the functional importance of the predicted sites, which are under stronger selective pressure than the rest of protein sequence. We also observe a tendency for compensatory mutations in the binding sites of interacting proteins. Subsequently, we interrogated the interactome data to formulate testable hypotheses for the molecular mechanisms underlying effects of protein sequence mutations. Examples include proteins relevant for various developmental processes. Finally, we observed, by analysing pairs of paralogs, a correlation between functional divergence and sequence divergence in interaction sites. This analysis suggests that large-scale prediction of binding sites can cast light on evolutionary processes that shape protein-protein interaction networks.  相似文献   

10.

Motivation

Computational simulation of protein-protein docking can expedite the process of molecular modeling and drug discovery. This paper reports on our new F2 Dock protocol which improves the state of the art in initial stage rigid body exhaustive docking search, scoring and ranking by introducing improvements in the shape-complementarity and electrostatics affinity functions, a new knowledge-based interface propensity term with FFT formulation, a set of novel knowledge-based filters and finally a solvation energy (GBSA) based reranking technique. Our algorithms are based on highly efficient data structures including the dynamic packing grids and octrees which significantly speed up the computations and also provide guaranteed bounds on approximation error.

Results

The improved affinity functions show superior performance compared to their traditional counterparts in finding correct docking poses at higher ranks. We found that the new filters and the GBSA based reranking individually and in combination significantly improve the accuracy of docking predictions with only minor increase in computation time. We compared F2 Dock 2.0 with ZDock 3.0.2 and found improvements over it, specifically among 176 complexes in ZLab Benchmark 4.0, F2 Dock 2.0 finds a near-native solution as the top prediction for 22 complexes; where ZDock 3.0.2 does so for 13 complexes. F2 Dock 2.0 finds a near-native solution within the top 1000 predictions for 106 complexes as opposed to 104 complexes for ZDock 3.0.2. However, there are 17 and 15 complexes where F2 Dock 2.0 finds a solution but ZDock 3.0.2 does not and vice versa; which indicates that the two docking protocols can also complement each other.

Availability

The docking protocol has been implemented as a server with a graphical client (TexMol) which allows the user to manage multiple docking jobs, and visualize the docked poses and interfaces. Both the server and client are available for download. Server: http://www.cs.utexas.edu/~bajaj/cvc/software/f2dock.shtml. Client: http://www.cs.utexas.edu/~bajaj/cvc/software/f2dockclient.shtml.  相似文献   

11.
Many biological processes are mediated by protein-protein interactions (PPIs). Because protein domains are the building blocks of proteins, PPIs likely rely on domain-domain interactions (DDIs). Several attempts exist to infer DDIs from PPI networks but the produced datasets are heterogeneous and sometimes not accessible, while the PPI interactome data keeps growing.We describe a new computational approach called “PPIDM” (Protein-Protein Interactions Domain Miner) for inferring DDIs using multiple sources of PPIs. The approach is an extension of our previously described “CODAC” (Computational Discovery of Direct Associations using Common neighbors) method for inferring new edges in a tripartite graph. The PPIDM method has been applied to seven widely used PPI resources, using as “Gold-Standard” a set of DDIs extracted from 3D structural databases. Overall, PPIDM has produced a dataset of 84,552 non-redundant DDIs. Statistical significance (p-value) is calculated for each source of PPI and used to classify the PPIDM DDIs in Gold (9,175 DDIs), Silver (24,934 DDIs) and Bronze (50,443 DDIs) categories. Dataset comparison reveals that PPIDM has inferred from the 2017 releases of PPI sources about 46% of the DDIs present in the 2020 release of the 3did database, not counting the DDIs present in the Gold-Standard. The PPIDM dataset contains 10,229 DDIs that are consistent with more than 13,300 PPIs extracted from the IMEx database, and nearly 23,300 DDIs (27.5%) that are consistent with more than 214,000 human PPIs extracted from the STRING database. Examples of newly inferred DDIs covering more than 10 PPIs in the IMEx database are provided.Further exploitation of the PPIDM DDI reservoir includes the inventory of possible partners of a protein of interest and characterization of protein interactions at the domain level in combination with other methods. The result is publicly available at http://ppidm.loria.fr/.  相似文献   

12.
Septin proteins bind GTP and heterooligomerize into filaments with conserved functions across a wide range of eukaryotes. Most septins hydrolyze GTP, altering the oligomerization interfaces; yet mutations designed to abolish nucleotide binding or hydrolysis by yeast septins perturb function only at high temperatures. Here, we apply an unbiased mutational approach to this problem. Mutations causing defects at high temperature mapped exclusively to the oligomerization interface encompassing the GTP-binding pocket, or to the pocket itself. Strikingly, cold-sensitive defects arise when certain of these same mutations are coexpressed with a wild-type allele, suggestive of a novel mode of dominance involving incompatibility between mutant and wild-type molecules at the septin–septin interfaces that mediate filament polymerization. A different cold-sensitive mutant harbors a substitution in an unstudied but highly conserved region of the septin Cdc12. A homologous domain in the small GTPase Ran allosterically regulates GTP-binding domain conformations, pointing to a possible new functional domain in some septins. Finally, we identify a mutation in septin Cdc3 that restores the high-temperature assembly competence of a mutant allele of septin Cdc10, likely by adopting a conformation more compatible with nucleotide-free Cdc10. Taken together, our findings demonstrate that GTP binding and hydrolysis promote, but are not required for, one-time events—presumably oligomerization-associated conformational changes—during assembly of the building blocks of septin filaments. Restrictive temperatures impose conformational constraints on mutant septin proteins, preventing new assembly and in certain cases destabilizing existing assemblies. These insights from yeast relate directly to disease-causing mutations in human septins.  相似文献   

13.
Rapidly increasing amounts of (physical and genetic) protein-protein interaction (PPI) data are produced by various high-throughput techniques, and interpretation of these data remains a major challenge. In order to gain insight into the organization and structure of the resultant large complex networks formed by interacting molecules, using simulated annealing, a method based on the node connectivity, we developed ModuleRole, a user-friendly web server tool which finds modules in PPI network and defines the roles for every node, and produces files for visualization in Cytoscape and Pajek. For given proteins, it analyzes the PPI network from BioGRID database, finds and visualizes the modules these proteins form, and then defines the role every node plays in this network, based on two topological parameters Participation Coefficient and Z-score. This is the first program which provides interactive and very friendly interface for biologists to find and visualize modules and roles of proteins in PPI network. It can be tested online at the website http://www.bioinfo.org/modulerole/index.php, which is free and open to all users and there is no login requirement, with demo data provided by “User Guide” in the menu Help. Non-server application of this program is considered for high-throughput data with more than 200 nodes or user’s own interaction datasets. Users are able to bookmark the web link to the result page and access at a later time. As an interactive and highly customizable application, ModuleRole requires no expert knowledge in graph theory on the user side and can be used in both Linux and Windows system, thus a very useful tool for biologist to analyze and visualize PPI networks from databases such as BioGRID.

Availability

ModuleRole is implemented in Java and C, and is freely available at http://www.bioinfo.org/modulerole/index.php. Supplementary information (user guide, demo data) is also available at this website. API for ModuleRole used for this program can be obtained upon request.  相似文献   

14.

Background

The exponential increase of published biomedical literature prompts the use of text mining tools to manage the information overload automatically. One of the most common applications is to mine protein-protein interactions (PPIs) from PubMed abstracts. Currently, most tools in mining PPIs from literature are using co-occurrence-based approaches or rule-based approaches. Hybrid methods (frame-based approaches) by combining these two methods may have better performance in predicting PPIs. However, the predicted PPIs from these methods are rarely evaluated by known PPI databases and co-occurred terms in Gene Ontology (GO) database.

Methodology/Principal Findings

We here developed a web-based tool, PPI Finder, to mine human PPIs from PubMed abstracts based on their co-occurrences and interaction words, followed by evidences in human PPI databases and shared terms in GO database. Only 28% of the co-occurred pairs in PubMed abstracts appeared in any of the commonly used human PPI databases (HPRD, BioGRID and BIND). On the other hand, of the known PPIs in HPRD, 69% showed co-occurrences in the literature, and 65% shared GO terms.

Conclusions

PPI Finder provides a useful tool for biologists to uncover potential novel PPIs. It is freely accessible at http://liweilab.genetics.ac.cn/tm/.  相似文献   

15.
Improvements in experimental techniques increasingly provide structural data relating to protein-protein interactions. Classification of structural details of protein-protein interactions can provide valuable insights for modeling and abstracting design principles. Here, we aim to cluster protein-protein interactions by their interface structures, and to exploit these clusters to obtain and study shared and distinct protein binding sites. We find that there are 22604 unique interface structures in the PDB. These unique interfaces, which provide a rich resource of structural data of protein-protein interactions, can be used for template-based docking. We test the specificity of these non-redundant unique interface structures by finding protein pairs which have multiple binding sites. We suggest that residues with more than 40% relative accessible surface area should be considered as surface residues in template-based docking studies. This comprehensive study of protein interface structures can serve as a resource for the community. The dataset can be accessed at http://prism.ccbb.ku.edu.tr/piface.  相似文献   

16.
Understanding complex networks of protein-protein interactions (PPIs) is one of the foremost challenges of the post-genomic era. Due to the recent advances in experimental bio-technology, including yeast-2-hybrid (Y2H), tandem affinity purification (TAP) and other high-throughput methods for protein-protein interaction (PPI) detection, huge amounts of PPI network data are becoming available. Of major concern, however, are the levels of noise and incompleteness. For example, for Y2H screens, it is thought that the false positive rate could be as high as 64%, and the false negative rate may range from 43% to 71%. TAP experiments are believed to have comparable levels of noise.We present a novel technique to assess the confidence levels of interactions in PPI networks obtained from experimental studies. We use it for predicting new interactions and thus for guiding future biological experiments. This technique is the first to utilize currently the best fitting network model for PPI networks, geometric graphs. Our approach achieves specificity of 85% and sensitivity of 90%. We use it to assign confidence scores to physical protein-protein interactions in the human PPI network downloaded from BioGRID. Using our approach, we predict 251 interactions in the human PPI network, a statistically significant fraction of which correspond to protein pairs sharing common GO terms. Moreover, we validate a statistically significant portion of our predicted interactions in the HPRD database and the newer release of BioGRID. The data and Matlab code implementing the methods are freely available from the web site: http://www.kuchaev.com/Denoising.  相似文献   

17.
18.
Saccharomyces cerevisiae Spt6 protein is a conserved chromatin factor with several distinct functional domains, including a natively unstructured 30-residue N-terminal region that binds competitively with Spn1 or nucleosomes. To uncover physiological roles of these interactions, we isolated histone mutations that suppress defects caused by weakening Spt6:Spn1 binding with the spt6-F249K mutation. The strongest suppressor was H2A-N39K, which perturbs the point of contact between the two H2A-H2B dimers in an assembled nucleosome. Substantial suppression also was observed when the H2A-H2B interface with H3-H4 was altered, and many members of this class of mutations also suppressed a defect in another essential histone chaperone, FACT. Spt6 is best known as an H3-H4 chaperone, but we found that it binds with similar affinity to H2A-H2B or H3-H4. Like FACT, Spt6 is therefore capable of binding each of the individual components of a nucleosome, but unlike FACT, Spt6 did not produce endonuclease-sensitive reorganized nucleosomes and did not displace H2A-H2B dimers from nucleosomes. Spt6 and FACT therefore have distinct activities, but defects can be suppressed by overlapping histone mutations. We also found that Spt6 and FACT together are nearly as abundant as nucleosomes, with ∼24,000 Spt6 molecules, ∼42,000 FACT molecules, and ∼75,000 nucleosomes per cell. Histone mutations that destabilize interfaces within nucleosomes therefore reveal multiple spatial regions that have both common and distinct roles in the functions of these two essential and abundant histone chaperones. We discuss these observations in terms of different potential roles for chaperones in both promoting the assembly of nucleosomes and monitoring their quality.  相似文献   

19.
Telomere length is tightly regulated in cells that express telomerase. The Saccharomyces cerevisiae Ku heterodimer, a DNA end-binding complex, positively regulates telomere length in a telomerase-dependent manner. Ku associates with the telomerase RNA subunit TLC1, and this association is required for TLC1 nuclear retention. Ku–TLC1 interaction also impacts the cell-cycle-regulated association of the telomerase catalytic subunit Est2 to telomeres. The promotion of TLC1 nuclear localization and Est2 recruitment have been proposed to be the principal role of Ku in telomere length maintenance, but neither model has been directly tested. Here we study the impact of forced recruitment of Est2 to telomeres on telomere length in the absence of Ku’s ability to bind TLC1 or DNA ends. We show that tethering Est2 to telomeres does not promote efficient telomere elongation in the absence of Ku–TLC1 interaction or DNA end binding. Moreover, restoration of TLC1 nuclear localization, even when combined with Est2 recruitment, does not bypass the role of Ku. In contrast, forced recruitment of Est1, which has roles in telomerase recruitment and activation, to telomeres promotes efficient and progressive telomere elongation in the absence of Ku–TLC1 interaction, Ku DNA end binding, or Ku altogether. Ku associates with Est1 and Est2 in a TLC1-dependent manner and enhances Est1 recruitment to telomeres independently of Est2. Together, our results unexpectedly demonstrate that the principal role of Ku in telomere length maintenance is to promote the association of Est1 with telomeres, which may in turn allow for efficient recruitment and activation of the telomerase holoenzyme.  相似文献   

20.
A database providing information on mosquito specimens (Arthropoda: Diptera: Culicidae) collected in French Guiana is presented. Field collections were initiated in 2013 under the auspices of the CEnter for the study of Biodiversity in Amazonia (CEBA: http://www.labexceba.fr/en/). This study is part of an ongoing process aiming to understand the distribution of mosquitoes, including vector species, across French Guiana. Occurrences are recorded after each collecting trip in a database managed by the laboratory Evolution et Diversité Biologique (EDB), Toulouse, France. The dataset is updated monthly and is available online. Voucher specimens and their associated DNA are stored at the laboratory Ecologie des Forêts de Guyane (Ecofog), Kourou, French Guiana. The latest version of the dataset is accessible through EDB’s Integrated Publication Toolkit at http://130.120.204.55:8080/ipt/resource.do?r=mosquitoes_of_french_guiana or through the Global Biodiversity Information Facility data portal at http://www.gbif.org/dataset/5a8aa2ad-261c-4f61-a98e-26dd752fe1c5 It can also be viewed through the Guyanensis platform at http://guyanensis.ups-tlse.fr  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号