首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background

High-throughput techniques are becoming widely used to study protein-protein interactions and protein complexes on a proteome-wide scale. Here we have explored the potential of these techniques to accurately determine the constituent proteins of complexes and their architecture within the complex.

Results

Two-dimensional representations of the 19S and 20S proteasome, mediator, and SAGA complexes were generated and overlaid with high quality pairwise interaction data, core-module-attachment classifications from affinity purifications of complexes and predicted domain-domain interactions. Pairwise interaction data could accurately determine the members of each complex, but was unexpectedly poor at deciphering the topology of proteins in complexes. Core and module data from affinity purification studies were less useful for accurately defining the member proteins of these complexes. However, these data gave strong information on the spatial proximity of many proteins. Predicted domain-domain interactions provided some insight into the topology of proteins within complexes, but was affected by a lack of available structural data for the co-activator complexes and the presence of shared domains in paralogous proteins.

Conclusion

The constituent proteins of complexes are likely to be determined with accuracy by combining data from high-throughput techniques. The topology of some proteins in the complexes will be able to be clearly inferred. We finally suggest strategies that can be employed to use high throughput interaction data to define the membership and understand the architecture of proteins in novel complexes.  相似文献   

2.

Background  

Several protein-protein interaction studies have been performed for the yeast Saccharomyces cerevisiae using different high-throughput experimental techniques. All these results are collected in the BioGRID database and the SGD database provide detailed annotation of the different proteins. Despite the value of BioGRID for studying protein-protein interactions, there is a need for manual curation of these interactions in order to remove false positives.  相似文献   

3.

Background

The problems of correlation and classification are long-standing in the fields of statistics and machine learning, and techniques have been developed to address these problems. We are now in the era of high-dimensional data, which is data that can concern billions of variables. These data present new challenges. In particular, it is difficult to discover predictive variables, when each variable has little marginal effect. An example concerns Genome-wide Association Studies (GWAS) datasets, which involve millions of single nucleotide polymorphism (SNPs), where some of the SNPs interact epistatically to affect disease status. Towards determining these interacting SNPs, researchers developed techniques that addressed this specific problem. However, the problem is more general, and so these techniques are applicable to other problems concerning interactions. A difficulty with many of these techniques is that they do not distinguish whether a learned interaction is actually an interaction or whether it involves several variables with strong marginal effects.

Methodology/Findings

We address this problem using information gain and Bayesian network scoring. First, we identify candidate interactions by determining whether together variables provide more information than they do separately. Then we use Bayesian network scoring to see if a candidate interaction really is a likely model. Our strategy is called MBS-IGain. Using 100 simulated datasets and a real GWAS Alzheimer’s dataset, we investigated the performance of MBS-IGain.

Conclusions/Significance

When analyzing the simulated datasets, MBS-IGain substantially out-performed nine previous methods at locating interacting predictors, and at identifying interactions exactly. When analyzing the real Alzheimer’s dataset, we obtained new results and results that substantiated previous findings. We conclude that MBS-IGain is highly effective at finding interactions in high-dimensional datasets. This result is significant because we have increasingly abundant high-dimensional data in many domains, and to learn causes and perform prediction/classification using these data, we often must first identify interactions.  相似文献   

4.
5.

Background

Whether or not a protein's number of physical interactions with other proteins plays a role in determining its rate of evolution has been a contentious issue. A recent analysis suggested that the observed correlation between number of interactions and evolutionary rate may be due to experimental biases in high-throughput protein interaction data sets.

Discussion

The number of interactions per protein, as measured by some protein interaction data sets, shows no correlation with evolutionary rate. Other data sets, however, do reveal a relationship. Furthermore, even when experimental biases of these data sets are taken into account, a real correlation between number of interactions and evolutionary rate appears to exist.

Summary

A strong and significant correlation between a protein's number of interactions and evolutionary rate is apparent for interaction data from some studies. The extremely low agreement between different protein interaction data sets indicates that interaction data are still of low coverage and/or quality. These limitations may explain why some data sets reveal no correlation with evolutionary rates.
  相似文献   

6.

Background

Multiple computational methods for predicting drug-target interactions have been developed to facilitate the drug discovery process. These methods use available data on known drug-target interactions to train classifiers with the purpose of predicting new undiscovered interactions. However, a key challenge regarding this data that has not yet been addressed by these methods, namely class imbalance, is potentially degrading the prediction performance. Class imbalance can be divided into two sub-problems. Firstly, the number of known interacting drug-target pairs is much smaller than that of non-interacting drug-target pairs. This imbalance ratio between interacting and non-interacting drug-target pairs is referred to as the between-class imbalance. Between-class imbalance degrades prediction performance due to the bias in prediction results towards the majority class (i.e. the non-interacting pairs), leading to more prediction errors in the minority class (i.e. the interacting pairs). Secondly, there are multiple types of drug-target interactions in the data with some types having relatively fewer members (or are less represented) than others. This variation in representation of the different interaction types leads to another kind of imbalance referred to as the within-class imbalance. In within-class imbalance, prediction results are biased towards the better represented interaction types, leading to more prediction errors in the less represented interaction types.

Results

We propose an ensemble learning method that incorporates techniques to address the issues of between-class imbalance and within-class imbalance. Experiments show that the proposed method improves results over 4 state-of-the-art methods. In addition, we simulated cases for new drugs and targets to see how our method would perform in predicting their interactions. New drugs and targets are those for which no prior interactions are known. Our method displayed satisfactory prediction performance and was able to predict many of the interactions successfully.

Conclusions

Our proposed method has improved the prediction performance over the existing work, thus proving the importance of addressing problems pertaining to class imbalance in the data.
  相似文献   

7.

Background

The advent of various high-throughput experimental techniques for measuring molecular interactions has enabled the systematic study of biological interactions on a global scale. Since biological processes are carried out by elaborate collaborations of numerous molecules that give rise to a complex network of molecular interactions, comparative analysis of these biological networks can bring important insights into the functional organization and regulatory mechanisms of biological systems.

Methodology/Principal Findings

In this paper, we present an effective framework for identifying common interaction patterns in the biological networks of different organisms based on hidden Markov models (HMMs). Given two or more networks, our method efficiently finds the top matching paths in the respective networks, where the matching paths may contain a flexible number of consecutive insertions and deletions.

Conclusions/Significance

Based on several protein-protein interaction (PPI) networks obtained from the Database of Interacting Proteins (DIP) and other public databases, we demonstrate that our method is able to detect biologically significant pathways that are conserved across different organisms. Our algorithm has a polynomial complexity that grows linearly with the size of the aligned paths. This enables the search for very long paths with more than 10 nodes within a few minutes on a desktop computer. The software program that implements this algorithm is available upon request from the authors.  相似文献   

8.

Background  

In recent years, mammalian protein-protein interaction network databases have been developed. The interactions in these databases are either extracted manually from low-throughput experimental biomedical research literature, extracted automatically from literature using techniques such as natural language processing (NLP), generated experimentally using high-throughput methods such as yeast-2-hybrid screens, or interactions are predicted using an assortment of computational approaches. Genes or proteins identified as significantly changing in proteomic experiments, or identified as susceptibility disease genes in genomic studies, can be placed in the context of protein interaction networks in order to assign these genes and proteins to pathways and protein complexes.  相似文献   

9.

Background

microRNAs act as regulators of gene expression interacting with their gene targets. Current bioinformatics services, such as databases of validated miRNA-target interactions and prediction tools, usually provide interactions without any information about what tissue that interaction is more likely to appear nor information about the type of interactions, causing mRNA degradation or translation inhibition respectively.

Results

In this work, we introduce miRTissue, a web application that combines validated miRNA-target interactions with statistical correlation among expression profiles of miRNAs, genes and proteins in 15 different human tissues. Validated interactions are taken from the miRTarBase database, while expression profiles are downloaded from The Cancer Genome Atlas repository. As a result, the service provides a tissue-specific characterisation of each couple of miRNA and gene together with its statistical significance (p-value). The inclusion of protein data also allows providing the type of interaction. Moreover, miRTissue offers several views for analysing interactions, focusing for example on the comparison between different cancer types or different tissue conditions. All the results are freely downloadable in the most common formats.

Conclusions

miRTissue fills a gap concerning current bioinformatics services related to miRNA-target interactions because it provides a tissue-specific context to each validated interaction and the type of interaction itself. miRTissue is easily browsable allowing the user to select miRNAs, genes, cancer types and tissue conditions. The results can be sorted according to p-values to immediately identify those interactions that are more likely to occur in a given tissue. miRTissue is available at http://tblab.pa.icar.cnr.it/mirtissue.html.
  相似文献   

10.

Introduction

Statistical interactions are a common component of data analysis across a broad range of scientific disciplines. However, the statistical power to detect interactions is often undesirably low. One solution is to elevate the Type 1 error rate so that important interactions are not missed in a low power situation. To date, no study has quantified the effects of this practice on power in a linear regression model.

Methods

A Monte Carlo simulation study was performed. A continuous dependent variable was specified, along with three types of interactions: continuous variable by continuous variable; continuous by dichotomous; and dichotomous by dichotomous. For each of the three scenarios, the interaction effect sizes, sample sizes, and Type 1 error rate were varied, resulting in a total of 240 unique simulations.

Results

In general, power to detect the interaction effect was either so low or so high at α = 0.05 that raising the Type 1 error rate only served to increase the probability of including a spurious interaction in the model. A small number of scenarios were identified in which an elevated Type 1 error rate may be justified.

Conclusions

Routinely elevating Type 1 error rate when testing interaction effects is not an advisable practice. Researchers are best served by positing interaction effects a priori and accounting for them when conducting sample size calculations.  相似文献   

11.

Aims

To elucidate the mechanisms of the beneficial effects of below-ground root interactions in maize plus legume intercropping system,

Methods

A pot experiment was conducted using root separation techniques.

Results

It is shown that root interaction and nitrogen fertilization increased chlorophyll content and improved plant characteristics of maize, and the effect of root interaction was significant (p<0.05). Compared to a full root separation treatment, no root separation increased the leaf and grain nitrogen contents, and economic and biological yields per maize plant by 9.3? %, 6.0? %, 14.0? %, and 6.5? %, respectively. Root interaction and nitrogen fertilization enhanced the numbers of bacteria, fungi, actinomycetes and Azotobacteria and the activities of urease, invertase, acid-phosphatase and protease in soil. Correlation analyses revealed that the quantity of microorganisms and the activity of the aforementioned enzymes were all positively or significantly (p<0.05) positively correlated with chlorophyll content, plant height and economic and biological yields per maize plant.

Conclusions

The findings demonstrate that root interactions are important in improving the soil micro-ecological environment, increasing microbial quantity and enzyme activity in soil, and enhancing crop yield.  相似文献   

12.

Background

Currently a huge amount of protein-protein interaction data is available from high throughput experimental methods. In a large network of protein-protein interactions, groups of proteins can be identified as functional clusters having related functions where a single protein can occur in multiple clusters. However experimental methods are error-prone and thus the interactions in a functional cluster may include false positives or there may be unreported interactions. Therefore correctly identifying a functional cluster of proteins requires the knowledge of whether any two proteins in a cluster interact, whether an interaction can exclude other interactions, or how strong the affinity between two interacting proteins is.

Methods

In the present work the yeast protein-protein interaction network is clustered using a spectral clustering method proposed by us in 2006 and the individual clusters are investigated for functional relationships among the member proteins. 3D structural models of the proteins in one cluster have been built – the protein structures are retrieved from the Protein Data Bank or predicted using a comparative modeling approach. A rigid body protein docking method (Cluspro) is used to predict the protein-protein interaction complexes. Binding sites of the docked complexes are characterized by their buried surface areas in the docked complexes, as a measure of the strength of an interaction.

Results

The clustering method yields functionally coherent clusters. Some of the interactions in a cluster exclude other interactions because of shared binding sites. New interactions among the interacting proteins are uncovered, and thus higher order protein complexes in the cluster are proposed. Also the relative stability of each of the protein complexes in the cluster is reported.

Conclusions

Although the methods used are computationally expensive and require human intervention and judgment, they can identify the interactions that could occur together or ones that are mutually exclusive. In addition indirect interactions through another intermediate protein can be identified. These theoretical predictions might be useful for crystallographers to select targets for the X-ray crystallographic determination of protein complexes.
  相似文献   

13.

Background

While the analysis of unweighted biological webs as diverse as genetic, protein and metabolic networks allowed spectacular insights in the inner workings of a cell, biological networks are not only determined by their static grid of links. In fact, we expect that the heterogeneity in the utilization of connections has a major impact on the organization of cellular activities as well.

Results

We consider a web of interactions between protein domains of the Protein Family database (PFAM), which are weighted by a probability score. We apply metrics that combine the static layout and the weights of the underlying interactions. We observe that unweighted measures as well as their weighted counterparts largely share the same trends in the underlying domain interaction network. However, we only find weak signals that weights and the static grid of interactions are connected entities. Therefore assuming that a protein interaction is governed by a single domain interaction, we observe strong and significant correlations of the highest scoring domain interaction and the confidence of protein interactions in the underlying interactions of yeast and fly.Modeling an interaction between proteins if we find a high scoring protein domain interaction we obtain 1, 428 protein interactions among 361 proteins in the human malaria parasite Plasmodium falciparum. Assessing their quality by a logistic regression method we observe that increasing confidence of predicted interactions is accompanied by high scoring domain interactions and elevated levels of functional similarity and evolutionary conservation.

Conclusion

Our results indicate that probability scores are randomly distributed, allowing to treat static grid and weights of domain interactions as separate entities. In particular, these finding confirms earlier observations that a protein interaction is a matter of a single interaction event on domain level. As an immediate application, we show a simple way to predict potential protein interactions by utilizing expectation scores of single domain interactions.
  相似文献   

14.

Aim

Lichens are often regarded as paradigms of mutualistic relationships. However, it is still poorly known how lichen-forming fungi and their photosynthetic partners interact at a community scale. We explored the structure of fungus-alga networks of interactions in lichen communities along a latitudinal transect in continental Antarctica. We expect these interactions to be highly specialized and, consequently, networks with low nestedness degree and high modularity.

Location

Transantarctic Mountains from 76° S to 85° S (continental Antarctica).

Time Period

Present.

Major Taxa Studied

Seventy-seven species of lichen-forming fungi and their photobionts.

Methods

DNA barcoding of photobionts using nrITS data was conducted in 756 lichen specimens from five regions along the Transantarctic Mountains. We built interaction networks for each of the five studied regions and a metaweb for the whole area. We explored the specialization of both partners using the number of partners a species interacts with and the specialization parameter d'. Network architecture parameters such as nestedness, modularity and network specialization parameter H2' were studied in all networks and contrasted through null models. Finally, we measured interaction turnover along the latitudinal transect.

Results

We recovered a total of 842 interactions. Differences in specialization between partners were not statistically significant. Fungus-alga interaction networks showed high specialization and modularity, as well as low connectance and nestedness. Despite the large turnover in interactions occurring among regions, network parameters were not correlated with latitude.

Main Conclusions

The interaction networks established between fungi and algae in saxicolous lichen communities in continental Antarctica showed invariant properties along the latitudinal transect. Rewiring is an important driver of interaction turnover along the transect studied. Future work should answer whether the patterns observed in our study are prevalent in other regions with milder climates and in lichen communities on different substrates.  相似文献   

15.

Background  

Protein-protein interactions are a pivotal component of many biological processes and mediate a variety of functions. Knowing the tertiary structure of a protein complex is therefore essential for understanding the interaction mechanism. However, experimental techniques to solve the structure of the complex are often found to be difficult. To this end, computational protein-protein docking approaches can provide a useful alternative to address this issue. Prediction of docking conformations relies on methods that effectively capture shape features of the participating proteins while giving due consideration to conformational changes that may occur.  相似文献   

16.
Ma X  Tarone AM  Li W 《PloS one》2008,3(4):e1922

Background

Synthetic lethal genetic interaction analysis has been successfully applied to predicting the functions of genes and their pathway identities. In the context of synthetic lethal interaction data alone, the global similarity of synthetic lethal interaction patterns between two genes is used to predict gene function. With physical interaction data, such as protein-protein interactions, the enrichment of physical interactions within subsets of genes and the enrichment of synthetic lethal interactions between those subsets of genes are used as an indication of compensatory pathways.

Result

In this paper, we propose a method of mapping genetically compensatory pathways from synthetic lethal interactions. Our method is designed to discover pairs of gene-sets in which synthetic lethal interactions are depleted among the genes in an individual set and where such gene-set pairs are connected by many synthetic lethal interactions. By its nature, our method could select compensatory pathway pairs that buffer the deleterious effect of the failure of either one, without the need of physical interaction data. By focusing on compensatory pathway pairs where genes in each individual pathway have a highly homogenous cellular function, we show that many cellular functions have genetically compensatory properties.

Conclusion

We conclude that synthetic lethal interaction data are a powerful source to map genetically compensatory pathways, especially in systems lacking physical interaction information, and that the cellular function network contains abundant compensatory properties.  相似文献   

17.

Background:

The biomedical literature is the primary information source for manual protein-protein interaction annotations. Text-mining systems have been implemented to extract binary protein interactions from articles, but a comprehensive comparison between the different techniques as well as with manual curation was missing.

Results:

We designed a community challenge, the BioCreative II protein-protein interaction (PPI) task, based on the main steps of a manual protein interaction annotation workflow. It was structured into four distinct subtasks related to: (a) detection of protein interaction-relevant articles; (b) extraction and normalization of protein interaction pairs; (c) retrieval of the interaction detection methods used; and (d) retrieval of actual text passages that provide evidence for protein interactions. A total of 26 teams submitted runs for at least one of the proposed subtasks. In the interaction article detection subtask, the top scoring team reached an F-score of 0.78. In the interaction pair extraction and mapping to SwissProt, a precision of 0.37 (with recall of 0.33) was obtained. For associating articles with an experimental interaction detection method, an F-score of 0.65 was achieved. As for the retrieval of the PPI passages best summarizing a given protein interaction in full-text articles, 19% of the submissions returned by one of the runs corresponded to curator-selected sentences. Curators extracted only the passages that best summarized a given interaction, implying that many of the automatically extracted ones could contain interaction information but did not correspond to the most informative sentences.

Conclusion:

The BioCreative II PPI task is the first attempt to compare the performance of text-mining tools specific for each of the basic steps of the PPI extraction pipeline. The challenges identified range from problems in full-text format conversion of articles to difficulties in detecting interactor protein pairs and then linking them to their database records. Some limitations were also encountered when using a single (and possibly incomplete) reference database for protein normalization or when limiting search for interactor proteins to co-occurrence within a single sentence, when a mention might span neighboring sentences. Finally, distinguishing between novel, experimentally verified interactions (annotation relevant) and previously known interactions adds additional complexity to these tasks.
  相似文献   

18.

Background

Shape complementarity and non-covalent interactions are believed to drive protein-ligand interaction. To date protein-protein, protein-DNA, and protein-RNA interactions were systematically investigated, which is in contrast to interactions with small ligands. We investigate the role of covalent and non-covalent bonds in protein-small ligand interactions using a comprehensive dataset of 2,320 complexes.

Methodology and Principal Findings

We show that protein-ligand interactions are governed by different forces for different ligand types, i.e., protein-organic compound interactions are governed by hydrogen bonds, van der Waals contacts, and covalent bonds; protein-metal ion interactions are dominated by electrostatic force and coordination bonds; protein-anion interactions are established with electrostatic force, hydrogen bonds, and van der Waals contacts; and protein-inorganic cluster interactions are driven by coordination bonds. We extracted several frequently occurring atomic-level patterns concerning these interactions. For instance, 73% of investigated covalent bonds were summarized with just three patterns in which bonds are formed between thiol of Cys and carbon or sulfur atoms of ligands, and nitrogen of Lys and carbon of ligands. Similar patterns were found for the coordination bonds. Hydrogen bonds occur in 67% of protein-organic compound complexes and 66% of them are formed between NH- group of protein residues and oxygen atom of ligands. We quantify relative abundance of specific interaction types and discuss their characteristic features. The extracted protein-organic compound patterns are shown to complement and improve a geometric approach for prediction of binding sites.

Conclusions and Significance

We show that for a given type (group) of ligands and type of the interaction force, majority of protein-ligand interactions are repetitive and could be summarized with several simple atomic-level patterns. We summarize and analyze 10 frequently occurring interaction patterns that cover 56% of all considered complexes and we show a practical application for the patterns that concerns interactions with organic compounds.  相似文献   

19.

Background

Identification of protein interaction networks has received considerable attention in the post-genomic era. The currently available biochemical approaches used to detect protein-protein interactions are all time and labour intensive. Consequently there is a growing need for the development of computational tools that are capable of effectively identifying such interactions.

Results

Here we explain the development and implementation of a novel Protein-Protein Interaction Prediction Engine termed PIPE. This tool is capable of predicting protein-protein interactions for any target pair of the yeast Saccharomyces cerevisiae proteins from their primary structure and without the need for any additional information or predictions about the proteins. PIPE showed a sensitivity of 61% for detecting any yeast protein interaction with 89% specificity and an overall accuracy of 75%. This rate of success is comparable to those associated with the most commonly used biochemical techniques. Using PIPE, we identified a novel interaction between YGL227W (vid30) and YMR135C (gid8) yeast proteins. This lead us to the identification of a novel yeast complex that here we term vid30 complex (vid30c). The observed interaction was confirmed by tandem affinity purification (TAP tag), verifying the ability of PIPE to predict novel protein-protein interactions. We then used PIPE analysis to investigate the internal architecture of vid30c. It appeared from PIPE analysis that vid30c may consist of a core and a secondary component. Generation of yeast gene deletion strains combined with TAP tagging analysis indicated that the deletion of a member of the core component interfered with the formation of vid30c, however, deletion of a member of the secondary component had little effect (if any) on the formation of vid30c. Also, PIPE can be used to analyse yeast proteins for which TAP tagging fails, thereby allowing us to predict protein interactions that are not included in genome-wide yeast TAP tagging projects.

Conclusion

PIPE analysis can predict yeast protein-protein interactions. Also, PIPE analysis can be used to study the internal architecture of yeast protein complexes. The data also suggests that a finite set of short polypeptide signals seem to be responsible for the majority of the yeast protein-protein interactions.  相似文献   

20.

Background

The majority of experimentally verified molecular interaction and biological pathway data are present in the unstructured text of biomedical journal articles where they are inaccessible to computational methods. The Biomolecular interaction network database (BIND) seeks to capture these data in a machine-readable format. We hypothesized that the formidable task-size of backfilling the database could be reduced by using Support Vector Machine technology to first locate interaction information in the literature. We present an information extraction system that was designed to locate protein-protein interaction data in the literature and present these data to curators and the public for review and entry into BIND.

Results

Cross-validation estimated the support vector machine's test-set precision, accuracy and recall for classifying abstracts describing interaction information was 92%, 90% and 92% respectively. We estimated that the system would be able to recall up to 60% of all non-high throughput interactions present in another yeast-protein interaction database. Finally, this system was applied to a real-world curation problem and its use was found to reduce the task duration by 70% thus saving 176 days.

Conclusions

Machine learning methods are useful as tools to direct interaction and pathway database back-filling; however, this potential can only be realized if these techniques are coupled with human review and entry into a factual database such as BIND. The PreBIND system described here is available to the public at http://bind.ca. Current capabilities allow searching for human, mouse and yeast protein-interaction information.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号