首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Pathogen-host protein-protein interaction (PPI) plays an important role in revealing the underlying pathogenesis of viruses and bacteria. The need of rapidly mapping proteome-wide pathogen-host interactome opens avenues for and imposes burdens on computational modeling. For Salmonella typhimurium, only 62 interactions with human proteins are reported to date, and the computational modeling based on such a small training data is prone to yield model overfitting. In this work, we propose a multi-instance transfer learning method to reconstruct the proteome-wide Salmonella-human PPI networks, wherein the training data is augmented by homolog knowledge transfer in the form of independent homolog instances. We use AdaBoost instance reweighting to counteract the noise from homolog instances, and deliberately design three experimental settings to validate the assumption that the homolog instances are effective to address the problems of data scarcity and data unavailability. The experimental results show that the proposed method outperforms the existing models and some predictions are validated by the findings from recent literature. Lastly, we conduct gene ontology based clustering analysis of the predicted networks to provide insights into the pathogenesis of Salmonella.  相似文献   

2.
Reconstruction of host-pathogen protein interaction networks is of great significance to reveal the underlying microbic pathogenesis. However, the current experimentally-derived networks are generally small and should be augmented by computational methods for less-biased biological inference. From the point of view of computational modelling, data scarcity, data unavailability and negative data sampling are the three major problems for host-pathogen protein interaction networks reconstruction. In this work, we are motivated to address the three concerns and propose a probability weighted ensemble transfer learning model for HIV-human protein interaction prediction (PWEN-TLM), where support vector machine (SVM) is adopted as the individual classifier of the ensemble model. In the model, data scarcity and data unavailability are tackled by homolog knowledge transfer. The importance of homolog knowledge is measured by the ROC-AUC metric of the individual classifiers, whose outputs are probability weighted to yield the final decision. In addition, we further validate the assumption that only the homolog knowledge is sufficient to train a satisfactory model for host-pathogen protein interaction prediction. Thus the model is more robust against data unavailability with less demanding data constraint. As regards with negative data construction, experiments show that exclusiveness of subcellular co-localized proteins is unbiased and more reliable than random sampling. Last, we conduct analysis of overlapped predictions between our model and the existing models, and apply the model to novel host-pathogen PPIs recognition for further biological research.  相似文献   

3.

Background

Our knowledge of global protein-protein interaction (PPI) networks in complex organisms such as humans is hindered by technical limitations of current methods.

Results

On the basis of short co-occurring polypeptide regions, we developed a tool called MP-PIPE capable of predicting a global human PPI network within 3 months. With a recall of 23% at a precision of 82.1%, we predicted 172,132 putative PPIs. We demonstrate the usefulness of these predictions through a range of experiments.

Conclusions

The speed and accuracy associated with MP-PIPE can make this a potential tool to study individual human PPI networks (from genomic sequences alone) for personalized medicine.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-014-0383-1) contains supplementary material, which is available to authorized users.  相似文献   

4.

Background

Understanding protein complexes is important for understanding the science of cellular organization and function. Many computational methods have been developed to identify protein complexes from experimentally obtained protein-protein interaction (PPI) networks. However, interaction information obtained experimentally can be unreliable and incomplete. Reconstructing these PPI networks with PPI evidences from other sources can improve protein complex identification.

Results

We combined PPI information from 6 different sources and obtained a reconstructed PPI network for yeast through machine learning. Some popular protein complex identification methods were then applied to detect yeast protein complexes using the new PPI networks. Our evaluation indicates that protein complex identification algorithms using the reconstructed PPI network significantly outperform ones on experimentally verified PPI networks.

Conclusions

We conclude that incorporating PPI information from other sources can improve the effectiveness of protein complex identification.  相似文献   

5.

Background

Proteins dynamically interact with each other to perform their biological functions. The dynamic operations of protein interaction networks (PPI) are also reflected in the dynamic formations of protein complexes. Existing protein complex detection algorithms usually overlook the inherent temporal nature of protein interactions within PPI networks. Systematically analyzing the temporal protein complexes can not only improve the accuracy of protein complex detection, but also strengthen our biological knowledge on the dynamic protein assembly processes for cellular organization.

Results

In this study, we propose a novel computational method to predict temporal protein complexes. Particularly, we first construct a series of dynamic PPI networks by joint analysis of time-course gene expression data and protein interaction data. Then a Time Smooth Overlapping Complex Detection model (TS-OCD) has been proposed to detect temporal protein complexes from these dynamic PPI networks. TS-OCD can naturally capture the smoothness of networks between consecutive time points and detect overlapping protein complexes at each time point. Finally, a nonnegative matrix factorization based algorithm is introduced to merge those very similar temporal complexes across different time points.

Conclusions

Extensive experimental results demonstrate the proposed method is very effective in detecting temporal protein complexes than the state-of-the-art complex detection techniques.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-335) contains supplementary material, which is available to authorized users.  相似文献   

6.

Background

A goal of systems biology is to analyze large-scale molecular networks including gene expressions and protein-protein interactions, revealing the relationships between network structures and their biological functions. Dividing a protein-protein interaction (PPI) network into naturally grouped parts is an essential way to investigate the relationship between topology of networks and their functions. However, clear modular decomposition is often hard due to the heterogeneous or scale-free properties of PPI networks.

Methodology/Principal Findings

To address this problem, we propose a diffusion model-based spectral clustering algorithm, which analytically solves the cluster structure of PPI networks as a problem of random walks in the diffusion process in them. To cope with the heterogeneity of the networks, the power factor is introduced to adjust the diffusion matrix by weighting the transition (adjacency) matrix according to a node degree matrix. This algorithm is named adjustable diffusion matrix-based spectral clustering (ADMSC). To demonstrate the feasibility of ADMSC, we apply it to decomposition of a yeast PPI network, identifying biologically significant clusters with approximately equal size. Compared with other established algorithms, ADMSC facilitates clear and fast decomposition of PPI networks.

Conclusions/Significance

ADMSC is proposed by introducing the power factor that adjusts the diffusion matrix to the heterogeneity of the PPI networks. ADMSC effectively partitions PPI networks into biologically significant clusters with almost equal sizes, while being very fast, robust and appealing simple.  相似文献   

7.
8.

Background

Experimental methods for the identification of essential proteins are always costly, time-consuming, and laborious. It is a challenging task to find protein essentiality only through experiments. With the development of high throughput technologies, a vast amount of protein-protein interactions are available, which enable the identification of essential proteins from the network level. Many computational methods for such task have been proposed based on the topological properties of protein-protein interaction (PPI) networks. However, the currently available PPI networks for each species are not complete, i.e. false negatives, and very noisy, i.e. high false positives, network topology-based centrality measures are often very sensitive to such noise. Therefore, exploring robust methods for identifying essential proteins would be of great value.

Method

In this paper, a new essential protein discovery method, named CoEWC (Co-Expression Weighted by Clustering coefficient), has been proposed. CoEWC is based on the integration of the topological properties of PPI network and the co-expression of interacting proteins. The aim of CoEWC is to capture the common features of essential proteins in both date hubs and party hubs. The performance of CoEWC is validated based on the PPI network of Saccharomyces cerevisiae. Experimental results show that CoEWC significantly outperforms the classical centrality measures, and that it also outperforms PeC, a newly proposed essential protein discovery method which outperforms 15 other centrality measures on the PPI network of Saccharomyces cerevisiae. Especially, when predicting no more than 500 proteins, even more than 50% improvements are obtained by CoEWC over degree centrality (DC), a better centrality measure for identifying protein essentiality.

Conclusions

We demonstrate that more robust essential protein discovery method can be developed by integrating the topological properties of PPI network and the co-expression of interacting proteins. The proposed centrality measure, CoEWC, is effective for the discovery of essential proteins.  相似文献   

9.

Background

HTLV-1 is a retrovirus that causes lymphoproliferative disorders and inflammatory and degenerative diseases of the central nervous system in humans. The prevalence of this infection is high in parts of Brazil and there is a general lack of public health care programs. As a consequence, official data on the transmission routes of this virus are scarce.

Objective

To demonstrate familial aggregation of HTLV infections in the metropolitan region of Belém, Pará, Brazil.

Method

A cross-sectional study involving 85 HTLV carriers treated at an outpatient clinic and other family members. The subjects were tested by ELISA and molecular methods between February 2007 and December 2010.

Results

The prevalence of HTLV was 43.5% (37/85) for families and 25.6% (58/227) for the family members tested (95% CI: 1.33 to 3.79, P = 0.0033). Sexual and vertical transmission was likely in 38.3% (23/60) and 20.4% (29/142) of pairs, respectively (95% CI: 1.25 to 4.69, P = 0.0130). Positivity was 51.3% (20/39) and 14.3% (3/21) in wives and husbands, respectively (95% CI: 0.04 to 0.63, P = 0.0057). By age group, seropositivity was 8.0% (7/88) in subjects <30 years of age and 36.7% (51/139) in those of over 30 years (95% CI: 0.06 to 0.34, P<0.0001). Positivity was 24.1% (7/29) in the children of patients infected with HTLV-2, as against only 5.8% (4/69) of those infected with HTLV-1 (95% CI: 0.05 to 0.72, P = 0.0143).

Conclusion

The results of this study indicate the existence of familial aggregations of HTLV characterized by a higher prevalence of infection among wives and subjects older than 30 years. Horizontal transmission between spouses was more frequent than vertical transmission. The higher rate of infection in children of HTLV-2 carriers suggests an increase in the prevalence of this virus type in the metropolitan region of Belém.  相似文献   

10.

Background

Network inference deals with the reconstruction of molecular networks from experimental data. Given N molecular species, the challenge is to find the underlying network. Due to data limitations, this typically is an ill-posed problem, and requires the integration of prior biological knowledge or strong regularization. We here focus on the situation when time-resolved measurements of a system’s response after systematic perturbations are available.

Results

We present a novel method to infer signaling networks from time-course perturbation data. We utilize dynamic Bayesian networks with probabilistic Boolean threshold functions to describe protein activation. The model posterior distribution is analyzed using evolutionary MCMC sampling and subsequent clustering, resulting in probability distributions over alternative networks. We evaluate our method on simulated data, and study its performance with respect to data set size and levels of noise. We then use our method to study EGF-mediated signaling in the ERBB pathway.

Conclusions

Dynamic Probabilistic Threshold Networks is a new method to infer signaling networks from time-series perturbation data. It exploits the dynamic response of a system after external perturbation for network reconstruction. On simulated data, we show that the approach outperforms current state of the art methods. On the ERBB data, our approach recovers a significant fraction of the known interactions, and predicts novel mechanisms in the ERBB pathway.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-250) contains supplementary material, which is available to authorized users.  相似文献   

11.
12.
13.

Background

Parkinson''s Disease (PD) is one of the most prevailing neurodegenerative diseases. Improving diagnoses and treatments of this disease is essential, as currently there exists no cure for this disease. Microarray and proteomics data have revealed abnormal expression of several genes and proteins responsible for PD. Nevertheless, few studies have been reported involving PD-specific protein-protein interactions.

Results

Microarray based gene expression data and protein-protein interaction (PPI) databases were combined to construct the PPI networks of differentially expressed (DE) genes in post mortem brain tissue samples of patients with Parkinson''s disease. Samples were collected from the substantia nigra and the frontal cerebral cortex. From the microarray data, two sets of DE genes were selected by 2-tailed t-tests and Significance Analysis of Microarrays (SAM), run separately to construct two Query-Query PPI (QQPPI) networks. Several topological properties of these networks were studied. Nodes with High Connectivity (hubs) and High Betweenness Low Connectivity (bottlenecks) were identified to be the most significant nodes of the networks. Three and four-cliques were identified in the QQPPI networks. These cliques contain most of the topologically significant nodes of the networks which form core functional modules consisting of tightly knitted sub-networks. Hitherto unreported 37 PD disease markers were identified based on their topological significance in the networks. Of these 37 markers, eight were significantly involved in the core functional modules and showed significant change in co-expression levels. Four (ARRB2, STX1A, TFRC and MARCKS) out of the 37 markers were found to be associated with several neurotransmitters including dopamine.

Conclusion

This study represents a novel investigation of the PPI networks for PD, a complex disease. 37 proteins identified in our study can be considered as PD network biomarkers. These network biomarkers may provide as potential therapeutic targets for PD applications development.  相似文献   

14.

Background

Bacillus anthracis, Francisella tularensis, and Yersinia pestis are bacterial pathogens that can cause anthrax, lethal acute pneumonic disease, and bubonic plague, respectively, and are listed as NIAID Category A priority pathogens for possible use as biological weapons. However, the interactions between human proteins and proteins in these bacteria remain poorly characterized leading to an incomplete understanding of their pathogenesis and mechanisms of immune evasion.

Methodology

In this study, we used a high-throughput yeast two-hybrid assay to identify physical interactions between human proteins and proteins from each of these three pathogens. From more than 250,000 screens performed, we identified 3,073 human-B. anthracis, 1,383 human-F. tularensis, and 4,059 human-Y. pestis protein-protein interactions including interactions involving 304 B. anthracis, 52 F. tularensis, and 330 Y. pestis proteins that are uncharacterized. Computational analysis revealed that pathogen proteins preferentially interact with human proteins that are hubs and bottlenecks in the human PPI network. In addition, we computed modules of human-pathogen PPIs that are conserved amongst the three networks. Functionally, such conserved modules reveal commonalities between how the different pathogens interact with crucial host pathways involved in inflammation and immunity.

Significance

These data constitute the first extensive protein interaction networks constructed for bacterial pathogens and their human hosts. This study provides novel insights into host-pathogen interactions.  相似文献   

15.

Background

Network-based approaches for the analysis of large-scale genomics data have become well established. Biological networks provide a knowledge scaffold against which the patterns and dynamics of ‘omics’ data can be interpreted. The background information required for the construction of such networks is often dispersed across a multitude of knowledge bases in a variety of formats. The seamless integration of this information is one of the main challenges in bioinformatics. The Semantic Web offers powerful technologies for the assembly of integrated knowledge bases that are computationally comprehensible, thereby providing a potentially powerful resource for constructing biological networks and network-based analysis.

Results

We have developed the Gene eXpression Knowledge Base (GeXKB), a semantic web technology based resource that contains integrated knowledge about gene expression regulation. To affirm the utility of GeXKB we demonstrate how this resource can be exploited for the identification of candidate regulatory network proteins. We present four use cases that were designed from a biological perspective in order to find candidate members relevant for the gastrin hormone signaling network model. We show how a combination of specific query definitions and additional selection criteria derived from gene expression data and prior knowledge concerning candidate proteins can be used to retrieve a set of proteins that constitute valid candidates for regulatory network extensions.

Conclusions

Semantic web technologies provide the means for processing and integrating various heterogeneous information sources. The GeXKB offers biologists such an integrated knowledge resource, allowing them to address complex biological questions pertaining to gene expression. This work illustrates how GeXKB can be used in combination with gene expression results and literature information to identify new potential candidates that may be considered for extending a gene regulatory network.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-014-0386-y) contains supplementary material, which is available to authorized users.  相似文献   

16.
17.

Background

Post-translational modifications (PTMs) impact on the stability, cellular location, and function of a protein thereby achieving a greater functional diversity of the proteome. To fully appreciate how PTMs modulate signaling networks, proteome-wide studies are necessary. However, the evaluation of PTMs on a proteome-wide scale has proven to be technically difficult. To facilitate these analyses we have developed a protein microarray-based assay that is capable of profiling PTM activities in complex biological mixtures such as whole-cell extracts and pathological specimens.

Methodology/Principal Findings

In our assay, protein microarrays serve as a substrate platform for in vitro enzymatic reactions in which a recombinant ligase, or extracts prepared from whole cells or a pathological specimen is overlaid. The reactions include labeled modifiers (e.g., ubiquitin, SUMO1, or NEDD8), ATP regenerating system, and other required components (depending on the assay) that support the conjugation of the modifier. In this report, we apply this methodology to profile three molecularly complex PTMs (ubiquitylation, SUMOylation, and NEDDylation) using purified ligase enzymes and extracts prepared from cultured cell lines and pathological specimens. We further validate this approach by confirming the in vivo modification of several novel PTM substrates identified by our assay.

Conclusions/Significance

This methodology offers several advantages over currently used PTM detection methods including ease of use, rapidity, scale, and sample source diversity. Furthermore, by allowing for the intrinsic enzymatic activities of cell populations or pathological states to be directly compared, this methodology could have widespread applications for the study of PTMs in human diseases and has the potential to be directly applied to most, if not all, basic PTM research.  相似文献   

18.

Introduction

Emerging epidemiological evidence suggests that proton pump inhibitor (PPI) acid-suppression therapy is associated with an increased risk of Clostridium difficile infection (CDI).

Methods

Ovid MEDLINE, EMBASE, ISI Web of Science, and Scopus were searched from 1990 to January 2012 for analytical studies that reported an adjusted effect estimate of the association between PPI use and CDI. We performed random-effect meta-analyses. We used the GRADE framework to interpret the findings.

Results

We identified 47 eligible citations (37 case-control and 14 cohort studies) with corresponding 51 effect estimates. The pooled OR was 1.65, 95% CI (1.47, 1.85), I2 = 89.9%, with evidence of publication bias suggested by a contour funnel plot. A novel regression based method was used to adjust for publication bias and resulted in an adjusted pooled OR of 1.51 (95% CI, 1.26–1.83). In a speculative analysis that assumes that this association is based on causality, and based on published baseline CDI incidence, the risk of CDI would be very low in the general population taking PPIs with an estimated NNH of 3925 at 1 year.

Conclusions

In this rigorously conducted systemic review and meta-analysis, we found very low quality evidence (GRADE class) for an association between PPI use and CDI that does not support a cause-effect relationship.  相似文献   

19.

Objective

Besides reducing gastric acid secretion, proton pump inhibitors (PPIs) suppress Th2-cytokine-stimulated expression of an eosinophil chemoattractant (eotaxin-3) by esophageal epithelial cells through acid-independent, anti-inflammatory mechanisms. To explore acid-inhibitory and acid-independent, anti-inflammatory PPI effects in reducing esophageal eosinophilia, we studied eotaxin-3 expression by the proximal and distal esophagus of children with esophageal eosinophilia before and after PPI therapy. In vitro, we studied acid and bile salt effects on IL-13-stimulated eotaxin-3 expression by esophageal epithelial cells.

Design

Among 264 children with esophageal eosinophilia seen at a tertiary pediatric hospital from 2008 through 2012, we identified 10 with esophageal biopsies before and after PPI treatment alone. We correlated epithelial cell eotaxin-3 immunostaining with eosinophil numbers in those biopsies. In vitro, we measured eotaxin-3 protein secretion by esophageal squamous cells stimulated with IL-13 and exposed to acid and/or bile salt media, with or without omeprazole.

Results

There was strong correlation between peak eosinophil numbers and peak eotaxin-3-positive epithelial cell numbers in esophageal biopsies. Eotaxin-3 expression decreased significantly with PPIs only in the proximal esophagus. In esophageal cells, exposure to acid-bile salt medium significantly suppressed IL-13-induced eotaxin-3 secretion; omeprazole added to the acid-bile salt medium further suppressed that eotaxin-3 secretion, but not as profoundly as at pH-neutral conditions.

Conclusion

In children with esophageal eosinophilia, PPIs significantly decrease eotaxin-3 expression in the proximal but not the distal esophagus. In esophageal squamous cells, acid and bile salts decrease Th2 cytokine-stimulated eotaxin-3 secretion profoundly, possibly explaining the disparate PPI effects on the proximal and distal esophagus. In the distal esophagus, where acid reflux is greatest, a PPI-induced reduction in acid reflux (an effect that could increase eotaxin-3 secretion induced by Th2 cytokines) might mask the acid-independent, anti-inflammatory PPI effect of decreasing cytokine-stimulated eotaxin-3 secretion.  相似文献   

20.

Background

Studies of functional modules in a Protein-Protein Interaction (PPI) network contribute greatly to the understanding of biological mechanisms. With the development of computing science, computational approaches have played an important role in detecting functional modules.

Results

We present a new approach using multi-agent evolution for detection of functional modules in PPI networks. The proposed approach consists of two stages: the solution construction for agents in a population and the evolutionary process of computational agents in a lattice environment, where each agent corresponds to a candidate solution to the detection problem of functional modules in a PPI network. First, the approach utilizes a connection-based encoding scheme to model an agent, and employs a random-walk behavior merged topological characteristics with functional information to construct a solution. Next, it applies several evolutionary operators, i.e., competition, crossover, and mutation, to realize information exchange among agents as well as solution evolution. Systematic experiments have been conducted on three benchmark testing sets of yeast networks. Experimental results show that the approach is more effective compared to several other existing algorithms.

Conclusions

The algorithm has the characteristics of outstanding recall, F-measure, sensitivity and accuracy while keeping other competitive performances, so it can be applied to the biological study which requires high accuracy.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号