首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background

Human T-cell leukemia viruses (HTLV) tend to induce some fatal human diseases like Adult T-cell Leukemia (ATL) by targeting human T lymphocytes. To indentify the protein-protein interactions (PPI) between HTLV viruses and Homo sapiens is one of the significant approaches to reveal the underlying mechanism of HTLV infection and host defence. At present, as biological experiments are labor-intensive and expensive, the identified part of the HTLV-human PPI networks is rather small. Although recent years have witnessed much progress in computational modeling for reconstructing pathogen-host PPI networks, data scarcity and data unavailability are two major challenges to be effectively addressed. To our knowledge, no computational method for proteome-wide HTLV-human PPI networks reconstruction has been reported.

Results

In this work we develop Multi-instance Adaboost method to conduct homolog knowledge transfer for computationally reconstructing proteome-wide HTLV-human PPI networks. In this method, the homolog knowledge in the form of gene ontology (GO) is treated as auxiliary homolog instance to address the problems of data scarcity and data unavailability, while the potential negative knowledge transfer is automatically attenuated by AdaBoost instance reweighting. The cross validation experiments show that the homolog knowledge transfer in the form of independent homolog instances can effectively enrich the feature information and substitute for the missing GO information. Moreover, the independent tests show that the method can validate 70.3% of the recently curated interactions, significantly exceeding the 2.1% recognition rate by the HT-Y2H experiment. We have used the method to reconstruct the proteome-wide HTLV-human PPI networks and further conducted gene ontology based clustering of the predicted networks for further biomedical research. The gene ontology based clustering analysis of the predictions provides much biological insight into the pathogenesis of HTLV retroviruses.

Conclusions

The Multi-instance AdaBoost method can effectively address the problems of data scarcity and data unavailability for the proteome-wide HTLV-human PPI interaction networks reconstruction. The gene ontology based clustering analysis of the predictions reveals some important signaling pathways and biological modules that HTLV retroviruses are likely to target.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-245) contains supplementary material, which is available to authorized users.  相似文献   

2.
Pathogen access to host nutrients in infected tissues is fundamental for pathogen growth and virulence, disease progression, and infection control. However, our understanding of this crucial process is still rather limited because of experimental and conceptual challenges. Here, we used proteomics, microbial genetics, competitive infections, and computational approaches to obtain a comprehensive overview of Salmonella nutrition and growth in a mouse typhoid fever model. The data revealed that Salmonella accessed an unexpectedly diverse set of at least 31 different host nutrients in infected tissues but the individual nutrients were available in only scarce amounts. Salmonella adapted to this situation by expressing versatile catabolic pathways to simultaneously exploit multiple host nutrients. A genome-scale computational model of Salmonella in vivo metabolism based on these data was fully consistent with independent large-scale experimental data on Salmonella enzyme quantities, and correctly predicted 92% of 738 reported experimental mutant virulence phenotypes, suggesting that our analysis provided a comprehensive overview of host nutrient supply, Salmonella metabolism, and Salmonella growth during infection. Comparison of metabolic networks of other pathogens suggested that complex host/pathogen nutritional interfaces are a common feature underlying many infectious diseases.  相似文献   

3.
A proteome-wide protein-protein interaction (PPI) network of Methanobrevibacter ruminantium M1 (MRU), a predominant rumen methanogen, was constructed from its metabolic genes using a gene neighborhood algorithm and then compared with closely related rumen methanogens Using proteome-wide PPI approach, we constructed network encompassed 2194 edges and 637 nodes interacting with 634 genes. Network quality and robustness of functional modules were assessed with gene ontology terms. A structure-function-metabolism mapping for each protein has been carried out with efforts to extract experimental PPI concomitant information from the literature. The results of our study revealed that some topological properties of its network were robust for sharing homologous protein interactions across heterotrophic and hydrogenotrophic methanogens. MRU proteome has shown to establish many PPI sub-networks for associated metabolic subsystems required to survive in the rumen environment. MRU genome found to share interacting proteins from its PPI network involved in specific metabolic subsystems distinct to heterotrophic and hydrogenotrophic methanogens. Across these proteomes, the interacting proteins from differential PPI networks were shared in common for the biosynthesis of amino acids, nucleosides, and nucleotides and energy metabolism in which more fractions of protein pairs shared with Methanosarcina acetivorans. Our comparative study expedites our knowledge to understand a complex proteome network associated with typical metabolic subsystems of MRU and to improve its genome-scale reconstruction in the future.  相似文献   

4.
Reconstruction of host-pathogen protein interaction networks is of great significance to reveal the underlying microbic pathogenesis. However, the current experimentally-derived networks are generally small and should be augmented by computational methods for less-biased biological inference. From the point of view of computational modelling, data scarcity, data unavailability and negative data sampling are the three major problems for host-pathogen protein interaction networks reconstruction. In this work, we are motivated to address the three concerns and propose a probability weighted ensemble transfer learning model for HIV-human protein interaction prediction (PWEN-TLM), where support vector machine (SVM) is adopted as the individual classifier of the ensemble model. In the model, data scarcity and data unavailability are tackled by homolog knowledge transfer. The importance of homolog knowledge is measured by the ROC-AUC metric of the individual classifiers, whose outputs are probability weighted to yield the final decision. In addition, we further validate the assumption that only the homolog knowledge is sufficient to train a satisfactory model for host-pathogen protein interaction prediction. Thus the model is more robust against data unavailability with less demanding data constraint. As regards with negative data construction, experiments show that exclusiveness of subcellular co-localized proteins is unbiased and more reliable than random sampling. Last, we conduct analysis of overlapped predictions between our model and the existing models, and apply the model to novel host-pathogen PPIs recognition for further biological research.  相似文献   

5.
Ou-Yang  Le  Yan  Hong  Zhang  Xiao-Fei 《BMC bioinformatics》2017,18(13):463-34

Background

The accurate identification of protein complexes is important for the understanding of cellular organization. Up to now, computational methods for protein complex detection are mostly focus on mining clusters from protein-protein interaction (PPI) networks. However, PPI data collected by high-throughput experimental techniques are known to be quite noisy. It is hard to achieve reliable prediction results by simply applying computational methods on PPI data. Behind protein interactions, there are protein domains that interact with each other. Therefore, based on domain-protein associations, the joint analysis of PPIs and domain-domain interactions (DDI) has the potential to obtain better performance in protein complex detection. As traditional computational methods are designed to detect protein complexes from a single PPI network, it is necessary to design a new algorithm that could effectively utilize the information inherent in multiple heterogeneous networks.

Results

In this paper, we introduce a novel multi-network clustering algorithm to detect protein complexes from multiple heterogeneous networks. Unlike existing protein complex identification algorithms that focus on the analysis of a single PPI network, our model can jointly exploit the information inherent in PPI and DDI data to achieve more reliable prediction results. Extensive experiment results on real-world data sets demonstrate that our method can predict protein complexes more accurately than other state-of-the-art protein complex identification algorithms.

Conclusions

In this work, we demonstrate that the joint analysis of PPI network and DDI network can help to improve the accuracy of protein complex detection.
  相似文献   

6.

Background  

Recent advancements in experimental biotechnology have produced large amounts of protein-protein interaction (PPI) data. The topology of PPI networks is believed to have a strong link to their function. Hence, the abundance of PPI data for many organisms stimulates the development of computational techniques for the modeling, comparison, alignment, and clustering of networks. In addition, finding representative models for PPI networks will improve our understanding of the cell just as a model of gravity has helped us understand planetary motion. To decide if a model is representative, we need quantitative comparisons of model networks to real ones. However, exact network comparison is computationally intractable and therefore several heuristics have been used instead. Some of these heuristics are easily computable "network properties," such as the degree distribution, or the clustering coefficient. An important special case of network comparison is the network alignment problem. Analogous to sequence alignment, this problem asks to find the "best" mapping between regions in two networks. It is expected that network alignment might have as strong an impact on our understanding of biology as sequence alignment has had. Topology-based clustering of nodes in PPI networks is another example of an important network analysis problem that can uncover relationships between interaction patterns and phenotype.  相似文献   

7.
Essential proteins are those that are indispensable to cellular survival and development. Existing methods for essential protein identification generally rely on knock-out experiments and/or the relative density of their interactions (edges) with other proteins in a Protein-Protein Interaction (PPI) network. Here, we present a computational method, called EW, to first rank protein-protein interactions in terms of their Edge Weights, and then identify sub-PPI-networks consisting of only the highly-ranked edges and predict their proteins as essential proteins. We have applied this method to publicly-available PPI data on Saccharomyces cerevisiae (Yeast) and Escherichia coli (E. coli) for essential protein identification, and demonstrated that EW achieves better performance than the state-of-the-art methods in terms of the precision-recall and Jackknife measures. The highly-ranked protein-protein interactions by our prediction tend to be biologically significant in both the Yeast and E. coli PPI networks. Further analyses on systematically perturbed Yeast and E. coli PPI networks through randomly deleting edges demonstrate that the proposed method is robust and the top-ranked edges tend to be more associated with known essential proteins than the lowly-ranked edges.  相似文献   

8.

Background

Essential proteins play an indispensable role in the cellular survival and development. There have been a series of biological experimental methods for finding essential proteins; however they are time-consuming, expensive and inefficient. In order to overcome the shortcomings of biological experimental methods, many computational methods have been proposed to predict essential proteins. The computational methods can be roughly divided into two categories, the topology-based methods and the sequence-based ones. The former use the topological features of protein-protein interaction (PPI) networks while the latter use the sequence features of proteins to predict essential proteins. Nevertheless, it is still challenging to improve the prediction accuracy of the computational methods.

Results

Comparing with nonessential proteins, essential proteins appear more frequently in certain subcellular locations and their evolution more conservative. By integrating the information of subcellular localization, orthologous proteins and PPI networks, we propose a novel essential protein prediction method, named SON, in this study. The experimental results on S.cerevisiae data show that the prediction accuracy of SON clearly exceeds that of nine competing methods: DC, BC, IC, CC, SC, EC, NC, PeC and ION.

Conclusions

We demonstrate that, by integrating the information of subcellular localization, orthologous proteins with PPI networks, the accuracy of predicting essential proteins can be improved. Our proposed method SON is effective for predicting essential proteins.
  相似文献   

9.
Previous research conducted in our laboratory found a significant prevalence of multi-drug resistant (MDR) Salmonella and MDR Escherichia coli (MDR EC) in dairy calves and suggests that the MDR EC population may be an important reservoir for resistance elements that could potentially transfer to Salmonella. Therefore, the objective of the current research was to determine if resistance transfers from MDR EC to susceptible strains of inoculated Salmonella. The experiment utilized Holstein calves (approximately 3 weeks old) naturally colonized with MDR EC and fecal culture negative for Salmonella. Fecal samples were collected for culture of Salmonella and MDR EC throughout the experiment following experimental inoculation with the susceptible Salmonella strains. Results initially suggested that resistance did transfer from the MDR E. coli to the inoculated strains of Salmonella, with these stains demonstrating resistance to multiple antibiotics following in vivo exposure to MDR EC. However, serogrouping and serotyping results from a portion of the Salmonella isolates recovered from the calves post-challenge, identified two new strains of Salmonella; therefore transfer of resistance was not demonstrated under these experimental conditions.  相似文献   

10.

Background

With ever increasing amount of available data on biological networks, modeling and understanding the structure of these large networks is an important problem with profound biological implications. Cellular functions and biochemical events are coordinately carried out by groups of proteins interacting each other in biological modules. Identifying of such modules in protein interaction networks is very important for understanding the structure and function of these fundamental cellular networks. Therefore, developing an effective computational method to uncover biological modules should be highly challenging and indispensable.

Results

The purpose of this study is to introduce a new quantitative measure modularity density into the field of biomolecular networks and develop new algorithms for detecting functional modules in protein-protein interaction (PPI) networks. Specifically, we adopt the simulated annealing (SA) to maximize the modularity density and evaluate its efficiency on simulated networks. In order to address the computational complexity of SA procedure, we devise a spectral method for optimizing the index and apply it to a yeast PPI network.

Conclusions

Our analysis of detected modules by the present method suggests that most of these modules have well biological significance in context of protein complexes. Comparison with the MCL and the modularity based methods shows the efficiency of our method.
  相似文献   

11.
Genome-wide protein–protein interaction (PPI) determination remains a significant unsolved problem in structural biology. The difficulty is twofold since high-throughput experiments (HTEs) have often a relatively high false-positive rate in assigning PPIs, and PPI quaternary structures are more difficult to solve than tertiary structures using traditional structural biology techniques. We proposed a uniform pipeline, Threpp, to address both problems. Starting from a pair of monomer sequences, Threpp first threads both sequences through a complex structure library, where the alignment score is combined with HTE data using a naïve Bayesian classifier model to predict the likelihood of two chains to interact with each other. Next, quaternary complex structures of the identified PPIs are constructed by reassembling monomeric alignments with dimeric threading frameworks through interface-specific structural alignments. The pipeline was applied to the Escherichia coli genome and created 35,125 confident PPIs which is 4.5-fold higher than HTE alone. Graphic analyses of the PPI networks show a scale-free cluster size distribution, consistent with previous studies, which was found critical to the robustness of genome evolution and the centrality of functionally important proteins that are essential to E. coli survival. Furthermore, complex structure models were constructed for all predicted E. coli PPIs based on the quaternary threading alignments, where 6771 of them were found to have a high confidence score that corresponds to the correct fold of the complexes with a TM-score >0.5, and 39 showed a close consistency with the later released experimental structures with an average TM-score = 0.73. These results demonstrated the significant usefulness of threading-based homologous modeling in both genome-wide PPI network detection and complex structural construction.  相似文献   

12.

Background

Proteins dynamically interact with each other to perform their biological functions. The dynamic operations of protein interaction networks (PPI) are also reflected in the dynamic formations of protein complexes. Existing protein complex detection algorithms usually overlook the inherent temporal nature of protein interactions within PPI networks. Systematically analyzing the temporal protein complexes can not only improve the accuracy of protein complex detection, but also strengthen our biological knowledge on the dynamic protein assembly processes for cellular organization.

Results

In this study, we propose a novel computational method to predict temporal protein complexes. Particularly, we first construct a series of dynamic PPI networks by joint analysis of time-course gene expression data and protein interaction data. Then a Time Smooth Overlapping Complex Detection model (TS-OCD) has been proposed to detect temporal protein complexes from these dynamic PPI networks. TS-OCD can naturally capture the smoothness of networks between consecutive time points and detect overlapping protein complexes at each time point. Finally, a nonnegative matrix factorization based algorithm is introduced to merge those very similar temporal complexes across different time points.

Conclusions

Extensive experimental results demonstrate the proposed method is very effective in detecting temporal protein complexes than the state-of-the-art complex detection techniques.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-335) contains supplementary material, which is available to authorized users.  相似文献   

13.
SdiA of E. coli and Salmonella is a LuxR homolog that detects N-acyl homoserine lactones (AHLs). Most LuxR homologs function together with a cognate AHL synthase (a LuxI homolog), but SdiA does not. Instead, SdiA detects AHLs produced by other bacterial species. In this report, we performed a phylogenetic analysis of SdiA. The results suggest that one branch of the Enterobacteriaceae obtained a rhlR/rhlI pair by horizontal transfer. The Erwinia and Pantoea branches still contain the complete pair where it is known as expR/expI and phzR/phzI, respectively. A deletion event removed the luxI homolog from the remainder of the group, leaving just the luxR homolog known as sdiA. Thus ExpR and PhzR are SdiA orthologs and ExpI and PhzI are descendants of the long lost cognate signal synthase of SdiA.  相似文献   

14.
Protein-protein interaction (PPI) networks provide insights into understanding of biological processes, function and the underlying complex evolutionary mechanisms of the cell. Modeling PPI network is an important and fundamental problem in system biology, where it is still of major concern to find a better fitting model that requires less structural assumptions and is more robust against the large fraction of noisy PPIs. In this paper, we propose a new approach called t-logistic semantic embedding (t-LSE) to model PPI networks. t-LSE tries to adaptively learn a metric embedding under the simple geometric assumption of PPI networks, and a non-convex cost function was adopted to deal with the noise in PPI networks. The experimental results show the superiority of the fit of t-LSE over other network models to PPI data. Furthermore, the robust loss function adopted here leads to big improvements for dealing with the noise in PPI network. The proposed model could thus facilitate further graph-based studies of PPIs and may help infer the hidden underlying biological knowledge. The Matlab code implementing the proposed method is freely available from the web site: http://home.ustc.edu.cn/~yzh33108/PPIModel.htm.  相似文献   

15.

Background

Identifying complexes from PPI networks has become a key problem to elucidate protein functions and identify signal and biological processes in a cell. Proteins binding as complexes are important roles of life activity. Accurate determination of complexes in PPI networks is crucial for understanding principles of cellular organization.

Results

We propose a novel method to identify complexes on PPI networks, based on different co-expression information. First, we use Markov Cluster Algorithm with an edge-weighting scheme to calculate complexes on PPI networks. Then, we propose some significant features, such as graph information and gene expression analysis, to filter and modify complexes predicted by Markov Cluster Algorithm. To evaluate our method, we test on two experimental yeast PPI networks.

Conclusions

On DIP network, our method has Precision and F-Measure values of 0.6004 and 0.5528. On MIPS network, our method has F-Measure and S n values of 0.3774 and 0.3453. Comparing to existing methods, our method improves Precision value by at least 0.1752, F-Measure value by at least 0.0448, S n value by at least 0.0771. Experiments show that our method achieves better results than some state-of-the-art methods for identifying complexes on PPI networks, with the prediction quality improved in terms of evaluation criteria.
  相似文献   

16.

Background

Recently, large data sets of protein-protein interactions (PPI) which can be modeled as PPI networks are generated through high-throughput methods. And locally dense regions in PPI networks are very likely to be protein complexes. Since protein complexes play a key role in many biological processes, detecting protein complexes in PPI networks is one of important tasks in post-genomic era. However, PPI networks are often incomplete and noisy, which builds barriers to mining protein complexes.

Results

We propose a new and effective algorithm based on robustness to detect overlapping clusters as protein complexes in PPI networks. And in order to improve the accuracy of resulting clusters, our algorithm tries to reduce bad effects brought by noise in PPI networks. And in our algorithm, each new cluster begins from a seed and is expanded through adding qualified nodes from the cluster's neighbourhood nodes. Besides, in our algorithm, a new distance measurement method between a cluster K and a node in the neighbours of K is proposed as well. The performance of our algorithm is evaluated by applying it on two PPI networks which are Gavin network and Database of Interacting Proteins (DIP). The results show that our algorithm is better than Markov clustering algorithm (MCL), Clique Percolation method (CPM) and core-attachment based method (CoAch) in terms of F-measure, co-localization and Gene Ontology (GO) semantic similarity.

Conclusions

Our algorithm detects locally dense regions or clusters as protein complexes. The results show that protein complexes generated by our algorithm have better quality than those generated by some previous classic methods. Therefore, our algorithm is effective and useful.
  相似文献   

17.

Background

Experimental methods for the identification of essential proteins are always costly, time-consuming, and laborious. It is a challenging task to find protein essentiality only through experiments. With the development of high throughput technologies, a vast amount of protein-protein interactions are available, which enable the identification of essential proteins from the network level. Many computational methods for such task have been proposed based on the topological properties of protein-protein interaction (PPI) networks. However, the currently available PPI networks for each species are not complete, i.e. false negatives, and very noisy, i.e. high false positives, network topology-based centrality measures are often very sensitive to such noise. Therefore, exploring robust methods for identifying essential proteins would be of great value.

Method

In this paper, a new essential protein discovery method, named CoEWC (Co-Expression Weighted by Clustering coefficient), has been proposed. CoEWC is based on the integration of the topological properties of PPI network and the co-expression of interacting proteins. The aim of CoEWC is to capture the common features of essential proteins in both date hubs and party hubs. The performance of CoEWC is validated based on the PPI network of Saccharomyces cerevisiae. Experimental results show that CoEWC significantly outperforms the classical centrality measures, and that it also outperforms PeC, a newly proposed essential protein discovery method which outperforms 15 other centrality measures on the PPI network of Saccharomyces cerevisiae. Especially, when predicting no more than 500 proteins, even more than 50% improvements are obtained by CoEWC over degree centrality (DC), a better centrality measure for identifying protein essentiality.

Conclusions

We demonstrate that more robust essential protein discovery method can be developed by integrating the topological properties of PPI network and the co-expression of interacting proteins. The proposed centrality measure, CoEWC, is effective for the discovery of essential proteins.  相似文献   

18.
Experimental protein-protein interaction (PPI) networks are increasingly being exploited in diverse ways for biological discovery. Accordingly, it is vital to discern their underlying natures by identifying and classifying the various types of deterministic (specific) and probabilistic (nonspecific) interactions detected. To this end, we have analyzed PPI networks determined using a range of high-throughput experimental techniques with the aim of systematically quantifying any biases that arise from the varying cellular abundances of the proteins. We confirm that PPI networks determined using affinity purification methods for yeast and Eschericia coli incorporate a correlation between protein degree, or number of interactions, and cellular abundance. The observed correlations are small but statistically significant and occur in both unprocessed (raw) and processed (high-confidence) data sets. In contrast, the yeast two-hybrid system yields networks that contain no such relationship. While previously commented based on mRNA abundance, our more extensive analysis based on protein abundance confirms a systematic difference between PPI networks determined from the two technologies. We additionally demonstrate that the centrality-lethality rule, which implies that higher-degree proteins are more likely to be essential, may be misleading, as protein abundance measurements identify essential proteins to be more prevalent than nonessential proteins. In fact, we generally find that when there is a degree/abundance correlation, the degree distributions of nonessential and essential proteins are also disparate. Conversely, when there is no degree/abundance correlation, the degree distributions of nonessential and essential proteins are not different. However, we show that essentiality manifests itself as a biological property in all of the yeast PPI networks investigated here via enrichments of interactions between essential proteins. These findings provide valuable insights into the underlying natures of the various high-throughput technologies utilized to detect PPIs and should lead to more effective strategies for the inference and analysis of high-quality PPI data sets.  相似文献   

19.
Within a microbial risk assessment framework, modeling the maximum population density (MPD) of a pathogenic microorganism is important but often not considered. This paper describes a model predicting the MPD of Salmonella on alfalfa as a function of the initial contamination level, the total count of the indigenous microbial population, the maximum pathogen growth rate and the maximum population density of the indigenous microbial population. The model is parameterized by experimental data describing growth of Salmonella on sprouting alfalfa seeds at inoculum size, native microbial load and Pseudomonas fluorescens 2–79. The obtained model fits well to the experimental data, with standard errors less than ten percent of the fitted average values. The results show that the MPD of Salmonella is not only dictated by performance characteristics of Salmonella but depends on the characteristics of the indigenous microbial population like total number of cells and its growth rate. The model can improve the predictions of microbiological growth in quantitative microbial risk assessments. Using this model, the effects of preventive measures to reduce pathogenic load and a concurrent effect on the background population can be better evaluated. If competing microorganisms are more sensitive to a particular decontamination method, a pathogenic microorganism may grow faster and reach a higher level. More knowledge regarding the effect of the indigenous microbial population (size, diversity, composition) of food products on pathogen dynamics is needed in order to make adequate predictions of pathogen dynamics on various food products.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号