共查询到20条相似文献,搜索用时 0 毫秒
1.
Predicting potential drug-target interactions from heterogeneous biological data is critical not only for better understanding of the various interactions and biological processes, but also for the development of novel drugs and the improvement of human medicines. In this paper, the method of Network-based Random Walk with Restart on the Heterogeneous network (NRWRH) is developed to predict potential drug-target interactions on a large scale under the hypothesis that similar drugs often target similar target proteins and the framework of Random Walk. Compared with traditional supervised or semi-supervised methods, NRWRH makes full use of the tool of the network for data integration to predict drug-target associations. It integrates three different networks (protein-protein similarity network, drug-drug similarity network, and known drug-target interaction networks) into a heterogeneous network by known drug-target interactions and implements the random walk on this heterogeneous network. When applied to four classes of important drug-target interactions including enzymes, ion channels, GPCRs and nuclear receptors, NRWRH significantly improves previous methods in terms of cross-validation and potential drug-target interaction prediction. Excellent performance enables us to suggest a number of new potential drug-target interactions for drug development. 相似文献
2.
Dramatic improvements in high throughput sequencing technologies have led to a staggering growth in the number of predicted genes. However, a large fraction of these newly discovered genes do not have a functional assignment. Fortunately, a variety of novel high-throughput genome-wide functional screening technologies provide important clues that shed light on gene function. The integration of heterogeneous data to predict protein function has been shown to improve the accuracy of automated gene annotation systems. In this paper, we propose and evaluate a probabilistic approach for protein function prediction that integrates protein-protein interaction (PPI) data, gene expression data, protein motif information, mutant phenotype data, and protein localization data. First, functional linkage graphs are constructed from PPI data and gene expression data, in which an edge between nodes (proteins) represents evidence for functional similarity. The assumption here is that graph neighbors are more likely to share protein function, compared to proteins that are not neighbors. The functional linkage graph model is then used in concert with protein domain, mutant phenotype and protein localization data to produce a functional prediction. Our method is applied to the functional prediction of Saccharomyces cerevisiae genes, using Gene Ontology (GO) terms as the basis of our annotation. In a cross validation study we show that the integrated model increases recall by 18%, compared to using PPI data alone at the 50% precision. We also show that the integrated predictor is significantly better than each individual predictor. However, the observed improvement vs. PPI depends on both the new source of data and the functional category to be predicted. Surprisingly, in some contexts integration hurts overall prediction accuracy. Lastly, we provide a comprehensive assignment of putative GO terms to 463 proteins that currently have no assigned function. 相似文献
3.
BackgroundResearchers discover LncRNA–miRNA regulatory paradigms modulate gene expression patterns and drive major cellular processes. Identification of lncRNA-miRNA interactions (LMIs) is critical to reveal the mechanism of biological processes and complicated diseases. Because conventional wet experiments are time-consuming, labor-intensive and costly, a few computational methods have been proposed to expedite the identification of lncRNA-miRNA interactions. However, little attention has been paid to fully exploit the structural and topological information of the lncRNA-miRNA interaction network. ResultsIn this paper, we propose novel lncRNA-miRNA prediction methods by using graph embedding and ensemble learning. First, we calculate lncRNA-lncRNA sequence similarity and miRNA-miRNA sequence similarity, and then we combine them with the known lncRNA-miRNA interactions to construct a heterogeneous network. Second, we adopt several graph embedding methods to learn embedded representations of lncRNAs and miRNAs from the heterogeneous network, and construct the ensemble models using two ensemble strategies. For the former, we consider individual graph embedding based models as base predictors and integrate their predictions, and develop a method, named GEEL-PI. For the latter, we construct a deep attention neural network (DANN) to integrate various graph embeddings, and present an ensemble method, named GEEL-FI. The experimental results demonstrate both GEEL-PI and GEEL-FI outperform other state-of-the-art methods. The effectiveness of two ensemble strategies is validated by further experiments. Moreover, the case studies show that GEEL-PI and GEEL-FI can find novel lncRNA-miRNA associations. ConclusionThe study reveals that graph embedding and ensemble learning based method is efficient for integrating heterogeneous information derived from lncRNA-miRNA interaction network and can achieve better performance on LMI prediction task. In conclusion, GEEL-PI and GEEL-FI are promising for lncRNA-miRNA interaction prediction. 相似文献
6.
Protein-protein interaction (PPI) prediction is a central task in achieving a better understanding of cellular and intracellular processes. Because high-throughput experimental methods are both expensive and time-consuming, and are also known of suffering from the problems of incompleteness and noise, many computational methods have been developed, with varied degrees of success. However, the inference of PPI network from multiple heterogeneous data sources remains a great challenge. In this work, we developed a novel method based on approximate Bayesian computation and modified differential evolution sampling (ABC-DEP) and regularized laplacian (RL) kernel. The method enables inference of PPI networks from topological properties and multiple heterogeneous features including gene expression and Pfam domain profiles, in forms of weighted kernels. The optimal weights are obtained by ABC-DEP, and the kernel fusion built based on optimal weights serves as input to RL to infer missing or new edges in the PPI network. Detailed comparisons with control methods have been made, and the results show that the accuracy of PPI prediction measured by AUC is increased by up to 23 %, as compared to a baseline without using optimal weights. The method can provide insights into the relations between PPIs and various feature kernels and demonstrates strong capability of predicting faraway interactions that cannot be well detected by traditional RL method. 相似文献
7.
The majority of small molecule drugs act on protein targets to exert a therapeutic function. It has become apparent in recent years that many small molecule drugs act on more than one particular target and consequently, approaches which profile drugs to uncover their target binding spectrum have become increasingly important. Classical yeast two-hybrid systems have mainly been used to discover and characterize protein-protein interactions, but recent modifications and improvements have opened up new routes towards screening for small molecule-protein interactions. Such yeast "n"-hybrid systems hold great promise for the development of drugs which interfere with protein-protein interactions and for the discovery of drug-target interactions. In this review, we discuss several yeast two-hybrid based approaches with applications in drug discovery and describe a protocol for yeast three-hybrid screening of small molecules to identify their direct targets. 相似文献
8.
Determining protein function is one of the most challenging problems of the post-genomic era. The availability of entire genome sequences and of high-throughput capabilities to determine gene coexpression patterns has shifted the research focus from the study of single proteins or small complexes to that of the entire proteome. In this context, the search for reliable methods for assigning protein function is of primary importance. There are various approaches available for deducing the function of proteins of unknown function using information derived from sequence similarity or clustering patterns of co-regulated genes, phylogenetic profiles, protein-protein interactions (refs. 5-8 and Samanta, M.P. and Liang, S., unpublished data), and protein complexes. Here we propose the assignment of proteins to functional classes on the basis of their network of physical interactions as determined by minimizing the number of protein interactions among different functional categories. Function assignment is proteome-wide and is determined by the global connectivity pattern of the protein network. The approach results in multiple functional assignments, a consequence of the existence of multiple equivalent solutions. We apply the method to analyze the yeast Saccharomyces cerevisiae protein-protein interaction network. The robustness of the approach is tested in a system containing a high percentage of unclassified proteins and also in cases of deletion and insertion of specific protein interactions. 相似文献
9.
Biology can be regarded as a science of networks: interactions between various biological entities (eg genes, proteins, metabolites) on different levels (eg gene regulation, cell signalling) can be represented as graphs and, thus, analysis of such networks might shed new light on the function of biological systems. Such biological networks can be obtained from different sources. The extraction of networks from text is an important technique that requires the integration of several different computational disciplines. This paper summarises the most important steps in network extraction and reviews common approaches and solutions for the extraction of biological networks from scientific literature. 相似文献
13.
Background Regulatory antisense RNAs are a class of ncRNAs that regulate gene expression by prohibiting the translation of an mRNA by
establishing stable interactions with a target sequence. There is great demand for efficient computational methods to predict
the specific interaction between an ncRNA and its target mRNA(s). There are a number of algorithms in the literature which
can predict a variety of such interactions - unfortunately at a very high computational cost. Although some existing target
prediction approaches are much faster, they are specialized for interactions with a single binding site. 相似文献
14.
Background Machine-learning tools have gained considerable attention during the last few years for analyzing biological networks for
protein function prediction. Kernel methods are suitable for learning from graph-based data such as biological networks, as
they only require the abstraction of the similarities between objects into the kernel matrix. One key issue in kernel methods
is the selection of a good kernel function. Diffusion kernels, the discretization of the familiar Gaussian kernel of Euclidean
space, are commonly used for graph-based data. 相似文献
15.
Background Developing reliable and efficient strategies allowing to infer a function to yet uncharacterized proteins based on interaction
networks is of crucial interest in the current context of high-throughput data generation. In this paper, we develop a new
algorithm for clustering vertices of a protein-protein interaction network using a density function, providing disjoint classes. 相似文献
17.
We are beginning to uncover common mechanisms leading to the evolution of biological networks. The driving force behind these
advances is the increasing availability of comparative data in several species. 相似文献
18.
Shotgun proteomics uses liquid chromatography-tandem mass spectrometry to identify proteins in complex biological samples. We describe an algorithm, called Percolator, for improving the rate of confident peptide identifications from a collection of tandem mass spectra. Percolator uses semi-supervised machine learning to discriminate between correct and decoy spectrum identifications, correctly assigning peptides to 17% more spectra from a tryptic Saccharomyces cerevisiae dataset, and up to 77% more spectra from non-tryptic digests, relative to a fully supervised approach. 相似文献
19.
Using a bioenergetic model we show that the pattern of foraging preferences greatly determines the complexity of the resulting food webs. By complexity we refer to the degree of richness of food-web architecture, measured in terms of some topological indicators (number of persistent species and links, connectance, link density, number of trophic levels, and frequency of weak links). The poorest food-web architecture is found for a mean-field scenario where all foraging preferences are assumed to be the same. Richer food webs appear when foraging preferences depend on the trophic position of species. Food-web complexity increases with the number of basal species. We also find a strong correlation between the complexity of a trophic module and the complexity of entire food webs with the same pattern of foraging preferences. 相似文献
|