首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background

Biclustering has been largely used in biological data analysis, enabling the discovery of putative functional modules from omic and network data. Despite the recognized importance of incorporating domain knowledge to guide biclustering and guarantee a focus on relevant and non-trivial biclusters, this possibility has not yet been comprehensively addressed. This results from the fact that the majority of existing algorithms are only able to deliver sub-optimal solutions with restrictive assumptions on the structure, coherency and quality of biclustering solutions, thus preventing the up-front satisfaction of knowledge-driven constraints. Interestingly, in recent years, a clearer understanding of the synergies between pattern mining and biclustering gave rise to a new class of algorithms, termed as pattern-based biclustering algorithms. These algorithms, able to efficiently discover flexible biclustering solutions with optimality guarantees, are thus positioned as good candidates for knowledge incorporation. In this context, this work aims to bridge the current lack of solid views on the use of background knowledge to guide (pattern-based) biclustering tasks.

Methods

This work extends (pattern-based) biclustering algorithms to guarantee the satisfiability of constraints derived from background knowledge and to effectively explore efficiency gains from their incorporation. In this context, we first show the relevance of constraints with succinct, (anti-)monotone and convertible properties for the analysis of expression data and biological networks. We further show how pattern-based biclustering algorithms can be adapted to effectively prune of the search space in the presence of such constraints, as well as be guided in the presence of biological annotations. Relying on these contributions, we propose BiClustering with Constraints using PAttern Mining (BiC2PAM), an extension of BicPAM and BicNET biclustering algorithms.

Results

Experimental results on biological data demonstrate the importance of incorporating knowledge within biclustering to foster efficiency and enable the discovery of non-trivial biclusters with heightened biological relevance.

Conclusions

This work provides the first comprehensive view and sound algorithm for biclustering biological data with constraints derived from user expectations, knowledge repositories and/or literature.
  相似文献   

2.

Background

With ever increasing amount of available data on biological networks, modeling and understanding the structure of these large networks is an important problem with profound biological implications. Cellular functions and biochemical events are coordinately carried out by groups of proteins interacting each other in biological modules. Identifying of such modules in protein interaction networks is very important for understanding the structure and function of these fundamental cellular networks. Therefore, developing an effective computational method to uncover biological modules should be highly challenging and indispensable.

Results

The purpose of this study is to introduce a new quantitative measure modularity density into the field of biomolecular networks and develop new algorithms for detecting functional modules in protein-protein interaction (PPI) networks. Specifically, we adopt the simulated annealing (SA) to maximize the modularity density and evaluate its efficiency on simulated networks. In order to address the computational complexity of SA procedure, we devise a spectral method for optimizing the index and apply it to a yeast PPI network.

Conclusions

Our analysis of detected modules by the present method suggests that most of these modules have well biological significance in context of protein complexes. Comparison with the MCL and the modularity based methods shows the efficiency of our method.
  相似文献   

3.
4.

Background

Despite the progress in neuroblastoma therapies the mortality of high-risk patients is still high (40–50%) and the molecular basis of the disease remains poorly known. Recently, a mathematical model was used to demonstrate that the network regulating stress signaling by the c-Jun N-terminal kinase pathway played a crucial role in survival of patients with neuroblastoma irrespective of their MYCN amplification status. This demonstrates the enormous potential of computational models of biological modules for the discovery of underlying molecular mechanisms of diseases.

Results

Since signaling is known to be highly relevant in cancer, we have used a computational model of the whole cell signaling network to understand the molecular determinants of bad prognostic in neuroblastoma. Our model produced a comprehensive view of the molecular mechanisms of neuroblastoma tumorigenesis and progression.

Conclusion

We have also shown how the activity of signaling circuits can be considered a reliable model-based prognostic biomarker.

Reviewers

This article was reviewed by Tim Beissbarth, Wenzhong Xiao and Joanna Polanska. For the full reviews, please go to the Reviewers’ comments section.
  相似文献   

5.

Background

Breast cancer and ovarian cancer are hormone driven and are known to have some predisposition genes in common such as the two well known cancer genes BRCA1 and BRCA2. The objective of this study is to compare the coexpression network modules of both cancers, so as to infer the potential cancer-related modules.

Methods

We applied the eigen-decomposition to the matrix that integrates the gene coexpression networks of both breast cancer and ovarian cancer. With hierarchical clustering of the related eigenvectors, we obtained the network modules of both cancers simultaneously. Enrichment analysis on Gene Ontology (GO), KEGG pathway, Disease Ontology (DO), and Gene Set Enrichment Analysis (GSEA) in the identified modules was performed.

Results

We identified 43 modules that are enriched by at least one of the four types of enrichments. 31, 25, and 18 modules are enriched by GO terms, KEGG pathways, and DO terms, respectively. The structure of 29 modules in both cancers is significantly different with p-values less than 0.05, of which 25 modules have larger densities in ovarian cancer. One module was found to be significantly enriched by the terms related to breast cancer from GO, KEGG and DO enrichment. One module was found to be significantly enriched by ovarian cancer related terms.

Conclusion

Breast cancer and ovarian cancer share some common properties on the module level. Integration of both cancers helps identifying the potential cancer associated modules.
  相似文献   

6.

Introduction

High-dose busulfan (busulfan) is an integral part of the majority of hematopoietic cell transplantation conditioning regimens. Intravenous (IV) busulfan doses are personalized using pharmacokinetics (PK)-guided dosing where the patient’s IV busulfan clearance is calculated after the first dose and is used to personalize subsequent doses to a target plasma exposure. PK-guided dosing has improved patient outcomes and is clinically accepted but highly resource-intensive.

Objective

We sought to discover endogenous plasma biomarkers predictive of IV busulfan clearance using a global pharmacometabolomics-based approach

Methods

Using LC-QTOF, we analyzed 59 (discovery) and 88 (validation) plasma samples obtained before IV busulfan administration.

Results

In the discovery dataset, we evaluated the association of the relative abundance of 1885 ions with IV busulfan clearance and found 21 ions that were associated with IV busulfan clearance tertiles (r2 ≥ 0.3). Identified compounds were deoxycholic acid and/or chenodeoxycholic acid, and linoleic acid. We used these 21 ions to develop a parsimonious seven-ion linear predictive model that accurately predicted IV busulfan clearance in 93 % (discovery) and 78 % (validation) of samples.

Conclusion

IV busulfan clearance was significantly correlated with the relative abundance of 21 ions, seven of which were included in a predictive model that accurately predicted IV busulfan clearance in the majority of the validation samples. These results reinforce the potential of pharmacometabolomics as a critical tool in personalized medicine, with the potential to improve the personalized dosing of drugs with a narrow therapeutic index such as busulfan.
  相似文献   

7.

Introduction

Metabolomics is a well-established tool in systems biology, especially in the top–down approach. Metabolomics experiments often results in discovery studies that provide intriguing biological hypotheses but rarely offer mechanistic explanation of such findings. In this light, the interpretation of metabolomics data can be boosted by deploying systems biology approaches.

Objectives

This review aims to provide an overview of systems biology approaches that are relevant to metabolomics and to discuss some successful applications of these methods.

Methods

We review the most recent applications of systems biology tools in the field of metabolomics, such as network inference and analysis, metabolic modelling and pathways analysis.

Results

We offer an ample overview of systems biology tools that can be applied to address metabolomics problems. The characteristics and application results of these tools are discussed also in a comparative manner.

Conclusions

Systems biology-enhanced analysis of metabolomics data can provide insights into the molecular mechanisms originating the observed metabolic profiles and enhance the scientific impact of metabolomics studies.
  相似文献   

8.

Background

Until recently, plant metabolomics have provided a deep understanding on the metabolic regulation in individual plants as experimental units. The application of these techniques to agricultural systems subjected to more complex interactions is a step towards the implementation of translational metabolomics in crop breeding.

Aim of Review

We present here a review paper discussing advances in the knowledge reached in the last years derived from the application of metabolomic techniques that evolved from biomarker discovery to improve crop yield and quality.

Key Scientific Concepts of Review

Translational metabolomics applied to crop breeding programs.
  相似文献   

9.

Introduction

Natural products from culture collections have enormous impact in advancing discovery programs for metabolites of biotechnological importance. These discovery efforts rely on the metabolomic characterization of strain collections.

Objective

Many emerging approaches compare metabolomic profiles of such collections, but few enable the analysis and prioritization of thousands of samples from diverse organisms while delivering chemistry specific read outs.

Method

In this work we utilize untargeted LC–MS/MS based metabolomics together with molecular networking to inventory the chemistries associated with 1000 marine microorganisms.

Result

This approach annotated 76 molecular families (a spectral match rate of 28 %), including clinically and biotechnologically important molecules such as valinomycin, actinomycin D, and desferrioxamine E. Targeting a molecular family produced primarily by one microorganism led to the isolation and structure elucidation of two new molecules designated maridric acids A and B.

Conclusion

Molecular networking guided exploration of large culture collections allows for rapid dereplication of know molecules and can highlight producers of uniques metabolites. These methods, together with large culture collections and growing databases, allow for data driven strain prioritization with a focus on novel chemistries.
  相似文献   

10.

Background

Non-negative matrix factorization (NMF) has been introduced as an important method for mining biological data. Though there currently exists packages implemented in R and other programming languages, they either provide only a few optimization algorithms or focus on a specific application field. There does not exist a complete NMF package for the bioinformatics community, and in order to perform various data mining tasks on biological data.

Results

We provide a convenient MATLAB toolbox containing both the implementations of various NMF techniques and a variety of NMF-based data mining approaches for analyzing biological data. Data mining approaches implemented within the toolbox include data clustering and bi-clustering, feature extraction and selection, sample classification, missing values imputation, data visualization, and statistical comparison.

Conclusions

A series of analysis such as molecular pattern discovery, biological process identification, dimension reduction, disease prediction, visualization, and statistical comparison can be performed using this toolbox.
  相似文献   

11.

Background

Comparison of various kinds of biological data is one of the main problems in bioinformatics and systems biology. Data compression methods have been applied to comparison of large sequence data and protein structure data. Since it is still difficult to compare global structures of large biological networks, it is reasonable to try to apply data compression methods to comparison of biological networks. In existing compression methods, the uniqueness of compression results is not guaranteed because there is some ambiguity in selection of overlapping edges.

Results

This paper proposes novel efficient methods, CompressEdge and CompressVertices, for comparing large biological networks. In the proposed methods, an original network structure is compressed by iteratively contracting identical edges and sets of connected edges. Then, the similarity of two networks is measured by a compression ratio of the concatenated networks. The proposed methods are applied to comparison of metabolic networks of several organisms, H. sapiens, M. musculus, A. thaliana, D. melanogaster, C. elegans, E. coli, S. cerevisiae, and B. subtilis, and are compared with an existing method. These results suggest that our methods can efficiently measure the similarities between metabolic networks.

Conclusions

Our proposed algorithms, which compress node-labeled networks, are useful for measuring the similarity of large biological networks.
  相似文献   

12.

Introduction

Collecting feces is easy. It offers direct outcome to endogenous and microbial metabolites.

Objectives

In a context of lack of consensus about fecal sample preparation, especially in animal species, we developed a robust protocol allowing untargeted LC-HRMS fingerprinting.

Methods

The conditions of extraction (quantity, preparation, solvents, dilutions) were investigated in bovine feces.

Results

A rapid and simple protocol involving feces extraction with methanol (1/3, M/V) followed by centrifugation and a step filtration (10 kDa) was developed.

Conclusion

The workflow generated repeatable and informative fingerprints for robust metabolome characterization.
  相似文献   

13.
14.

Introduction

In systems biology, where a main goal is acquiring knowledge of biological systems, one of the challenges is inferring biochemical interactions from different molecular entities such as metabolites. In this area, the metabolome possesses a unique place for reflecting “true exposure” by being sensitive to variation coming from genetics, time, and environmental stimuli. While influenced by many different reactions, often the research interest needs to be focused on variation coming from a certain source, i.e. a certain covariable \(\mathbf {X}_m\).

Objective

Here, we use network analysis methods to recover a set of metabolite relationships, by finding metabolites sharing a similar relation to \(\mathbf {X}_m\). Metabolite values are based on information coming from individuals’ \(\mathbf {X}_m\) status which might interact with other covariables.

Methods

Alternative to using the original metabolite values, the total information is decomposed by utilizing a linear regression model and the part relevant to \(\mathbf {X}_m\) is further used. For two datasets, two different network estimation methods are considered. The first is weighted gene co-expression network analysis based on correlation coefficients. The second method is graphical LASSO based on partial correlations.

Results

We observed that when using the parts related to the specific covariable of interest, resulting estimated networks display higher interconnectedness. Additionally, several groups of biologically associated metabolites (very large density lipoproteins, lipoproteins, etc.) were identified in the human data example.

Conclusions

This work demonstrates how information on the study design can be incorporated to estimate metabolite networks. As a result, sets of interconnected metabolites can be clustered together with respect to their relation to a covariable of interest.
  相似文献   

15.

Background

Fragment-based approaches have now become an important component of the drug discovery process. At the same time, pharmaceutical chemists are more often turning to the natural world and its extremely large and diverse collection of natural compounds to discover new leads that can potentially be turned into drugs. In this study we introduce and discuss a computational pipeline to automatically extract statistically overrepresented chemical fragments in therapeutic classes, and search for similar fragments in a large database of natural products. By systematically identifying enriched fragments in therapeutic groups, we are able to extract and focus on few fragments that are likely to be active or structurally important.

Results

We show that several therapeutic classes (including antibacterial, antineoplastic, and drugs active on the cardiovascular system, among others) have enriched fragments that are also found in many natural compounds. Further, our method is able to detect fragments shared by a drug and a natural product even when the global similarity between the two molecules is generally low.

Conclusions

A further development of this computational pipeline is to help predict putative therapeutic activities of natural compounds, and to help identify novel leads for drug discovery.
  相似文献   

16.

Background

Protein complexes are important for understanding principles of cellular organization and functions. With the availability of large amounts of high-throughput protein-protein interactions (PPI), many algorithms have been proposed to discover protein complexes from PPI networks. However, existing algorithms generally do not take into consideration the fact that not all the interactions in a PPI network take place at the same time. As a result, predicted complexes often contain many spuriously included proteins, precluding them from matching true complexes.

Results

We propose two methods to tackle this problem: (1) The localization GO term decomposition method: We utilize cellular component Gene Ontology (GO) terms to decompose PPI networks into several smaller networks such that the proteins in each decomposed network are annotated with the same cellular component GO term. (2) The hub removal method: This method is based on the observation that hub proteins are more likely to fuse clusters that correspond to different complexes. To avoid this, we remove hub proteins from PPI networks, and then apply a complex discovery algorithm on the remaining PPI network. The removed hub proteins are added back to the generated clusters afterwards. We tested the two methods on the yeast PPI network downloaded from BioGRID. Our results show that these methods can improve the performance of several complex discovery algorithms significantly. Further improvement in performance is achieved when we apply them in tandem.

Conclusions

The performance of complex discovery algorithms is hindered by the fact that not all the interactions in a PPI network take place at the same time. We tackle this problem by using localization GO terms or hubs to decompose a PPI network before complex discovery, which achieves considerable improvement.
  相似文献   

17.

Background and motivations

Module identification has been studied extensively in order to gain deeper understanding of complex systems, such as social networks as well as biological networks. Modules are often defined as groups of vertices in these networks that are topologically cohesive with similar interaction patterns with the rest of the vertices. Most of the existing module identification algorithms assume that the given networks are faithfully measured without errors. However, in many real-world applications, for example, when analyzing protein-protein interaction networks from high-throughput profiling techniques, there is significant noise with both false positive and missing links between vertices. In this paper, we propose a new model for more robust module identification by taking advantage of multiple observed networks with significant noise so that signals in multiple networks can be strengthened and help improve the solution quality by combining information from various sources.

Methods

We adopt a hierarchical Bayesian model to integrate multiple noisy snapshots that capture the underlying modular structure of the networks under study. By introducing a latent root assignment matrix and its relations to instantaneous module assignments in all the observed networks to capture the underlying modular structure and combine information across multiple networks, an efficient variational Bayes algorithm can be derived to accurately and robustly identify the underlying modules from multiple noisy networks.

Results

Experiments on synthetic and protein-protein interaction data sets show that our proposed model enhances both the accuracy and resolution in detecting cohesive modules, and it is less vulnerable to noise in the observed data. In addition, it shows higher power in predicting missing edges compared to individual-network methods.
  相似文献   

18.

Introduction

Data sharing is being increasingly required by journals and has been heralded as a solution to the ‘replication crisis’.

Objectives

(i) Review data sharing policies of journals publishing the most metabolomics papers associated with open data and (ii) compare these journals’ policies to those that publish the most metabolomics papers.

Methods

A PubMed search was used to identify metabolomics papers. Metabolomics data repositories were manually searched for linked publications.

Results

Journals that support data sharing are not necessarily those with the most papers associated to open metabolomics data.

Conclusion

Further efforts are required to improve data sharing in metabolomics.
  相似文献   

19.

Background

Genome-scale metabolic models provide an opportunity for rational approaches to studies of the different reactions taking place inside the cell. The integration of these models with gene regulatory networks is a hot topic in systems biology. The methods developed to date focus mostly on resolving the metabolic elements and use fairly straightforward approaches to assess the impact of genome expression on the metabolic phenotype.

Results

We present here a method for integrating the reverse engineering of gene regulatory networks into these metabolic models. We applied our method to a high-dimensional gene expression data set to infer a background gene regulatory network. We then compared the resulting phenotype simulations with those obtained by other relevant methods.

Conclusions

Our method outperformed the other approaches tested and was more robust to noise. We also illustrate the utility of this method for studies of a complex biological phenomenon, the diauxic shift in yeast.
  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号