首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
For decades, molecular biologists have been uncovering the mechanics of biological systems. Efforts to bring their findings together have led to the development of multiple databases and information systems that capture and present pathway information in a computable network format. Concurrently, the advent of modern omics technologies has empowered researchers to systematically profile cellular processes across different modalities. Numerous algorithms, methodologies, and tools have been developed to use prior knowledge networks (PKNs) in the analysis of omics datasets. Interestingly, it has been repeatedly demonstrated that the source of prior knowledge can greatly impact the results of a given analysis. For these methods to be successful it is paramount that their selection of PKNs is amenable to the data type and the computational task they aim to accomplish. Here we present a five-level framework that broadly describes network models in terms of their scope, level of detail, and ability to inform causal predictions. To contextualize this framework, we review a handful of network-based omics analysis methods at each level, while also describing the computational tasks they aim to accomplish.  相似文献   

2.
Chinese hamster ovary (CHO) cell lines are widely used in industry for biological drug production. During cell culture development, considerable effort is invested to understand the factors that greatly impact cell growth, specific productivity and product qualities of the biotherapeutics. While high-throughput omics approaches have been increasingly utilized to reveal cellular mechanisms associated with cell line phenotypes and guide process optimization, comprehensive omics data analysis and management have been a challenge. Here we developed CHOmics, a web-based tool for integrative analysis of CHO cell line omics data that provides an interactive visualization of omics analysis outputs and efficient data management. CHOmics has a built-in comprehensive pipeline for RNA sequencing data processing and multi-layer statistical modules to explore relevant genes or pathways. Moreover, advanced functionalities were provided to enable users to customize their analysis and visualize the output systematically and interactively. The tool was also designed with the flexibility to accommodate other types of omics data and thereby enabling multi-omics comparison and visualization at both gene and pathway levels. Collectively, CHOmics is an integrative platform for data analysis, visualization and management with expectations to promote the broader use of omics in CHO cell research.  相似文献   

3.
4.
5.
Experimental biologists are often left alone with the task to download, process, and analyze big datasets in order to perform correlation or other simpler analyses. To address these issues, we introduce EviCor, a handy toolbox for exploration of data from large public resources such as The Cancer Genome Atlas and The Cancer Cell Line Encyclopedia, complemented with follow-up information on same samples, which couples omics datasets with drug response profiles (https://www.evicor.org/). The data was processed for easy retrieval from the server-side database and includes pre-computed drug-feature correlation tables. Using information from multiple independent sources, the task-oriented web interface presents relations between phenotype, single-molecule, and pathway variables with graphical, statistical, and network analysis tools. Building custom multivariate models is enabled via user-friendly web interface and programmatic access via RESTinterface. Project code is available at https://github.com/aveviort/HyperSet.  相似文献   

6.
高通量实验方法的发展导致大量基因组、转录组、代谢组等组学数据的出现,组学数据的整合为全面了解生物学系统提供了条件.但是,由于当前实验技术手段的限制,高通量组学数据大多存在系统偏差,数据类型和可靠程度也各不相同,这给组学数据的整合带来了困难.本文以转录组、蛋白质组和代谢组为重点,综述了近年来组学数据整合方面的研究进展,包括新的数据整合方法和分析平台.虽然现存的数据统计和网络分析的方法有助于发现不同组学数据之间的关联,但是生物学意义上的深层次的数据整合还有待于生物、数学、计算机等各种领域的全面发展.  相似文献   

7.
Computational integrative analysis has become a significant approach in the data-driven exploration of biological problems. Many integration methods for cancer subtyping have been proposed, but evaluating these methods has become a complicated problem due to the lack of gold standards. Moreover, questions of practical importance remain to be addressed regarding the impact of selecting appropriate data types and combinations on the performance of integrative studies. Here, we constructed three classes of benchmarking datasets of nine cancers in TCGA by considering all the eleven combinations of four multi-omics data types. Using these datasets, we conducted a comprehensive evaluation of ten representative integration methods for cancer subtyping in terms of accuracy measured by combining both clustering accuracy and clinical significance, robustness, and computational efficiency. We subsequently investigated the influence of different omics data on cancer subtyping and the effectiveness of their combinations. Refuting the widely held intuition that incorporating more types of omics data always produces better results, our analyses showed that there are situations where integrating more omics data negatively impacts the performance of integration methods. Our analyses also suggested several effective combinations for most cancers under our studies, which may be of particular interest to researchers in omics data analysis.  相似文献   

8.

Background

The integration of high-quality, genome-wide analyses offers a robust approach to elucidating genetic factors involved in complex human diseases. Even though several methods exist to integrate heterogeneous omics data, most biologists still manually select candidate genes by examining the intersection of lists of candidates stemming from analyses of different types of omics data that have been generated by imposing hard (strict) thresholds on quantitative variables, such as P-values and fold changes, increasing the chance of missing potentially important candidates.

Methods

To better facilitate the unbiased integration of heterogeneous omics data collected from diverse platforms and samples, we propose a desirability function framework for identifying candidate genes with strong evidence across data types as targets for follow-up functional analysis. Our approach is targeted towards disease systems with sparse, heterogeneous omics data, so we tested it on one such pathology: spontaneous preterm birth (sPTB).

Results

We developed the software integRATE, which uses desirability functions to rank genes both within and across studies, identifying well-supported candidate genes according to the cumulative weight of biological evidence rather than based on imposition of hard thresholds of key variables. Integrating 10 sPTB omics studies identified both genes in pathways previously suspected to be involved in sPTB as well as novel genes never before linked to this syndrome. integRATE is available as an R package on GitHub (https://github.com/haleyeidem/integRATE).

Conclusions

Desirability-based data integration is a solution most applicable in biological research areas where omics data is especially heterogeneous and sparse, allowing for the prioritization of candidate genes that can be used to inform more targeted downstream functional analyses.
  相似文献   

9.
10.
基于基因表达变异性的通路富集方法研究   总被引:1,自引:0,他引:1       下载免费PDF全文
当前的通路富集方法主要是基于基因的表达差异,很少有方法从通路变异性(方差)角度对其富集分析.我们注意到用合适的统计量描述通路的变异性时,在疾病表型下一些通路的变异性有明显的上升或者下降.因此本研究假设:通路变异性程度在不同表型中存在差异.本文设计了14种描述通路变异性的统计量与检验方法,检测不同表型下变异性有差异的通路即富集通路,并将富集结果与文献检索结果进行比较,同时,分析不同芯片预处理方法对数据和结果的影响.研究结果表明:5种预处理方法中,多阵列对数健壮算法(RMA)是数据预处理的最优方法;不同表型下通路的变异性程度存在差异;根据文献检索的通路结果,14种基于变异性的通路富集方法中,以通路中各基因欧氏距离的方差做统计量进行permutation检验(方法11)能有效识别显著通路,其富集结果优于基因集富集分析(GSEA).综上所述,基于通路变异性的通路富集策略具有可行性,不仅对通路富集分析有一定的理论指导意义,而且为人类疾病研究提供新的视角.  相似文献   

11.
Metabolism is recognized as an important driver of cancer progression and other complex diseases, but global metabolite profiling remains a challenge. Protein expression profiling is often a poor proxy since existing pathway enrichment models provide an incomplete mapping between the proteome and metabolism. To overcome these gaps, we introduce multiomic metabolic enrichment network analysis (MOMENTA), an integrative multiomic data analysis framework for more accurately deducing metabolic pathway changes from proteomics data alone in a gene set analysis context by leveraging protein interaction networks to extend annotated metabolic models. We apply MOMENTA to proteomic data from diverse cancer cell lines and human tumors to demonstrate its utility at revealing variation in metabolic pathway activity across cancer types, which we verify using independent metabolomics measurements. The novel metabolic networks we uncover in breast cancer and other tumors are linked to clinical outcomes, underscoring the pathophysiological relevance of the findings.  相似文献   

12.
Metabolism is recognized as an important driver of cancer progression and other complex diseases, but global metabolite profiling remains a challenge. Protein expression profiling is often a poor proxy since existing pathway enrichment models provide an incomplete mapping between the proteome and metabolism. To overcome these gaps, we introduce multiomic metabolic enrichment network analysis (MOMENTA), an integrative multiomic data analysis framework for more accurately deducing metabolic pathway changes from proteomics data alone in a gene set analysis context by leveraging protein interaction networks to extend annotated metabolic models. We apply MOMENTA to proteomic data from diverse cancer cell lines and human tumors to demonstrate its utility at revealing variation in metabolic pathway activity across cancer types, which we verify using independent metabolomics measurements. The novel metabolic networks we uncover in breast cancer and other tumors are linked to clinical outcomes, underscoring the pathophysiological relevance of the findings.  相似文献   

13.
谢兵兵  杨亚东  丁楠  方向东 《遗传》2015,37(7):655-663
随着高通量测序技术的不断发展与完善,对于不同层次和类型的生物组学数据的获取及分析方法也日趋成熟与完善。基于单组学数据的疾病研究已经发现了诸多新的疾病相关因子,而整合多组学数据研究疾病靶点的工作方兴未艾。生命体是一个复杂的调控系统,疾病的发生与发展涉及基因变异、表观遗传改变、基因表达异常以及信号通路紊乱等诸多层次的复杂调控机制,利用单一组学数据分析致病因子的局限性愈发显著。通过对多种层次和来源的高通量组学数据的整合分析,系统地研究临床发病机理、确定最佳疾病靶点已经成为精准医学研究的重要发展方向,将为疾病研究提供新的思路,并对疾病的早期诊断、个体化治疗和指导用药等提供新的理论依据。本文详细介绍了基因组、转录组和表观组等系统组学研究在疾病靶点筛选方面出现的新技术手段和研究进展,并对它们之间的整合分析新策略和优势进行了讨论。  相似文献   

14.
Errors in sample annotation or labeling often occur in large-scale genetic or genomic studies and are difficult to avoid completely during data generation and management. For integrative genomic studies, it is critical to identify and correct these errors. Different types of genetic and genomic data are inter-connected by cis-regulations. On that basis, we developed a computational approach, Multi-Omics Data Matcher (MODMatcher), to identify and correct sample labeling errors in multiple types of molecular data, which can be used in further integrative analysis. Our results indicate that inspection of sample annotation and labeling error is an indispensable data quality assurance step. Applied to a large lung genomic study, MODMatcher increased statistically significant genetic associations and genomic correlations by more than two-fold. In a simulation study, MODMatcher provided more robust results by using three types of omics data than two types of omics data. We further demonstrate that MODMatcher can be broadly applied to large genomic data sets containing multiple types of omics data, such as The Cancer Genome Atlas (TCGA) data sets.  相似文献   

15.
16.
17.
18.
Molecular and functional profiling of cancer cell lines is subject to laboratory‐specific experimental practices and data analysis protocols. The current challenge therefore is how to make an integrated use of the omics profiles of cancer cell lines for reliable biological discoveries. Here, we carried out a systematic analysis of nine types of data modalities using meta‐analysis of 53 omics studies across 12 research laboratories for 2,018 cell lines. To account for a relatively low consistency observed for certain data modalities, we developed a robust data integration approach that identifies reproducible signals shared among multiple data modalities and studies. We demonstrated the power of the integrative analyses by identifying a novel driver gene, ECHDC1, with tumor suppressive role validated both in breast cancer cells and patient tumors. The multi‐modal meta‐analysis approach also identified synthetic lethal partners of cancer drivers, including a co‐dependency of PTEN deficient endometrial cancer cells on RNA helicases.  相似文献   

19.
20.
《Genomics》2019,111(6):1387-1394
To decipher the genetic architecture of human disease, various types of omics data are generated. Two common omics data are genotypes and gene expression. Often genotype data for a large number of individuals and gene expression data for a few individuals are generated due to biological and technical reasons, leading to unequal sample sizes for different omics data. Unavailability of standard statistical procedure for integrating such datasets motivates us to propose a two-step multi-locus association method using latent variables. Our method is powerful than single/separate omics data analysis and it unravels comprehensively deep-seated signals through a single statistical model. Extensive simulation confirms that it is robust to various genetic models as its power increases with sample size and number of associated loci. It provides p-values very fast. Application to real dataset on psoriasis identifies 17 novel SNPs, functionally related to psoriasis-associated genes, at much smaller sample size than standard GWAS.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号