首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background  

Gene set enrichment analysis (GSEA) is a microarray data analysis method that uses predefined gene sets and ranks of genes to identify significant biological changes in microarray data sets. GSEA is especially useful when gene expression changes in a given microarray data set is minimal or moderate.  相似文献   

2.

Background  

The biomedical community is developing new methods of data analysis to more efficiently process the massive data sets produced by microarray experiments. Systematic and global mathematical approaches that can be readily applied to a large number of experimental designs become fundamental to correctly handle the otherwise overwhelming data sets.  相似文献   

3.

Background  

The analysis of high-throughput screening data sets is an expanding field in bioinformatics. High-throughput screens by RNAi generate large primary data sets which need to be analyzed and annotated to identify relevant phenotypic hits. Large-scale RNAi screens are frequently used to identify novel factors that influence a broad range of cellular processes, including signaling pathway activity, cell proliferation, and host cell infection. Here, we present a web-based application utility for the end-to-end analysis of large cell-based screening experiments by cellHTS2.  相似文献   

4.

Background

The identification of prognostic biomarkers for cancer patients is essential for cancer research. These days, DNA methylation has been proved to be associated with cancer prognosis. However, there are few methods which identify the prognostic markers based on DNA methylation data systematically, especially considering the interaction among DNA methylation sites.

Methods

In this paper, we first evaluated the stabilities of microRNA, mRNA, and DNA methylation data in prognosis of cancer. After that, a rank-based method was applied to construct a DNA methylation interaction network. In this network, nodes with the largest degrees (10% of all the nodes) were selected as hubs. Cox regression was applied to select the hubs as prognostic signature. In this prognostic signature, DNA methylation levels of each DNA methylation site are correlated with the outcomes of cancer patients. After obtaining these prognostic genes, we performed the survival analysis in the training group and the test group to verify the reliability of these genes.

Results

We applied our method in three cancers (ovarian cancer, breast cancer and Glioblastoma Multiforme).In all the three cancers, there are more common ones of prognostic genes selected from different samples in DNA methylation data, compared with gene expression data and miRNA expression data, which indicates the DNA methylation data may be more stable in cancer prognosis. Power-law distribution fitting suggests that the DNA methylation interaction networks are scale-free. And the hubs selected from the three networks are all enriched by cancer related pathways. The gene signatures were obtained for the three cancers respectively, and survival analysis shows they can distinguish the outcomes of tumor patients in both the training data sets and test data sets, which outperformed the control signatures.

Conclusions

A computational method was proposed to construct DNA methylation interaction network and this network could be used to select prognostic signatures in cancer.
  相似文献   

5.

Background  

Information extraction from microarrays has not yet been widely used in diagnostic or prognostic decision-support systems, due to the diversity of results produced by the available techniques, their instability on different data sets and the inability to relate statistical significance with biological relevance. Thus, there is an urgent need to address the statistical framework of microarray analysis and identify its drawbacks and limitations, which will enable us to thoroughly compare methodologies under the same experimental set-up and associate results with confidence intervals meaningful to clinicians. In this study we consider gene-selection algorithms with the aim to reveal inefficiencies in performance evaluation and address aspects that can reduce uncertainty in algorithmic validation.  相似文献   

6.

Background  

One challenge facing biologists is to tease out useful information from massive data sets for further analysis. A pathway-based analysis may shed light by projecting candidate genes onto protein functional relationship networks. We are building such a pathway-based analysis system.  相似文献   

7.

Background

Bioinformatics tools have been developed to interpret gene expression data at the gene set level, and these gene set based analyses improve the biologists’ capability to discover functional relevance of their experiment design. While elucidating gene set individually, inter-gene sets association is rarely taken into consideration. Deep learning, an emerging machine learning technique in computational biology, can be used to generate an unbiased combination of gene set, and to determine the biological relevance and analysis consistency of these combining gene sets by leveraging large genomic data sets.

Results

In this study, we proposed a gene superset autoencoder (GSAE), a multi-layer autoencoder model with the incorporation of a priori defined gene sets that retain the crucial biological features in the latent layer. We introduced the concept of the gene superset, an unbiased combination of gene sets with weights trained by the autoencoder, where each node in the latent layer is a superset. Trained with genomic data from TCGA and evaluated with their accompanying clinical parameters, we showed gene supersets’ ability of discriminating tumor subtypes and their prognostic capability. We further demonstrated the biological relevance of the top component gene sets in the significant supersets.

Conclusions

Using autoencoder model and gene superset at its latent layer, we demonstrated that gene supersets retain sufficient biological information with respect to tumor subtypes and clinical prognostic significance. Superset also provides high reproducibility on survival analysis and accurate prediction for cancer subtypes.
  相似文献   

8.

Background  

The extraction of biological knowledge from genome-scale data sets requires its analysis in the context of additional biological information. The importance of integrating experimental data sets with molecular interaction networks has been recognized and applied to the study of model organisms, but its systematic application to the study of human disease has lagged behind due to the lack of tools for performing such integration.  相似文献   

9.

Background  

A major goal of the analysis of high-dimensional RNA expression data from tumor tissue is to identify prognostic signatures for discriminating patient subgroups. For this purpose genome-wide identification of bimodally expressed genes from gene array data is relevant because distinguishability of high and low expression groups is easier compared to genes with unimodal expression distributions.  相似文献   

10.

Background  

General protein evolution models help determine the baseline expectations for the evolution of sequences, and they have been extensively useful in sequence analysis and for the computer simulation of artificial sequence data sets.  相似文献   

11.

Background  

Multiple sequence alignments are a fundamental tool for the comparative analysis of proteins and nucleic acids. However, large data sets are no longer manageable for visualization and investigation using the traditional stacked sequence alignment representation.  相似文献   

12.

Background  

Previous differential coexpression analyses focused on identification of differentially coexpressed gene pairs, revealing many insightful biological hypotheses. However, this method could not detect coexpression relationships between pairs of gene sets. Considering the success of many set-wise analysis methods for microarray data, a coexpression analysis based on gene sets may elucidate underlying biological processes provoked by the conditional changes. Here, we propose a differentially coexpressed gene sets (dCoxS) algorithm that identifies the differentially coexpressed gene set pairs between conditions.  相似文献   

13.

Background  

Testing for selection is becoming one of the most important steps in the analysis of multilocus population genetics data sets. Existing applications are difficult to use, leaving many non-trivial, error-prone tasks to the user.  相似文献   

14.

Background  

The assessment of data reproducibility is essential for application of microarray technology to exploration of biological pathways and disease states. Technical variability in data analysis largely depends on signal intensity. Within that context, the reproducibility of individual probe sets has not been hitherto addressed.  相似文献   

15.

Background  

Large biological data sets, such as expression profiles, benefit from reduction of random noise. Principal component (PC) analysis has been used for this purpose, but it tends to remove small features as well as random noise.  相似文献   

16.
17.

Background  

Recent advances in automation technologies have enabled the use of flow cytometry for high throughput screening, generating large complex data sets often in clinical trials or drug discovery settings. However, data management and data analysis methods have not advanced sufficiently far from the initial small-scale studies to support modeling in the presence of multiple covariates.  相似文献   

18.

Background  

Understanding the evolutionary relationships among species based on their genetic information is one of the primary objectives in phylogenetic analysis. Reconstructing phylogenies for large data sets is still a challenging task in Bioinformatics.  相似文献   

19.
20.

Background  

With the advent of systems biology, biological knowledge is often represented today by networks. These include regulatory and metabolic networks, protein-protein interaction networks, and many others. At the same time, high-throughput genomics and proteomics techniques generate very large data sets, which require sophisticated computational analysis. Usually, separate and different analysis methodologies are applied to each of the two data types. An integrated investigation of network and high-throughput information together can improve the quality of the analysis by accounting simultaneously for topological network properties alongside intrinsic features of the high-throughput data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号