期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

基因表达谱芯片的数据挖掘 总被引：3，自引：1，他引：3

尤元海张建中《中国生物工程杂志》2009,29(10):87-91

随着基因芯片技术的迅速发展,表达谱芯片分析及aCGH等方法已被广泛应用于生命科学各个研究领域,由此产生的数据也呈指数级增长。如何从海量数据中获取有生物学意义的结果成为摆在生物学工作者面前的难题。对表达谱芯片数据挖掘方法进行了综述。介绍了基本分析思路,当前重点分析方向,如GO分析、pathway与调控网络分析、聚类分析等计算法则和相关几款易用的分析软件。并介绍了几种科学自由计算软件在表达谱生物信息学分析中的应用。藉此为从事芯片分析的研究人员提供参考。相似文献

2.

Assessment method for a power analysis to identify differentially expressed pathways

Tripathi S Emmert-Streib F 《PloS one》2012,7(5):e37510

Gene expression data can provide a very rich source of information for elucidating the biological function on the pathway level if the experimental design considers the needs of the statistical analysis methods. The purpose of this paper is to provide a comparative analysis of statistical methods for detecting the differentially expression of pathways (DEP). In contrast to many other studies conducted so far, we use three novel simulation types, producing a more realistic correlation structure than previous simulation methods. This includes also the generation of surrogate data from two large-scale microarray experiments from prostate cancer and ALL. As a result from our comprehensive analysis of 41,004 parameter configurations, we find that each method should only be applied if certain conditions of the data from a pathway are met. Further, we provide method-specific estimates for the optimal sample size for microarray experiments aiming to identify DEP in order to avoid an underpowered design. Our study highlights the sensitivity of the studied methods on the parameters of the system. 相似文献

3.

M@IA: a modular open-source application for microarray workflow and integrative datamining

Le Béchec A Zindy P Sierocinski T Petritis D Bihouée A Le Meur N Léger J Théret N 《In silico biology》2008,8(1):63-69

相似文献

4.

A statistical method to incorporate biological knowledge for generating testable novel gene regulatory interactions from microarray experiments

Peter Larsen Eyad Almasri Guanrao Chen Yang Dai 《BMC bioinformatics》2007,8(1):317

相似文献

5.

Pathway and gene-set activation measurement from mRNA expression data: the tissue distribution of human pathways

Levine DM Haynor DR Castle JC Stepaniants SB Pellegrini M Mao M Johnson JM 《Genome biology》2006,7(10):R93-17

Background

Interpretation of lists of genes or proteins with altered expression is a critical and time-consuming part of microarray and proteomics research, but relatively little attention has been paid to methods for extracting biological meaning from these output lists. One powerful approach is to examine the expression of predefined biological pathways and gene sets, such as metabolic and signaling pathways and macromolecular complexes. Although many methods for measuring pathway expression have been proposed, a systematic analysis of the performance of multiple methods over multiple independent data sets has not previously been reported. 相似文献

6.

Microarray data analysis and mining approaches. 总被引：1，自引：0，他引：1

Francesca Cordero Marco Botta Raffaele A Calogero 《Briefings in Functional Genomics and Prot》2007,6(4):265-281

相似文献

7.

STEM: a tool for the analysis of short time series gene expression data 总被引：2，自引：0，他引：2

Jason Ernst Ziv Bar-Joseph 《BMC bioinformatics》2006,7(1):191

Background

Time series microarray experiments are widely used to study dynamical biological processes. Due to the cost of microarray experiments, and also in some cases the limited availability of biological material, about 80% of microarray time series experiments are short (3–8 time points). Previously short time series gene expression data has been mainly analyzed using more general gene expression analysis tools not designed for the unique challenges and opportunities inherent in short time series gene expression data. 相似文献

8.

Pathway analysis using random forests classification and regression 总被引：3，自引：0，他引：3

Pang H Lin A Holford M Enerson BE Lu B Lawton MP Floyd E Zhao H 《Bioinformatics (Oxford, England)》2006,22(16):2028-2036

MOTIVATION: Although numerous methods have been developed to better capture biological information from microarray data, commonly used single gene-based methods neglect interactions among genes and leave room for other novel approaches. For example, most classification and regression methods for microarray data are based on the whole set of genes and have not made use of pathway information. Pathway-based analysis in microarray studies may lead to more informative and relevant knowledge for biological researchers. RESULTS: In this paper, we describe a pathway-based classification and regression method using Random Forests to analyze gene expression data. The proposed methods allow researchers to rank important pathways from externally available databases, discover important genes, find pathway-based outlying cases and make full use of a continuous outcome variable in the regression setting. We also compared Random Forests with other machine learning methods using several datasets and found that Random Forests classification error rates were either the lowest or the second-lowest. By combining pathway information and novel statistical methods, this procedure represents a promising computational strategy in dissecting pathways and can provide biological insight into the study of microarray data. AVAILABILITY: Source code written in R is available from http://bioinformatics.med.yale.edu/pathway-analysis/rf.htm. 相似文献

9.

Missing value imputation for microRNA expression data by using a GO-based similarity measure

Yang Yang Xu Zhuangdi Song Dandan 《BMC bioinformatics》2016,17(1):109-116

Missing values are commonly present in microarray data profiles. Instead of discarding genes or samples with incomplete expression level, missing values need to be properly imputed for accurate data analysis. The imputation methods can be roughly categorized as expression level-based and domain knowledge-based. The first type of methods only rely on expression data without the help of external data sources, while the second type incorporates available domain knowledge into expression data to improve imputation accuracy. In recent years, microRNA (miRNA) microarray has been largely developed and used for identifying miRNA biomarkers in complex human disease studies. Similar to mRNA profiles, miRNA expression profiles with missing values can be treated with the existing imputation methods. However, the domain knowledge-based methods are hard to be applied due to the lack of direct functional annotation for miRNAs. With the rapid accumulation of miRNA microarray data, it is increasingly needed to develop domain knowledge-based imputation algorithms specific to miRNA expression profiles to improve the quality of miRNA data analysis. We connect miRNAs with domain knowledge of Gene Ontology (GO) via their target genes, and define miRNA functional similarity based on the semantic similarity of GO terms in GO graphs. A new measure combining miRNA functional similarity and expression similarity is used in the imputation of missing values. The new measure is tested on two miRNA microarray datasets from breast cancer research and achieves improved performance compared with the expression-based method on both datasets. The experimental results demonstrate that the biological domain knowledge can benefit the estimation of missing values in miRNA profiles as well as mRNA profiles. Especially, functional similarity defined by GO terms annotated for the target genes of miRNAs can be useful complementary information for the expression-based method to improve the imputation accuracy of miRNA array data. Our method and data are available to the public upon request. 相似文献

10.

Using prior knowledge to improve genetic network reconstruction from microarray data

Le Phillip P Bahl A Ungar LH 《In silico biology》2004,4(3):335-353

相似文献

11.

Genomic Portraits of the Nervous System in Health and Disease 总被引：1，自引：0，他引：1

D'Agata V Cavallaro S 《Neurochemical research》2004,29(6):1201-1212

As the human genome project moves toward its goal of sequencing the entire human genome, gene expression profiling by DNA microarray technology is being employed to rapidly screen genes for biological information. In this review, we will introduce DNA microarray technology, outline the basic experimental paradigms and data analysis methods, and then show with some examples how gene expression profiling can be applied to the study of the central nervous system in health and disease. 相似文献

12.

eXPatGen: generating dynamic expression patterns for the systematic evaluation of analytical methods 总被引：2，自引：0，他引：2

Michaud DJ Marsh AG Dhurjati PS 《Bioinformatics (Oxford, England)》2003,19(9):1140-1146

MOTIVATION: Experimental gene expression data sets, such as those generated by microarray or gene chip experiments, typically have significant noise and complicated interconnectivities that make understanding even simple regulatory patterns difficult. Given these complications, characterizing the effectiveness of different analysis techniques to uncover network groups and structures remains a challenge. Generating simulated expression patterns with known biological features of expression complexity, diversity and interconnectivities provides a more controlled means of investigating the appropriateness of different analysis methods. A simulation-based approach can systematically evaluate different gene expression analysis techniques and provide a basis for improved methods in dynamic metabolic network reconstruction. RESULTS: We have developed an on-line simulator, called eXPatGen, to generate dynamic gene expression patterns typical of microarray experiments. eXPatGen provides a quantitative network structure to represent key biological features, including the induction, repression, and cascade regulation of messenger RNA (mRNA). The simulation is modular such that the expression model can be replaced with other representations, depending on the level of biological detail required by the user. Two example gene networks, of 25 and 100 genes respectively, were simulated. Two standard analysis techniques, clustering and PCA analysis, were performed on the resulting expression patterns in order to demonstrate how the simulator might be used to evaluate different analysis methods and provide experimental guidance for biological studies of gene expression. AVAILABILITY: http://www.che.udel.edu/eXPatGen/ 相似文献

13.

Using formal concept analysis for microarray data comparison

Choi V Huang Y Lam V Potter D Laubenbacher R Duca K 《Journal of bioinformatics and computational biology》2008,6(1):65-75

Microarray technologies, which can measure tens of thousands of gene expression values simultaneously in a single experiment, have become a common research method for biomedical researchers. Computational tools to analyze microarray data for biological discovery are needed. In this paper, we investigate the feasibility of using formal concept analysis (FCA) as a tool for microarray data analysis. The method of FCA builds a (concept) lattice from the experimental data together with additional biological information. For microarray data, each vertex of the lattice corresponds to a subset of genes that are grouped together according to their expression values and some biological information related to gene function. The lattice structure of these gene sets might reflect biological relationships in the dataset. Similarities and differences between experiments can then be investigated by comparing their corresponding lattices according to various graph measures. We apply our method to microarray data derived from influenza-infected mouse lung tissue and healthy controls. Our preliminary results show the promise of our method as a tool for microarray data analysis. 相似文献

14.

Methods for assessing reproducibility of clustering patterns observed in analyses of microarray data 总被引：5，自引：0，他引：5

McShane LM Radmacher MD Freidlin B Yu R Li MC Simon R 《Bioinformatics (Oxford, England)》2002,18(11):1462-1469

MOTIVATION: Recent technological advances such as cDNA microarray technology have made it possible to simultaneously interrogate thousands of genes in a biological specimen. A cDNA microarray experiment produces a gene expression 'profile'. Often interest lies in discovering novel subgroupings, or 'clusters', of specimens based on their profiles, for example identification of new tumor taxonomies. Cluster analysis techniques such as hierarchical clustering and self-organizing maps have frequently been used for investigating structure in microarray data. However, clustering algorithms always detect clusters, even on random data, and it is easy to misinterpret the results without some objective measure of the reproducibility of the clusters. RESULTS: We present statistical methods for testing for overall clustering of gene expression profiles, and we define easily interpretable measures of cluster-specific reproducibility that facilitate understanding of the clustering structure. We apply these methods to elucidate structure in cDNA microarray gene expression profiles obtained on melanoma tumors and on prostate specimens. 相似文献

15.

A Bayesian Approach to Pathway Analysis by Integrating Gene–Gene Functional Directions and Microarray Data

Yifang Zhao Ming-Hui Chen Baikang Pei David Rowe Dong-Guk Shin Wangang Xie Fang Yu Lynn Kuo 《Statistics in biosciences》2012,4(1):105-131

Many statistical methods have been developed to screen for differentially expressed genes associated with specific phenotypes in the microarray data. However, it remains a major challenge to synthesize the observed expression patterns with abundant biological knowledge for more complete understanding of the biological functions among genes. Various methods including clustering analysis on genes, neural network, Bayesian network and pathway analysis have been developed toward this goal. In most of these procedures, the activation and inhibition relationships among genes have hardly been utilized in the modeling steps. We propose two novel Bayesian models to integrate the microarray data with the putative pathway structures obtained from the KEGG database and the directional gene–gene interactions in the medical literature. We define the symmetric Kullback–Leibler divergence of a pathway, and use it to identify the pathway(s) most supported by the microarray data. Monte Carlo Markov Chain sampling algorithm is given for posterior computation in the hierarchical model. The proposed method is shown to select the most supported pathway in an illustrative example. Finally, we apply the methodology to a real microarray data set to understand the gene expression profile of osteoblast lineage at defined stages of differentiation. We observe that our method correctly identifies the pathways that are reported to play essential roles in modulating bone mass. 相似文献

16.

A Powerful Statistical Approach for Large-Scale Differential Transcription Analysis

Yuan-De Tan Anita M. Chandler Arindam Chaudhury Joel R. Neilson 《PloS one》2015,10(4)

相似文献

17.

最短路径法在水稻ABA 和环境胁迫条件下基因应答网络研究中的应用

张晓鹏黄智星周杰修乃华钟扬陈凡《植物学报》2009,44(2):159-166

目前微阵列数据分析方法都基于具有相似表达模式的基因可能具有相近的生物学功能这一假设, 而实际上参与同一生物学功能的基因, 在表达时间和空间上是有关联的, 而并非表现为相似模式。利用水稻cDNA微阵列, 对水稻在ABA及干旱、寒冷和高盐胁迫条件下的基因表达进行了研究。选取环境胁迫和ABA应答的相关基因, 采用最短路径法(shortest path), 利用自行编制的计算软件, 在表达模式不直接相关的基因之间构建最短路径。研究表明, 通过分析这些基因的表达数据, 可以发现它们在功能上的关联性, 并对未知基因的功能预测进行了探索, 为构建水稻在ABA和环境胁迫条件下的分子应答网络奠定了基础。相似文献

18.

Variable selection and pattern recognition with gene expression data generated by the microarray technology

Szabo A Boucher K Carroll WL Klebanov LB Tsodikov AD Yakovlev AY 《Mathematical biosciences》2002,176(1):71-98

Lack of adequate statistical methods for the analysis of microarray data remains the most critical deterrent to uncovering the true potential of these promising techniques in basic and translational biological studies. The popular practice of drawing important biological conclusions from just one replicate (slide) should be discouraged. In this paper, we discuss some modern trends in statistical analysis of microarray data with a special focus on statistical classification (pattern recognition) and variable selection. In addressing these issues we consider the utility of some distances between random vectors and their nonparametric estimates obtained from gene expression data. Performance of the proposed distances is tested by computer simulations and analysis of gene expression data on two different types of human leukemia. In experimental settings, the error rate is estimated by cross-validation, while a control sample is generated in computer simulation experiments aimed at testing the proposed gene selection procedures and associated classification rules. 相似文献

19.

Methods for evaluating gene expression from Affymetrix microarray datasets

Ning Jiang Lindsey J Leach Xiaohua Hu Elena Potokina Tianye Jia Arnis Druka Robbie Waugh Michael J Kearsey Zewei W Luo 《BMC bioinformatics》2008,9(1):284

相似文献

20.

Unfolding of microarray data.

A B Goryachev P F Macgregor A M Edwards 《Journal of computational biology》2001,8(4):443-461

相似文献