首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 156 毫秒
1.
MOTIVATION: Microarray technology allows the monitoring of expression levels for thousands of genes simultaneously. In time-course experiments in which gene expression is monitored over time, we are interested in testing gene expression profiles for different experimental groups. However, no sophisticated analytic methods have yet been proposed to handle time-course experiment data. RESULTS: We propose a statistical test procedure based on the ANOVA model to identify genes that have different gene expression profiles among experimental groups in time-course experiments. Especially, we propose a permutation test which does not require the normality assumption. For this test, we use residuals from the ANOVA model only with time-effects. Using this test, we detect genes that have different gene expression profiles among experimental groups. The proposed model is illustrated using cDNA microarrays of 3840 genes obtained in an experiment to search for changes in gene expression profiles during neuronal differentiation of cortical stem cells.  相似文献   

2.
We propose an algorithm for selecting and clustering genes according to their time-course or dose-response profiles using gene expression data. The proposed algorithm is based on the order-restricted inference methodology developed in statistics. We describe the methodology for time-course experiments although it is applicable to any ordered set of treatments. Candidate temporal profiles are defined in terms of inequalities among mean expression levels at the time points. The proposed algorithm selects genes when they meet a bootstrap-based criterion for statistical significance and assigns each selected gene to the best fitting candidate profile. We illustrate the methodology using data from a cDNA microarray experiment in which a breast cancer cell line was stimulated with estrogen for different time intervals. In this example, our method was able to identify several biologically interesting genes that previous analyses failed to reveal.  相似文献   

3.
MOTIVATION: Classifying genes into clusters depending on their expression profiles is one of the most important analysis techniques for microarray data. Because temporal gene expression profiles are indicative of the dynamic functional properties of genes, the application of clustering analysis to time-course data allows the more precise division of genes into functional classes. Conventional clustering methods treat the sampling data at each time point as data obtained under different experimental conditions without considering the continuity of time-course data between time periods t and t+1. Here, we propose a method designated mathematical model-based clustering (MMBC). RESULTS: The proposed method, designated MMBC, was applied to artificial data and time-course data obtained using Saccharomyces cerevisiae. Our method is able to divide data into clusters more accurately and coherently than conventional clustering methods. Furthermore, MMBC is more tolerant to noise than conventional clustering methods. AVAILABILITY: Software is available upon request. CONTACT: taizo@brs.kyushu-u.ac.jp.  相似文献   

4.
The detection of genes that show similar profiles under different experimental conditions is often an initial step in inferring the biological significance of such genes. Visualization tools are used to identify genes with similar profiles in microarray studies. Given the large number of genes recorded in microarray experiments, gene expression data are generally displayed on a low dimensional plot, based on linear methods. However, microarray data show nonlinearity, due to high-order terms of interaction between genes, so alternative approaches, such as kernel methods, may be more appropriate. We introduce a technique that combines kernel principal component analysis (KPCA) and Biplot to visualize gene expression profiles. Our approach relies on the singular value decomposition of the input matrix and incorporates an additional step that involves KPCA. The main properties of our method are the extraction of nonlinear features and the preservation of the input variables (genes) in the output display. We apply this algorithm to colon tumor, leukemia and lymphoma datasets. Our approach reveals the underlying structure of the gene expression profiles and provides a more intuitive understanding of the gene and sample association.  相似文献   

5.
MOTIVATION: The clustering of gene profiles across some experimental conditions of interest contributes significantly to the elucidation of unknown gene function, the validation of gene discoveries and the interpretation of biological processes. However, this clustering problem is not straightforward as the profiles of the genes are not all independently distributed and the expression levels may have been obtained from an experimental design involving replicated arrays. Ignoring the dependence between the gene profiles and the structure of the replicated data can result in important sources of variability in the experiments being overlooked in the analysis, with the consequent possibility of misleading inferences being made. We propose a random-effects model that provides a unified approach to the clustering of genes with correlated expression levels measured in a wide variety of experimental situations. Our model is an extension of the normal mixture model to account for the correlations between the gene profiles and to enable covariate information to be incorporated into the clustering process. Hence the model is applicable to longitudinal studies with or without replication, for example, time-course experiments by using time as a covariate, and to cross-sectional experiments by using categorical covariates to represent the different experimental classes. RESULTS: We show that our random-effects model can be fitted by maximum likelihood via the EM algorithm for which the E(expectation)and M(maximization) steps can be implemented in closed form. Hence our model can be fitted deterministically without the need for time-consuming Monte Carlo approximations. The effectiveness of our model-based procedure for the clustering of correlated gene profiles is demonstrated on three real datasets, representing typical microarray experimental designs, covering time-course, repeated-measurement and cross-sectional data. In these examples, relevant clusters of the genes are obtained, which are supported by existing gene-function annotation. A synthetic dataset is considered too. AVAILABILITY: A Fortran program blue called EMMIX-WIRE (EM-based MIXture analysis WIth Random Effects) is available on request from the corresponding author.  相似文献   

6.

Background  

Time-course microarray experiments are widely used to study the temporal profiles of gene expression. Storey et al. (2005) developed a method for analyzing time-course microarray studies that can be applied to discovering genes whose expression trajectories change over time within a single biological group, or those that follow different time trajectories among multiple groups. They estimated the expression trajectories of each gene using natural cubic splines under the null (no time-course) and alternative (time-course) hypotheses, and used a goodness of fit test statistic to quantify the discrepancy. The null distribution of the statistic was approximated through a bootstrap method. Gene expression levels in microarray data are often complicatedly correlated. An accurate type I error control adjusting for multiple testing requires the joint null distribution of test statistics for a large number of genes. For this purpose, permutation methods have been widely used because of computational ease and their intuitive interpretation.  相似文献   

7.
Time course microarray experiments designed to characterize the dynamic regulation of gene expression in biological systems are becoming increasingly important. One critical issue that arises when examining time course microarray data is the identification of genes that show different temporal expression patterns among biological conditions. Here we propose a Bayesian hierarchical model to incorporate important experimental factors and to account for correlated gene expression measurements over time and over different genes. A new gene selection algorithm is also presented with the model to simultaneously identify genes that show changes in expression among biological conditions, in response to time and other experimental factors of interest. The algorithm performs well in terms of the false positive and false negative rates in simulation studies. The methodology is applied to a mouse model time course experiment to correlate temporal changes in azoxymethane-induced gene expression profiles with colorectal cancer susceptibility.  相似文献   

8.

Background  

In a time-course microarray experiment, the expression level for each gene is observed across a number of time-points in order to characterize the temporal trajectories of the gene-expression profiles. For many of these experiments, the scientific aim is the identification of genes for which the trajectories depend on an experimental or phenotypic factor. There is an extensive recent body of literature on statistical methodology for addressing this analytical problem. Most of the existing methods are based on estimating the time-course trajectories using parametric or non-parametric mean regression methods. The sensitivity of these regression methods to outliers, an issue that is well documented in the statistical literature, should be of concern when analyzing microarray data.  相似文献   

9.
10.

Background  

Time-course microarray experiments are being increasingly used to characterize dynamic biological processes. In these experiments, the goal is to identify genes differentially expressed in time-course data, measured between different biological conditions. These differentially expressed genes can reveal the changes in biological process due to the change in condition which is essential to understand differences in dynamics.  相似文献   

11.
12.

Background  

Time-course microarray experiments produce vector gene expression profiles across a series of time points. Clustering genes based on these profiles is important in discovering functional related and co-regulated genes. Early developed clustering algorithms do not take advantage of the ordering in a time-course study, explicit use of which should allow more sensitive detection of genes that display a consistent pattern over time. Peddada et al. [1] proposed a clustering algorithm that can incorporate the temporal ordering using order-restricted statistical inference. This algorithm is, however, very time-consuming and hence inapplicable to most microarray experiments that contain a large number of genes. Its computational burden also imposes difficulty to assess the clustering reliability, which is a very important measure when clustering noisy microarray data.  相似文献   

13.
MOTIVATION: Time series experiments of cDNA microarrays have been commonly used in various biological studies and conducted under a lot of experimental factors. A popular approach of time series microarray analysis is to compare one gene with another in their expression profiles, and clustering expression sequences is a typical example. On the other hand, a practically important issue in gene expression is to identify the general timing difference that is caused by experimental factors. This type of difference can be extracted by comparing a set of time series expression profiles under a factor with those under another factor, and so it would be difficult to tackle this issue by using only a current approach for time series microarray analysis. RESULTS: We have developed a systematic method to capture the timing difference in gene expression under different experimental factors, based on hidden Markov models. Our model outputs a real-valued vector at each state and has a unique state transition diagram. The parameters of our model are trained from a given set of pairwise (generally multiplewise) expression sequences. We evaluated our model using synthetic as well as real microarray datasets. The results of our experiment indicate that our method worked favourably to identify the timing ordering under different experimental factors, such as that gene expression under heat shock tended to start earlier than that under oxidative stress. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

14.
DNA microarray technology allows researchers to monitor the expressions of thousands of genes under different conditions, and to measure the levels of thousands of different DNA molecules at a given point in the life of an organism, tissue or cell. A wide variety of different diseases that are characterised by unregulated gene expression, DNA replication, cell division and cell death, can be detected early using microarrays. One of the major objectives of microarray experiments is to identify differentially expressed genes under various conditions. The detection of differential gene expression under two different conditions is very important in biological studies, and allows us to identify experimental variables that affect different biological processes. Most of the tests available in the literature are based on the assumption of normal distribution. However, the assumption of normality may not be true in real-life data, particularly with respect to microarray data.A test is proposed for the identification of differentially expressed genes in replicated microarray experiments conducted under two different conditions. The proposed test does not assume the distribution of the parent population; thus, the proposed test is strictly nonparametric in nature. We calculate the p-value and the asymptotic power function of the proposed test statistic. The proposed test statistic is compared with some of its competitors under normal, gamma and exponential population setup using the Monte Carlo simulation technique. The application of the proposed test statistic is presented using microarray data. The proposed test is robust and highly efficient when populations are non-normal.  相似文献   

15.
MOTIVATION: Association pattern discovery (APD) methods have been successfully applied to gene expression data. They find groups of co-regulated genes in which the genes are either up- or down-regulated throughout the identified conditions. These methods, however, fail to identify similarly expressed genes whose expressions change between up- and down-regulation from one condition to another. In order to discover these hidden patterns, we propose the concept of mining co-regulated gene profiles. Co-regulated gene profiles contain two gene sets such that genes within the same set behave identically (up or down) while genes from different sets display contrary behavior. To reduce and group the large number of similar resulting patterns, we propose a new similarity measure that can be applied together with hierarchical clustering methods. RESULTS: We tested our proposed method on two well-known yeast microarray data sets. Our implementation mined the data effectively and discovered patterns of co-regulated genes that are hidden to traditional APD methods. The high content of biologically relevant information in these patterns is demonstrated by the significant enrichment of co-regulated genes with similar functions. Our experimental results show that the Mining Attribute Profile (MAP) method is an efficient tool for the analysis of gene expression data and competitive with bi-clustering techniques.  相似文献   

16.
17.
The major goal of two-color cDNA microarray experiments is to measure the relative gene expression level (i.e., relative amount of mRNA) of each gene between samples in studies of gene expression. More specifically, given an N-sample experiment, we need all N(N - 1)/2 relative expression levels of all sample pairs of each gene for identification of the differentially expressed genes and for clustering of gene expression patterns. However, the intensities observed from two-color cDNA microarray experiments do not simply represent the relative gene expression level. They are composed of signal (gene expression level), noise, and other factors. In discussions on the experimental design of two-color cDNA microarray experiments, little attention has been given to the fact that different combinations of test and control samples will produce microarray intensities data with varying intrinsic composition of factors. As a consequence, not all experimental designs for two-color cDNA microarray experiments are able to provide all possible relative gene expression levels. This phenomenon has never been addressed. To obtain all possible relative gene expression levels, a novel method for two-color cDNA microarray experimental design evaluation is necessary that will allow the making of an accurate choice. In this study, we propose a model-based approach to illustrate how the factor composition of microarray intensities changed with different experimental designs in two-color cDNA microarray experiments. By analyzing 12 experimental designs (including 5 general forms), we demonstrate that not all experimental designs are able to provide all possible relative gene expression levels due to the differences in factor composition. Our results indicate that whether an experimental design can provide all possible relative expression levels of all sample pairs for each gene should be the first criterion to be considered in an evaluation of experimental designs for two-color cDNA microarray experiments.  相似文献   

18.
Hong F  Li H 《Biometrics》2006,62(2):534-544
Time-course studies of gene expression are essential in biomedical research to understand biological phenomena that evolve in a temporal fashion. We introduce a functional hierarchical model for detecting temporally differentially expressed (TDE) genes between two experimental conditions for cross-sectional designs, where the gene expression profiles are treated as functional data and modeled by basis function expansions. A Monte Carlo EM algorithm was developed for estimating both the gene-specific parameters and the hyperparameters in the second level of modeling. We use a direct posterior probability approach to bound the rate of false discovery at a pre-specified level and evaluate the methods by simulations and application to microarray time-course gene expression data on Caenorhabditis elegans developmental processes. Simulation results suggested that the procedure performs better than the two-way ANOVA in identifying TDE genes, resulting in both higher sensitivity and specificity. Genes identified from the C. elegans developmental data set show clear patterns of changes between the two experimental conditions.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号