首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
《Genomics》2019,111(4):636-641
High-throughput time-series data have a special value for studying the dynamism of biological systems. However, the interpretation of such complex data can be challenging. The aim of this study was to compare common algorithms recently developed for the detection of differentially expressed genes in time-course microarray data. Using different measures such as sensitivity, specificity, predictive values, and related signaling pathways, we found that limma, timecourse, and gprege have reasonably good performance for the analysis of datasets in which only test group is followed over time. However, limma has the additional advantage of being able to report significance cut off, making it a more practical tool. In addition, limma and TTCA can be satisfactorily used for datasets with time-series data for all experimental groups. These findings may assist investigators to select appropriate tools for the detection of differentially expressed genes as an initial step in the interpretation of time-course big data.  相似文献   

2.
Hong F  Li H 《Biometrics》2006,62(2):534-544
Time-course studies of gene expression are essential in biomedical research to understand biological phenomena that evolve in a temporal fashion. We introduce a functional hierarchical model for detecting temporally differentially expressed (TDE) genes between two experimental conditions for cross-sectional designs, where the gene expression profiles are treated as functional data and modeled by basis function expansions. A Monte Carlo EM algorithm was developed for estimating both the gene-specific parameters and the hyperparameters in the second level of modeling. We use a direct posterior probability approach to bound the rate of false discovery at a pre-specified level and evaluate the methods by simulations and application to microarray time-course gene expression data on Caenorhabditis elegans developmental processes. Simulation results suggested that the procedure performs better than the two-way ANOVA in identifying TDE genes, resulting in both higher sensitivity and specificity. Genes identified from the C. elegans developmental data set show clear patterns of changes between the two experimental conditions.  相似文献   

3.
Experiments that longitudinally collect RNA sequencing (RNA-seq) data can provide transformative insights in biology research by revealing the dynamic patterns of genes. Such experiments create a great demand for new analytic approaches to identify differentially expressed (DE) genes based on large-scale time-course count data. Existing methods, however, are suboptimal with respect to power and may lack theoretical justification. Furthermore, most existing tests are designed to distinguish among conditions based on overall differential patterns across time, though in practice, a variety of composite hypotheses are of more scientific interest. Finally, some current methods may fail to control the false discovery rate. In this paper, we propose a new model and testing procedure to address the above issues simultaneously. Specifically, conditional on a latent Gaussian mixture with evolving means, we model the data by negative binomial distributions. Motivated by Storey (2007) and Hwang and Liu (2010), we introduce a general testing framework based on the proposed model and show that the proposed test enjoys the optimality property of maximum average power. The test allows not only identification of traditional DE genes but also testing of a variety of composite hypotheses of biological interest. We establish the identifiability of the proposed model, implement the proposed method via efficient algorithms, and demonstrate its good performance via simulation studies. The procedure reveals interesting biological insights, when applied to data from an experiment that examines the effect of varying light environments on the fundamental physiology of the marine diatom Phaeodactylum tricornutum.  相似文献   

4.
Identification of differentially expressed (DE) genes across two conditions is a common task with microarray. Most existing approaches accomplish this goal by examining each gene separately based on a model and then control the false discovery rate over all genes. We took a different approach that employs a uniform platform to simultaneously depict the dynamics of the gene trajectories for all genes and select differently expressed genes. A new Functional Principal Component (FPC) approach is developed for time-course microarray data to borrow strength across genes. The approach is flexible as the temporal trajectory of the gene expressions is modeled nonparametrically through a set of orthogonal basis functions, and often fewer basis functions are needed to capture the shape of the gene expression trajectory than existing nonparametric methods. These basis functions are estimated from the data reflecting major modes of variation in the data. The correlation structure of the gene expressions over time is also incorporated without any parametric assumptions and estimated from all genes such that the information across other genes can be shared to infer one individual gene. Estimation of the parameters is carried out by an efficient hybrid EM algorithm. The performance of the proposed method across different scenarios was compared favorably in simulation to two-way mixed-effects ANOVA and the EDGE method using B-spline basis function. Application to the real data on C. elegans developmental stages also suggested that FPC analysis combined with hybrid EM algorithm provides a computationally fast and efficient method for identifying DE genes based on time-course microarray data.  相似文献   

5.
DNA microarray experiments have generated large amount of gene expression measurements across different conditions. One crucial step in the analysis of these data is to detect differentially expressed genes. Some parametric methods, including the two-sample t-test (T-test) and variations of it, have been used. Alternatively, a class of non-parametric algorithms, such as the Wilcoxon rank sum test (WRST), significance analysis of microarrays (SAM) of Tusher et al. (2001), the empirical Bayesian (EB) method of Efron et al. (2001), etc., have been proposed. Most available popular methods are based on t-statistic. Due to the quality of the statistic that they used to describe the difference between groups of data, there are situations when these methods are inefficient, especially when the data follows multi-modal distributions. For example, some genes may display different expression patterns in the same cell type, say, tumor or normal, to form some subtypes. Most available methods are likely to miss these genes. We developed a new non-parametric method for selecting differentially expressed genes by relative entropy, called SDEGRE, to detect differentially expressed genes by combining relative entropy and kernel density estimation, which can detect all types of differences between two groups of samples. The significance of whether a gene is differentially expressed or not can be estimated by resampling-based permutations. We illustrate our method on two data sets from Golub et al. (1999) and Alon et al. (1999). Comparing the results with those of the T-test, the WRST and the SAM, we identified novel differentially expressed genes which are of biological significance through previous biological studies while they were not detected by the other three methods. The results also show that the genes selected by SDEGRE have a better capability to distinguish the two cell types.  相似文献   

6.
Cross-species research in drug development is novel and challenging. A bivariate mixture model utilizing information across two species was proposed to solve the fundamental problem of identifying differentially expressed genes in microarray experiments in order to potentially improve the understanding of translation between preclinical and clinical studies for drug development. The proposed approach models the joint distribution of treatment effects estimated from independent linear models. The mixture model posits up to nine components, four of which include groups in which genes are differentially expressed in both species. A comprehensive simulation to evaluate the model performance and one application on a real world data set, a mouse and human type II diabetes experiment, suggest that the proposed model, though highly structured, can handle various configurations of differential gene expression and is practically useful on identifying differentially expressed genes, especially when the magnitude of differential expression due to different treatment intervention is weak. In the mouse and human application, the proposed mixture model was able to eliminate unimportant genes and identify a list of genes that were differentially expressed in both species and could be potential gene targets for drug development.  相似文献   

7.
Commonly accepted intensity-dependent normalization in spotted microarray studies takes account of measurement errors in the differential expression ratio but ignores measurement errors in the total intensity, although the definitions imply the same measurement error components are involved in both statistics. Furthermore, identification of differentially expressed genes is usually considered separately following normalization, which is statistically problematic. By incorporating the measurement errors in both total intensities and differential expression ratios, we propose a measurement-error model for intensity-dependent normalization and identification of differentially expressed genes. This model is also flexible enough to incorporate intra-array and inter-array effects. A Bayesian framework is proposed for the analysis of the proposed measurement-error model to avoid the potential risk of using the common two-step procedure. We also propose a Bayesian identification of differentially expressed genes to control the false discovery rate instead of the ad hoc thresholding of the posterior odds ratio. The simulation study and an application to real microarray data demonstrate promising results.  相似文献   

8.
Low developmental competence of bovine somatic cell nuclear transfer (SCNT) embryos is a universal problem. Abnormal placentation has been commonly reported in SCNT pregnancies from a number of species. The present study employed Affymetrix bovine expression microarrays to examine global gene expression patterns of SCNT and in vivo produced (AI) blastocysts as well as cotyledons from day‐70 SCNT and AI pregnancies. SCNT and AI embryos and cotyledons were analyzed for differential expression. Also in an attempt to establish a link between abnormal gene expression patterns in early embryos and cotyledons, differentially expressed genes were compared between the two studies. Microarray analysis yielded a list of 28 genes differentially expressed between SCNT and AI blastocysts and 19 differentially expressed cotyledon genes. None of the differentially expressed genes were common to both groups, although major histocompatibility complex I (MHCI) was significant in the embryo data and approached significance in the cotyledon data. This is the first study to report global gene expression patterns in bovine AI and SCNT cotyledons. The embryonic gene expression data reported here adds to a growing body of data that indicates the common occurrence of aberrant gene expression in early SCNT embryos. Mol. Reprod. Dev. 76: 471–482, 2009. © 2008 Wiley‐Liss, Inc.  相似文献   

9.
Park T  Yi SG  Lee S  Lee JK 《BioTechniques》2005,38(3):463-471
Different sources of systematic and random error variations are often observed in cDNA microarray experiments. A simple scatter plot is commonly used to examine outlying slides that have unusual expression patterns or larger variability than other slides. These outlying slides tend to have large impacts on the subsequent analyses, such as identification of differentially expressed genes and clustering analysis. However, it is difficult to select outlying slides rigorously and consistently based on subjective human pattern recognition on their scatter plots. A graphical method and a rigorous diagnostic measure are proposed to detect outlying slides. The proposed graphical method is easy to implement and shown to be quite effective in detecting outlying slides in real microarray data sets. This diagnostic measure is also informative to compare variability among slides. Two cDNA microarray data sets are carefully examined to illustrate the proposed approach. A 3840-gene microarray experiment for neuronal differentiation of cortical stem cells and a 2076-gene microarray experiment for anticancer compound time-course expression of the NCI-60 cancer cell lines.  相似文献   

10.
MOTIVATION: Multi-series time-course microarray experiments are useful approaches for exploring biological processes. In this type of experiments, the researcher is frequently interested in studying gene expression changes along time and in evaluating trend differences between the various experimental groups. The large amount of data, multiplicity of experimental conditions and the dynamic nature of the experiments poses great challenges to data analysis. RESULTS: In this work, we propose a statistical procedure to identify genes that show different gene expression profiles across analytical groups in time-course experiments. The method is a two-regression step approach where the experimental groups are identified by dummy variables. The procedure first adjusts a global regression model with all the defined variables to identify differentially expressed genes, and in second a variable selection strategy is applied to study differences between groups and to find statistically significant different profiles. The methodology is illustrated on both a real and a simulated microarray dataset.  相似文献   

11.

Background  

Time-course microarray experiments are being increasingly used to characterize dynamic biological processes. In these experiments, the goal is to identify genes differentially expressed in time-course data, measured between different biological conditions. These differentially expressed genes can reveal the changes in biological process due to the change in condition which is essential to understand differences in dynamics.  相似文献   

12.
Qin LX  Self SG 《Biometrics》2006,62(2):526-533
Identification of differentially expressed genes and clustering of genes are two important and complementary objectives addressed with gene expression data. For the differential expression question, many "per-gene" analytic methods have been proposed. These methods can generally be characterized as using a regression function to independently model the observations for each gene; various adjustments for multiplicity are then used to interpret the statistical significance of these per-gene regression models over the collection of genes analyzed. Motivated by this common structure of per-gene models, we proposed a new model-based clustering method--the clustering of regression models method, which groups genes that share a similar relationship to the covariate(s). This method provides a unified approach for a family of clustering procedures and can be applied for data collected with various experimental designs. In addition, when combined with per-gene methods for assessing differential expression that employ the same regression modeling structure, an integrated framework for the analysis of microarray data is obtained. The proposed methodology was applied to two microarray data sets, one from a breast cancer study and the other from a yeast cell cycle study.  相似文献   

13.
Bochkina N  Richardson S 《Biometrics》2007,63(4):1117-1125
We consider the problem of identifying differentially expressed genes in microarray data in a Bayesian framework with a noninformative prior distribution on the parameter quantifying differential expression. We introduce a new rule, tail posterior probability, based on the posterior distribution of the standardized difference, to identify genes differentially expressed between two conditions, and we derive a frequentist estimator of the false discovery rate associated with this rule. We compare it to other Bayesian rules in the considered settings. We show how the tail posterior probability can be extended to testing a compound null hypothesis against a class of specific alternatives in multiclass data.  相似文献   

14.
MOTIVATION: Statistical methods based on controlling the false discovery rate (FDR) or positive false discovery rate (pFDR) are now well established in identifying differentially expressed genes in DNA microarray. Several authors have recently raised the important issue that FDR or pFDR may give misleading inference when specific genes are of interest because they average the genes under consideration with genes that show stronger evidence for differential expression. The paper proposes a flexible and robust mixture model for estimating the local FDR which quantifies how plausible each specific gene expresses differentially. RESULTS: We develop a special mixture model tailored to multiple testing by requiring the P-value distribution for the differentially expressed genes to be stochastically smaller than the P-value distribution for the non-differentially expressed genes. A smoothing mechanism is built in. The proposed model gives robust estimation of local FDR for any reasonable underlying P-value distributions. It also provides a single framework for estimating the proportion of differentially expressed genes, pFDR, negative predictive values, sensitivity and specificity. A cervical cancer study shows that the local FDR gives more specific and relevant quantification of the evidence for differential expression that can be substantially different from pFDR. AVAILABILITY: An R function implementing the proposed model is available at http://www.geocities.com/jg_liao/software  相似文献   

15.
16.
Yuan M  Kendziorski C 《Biometrics》2006,62(4):1089-1098
Although both clustering and identification of differentially expressed genes are equally essential in most microarray studies, the two tasks are often conducted without regard to each other. This is clearly not the most efficient way of extracting information. The main aim of this article is to develop a coherent statistical method that can simultaneously cluster and detect differentially expressed genes. Through information sharing between the two tasks, the proposed approach gives more sensible clustering among genes and is more sensitive in identifying differentially expressed genes. The improvement over existing methods is illustrated in both our simulation results and a case study.  相似文献   

17.

Background  

This paper presents a unified framework for finding differentially expressed genes (DEGs) from the microarray data. The proposed framework has three interrelated modules: (i) gene ranking, ii) significance analysis of genes and (iii) validation. The first module uses two gene selection algorithms, namely, a) two-way clustering and b) combined adaptive ranking to rank the genes. The second module converts the gene ranks into p-values using an R-test and fuses the two sets of p-values using the Fisher's omnibus criterion. The DEGs are selected using the FDR analysis. The third module performs three fold validations of the obtained DEGs. The robustness of the proposed unified framework in gene selection is first illustrated using false discovery rate analysis. In addition, the clustering-based validation of the DEGs is performed by employing an adaptive subspace-based clustering algorithm on the training and the test datasets. Finally, a projection-based visualization is performed to validate the DEGs obtained using the unified framework.  相似文献   

18.
19.
This study describes the use of a 15 000 gene microarray developed for the toxicological model species, Pimephales promelas , in investigating the impact of acute and chronic methylmercury exposures in male gonad and liver tissues. The results show significant differences in the individual genes that were differentially expressed in response to each treatment. In liver, a total of 650 genes exhibited significantly ( P < 0·05) altered expression with greater than two-fold differences from the controls in response to acute exposure and a total of 267 genes were differentially expressed in response to chronic exposure. A majority of these genes were downregulated rather than upregulated. Fewer genes were altered in gonad than in liver at both timepoints. A total of 212 genes were differentially expressed in response to acute exposure and 155 genes were altered in response to chronic exposure. Despite the differences in individual genes expressed across treatments, the functional categories that altered genes were associated with showed some similarities. Of interest in light of other studies involving the effects of methylmercury on fish, several genes associated with apoptosis were upregulated in response to both acute and chronic exposures. Induction of apoptosis has been associated with effects on reproduction seen in the previous studies. This study demonstrates the utility of microarray analysis for investigations of the physiological effects of toxicants as well as the time-course of effects that may take place. In addition, it is the first publication to demonstrate the use of this new 15 000 gene microarray for fish biology and toxicology.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号