首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
We present a Bayesian hierarchical model for detecting differentially expressing genes that includes simultaneous estimation of array effects, and show how to use the output for choosing lists of genes for further investigation. We give empirical evidence that expression-level dependent array effects are needed, and explore different nonlinear functions as part of our model-based approach to normalization. The model includes gene-specific variances but imposes some necessary shrinkage through a hierarchical structure. Model criticism via posterior predictive checks is discussed. Modeling the array effects (normalization) simultaneously with differential expression gives fewer false positive results. To choose a list of genes, we propose to combine various criteria (for instance, fold change and overall expression) into a single indicator variable for each gene. The posterior distribution of these variables is used to pick the list of genes, thereby taking into account uncertainty in parameter estimates. In an application to mouse knockout data, Gene Ontology annotations over- and underrepresented among the genes on the chosen list are consistent with biological expectations.  相似文献   

2.
Commonly accepted intensity-dependent normalization in spotted microarray studies takes account of measurement errors in the differential expression ratio but ignores measurement errors in the total intensity, although the definitions imply the same measurement error components are involved in both statistics. Furthermore, identification of differentially expressed genes is usually considered separately following normalization, which is statistically problematic. By incorporating the measurement errors in both total intensities and differential expression ratios, we propose a measurement-error model for intensity-dependent normalization and identification of differentially expressed genes. This model is also flexible enough to incorporate intra-array and inter-array effects. A Bayesian framework is proposed for the analysis of the proposed measurement-error model to avoid the potential risk of using the common two-step procedure. We also propose a Bayesian identification of differentially expressed genes to control the false discovery rate instead of the ad hoc thresholding of the posterior odds ratio. The simulation study and an application to real microarray data demonstrate promising results.  相似文献   

3.
We consider the problem of identifying differentially expressed genes under different conditions using gene expression microarrays. Because of the many steps involved in the experimental process, from hybridization to image analysis, cDNA microarray data often contain outliers. For example, an outlying data value could occur because of scratches or dust on the surface, imperfections in the glass, or imperfections in the array production. We develop a robust Bayesian hierarchical model for testing for differential expression. Errors are modeled explicitly using a t-distribution, which accounts for outliers. The model includes an exchangeable prior for the variances, which allows different variances for the genes but still shrinks extreme empirical variances. Our model can be used for testing for differentially expressed genes among multiple samples, and it can distinguish between the different possible patterns of differential expression when there are three or more samples. Parameter estimation is carried out using a novel version of Markov chain Monte Carlo that is appropriate when the model puts mass on subspaces of the full parameter space. The method is illustrated using two publicly available gene expression data sets. We compare our method to six other baseline and commonly used techniques, namely the t-test, the Bonferroni-adjusted t-test, significance analysis of microarrays (SAM), Efron's empirical Bayes, and EBarrays in both its lognormal-normal and gamma-gamma forms. In an experiment with HIV data, our method performed better than these alternatives, on the basis of between-replicate agreement and disagreement.  相似文献   

4.
5.

Background  

In many laboratory-based high throughput microarray experiments, there are very few replicates of gene expression levels. Thus, estimates of gene variances are inaccurate. Visual inspection of graphical summaries of these data usually reveals that heteroscedasticity is present, and the standard approach to address this is to take a log2 transformation. In such circumstances, it is then common to assume that gene variability is constant when an analysis of these data is undertaken. However, this is perhaps too stringent an assumption. More careful inspection reveals that the simple log2 transformation does not remove the problem of heteroscedasticity. An alternative strategy is to assume independent gene-specific variances; although again this is problematic as variance estimates based on few replications are highly unstable. More meaningful and reliable comparisons of gene expression might be achieved, for different conditions or different tissue samples, where the test statistics are based on accurate estimates of gene variability; a crucial step in the identification of differentially expressed genes.  相似文献   

6.
Logit-t employs a logit-transformation for normalization followed by statistical testing at the probe-level. Using four publicly-available datasets, together providing 2,710 known positive incidences of differential expression and 2,913,813 known negative incidences, performance of statistical tests were: Logit-t provided 75% positive-predictive value, compared with 5% for Affymetrix Microarray Suite 5, 6% for dChip perfect match (PM)-only, and 9% for Robust Multi-array Analysis at the p < 0.01 threshold. Logit-t provided 70% sensitivity, Microarray Suite 5 provided 46%, dChip provided 53% and Robust Multi-array Analysis provided 63%.  相似文献   

7.
There exist now a number of statistical methods for detecting differential gene expression in experiments with microarray data. In trials under two conditions, a version of the two-sample t statistic is usually used. However, the problem of estimating the power for these tests has so far been insufficiently studied. In this paper, we propose a method to calculate the power of the robust t test for detecting differential gene expression in experiments with twins. We discuss also the results of the implementation of this method to simulated data.  相似文献   

8.
During the past five years, several methods have been described that allow the isolation and cloning of stage-specific or cell-specific genes. The characterization of genes expressed at different stages of parasite development is of the utmost importance for the understanding of the mechanisms involved in the regulation of gene expression. Here, Samuel Goldenberg and Marco Aurelio Krieger describe a method for the amplification and cloning of Trypanosoma cruzi genes expressed specifically at different times of the metacyclogenesis process. This method, representation of differential expression (RDE), should be useful for the isolation and cloning of any trypanosomatid gene transcribing differentially expressed messenger RNA.  相似文献   

9.
10.
Gene clustering is a useful exploratory technique to group together genes with similar expression levels under distinct cell cycle phases or distinct conditions. It helps the biologist to identify potentially meaningful relationships between genes. In this study, we propose a clustering method based on multivariate normal mixture models, where the number of clusters is predicted via sequential hypothesis tests: at each step, the method considers a mixture model of m components (m = 2 in the first step) and tests if in fact it should be m - 1. If the hypothesis is rejected, m is increased and a new test is carried out. The method continues (increasing m) until the hypothesis is accepted. The theoretical core of the method is the full Bayesian significance test, an intuitive Bayesian approach, which needs no model complexity penalization nor positive probabilities for sharp hypotheses. Numerical experiments were based on a cDNA microarray dataset consisting of expression levels of 205 genes belonging to four functional categories, for 10 distinct strains of Saccharomyces cerevisiae. To analyze the method's sensitivity to data dimension, we performed principal components analysis on the original dataset and predicted the number of classes using 2 to 10 principal components. Compared to Mclust (model-based clustering), our method shows more consistent results.  相似文献   

11.
Analyses of gene expressions in single cells are important for understanding detailed biological phenomena. Here, a highly sensitive and accurate method by sequencing (called “bead-seq”) to obtain a whole gene expression profile for a single cell is proposed. A key feature of the method is to use a complementary DNA (cDNA) library on magnetic beads, which enables adding washing steps to remove residual reagents in a sample preparation process. By adding the washing steps, the next steps can be carried out under the optimal conditions without losing cDNAs. Error sources were carefully evaluated to conclude that the first several steps were the key steps. It is demonstrated that bead-seq is superior to the conventional methods for single-cell gene expression analyses in terms of reproducibility, quantitative accuracy, and biases caused during sample preparation and sequencing processes.  相似文献   

12.
Successful pharmaceutical drug development requires finding correct doses. The issues that conventional dose‐response analyses consider, namely whether responses are related to doses, which doses have responses differing from a control dose response, the functional form of a dose‐response relationship, and the dose(s) to carry forward, do not need to be addressed simultaneously. Determining if a dose‐response relationship exists, regardless of its functional form, and then identifying a range of doses to study further may be a more efficient strategy. This article describes a novel estimation‐focused Bayesian approach (BMA‐Mod) for carrying out the analyses when the actual dose‐response function is unknown. Realizations from Bayesian analyses of linear, generalized linear, and nonlinear regression models that may include random effects and covariates other than dose are optimally combined to produce distributions of important secondary quantities, including test‐control differences, predictive distributions of possible outcomes from future trials, and ranges of doses corresponding to target outcomes. The objective is similar to the objective of the hypothesis‐testing based MCP‐Mod approach, but provides more model and distributional flexibility and does not require testing hypotheses or adjusting for multiple comparisons. A number of examples illustrate the application of the method.  相似文献   

13.
14.
15.
16.
17.

Background and Aims

Low soil fertility limits growth and productivity in many natural and agricultural systems, where the ability to sense and respond to nutrient limitation is important for success. Helianthus anomalus is an annual sunflower of hybrid origin that is adapted to desert sand-dune substrates with lower fertility than its parental species, H. annuus and H. petiolaris. Previous studies have shown that H. anomalus has traits generally associated with adaptation to low-fertility habitats, including a lower inherent relative growth rate and longer leaf lifetime.

Methods

Here, a cDNA microarray is used to identify gene expression differences that potentially contribute to increased tolerance of low fertility of the hybrid species by comparing the nitrogen stress response of all three species with high- and low-nutrient treatments.

Key Results

Relative to the set of genes on the microarray, the genes showing differential expression in the hybrid species compared with its parents are enriched in stress-response genes, developmental genes, and genes involved in responses to biotic or abiotic stimuli. After a correction for multiple comparisons, five unique genes show a significantly different response to nitrogen limitation in H. anomalus compared with H. petiolaris and H. annuus. The Arabidopsis thaliana homologue of one of the five genes, catalase 1, has been shown to affect the timing of leaf senescence, and thus leaf lifespan.

Conclusions

The five genes identified in this analysis will be examined further as candidate genes for the adaptive stress response in H. anomalus. Genes that improve growth and productivity under nutrient stress could be used to improve crops for lower soil fertility which is common in marginal agricultural settings.  相似文献   

18.
19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号