首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
3.

Background

With the growing abundance of microarray data, statistical methods are increasingly needed to integrate results across studies. Two common approaches for meta-analysis of microarrays include either combining gene expression measures across studies or combining summaries such as p-values, probabilities or ranks. Here, we compare two Bayesian meta-analysis models that are analogous to these methods.

Results

Two Bayesian meta-analysis models for microarray data have recently been introduced. The first model combines standardized gene expression measures across studies into an overall mean, accounting for inter-study variability, while the second combines probabilities of differential expression without combining expression values. Both models produce the gene-specific posterior probability of differential expression, which is the basis for inference. Since the standardized expression integration model includes inter-study variability, it may improve accuracy of results versus the probability integration model. However, due to the small number of studies typical in microarray meta-analyses, the variability between studies is challenging to estimate. The probability integration model eliminates the need to model variability between studies, and thus its implementation is more straightforward. We found in simulations of two and five studies that combining probabilities outperformed combining standardized gene expression measures for three comparison values: the percent of true discovered genes in meta-analysis versus individual studies; the percent of true genes omitted in meta-analysis versus separate studies, and the number of true discovered genes for fixed levels of Bayesian false discovery. We identified similar results when pooling two independent studies of Bacillus subtilis. We assumed that each study was produced from the same microarray platform with only two conditions: a treatment and control, and that the data sets were pre-scaled.

Conclusion

The Bayesian meta-analysis model that combines probabilities across studies does not aggregate gene expression measures, thus an inter-study variability parameter is not included in the model. This results in a simpler modeling approach than aggregating expression measures, which accounts for variability across studies. The probability integration model identified more true discovered genes and fewer true omitted genes than combining expression measures, for our data sets.  相似文献   

4.
Time course microarray experiments designed to characterize the dynamic regulation of gene expression in biological systems are becoming increasingly important. One critical issue that arises when examining time course microarray data is the identification of genes that show different temporal expression patterns among biological conditions. Here we propose a Bayesian hierarchical model to incorporate important experimental factors and to account for correlated gene expression measurements over time and over different genes. A new gene selection algorithm is also presented with the model to simultaneously identify genes that show changes in expression among biological conditions, in response to time and other experimental factors of interest. The algorithm performs well in terms of the false positive and false negative rates in simulation studies. The methodology is applied to a mouse model time course experiment to correlate temporal changes in azoxymethane-induced gene expression profiles with colorectal cancer susceptibility.  相似文献   

5.
MOTIVATION: Identifying groups of co-regulated genes by monitoring their expression over various experimental conditions is complicated by the fact that such co-regulation is condition-specific. Ignoring the context-specific nature of co-regulation significantly reduces the ability of clustering procedures to detect co-expressed genes due to additional 'noise' introduced by non-informative measurements. RESULTS: We have developed a novel Bayesian hierarchical model and corresponding computational algorithms for clustering gene expression profiles across diverse experimental conditions and studies that accounts for context-specificity of gene expression patterns. The model is based on the Bayesian infinite mixtures framework and does not require a priori specification of the number of clusters. We demonstrate that explicit modeling of context-specificity results in increased accuracy of the cluster analysis by examining the specificity and sensitivity of clusters in microarray data. We also demonstrate that probabilities of co-expression derived from the posterior distribution of clusterings are valid estimates of statistical significance of created clusters. AVAILABILITY: The open-source package gimm is available at http://eh3.uc.edu/gimm.  相似文献   

6.
Genes encoding proteins in a common pathway are often found near each other along bacterial chromosomes. Several explanations have been proposed to account for the evolution of these structures. For instance, natural selection may directly favour gene clusters through a variety of mechanisms, such as increased efficiency of coregulation. An alternative and controversial hypothesis is the selfish operon model, which asserts that clustered arrangements of genes are more easily transferred to other species, thus improving the prospects for survival of the cluster. According to another hypothesis (the persistence model), genes that are in close proximity are less likely to be disrupted by deletions. Here we develop computational models to study the conditions under which gene clusters can evolve and persist. First, we examine the selfish operon model by re-implementing the simulation and running it under a wide range of conditions. Second, we introduce and study a Moran process in which there is natural selection for gene clustering and rearrangement occurs by genome inversion events. Finally, we develop and study a model that includes selection and inversion, which tracks the occurrence and fixation of rearrangements. Surprisingly, gene clusters fail to evolve under a wide range of conditions. Factors that promote the evolution of gene clusters include a low number of genes in the pathway, a high population size, and in the case of the selfish operon model, a high horizontal transfer rate. The computational analysis here has shown that the evolution of gene clusters can occur under both direct and indirect selection as long as certain conditions hold. Under these conditions the selfish operon model is still viable as an explanation for the evolution of gene clusters.  相似文献   

7.
A Bayesian network classification methodology for gene expression data.   总被引:5,自引:0,他引:5  
We present new techniques for the application of a Bayesian network learning framework to the problem of classifying gene expression data. The focus on classification permits us to develop techniques that address in several ways the complexities of learning Bayesian nets. Our classification model reduces the Bayesian network learning problem to the problem of learning multiple subnetworks, each consisting of a class label node and its set of parent genes. We argue that this classification model is more appropriate for the gene expression domain than are other structurally similar Bayesian network classification models, such as Naive Bayes and Tree Augmented Naive Bayes (TAN), because our model is consistent with prior domain experience suggesting that a relatively small number of genes, taken in different combinations, is required to predict most clinical classes of interest. Within this framework, we consider two different approaches to identifying parent sets which are supported by the gene expression observations and any other currently available evidence. One approach employs a simple greedy algorithm to search the universe of all genes; the second approach develops and applies a gene selection algorithm whose results are incorporated as a prior to enable an exhaustive search for parent sets over a restricted universe of genes. Two other significant contributions are the construction of classifiers from multiple, competing Bayesian network hypotheses and algorithmic methods for normalizing and binning gene expression data in the absence of prior expert knowledge. Our classifiers are developed under a cross validation regimen and then validated on corresponding out-of-sample test sets. The classifiers attain a classification rate in excess of 90% on out-of-sample test sets for two publicly available datasets. We present an extensive compilation of results reported in the literature for other classification methods run against these same two datasets. Our results are comparable to, or better than, any we have found reported for these two sets, when a train-test protocol as stringent as ours is followed.  相似文献   

8.
A gene cluster which includes genes required for the expression of nitric oxide reductase in Rhodobacter sphaeroides 2.4.3 has been isolated and characterized. Sequence analysis indicates that the two proximal genes in the cluster are the Nor structural genes. These two genes and four distal genes apparently constitute an operon. Mutational analysis indicates that the two structural genes, norC and norB, and the genes immediately downstream, norQ and norD, are required for expression of an active Nor complex. The remaining two genes, nnrT and nnrU, are required for expression of both Nir and Nor. The products of norCBQD have significant identity with products from other denitrifiers, whereas the predicted nnrT and nnrU gene products have no similarity with products corresponding to other sequences in the database. Mutational analysis and functional complementation studies indicate that the nnrT and nnrU genes can be expressed from an internal promoter. Deletion analysis of the regulatory region upstream of norC indicated that a sequence motif which has identity to a motif in the gene encoding nitrite reductase in strain 2.4.3 is critical for nor operon expression. Regulatory studies demonstrated that the first four genes, norCBQD, are expressed only when the oxygen concentration is low and nitrate is present but that the two distal genes, nnrTU, are expressed constitutively.  相似文献   

9.
10.
11.
12.
Identity gene expression in Proteus mirabilis   总被引:1,自引:0,他引:1  
Swarming colonies of independent Proteus mirabilis isolates recognize each other as foreign and do not merge together, whereas apposing swarms of clonal isolates merge with each other. Swarms of mutants with deletions in the ids gene cluster do not merge with their parent. Thus, ids genes are involved in the ability of P. mirabilis to distinguish self from nonself. Here we have characterized expression of the ids genes. We show that idsABCDEF genes are transcribed as an operon, and we define the promoter region upstream of idsA by deletion analysis. Expression of the ids operon increased in late logarithmic and early stationary phases and appeared to be bistable. Approaching swarms of nonself populations led to increased ids expression and increased the abundance of ids-expressing cells in the bimodal population. This information on ids gene expression provides a foundation for further understanding the molecular details of self-nonself discrimination in P. mirabilis.  相似文献   

13.
Kauermann G  Eilers P 《Biometrics》2004,60(2):376-387
An important goal of microarray studies is the detection of genes that show significant changes in expression when two classes of biological samples are being compared. We present an ANOVA-style mixed model with parameters for array normalization, overall level of gene expression, and change of expression between the classes. For the latter we assume a mixing distribution with a probability mass concentrated at zero, representing genes with no changes, and a normal distribution representing the level of change for the other genes. We estimate the parameters by optimizing the marginal likelihood. To make this practical, Laplace approximations and a backfitting algorithm are used. The performance of the model is studied by simulation and by application to publicly available data sets.  相似文献   

14.
Temporal gene expression data are of particular interest to researchers as they contain rich information in characterization of gene function and have been widely used in biomedical studies and early cancer detection. However, the current temporal gene expressions usually have few measuring time series levels; extracting information and identifying efficient treatment effects without temporal information are still a problem. A?dense temporal gene expression data set in bacteria shows that the gene expression has various patterns under different biological conditions. Instead of analyzing gene expression levels, in this paper we consider the relative change-rates of gene in the observation period. We propose a non-linear regression model to characterize the relative change-rates of genes, in which individual expression trajectory is modeled as longitudinal data with changeable variance and covariance structure. Then, based on the parameter estimates, a chi-square test is proposed to test the equality of gene expression change-rates. Furthermore, the Mahalanobis distance is used for the classification of genes. The proposed methods are applied to the data set of 18?genes in P. aeruginosa expressed in 24?biological conditions. The simulation studies show that our methods perform well for analysis of temporal gene expressions.  相似文献   

15.
16.
17.
18.
19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号