共查询到20条相似文献,搜索用时 0 毫秒
1.
Background
One of the primary tasks in analysing gene expression data is finding genes that are differentially expressed in different samples. Multiple testing issues due to the thousands of tests run make some of the more popular methods for doing this problematic. 相似文献2.
Array-based gene expression studies frequently serve to identify genes that are expressed differently under two or more conditions. The actual analysis of the data, however, may be hampered by a number of technical and statistical problems. Possible remedies on the level of computational analysis lie in appropriate preprocessing steps, proper normalization of the data and application of statistical testing procedures in the derivation of differentially expressed genes. This review summarizes methods that are available for these purposes and provides a brief overview of the available software tools. 相似文献
3.
Microarrays have been used in a wide variety of experimental systems, but realizing their full potential is contingent on sophisticated and rigorous experimental design and data analysis. This article highlights what is needed to get the most out of microarrays in terms of accurately and effectively revealing differential gene expression and regulation in the nervous system. 相似文献
4.
During the past five years, several methods have been described that allow the isolation and cloning of stage-specific or cell-specific genes. The characterization of genes expressed at different stages of parasite development is of the utmost importance for the understanding of the mechanisms involved in the regulation of gene expression. Here, Samuel Goldenberg and Marco Aurelio Krieger describe a method for the amplification and cloning of Trypanosoma cruzi genes expressed specifically at different times of the metacyclogenesis process. This method, representation of differential expression (RDE), should be useful for the isolation and cloning of any trypanosomatid gene transcribing differentially expressed messenger RNA. 相似文献
5.
Belbin TJ Gaspar J Haigentz M Perez-Soler R Keller SM Prystowsky MB Childs G Socci ND 《BioTechniques》2004,36(2):310-314
The use of universal RNA reference sets is an increasingly common approach to molecular classification studies with cDNA microarrays. Here we evaluated the reliability of indirect measurements of fluorescence ratios with a common RNA reference as a means of identifying differentially expressed genes. Comparisons of direct and indirect measures of differential gene expression showed a strong overall correlation in fluorescence ratio measurements but also a high degree of false positives in our indirect measurements. These results indicated that the application of more stringent ratio filters may be required when assessing differential gene expression utilizing a common RNA reference in classification studies. 相似文献
6.
Meng T Chen H Sun M Wang H Zhao G Wang X 《Omics : a journal of integrative biology》2012,16(6):301-311
The purpose of this study was to perform a comprehensive analysis of gene expression profiles in placentas from preeclamptic pregnancies versus normal placentas. Placental tissues were obtained immediately after delivery from women with normal pregnancies (n=6) and patients with preeclampsia (n=6). The gene expression profile was assessed by oligonucleotide-based DNA microarrays and validated by quantitative real-time RT-PCR. Functional relationships and canonical pathways/networks of differentially-expressed genes were evaluated by GeneSpring? GX 11.0 software, and ingenuity pathways analysis (IPA). A total of 939 genes were identified that differed significantly in expression: 483 genes were upregulated and 456 genes were downregulated in preeclamptic placentas compared with normal placentas (fold change ≥ 2 and p<0.05 by unpaired t-test corrected with Bonferroni multiple testing). The IPA revealed that the primary molecular functions of these genes are involved in cellular function and maintenance, cellular development, cell signaling, and lipid metabolism. Pathway analysis provided evidence that a number of biological pathways, including Notch, Wnt, NF-κB, and transforming growth factor-β (TGF-β) signaling pathways, were aberrantly regulated in preeclampsia. In conclusion, our microarray analysis represents a comprehensive list of placental gene expression profiles and various dysregulated signaling pathways that are altered in preeclampsia. These observations may provide the basis for developing novel predictive, diagnostic, and prognostic biomarkers of preeclampsia to improve reproductive outcomes and reduce the risk for subsequent cardiovascular disease. 相似文献
7.
Detecting differential gene expression with a semiparametric hierarchical mixture method 总被引:11,自引:0,他引:11
Mixture modeling provides an effective approach to the differential expression problem in microarray data analysis. Methods based on fully parametric mixture models are available, but lack of fit in some examples indicates that more flexible models may be beneficial. Existing, more flexible, mixture models work at the level of one-dimensional gene-specific summary statistics, and so when there are relatively few measurements per gene these methods may not provide sensitive detectors of differential expression. We propose a hierarchical mixture model to provide methodology that is both sensitive in detecting differential expression and sufficiently flexible to account for the complex variability of normalized microarray data. EM-based algorithms are used to fit both parametric and semiparametric versions of the model. We restrict attention to the two-sample comparison problem; an experiment involving Affymetrix microarrays and yeast translation provides the motivating case study. Gene-specific posterior probabilities of differential expression form the basis of statistical inference; they define short gene lists and false discovery rates. Compared to several competing methodologies, the proposed methodology exhibits good operating characteristics in a simulation study, on the analysis of spike-in data, and in a cross-validation calculation. 相似文献
8.
9.
We consider the problem of identifying differentially expressed genes under different conditions using gene expression microarrays. Because of the many steps involved in the experimental process, from hybridization to image analysis, cDNA microarray data often contain outliers. For example, an outlying data value could occur because of scratches or dust on the surface, imperfections in the glass, or imperfections in the array production. We develop a robust Bayesian hierarchical model for testing for differential expression. Errors are modeled explicitly using a t-distribution, which accounts for outliers. The model includes an exchangeable prior for the variances, which allows different variances for the genes but still shrinks extreme empirical variances. Our model can be used for testing for differentially expressed genes among multiple samples, and it can distinguish between the different possible patterns of differential expression when there are three or more samples. Parameter estimation is carried out using a novel version of Markov chain Monte Carlo that is appropriate when the model puts mass on subspaces of the full parameter space. The method is illustrated using two publicly available gene expression data sets. We compare our method to six other baseline and commonly used techniques, namely the t-test, the Bonferroni-adjusted t-test, significance analysis of microarrays (SAM), Efron's empirical Bayes, and EBarrays in both its lognormal-normal and gamma-gamma forms. In an experiment with HIV data, our method performed better than these alternatives, on the basis of between-replicate agreement and disagreement. 相似文献
10.
Background
In many laboratory-based high throughput microarray experiments, there are very few replicates of gene expression levels. Thus, estimates of gene variances are inaccurate. Visual inspection of graphical summaries of these data usually reveals that heteroscedasticity is present, and the standard approach to address this is to take a log2 transformation. In such circumstances, it is then common to assume that gene variability is constant when an analysis of these data is undertaken. However, this is perhaps too stringent an assumption. More careful inspection reveals that the simple log2 transformation does not remove the problem of heteroscedasticity. An alternative strategy is to assume independent gene-specific variances; although again this is problematic as variance estimates based on few replications are highly unstable. More meaningful and reliable comparisons of gene expression might be achieved, for different conditions or different tissue samples, where the test statistics are based on accurate estimates of gene variability; a crucial step in the identification of differentially expressed genes. 相似文献11.
12.
Ek S Andréasson U Hober S Kampf C Pontén F Uhlén M Merz H Borrebaeck CA 《Molecular & cellular proteomics : MCP》2006,5(6):1072-1081
Mantle cell lymphoma (MCL) is an aggressive lymphoid malignancy for which better treatment strategies are needed. To identify potential diagnostic and therapeutic targets, a signature consisting of MCL-associated genes was selected based on a comprehensive gene expression analysis of malignant and normal B cells. The corresponding protein epitope signature tags were identified and used to raise monospecific, polyclonal antibodies, which were subsequently analyzed on paraffin-embedded sections of malignant and normal tissue. In this study, we demonstrate that the initial selection strategy of MCL-associated genes successfully allows identification of protein antigens either uniquely expressed or overexpressed in MCL compared with normal lymphoid tissues. We propose that genome-based, affinity proteomics, using protein epitope signature tag-induced antibodies, is an efficient way to rapidly identify a number of disease-associated protein candidates of both previously known and unknown identities. 相似文献
13.
Although both clustering and identification of differentially expressed genes are equally essential in most microarray studies, the two tasks are often conducted without regard to each other. This is clearly not the most efficient way of extracting information. The main aim of this article is to develop a coherent statistical method that can simultaneously cluster and detect differentially expressed genes. Through information sharing between the two tasks, the proposed approach gives more sensible clustering among genes and is more sensitive in identifying differentially expressed genes. The improvement over existing methods is illustrated in both our simulation results and a case study. 相似文献
14.
Background
Highly parallel analysis of gene expression has recently been used to identify gene sets or ‘signatures’ to improve patient diagnosis and risk stratification. Once a signature is generated, traditional statistical testing is used to evaluate its prognostic performance. However, due to the dimensionality of microarrays, this can lead to false interpretation of these signatures.Principal Findings
A method was developed to test batches of a user-specified number of randomly chosen signatures in patient microarray datasets. The percentage of random generated signatures yielding prognostic value was assessed using ROC analysis by calculating the area under the curve (AUC) in six public available cancer patient microarray datasets. We found that a signature consisting of randomly selected genes has an average 10% chance of reaching significance when assessed in a single dataset, but can range from 1% to ∼40% depending on the dataset in question. Increasing the number of validation datasets markedly reduces this number.Conclusions
We have shown that the use of an arbitrary cut-off value for evaluation of signature significance is not suitable for this type of research, but should be defined for each dataset separately. Our method can be used to establish and evaluate signature performance of any derived gene signature in a dataset by comparing its performance to thousands of randomly generated signatures. It will be of most interest for cases where few data are available and testing in multiple datasets is limited. 相似文献15.
Background
In microarray experiments the numbers of replicates are often limited due to factors such as cost, availability of sample or poor hybridization. There are currently few choices for the analysis of a pair of microarrays where N = 1 in each condition. In this paper, we demonstrate the effectiveness of a new algorithm called PINC (PINC is Not Cyber-T) that can analyze Affymetrix microarray experiments. 相似文献16.
MOTIVATION: This paper introduces the software EMMIX-GENE that has been developed for the specific purpose of a model-based approach to the clustering of microarray expression data, in particular, of tissue samples on a very large number of genes. The latter is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. A feasible approach is provided by first selecting a subset of the genes relevant for the clustering of the tissue samples by fitting mixtures of t distributions to rank the genes in order of increasing size of the likelihood ratio statistic for the test of one versus two components in the mixture model. The imposition of a threshold on the likelihood ratio statistic used in conjunction with a threshold on the size of a cluster allows the selection of a relevant set of genes. However, even this reduced set of genes will usually be too large for a normal mixture model to be fitted directly to the tissues, and so the use of mixtures of factor analyzers is exploited to reduce effectively the dimension of the feature space of genes. RESULTS: The usefulness of the EMMIX-GENE approach for the clustering of tissue samples is demonstrated on two well-known data sets on colon and leukaemia tissues. For both data sets, relevant subsets of the genes are able to be selected that reveal interesting clusterings of the tissues that are either consistent with the external classification of the tissues or with background and biological knowledge of these sets. AVAILABILITY: EMMIX-GENE is available at http://www.maths.uq.edu.au/~gjm/emmix-gene/ 相似文献
17.
18.
Background
DNA microarrays provide data for genome wide patterns of expression between observation classes. Microarray studies often have small samples sizes, however, due to cost constraints or specimen availability. This can lead to poor random error estimates and inaccurate statistical tests of differential expression. We compare the performance of the standard t-test, fold change, and four small n statistical test methods designed to circumvent these problems. We report results of various normalization methods for empirical microarray data and of various random error models for simulated data. 相似文献19.
In this contribution, we have examined the patterns of gene expression in normal and cataractous lenses as presented in five different papers using microarrays and expressed sequence tags. The purpose was to evaluate unique and common patterns of gene expression during development, aging and cataracts. 相似文献
20.
Evaluation of differential gene expression during behavioral development in the honeybee using microarrays and northern blots 总被引:3,自引:0,他引:3