首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Time course expression analysis constitutes a large portion of applications of microarray experiments. One primary goal of such experiments is to detect genes with the temporal changes over a period of time or at some interested time points. Difficulties arising from data with small number of replicates over only a few unaligned time points in multiple groups pose challenges for efficient statistical analysis. Some known methods are limited by the unverifiable assumptions or by the scope of applications for only two groups. We present a new method for detecting differentially expressed genes under nonhomogeneous time course experiments in multiple groups. The new method first models the time course curve of one gene by a Gaussian process to align the nonhomogeneous time course data and to compute the gradient of the time course curve as well, the latter of which is used as directional information to enhance the sensitivity of detection for temporal changes. Second, we adopt a nonparametric method to test a surrogate hypothesis based on the augmented data from the Gaussian process model. The proposed method is robust in terms of model fitting and testing. It does not require any distributional assumption for the observations or the test statistic and the method works for the case with as few as triplicate samples over four or five time points under multiple groups. We show the effectiveness and superiority of the new method in comparison with some existing methods using simulated models and two real data sets.  相似文献   

2.
Microarray-based expression profiling experiments typically use either a one-color or a two-color design to measure mRNA abundance. The validity of each approach has been amply demonstrated. Here we provide a simultaneous comparison of results from one- and two-color labeling designs, using two independent RNA samples from the Microarray Quality Control (MAQC) project, tested on each of three different microarray platforms. The data were evaluated in terms of reproducibility, specificity, sensitivity and accuracy to determine if the two approaches provide comparable results. For each of the three microarray platforms tested, the results show good agreement with high correlation coefficients and high concordance of differentially expressed gene lists within each platform. Cumulatively, these comparisons indicate that data quality is essentially equivalent between the one- and two-color approaches and strongly suggest that this variable need not be a primary factor in decisions regarding experimental microarray design.  相似文献   

3.
Conventional statistical methods for interpreting microarray data require large numbers of replicates in order to provide sufficient levels of sensitivity. We recently described a method for identifying differentially-expressed genes in one-channel microarray data 1. Based on the idea that the variance structure of microarray data can itself be a reliable measure of noise, this method allows statistically sound interpretation of as few as two replicates per treatment condition. Unlike the one-channel array, the two-channel platform simultaneously compares gene expression in two RNA samples. This leads to covariation of the measured signals. Hence, by accounting for covariation in the variance model, we can significantly increase the power of the statistical test. We believe that this approach has the potential to overcome limitations of existing methods. We present here a novel approach for the analysis of microarray data that involves modeling the variance structure of paired expression data in the context of a Bayesian framework. We also describe a novel statistical test that can be used to identify differentially-expressed genes. This method, bivariate microarray analysis (BMA), demonstrates dramatically improved sensitivity over existing approaches. We show that with only two array replicates, it is possible to detect gene expression changes that are at best detected with six array replicates by other methods. Further, we show that combining results from BMA with Gene Ontology annotation yields biologically significant results in a ligand-treated macrophage cell system.  相似文献   

4.
Microarray technology is currently one of the most widely-used technologies in biology. Many studies focus on inferring the function of an unknown gene from its co-expressed genes. Here, we are able to show that there are two types of positional artifacts in microarray data introducing spurious correlations between genes. First, we find that genes that are close on the microarray chips tend to have higher correlations between their expression profiles. We call this the 'chip artifact'. Our calculations suggest that the carry-over during the printing process is one of the major sources of this type of artifact, which is later confirmed by our experiments. Based on our experiments, the measured intensity of a microarray spot contains 0.1% (for fully-hybridized spots) to 93% (for un-hybridized ones) of noise resulting from this artifact. Secondly, we, for the first time, show that genes that are close on the microtiter plates in microarray experiments also tend to have higher correlations. We call this the 'plate artifact'. Both types of artifacts exist with different severity in all cDNA microarray experiments that we analyzed. Therefore, we develop an automated web tool-COP (COrrelations by Positional artifacts) to detect these artifacts in microarray experiments. COP has been integrated with the microarray data normalization tool, ExpressYourself, which is available at http://bioinfo.mbb.yale.edu/ExpressYourself/. Together, the two can eliminate most of the common noises in microarray data.  相似文献   

5.
High‐throughput microarray experiments often generate far more biological information than is required to test the experimental hypotheses. Many microarray analyses are considered finished after differential expression and additional analyses are typically not performed, leaving untapped biological information left undiscovered. This is especially true if the microarray experiment is from an ecological study of multiple populations. Comparisons across populations may also contain important genomic polymorphisms, and a subset of these polymorphisms may be identified with microarrays using techniques for the detection of single feature polymorphisms (SFP). SFPs are differences in microarray probe level intensities caused by genetic polymorphisms such as single‐nucleotide polymorphisms and small insertions/deletions and not expression differences. In this study, we provide a new algorithm for the detection of SFPs, evaluate the algorithm using existing data from two publicly available Affymetrix Barley (Hordeum vulgare) microarray data sets and compare them to two previously published SFP detection algorithms. Results show that our algorithm provides more consistent and sensitive calling of SFPs with a lower false discovery rate. Simultaneous analysis of SFPs and differential expression is a low‐cost method for the enhanced analysis of microarray data, enabling additional biological inferences to be made.  相似文献   

6.
Microarrays are a powerful tool for comparison and understanding of gene expression levels in healthy and diseased states. The method relies upon the assumption that signals from microarray features are a reflection of relative gene expression levels of the cell types under investigation. It has previously been reported that the classical fluorescent dyes used for microarray technology, Cy3 and Cy5, are not ideal due to the decreased stability and fluorescence intensity of the Cy5 dye relative to the Cy3, such that dye bias is an accepted phenomena necessitating dye swap experimental protocols and analysis of differential dye affects. The incentive to find new fluorophores is based on alleviating the problem of dye bias through synonymous performance between counterpart dyes. Alexa Fluor 555 and Alexa Fluor 647 are increasingly promoted as replacements for CyDye in microarray experiments. Performance relates to the molecular and steric similarities, which will vary for each new pair of dyes as well as the spectral integrity for the specific application required. Comparative analysis of the performance of these two competitive dye pairs in practical microarray applications is warranted towards this end. The findings of our study showed that both dye pairs were comparable but that conventional CyDye resulted in significantly higher signal intensities (P < 0.05) and signal minus background levels (P < 0.05) with no significant difference in background values (P > 0.05). This translated to greater levels of differential gene expression with CyDye than with the Alexa Fluor counterparts. However, CyDye fluorophores and in particular Cy5, were found to be less photostable over time and following repeated scans in microarray experiments. These results suggest that precautions against potential dye affects will continue to be necessary and that no one dye pair negates this need.  相似文献   

7.
8.
In this paper we discuss some of the statistical issues that should be considered when conducting experiments involving microarray gene expression data. We discuss statistical issues related to preprocessing the data as well as the analysis of the data. Analysis of the data is discussed in three contexts: class comparison, class prediction and class discovery. We also review the methods used in two studies that are using microarray gene expression to assess the effect of exposure to radiofrequency (RF) fields on gene expression. Our intent is to provide a guide for radiation researchers when conducting studies involving microarray gene expression data.  相似文献   

9.
Optimized T7 amplification system for microarray analysis.   总被引:8,自引:0,他引:8  
  相似文献   

10.
11.
Statistical design of reverse dye microarrays   总被引:7,自引:0,他引:7  
MOTIVATION: In cDNA microarray experiments all samples are labelled with either Cy3 dye or Cy5 dye. Certain genes exhibit dye bias-a tendency to bind more efficiently to one of the dyes. The common reference design avoids the problem of dye bias by running all arrays 'forward', so that the samples being compared are always labelled with the same dye. But comparison of samples labelled with different dyes is sometimes of interest. In these situations, it is necessary to run some arrays 'reverse'-with the dye labelling reversed-in order to correct for the dye bias. The design of these experiments will impact one's ability to identify genes that are differentially expressed in different tissues or conditions. We address the design issue of how many specimens are needed, how many forward and reverse labelled arrays to perform, and how to optimally assign Cy3 and Cy5 labels to the specimens. RESULTS: We consider three types of experiments for which some reverse labelling is needed: paired samples, samples from two predefined groups, and reference design data when comparison with the reference is of interest. We present simple probability models for the data, derive optimal estimators for relative gene expression, and compare the efficiency of the estimators for a range of designs. In each case, we present the optimal design and sample size formulas. We show that reverse labelling of individual arrays is generally not required.  相似文献   

12.
Z-score transformation has been successfully used as a normalisation procedure for microarray data generated using radioactively labelled probes with spotted cDNA arrays. One of the advantages of the z-score transformation method is that it provides a way of standardising data across a wide range of experiments and allows the comparison of microarray data independent of the original hybridisation intensities. The feasibility of applying z-score transformation to other types of linear microarray data, specifically that generated using fluorescently labelled probes with Affymetrix chips, was tested in three separate scenarios and is discussed here. In the first scenario, Affymetrix data from the NCBI (National Center for Biotechnology Information) GEO (Gene Expression Omnibus) database was used to demonstrate that z-score transformation preserved the essential phylogenetic grouping between primate species' fibroblast gene expression baseline measurements. The second scenario employed z-score transformation on data consisting of a series of genes spiked-in at known concentrations and arrayed in a Latin square format. We were able to reconstruct the entire set of spike-in concentration curves without prior knowledge of their format by using z-score transformation as the normalisation process. Finally, we show that z-score transformed data maintains the integrity of separate samples from different experiments and laboratories, as demonstrated by accurate grouping of clustered data according to sample identity. We conclude that data normalised by z-score transformation can be easily used with Affymetrix data without noticeable loss of information content. Z-score transformation provides a useful tool for comparisons between experiments and between laboratories that use the Affymetrix platform.  相似文献   

13.
表达谱基因芯片的可靠性验证分析   总被引:7,自引:0,他引:7  
cDNA芯片是一项新兴的能评估检测全范围mRNA表达水平变化的技术。通过同种组织RNA自身比较实验及不同组织RNA的差异分析实验对cDNA芯片实验的重复性进行检验,利用相关系数(correlation coefficient,R)、变异系数(coefficient of variation,CV)和假阳性率(false positiver ate,FPR)分析eDNA芯片数据的可靠程度,对cDNA芯片实验数据作了整体的评估。结果证实,该芯片系统得到的cDNA表达谱数据相关系数一般大于0.9,平均变异系数15%左右,假阳性率控制在3%以内。还提出一致率(consistence rate,CR)的概念,作为衡量cDNA芯片系统重复性的新参数,同时阐述了该参数优于目前常用的相关系数及变异系数的特点。另外,通过比较芯片制备中点样浓度、mRNA和总RNA以及不同批次芯片和不同标记过程对实验的影响,来分析芯片数据的系统误差来源。并提出重复两次实验,可以克服绝大部分实验系统引入的假阳性。  相似文献   

14.

Background

The methods used for sample selection and processing can have a strong influence on the expression values obtained through microarray profiling. Laser capture microdissection (LCM) provides higher specificity in the selection of target cells compared to traditional bulk tissue selection methods, but at an increased processing cost. The benefit gained from the higher tissue specificity realized through LCM sampling is evaluated in this study through a comparison of microarray expression profiles obtained from same-samples using bulk and LCM processing.

Methods

Expression data from ten lung adenocarcinoma samples and six adjacent normal samples were acquired using LCM and bulk sampling methods. Expression values were evaluated for correlation between sample processing methods, as well as for bias introduced by the additional linear amplification required for LCM sample profiling.

Results

The direct comparison of expression values obtained from the bulk and LCM sampled datasets reveals a large number of probesets with significantly varied expression. Many of these variations were shown to be related to bias arising from the process of linear amplification, which is required for LCM sample preparation. A comparison of differentially expressed genes (cancer vs. normal) selected in the bulk and LCM datasets also showed substantial differences. There were more than twice as many down-regulated probesets identified in the LCM data than identified in the bulk data. Controlling for the previously identified amplification bias did not have a substantial impact on the differences identified in the differentially expressed probesets found in the bulk and LCM samples.

Conclusion

LCM-coupled microarray expression profiling was shown to uniquely identify a large number of differentially expressed probesets not otherwise found using bulk tissue sampling. The information gain realized from the LCM sampling was limited to differential analysis, as the absolute expression values obtained for some probesets using this study's protocol were biased during the second round of amplification. Consequently, LCM may enable investigators to obtain additional information in microarray studies not easily found using bulk tissue samples, but it is of critical importance that potential amplification biases are controlled for.  相似文献   

15.
Gu X 《Genetics》2004,167(1):531-542
Microarray technology has produced massive expression data that are invaluable for investigating the genome-wide evolutionary pattern of gene expression. To this end, phylogenetic expression analysis is highly desirable. On the basis of the Brownian process, we developed a statistical framework (called the E(0) model), assuming the independent expression of evolution between lineages. Several evolutionary mechanisms are integrated to characterize the pattern of expression diversity after gene duplications, including gradual drift and dramatic shift (punctuated equilibrium). When the phylogeny of a gene family is given, we show that the likelihood function follows a multivariate normal distribution; the variance-covariance matrix is determined by the phylogenetic topology and evolutionary parameters. Maximum-likelihood methods for multiple microarray experiments are developed, and likelihood-ratio tests are designed for testing the evolutionary pattern of gene expression. To reconstruct the evolutionary trace of expression diversity after gene (or genome) duplications, we developed a Bayesian-based method and use the posterior mean as predictors. Potential applications in evolutionary genomics are discussed.  相似文献   

16.

Background

Although numerous investigations have compared gene expression microarray platforms, preprocessing methods and batch correction algorithms using constructed spike-in or dilution datasets, there remains a paucity of studies examining the properties of microarray data using diverse biological samples. Most microarray experiments seek to identify subtle differences between samples with variable background noise, a scenario poorly represented by constructed datasets. Thus, microarray users lack important information regarding the complexities introduced in real-world experimental settings. The recent development of a multiplexed, digital technology for nucleic acid measurement enables counting of individual RNA molecules without amplification and, for the first time, permits such a study.

Results

Using a set of human leukocyte subset RNA samples, we compared previously acquired microarray expression values with RNA molecule counts determined by the nCounter Analysis System (NanoString Technologies) in selected genes. We found that gene measurements across samples correlated well between the two platforms, particularly for high-variance genes, while genes deemed unexpressed by the nCounter generally had both low expression and low variance on the microarray. Confirming previous findings from spike-in and dilution datasets, this “gold-standard” comparison demonstrated signal compression that varied dramatically by expression level and, to a lesser extent, by dataset. Most importantly, examination of three different cell types revealed that noise levels differed across tissues.

Conclusions

Microarray measurements generally correlate with relative RNA molecule counts within optimal ranges but suffer from expression-dependent accuracy bias and precision that varies across datasets. We urge microarray users to consider expression-level effects in signal interpretation and to evaluate noise properties in each dataset independently.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-649) contains supplementary material, which is available to authorized users.  相似文献   

17.
Fung ES  Ng MK 《Bioinformation》2007,2(5):230-234
One of the applications of the discriminant analysis on microarray data is to classify patient and normal samples based on gene expression values. The analysis is especially important in medical trials and diagnosis of cancer subtypes. The main contribution of this paper is to propose a simple Fisher-type discriminant method on gene selection in microarray data. In the new algorithm, we calculate a weight for each gene and use the weight values as an indicator to identify the subsets of relevant genes that categorize patient and normal samples. A l(2) - l(1) norm minimization method is implemented to the discriminant process to automatically compute the weights of all genes in the samples. The experiments on two microarray data sets have shown that the new algorithm can generate classification results as good as other classification methods, and effectively determine relevant genes for classification purpose. In this study, we demonstrate the gene selection's ability and the computational effectiveness of the proposed algorithm. Experimental results are given to illustrate the usefulness of the proposed model.  相似文献   

18.
19.
We present an efficient algorithm for detecting putative regulatory elements in the upstream DNA sequences of genes, using gene expression information obtained from microarray experiments. Based on a generalized suffix tree, our algorithm looks for motif patterns whose appearance in the upstream region is most correlated with the expression levels of the genes. We are able to find the optimal pattern, in time linear in the total length of the upstream sequences. We implement and apply our algorithm to publicly available microarray gene expression data, and show that our method is able to discover biologically significant motifs, including various motifs which have been reported previously using the same data set. We further discuss applications for which the efficiency of the method is essential, as well as possible extensions to our algorithm.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号