首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

Background  

High throughput gene expression data from spotted cDNA microarrays are collected by scanning the signal intensities of the corresponding spots by dedicated fluorescence scanners. The major scanner settings for increasing the spot intensities are the laser power and the voltage of the photomultiplier tube (PMT). It is required that the expression ratios are independent of these settings. We have investigated the relationships between PMT voltage, spot intensities, and expression ratios for different scanners, in order to define an optimal scanning procedure.  相似文献   

2.
Analysis of repeatability in spotted cDNA microarrays   总被引:7,自引:3,他引:4  
We report a strategy for analysis of data quality in cDNA microarrays based on the repeatability of repeatedly spotted clones. We describe how repeatability can be used to control data quality by developing adaptive filtering criteria for microarray data containing clones spotted in multiple spots. We have applied the method on five publicly available cDNA microarray data sets and one previously unpublished data set from our own laboratory. The results demonstrate the feasibility of the approach as a foundation for data filtering, and indicate a high degree of variation in data quality, both across the data sets and between arrays within data sets.  相似文献   

3.
We propose a simple approach, the multiplicative background correction, to solve a perplexing problem in spotted microarray data analysis: correcting the foreground intensities for the background noise, especially for spots with genes that are weakly expressed or not at all. The conventional approach, the additive background correction, directly subtracts the background intensities from foreground intensities. When the foreground intensities marginally dominate the background intensities, the additive background correction provides unreliable estimates of the differential gene expression levels and usually presents M-A plots with fishtails or fans. Unreliable additive background correction makes it preferable to ignore the background noise, which may increase the number of false positives. Based on the more realistic multiplicative assumption instead of the conventional additive assumption, we propose to logarithmically transform the intensity readings before the background correction, with the logarithmic transformation symmetrizing the skewed intensity readings. This approach not only precludes the fishtails and fans in the M-A plots, but provides highly reproducible background-corrected intensities for both strongly and weakly expressed genes. The superiority of the multiplicative background correction to the additive one as well as the no background correction is justified by publicly available self-hybridization datasets.  相似文献   

4.
MOTIVATION: Assessment of gene expression on spotted microarrays is based on measurement of fluorescence intensity emitted by hybridized spots. Unfortunately, quantifying fluorescence intensity from hybridized spots does not always correctly reflect gene expression level. Low gene expression levels produce low fluorescence intensities which tend to be confounded with the local background while high gene expression levels produce high fluorescence intensities which rapidly reach the saturation level. Most algorithms that combine data acquired at different voltages of the photomultiplier tube (PMT) assume that a change in scanner setting transforms the intensity measurements by a multiplicative constant. METHODS AND RESULTS: In this paper we introduce a new model of spot foreground intensity which integrates a PMT voltage independent scanner optical bias. This new model is used to implement a "Combining Multiple Scan using a Two-way ANOVA" (CMS2A) method, which is based on a maximum likelihood estimation of the scanner optical bias. After having computed scanner bias, coefficients of the two-way ANOVA model are used for correcting the saturated spots intensities obtained at high PMT voltage by using their counterpart values at lower PMT voltages. The method was compared to state-of-the-art multiple scan algorithms, using data generated from the MAQC study. CMS2A produced fold-changes that were highly correlated with qPCR fold-changes. As the scanner optical bias is accurately estimated within CMS2A, this method allows also avoiding fold-change compression biases whatever the value of this optical bias.  相似文献   

5.
MOTIVATION: Scanning parameters are often overlooked when optimizing microarray experiments. A scanning approach that extends the dynamic data range by acquiring multiple scans of different intensities has been developed. RESULTS: Data from each of three scan intensities (low, medium, high) were analyzed separately using multiple scan and linear regression approaches to identify and compare the sets of genes that exhibit statistically significant differential expression. In the multiple scan approach only one-third of the differentially expressed genes were shared among the three intensities, and each scan intensity identified unique sets of differentially expressed genes. The set of differentially expressed genes from any one scan amounted to < 70% of the total number of genes identified in at least one scan. The average signal intensity of genes that exhibited statistically significant changes in expression was highest for the low-intensity scan and lowest for the high-intensity scan, suggesting that low-intensity scans may be best for detecting expression differences in high-signal genes, while high-intensity scans may be best for detecting expression differences in low-signal genes. Comparison of the differentially expressed genes identified in the multiple scan and linear regression approaches revealed that the multiple scan approach effectively identifies a subset of statistically significant genes that linear regression approach is unable to identify. Quantitative RT-PCR (qRT-PCR) tests demonstrated that statistically significant differences identified at all three scan intensities can be verified. AVAILABILITY: The data presented can be viewed at http://www.ncbi.nlm.nih.gov/geo/ under GEO accession no. GSE3017.  相似文献   

6.
Beckman KB  Lee KY  Golden T  Melov S 《Mitochondrion》2004,4(5-6):453-470
Mitochondrial diseases are a heterogeneous array of disorders with a complex etiology. Use of microarrays as a tool to investigate complex human disease is increasingly common, however, a principle drawback of microarrays is their limited dynamic range, due to the poor quantification of weak signals. Although it is generally understood that low-intensity microarray 'spots' may be unreliable, there exists little documentation of their accuracy. Quantitative PCR (Q-PCR) is frequently used to validate microarray data, yet few Q-PCR validation studies have focused on the accuracy of low-intensity microarray signals. Hence, we have used Q-PCR to systematically assess microarray accuracy as a function of signal strength in a mouse model of mitochondrial disease, the superoxide dismutase 2 (SOD2) nullizygous mouse. We have focused on a unique category of data--spots with only one weak signal in a two-dye comparative hybridization--and show that such 'high-low' signal intensities are common for differentially expressed genes. This category of differential expression may be more important in mitochondrial disease in which there are often mosaic expression patterns due to the idiosyncratic distribution of mutant mtDNA in heteroplasmic individuals. Using RNA from the SOD2 mouse, we found that when spotted cDNA microarray data are filtered for quality (low variance between many technical replicates) and spot intensity (above a negative control threshold in both channels), there is an excellent quantitative concordance with Q-PCR (R2 = 0.94). The accuracy of gene expression ratios from low-intensity spots (R2 = 0.27) and 'high-low' spots (R2 = 0.32) is considerably lower. Our results should serve as guidelines for microarray interpretation and the selection of genes for validation in mitochondrial disorders.  相似文献   

7.
The first stage in the analysis of cDNA microarray data is estimation of the level of expression of each gene, from laser scans of hybridised microarrays. Typically, data are used from a single scan, although, if multiple scans are available, there is the opportunity to reduce sampling error by using all of them. Combining multiple laser scans can be formulated as multivariate functional regression through the origin. Maximum likelihood estimation fails, but many alternative estimators exist, one of which is to maximise the likelihood of a Gaussian structural regression model. We show by simulation that, surprisingly, this estimator is efficient for our problem, even though the distribution of gene expression values is far from Gaussian. Further, it performs well if errors have a heavier tailed distribution or the model includes intercept terms, but not necessarily in other regions of parameter space. Finally, we show that by combining multiple laser scans we increase the power to detect differential expression of genes. (© 2009 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

8.
MOTIVATION: To study lowly expressed genes in microarray experiments, it is useful to increase the photometric gain in the scanning. However, a large gain may cause some pixels for highly expressed genes to become saturated. Spatial statistical models that model spot shapes on the pixel level may be used to infer information about the saturated pixel intensities. Other possible applications for spot shape models include data quality control and accurate determination of spot centres and spot diameters. RESULTS: Spatial statistical models for spotted microarrays are studied including pixel level transformations and spot shape models. The models are applied to a dataset from 50mer oligonucleotide microarrays with 452 selected Arabidopsis genes. Logarithmic, Box-Cox and inverse hyperbolic sine transformations are compared in combination with four spot shape models: a cylindric plateau shape, an isotropic Gaussian distribution and a difference of two-scaled Gaussian distribution suggested in the literature, as well as a proposed new polynomial-hyperbolic spot shape model. A substantial improvement is obtained for the dataset studied by the polynomial-hyperbolic spot shape model in combination with the Box-Cox transformation. The spatial statistical models are used to correct spot measurements with saturation by extrapolating the censored data. AVAILABILITY: Source code for R is available at http://www.matfys.kvl.dk/~ekstrom/spotshapes/  相似文献   

9.
We have developed a new method, designated restriction landmark cDNA scanning (RLCS), which displays many cDNA species quantitatively and simultaneously as two-dimensional gel spots. In this method cDNA species of uniform length were prepared for each mRNA species using restriction enzymes. After the restriction enzyme sites were radiolabeled as landmarks, the labeled fragments were subjected to high resolution two-dimensional gel electrophoresis. In analyses of cDNA samples from adult mouse liver and brain (cerebral cortex, cerebellum and brain stem) we detected approximately 500 and >1000 discrete gel spots respectively of various intensities at a time. The spot patterns of the three brain regions were very similar, although not identical, but were quite different from the pattern for the liver. RNA blot hybridization analysis using several cloned spot DNAs as probes showed that differences in intensity of the spots among RLCS profiles correlated well with expression levels of the corresponding mRNA species in the brain regions. Because the spots and their intensities reflect distinct mRNA species and their expression level respectively, the RLCS is a novel cDNA display system which provides a great deal of information and should be useful for systematic documentation of differentially expressed genes.  相似文献   

10.
limmaGUI: a graphical user interface for linear modeling of microarray data   总被引:15,自引:0,他引:15  
SUMMARY: limmaGUI is a graphical user interface (GUI) based on R-Tcl/Tk for the exploration and linear modeling of data from two-color spotted microarray experiments, especially the assessment of differential expression in complex experiments. limmaGUI provides an interface to the statistical methods of the limma package for R, and is itself implemented as an R package. The software provides point and click access to a range of methods for background correction, graphical display, normalization, and analysis of microarray data. Arbitrarily complex microarray experiments involving multiple RNA sources can be accomodated using linear models and contrasts. Empirical Bayes shrinkage of the gene-wise residual variances is provided to ensure stable results even when the number of arrays is small. Integrated support is provided for quantitative spot quality weights, control spots, within-array replicate spots and multiple testing. limmaGUI is available for most platforms on the which R runs including Windows, Mac and most flavors of Unix. AVAILABILITY: http://bioinf.wehi.edu.au/limmaGUI.  相似文献   

11.
12.
cDNA arrays allow quantitative measurement of expression levels for thousands of genes simultaneously. The measurements are affected by many sources of variation, and substantial improvements in the precision of estimated effects accompany adjustments for these effects. Two generic nuisance variations, one associated with the magnitude of expression and the other associated with array location, are common in data from filter arrays. Procedures, like normalization using lowess regression, are effective at reducing variation associated with magnitude, and they have been widely adopted. However, variation associated with location has received less attention. Here, a simple, but effective method based on localized median is expounded for dealing with these nuisance effects, and its properties are discussed. The proposed methodology handles location-dependent variation ("splotches") and magnitude-dependent variation (background and/or saturation) effectively. The procedure is related to lowess when implemented to adjust magnitude-dependent variation, and it performs similarly. The proposed methodology is illustrated with data from the National Center for Toxicological Research (NCTR), where treatment differences in levels of mRNA from rat hepatocytes were assessed using 33P-labeled samples hybridized to cDNA spotted arrays. Normalizing intensities by the median-of-subsets removes systematic variation associated with the location of a gene on the array and/or the level of its expression. This procedure is easy to implement using iteratively reweighted least-squares algorithms. Although less sophisticated than lowess, this procedure works nearly as well for normalizing intensities based upon their magnitude. Unlike lowess, it can adjust for location-dependent effects.  相似文献   

13.
Protocols for the assurance of microarray data quality and process control   总被引:3,自引:0,他引:3  
Microarrays represent a powerful technology that provides the ability to simultaneously measure the expression of thousands of genes. However, it is a multi-step process with numerous potential sources of variation that can compromise data analysis and interpretation if left uncontrolled, necessitating the development of quality control protocols to ensure assay consistency and high-quality data. In response to emerging standards, such as the minimum information about a microarray experiment standard, tools are required to ascertain the quality and reproducibility of results within and across studies. To this end, an intralaboratory quality control protocol for two color, spotted microarrays was developed using cDNA microarrays from in vivo and in vitro dose-response and time-course studies. The protocol combines: (i) diagnostic plots monitoring the degree of feature saturation, global feature and background intensities, and feature misalignments with (ii) plots monitoring the intensity distributions within arrays with (iii) a support vector machine (SVM) model. The protocol is applicable to any laboratory with sufficient datasets to establish historical high- and low-quality data.  相似文献   

14.
15.

Background  

Maximizing the utility of DNA microarray data requires optimization of data acquisition through selection of an appropriate scanner setting. To increase the amount of useable data, several approaches have been proposed that incorporate multiple scans at different sensitivities to reduce the quantification error and to minimize effects of saturation. However, no direct comparison of their efficacy has been made. In the present study we compared individual scans at low, medium and high sensitivity with three methods for combining data from multiple scans (either 2-scan or 3-scan cases) using an actual dataset comprising 40 technical replicates of a reference RNA standard.  相似文献   

16.
MOTIVATION: Microarray images challenge existing analytical methods in many ways given that gene spots are often comprised of characteristic imperfections. Irregular contours, donut shapes, artifacts, and low or heterogeneous expression impair corresponding values for red and green intensities as well as their ratio R/G. New approaches are needed to ensure accurate data extraction from these images. RESULTS: Herein we introduce a novel method for intensity assessment of gene spots. The technique is based on clustering pixels of a target area into foreground and background. For this purpose we implemented two clustering algorithms derived from k-means and Partitioning Around Medoids (PAM), respectively. Results from the analysis of real gene spots indicate that our approach performs superior to other existing analytical methods. This is particularly true for spots generally considered as problematic due to imperfections or almost absent expression. Both PX(PAM) and PX(KMEANS) prove to be highly robust against various types of artifacts through adaptive partitioning, which more correctly assesses expression intensity values. AVAILABILITY: The implementation of this method is a combination of two complementary tools Extractiff (Java) and Pixclust (free statistical language R), which are available upon request from the authors.  相似文献   

17.
The shark genus Mustelus is speciose, commercially important and systematically troublesome. We use a molecular approach combining inter and intra-specific data to investigate Mustelus species in the central Indo-Pacific and Australasia. Our analysis supports two Mustelus clades, one comprising species with no white spots and a placental reproductive mode and a second clade of white spotted, aplacental species. Levels of genetic divergence are low, especially among species in the white spotted, aplacental clade and this should be taken into account when employing molecular data to delineate species. Our data support the hypothesis of a radiation following dispersal from a northern hemisphere ancestor. Molecular dating suggests that localised speciation in Australasia may have occurred during the Pleistocene. We propose that some of the difficulties associated with Mustelus systematics relate to a recent radiation, particularly in the Australasian region.  相似文献   

18.
Management, presentation and interpretation of genome scans using GSCANDB   总被引:1,自引:0,他引:1  
MOTIVATION: Advances in high-throughput genotyping have made it possible to carry out genome-wide association studies using very high densities of genetic markers. This has led to the problem of the storage, management, quality control, presentation and interpretation of results. In order to achieve a successful outcome, it may be necessary to analyse the data in different ways and compare the results with genome annotations and other genome scans. RESULTS: We created GSCANDB, a database for genome scan data, using a MySQL backend and Perl-CGI web interface. It displays genome scans of multiple phenotypes analysed in different ways and projected onto genome annotations derived from EnsMart. The current version is optimized for analysis of mouse data, but is customizable to other species. AVAILABILITY: Source code and example data are available under the GPL, in versions tailored to either human or mouse association studies, from http://gscan.well.ox.ac.uk/software.  相似文献   

19.
MOTIVATION: The numerical values of gene expression measured using microarrays are usually presented to the biological end-user as summary statistics of spot pixel data, such as the spot mean, median and mode. Much of the subsequent data analysis reported in the literature, however, uses only one of these spot statistics. This results in sub-optimal estimates of gene expression levels and a need for improvement in quantitative spot variation surveillance. RESULTS: This paper develops a maximum-likelihood method for estimating gene expression using spot mean, variance and pixel number values available from typical microarray scanners. It employs a hierarchical model of variation between and within microarray spots. The hierarchical maximum-likelihood estimate (MLE) is shown to be a more efficient estimator of the mean than the 'conventional' estimate using solely the spot mean values (i.e. without spot variance data). Furthermore, under the assumptions of our model, the spot mean and spot variance are shown to be sufficient statistics that do not require the use of all pixel data.The hierarchical MLE method is applied to data from both Monte Carlo (MC) simulations and a two-channel dye-swapped spotted microarray experiment. The MC simulations show that the hierarchical MLE method leads to improved detection of differential gene expression particularly when 'outlier' spots are present on the arrays. Compared with the conventional method, the MLE method applied to data from the microarray experiment leads to an increase in the number of differentially expressed genes detected for low cut-off P-values of interest.  相似文献   

20.
Microarray analysis is a critically important technology for genome-enabled biology, therefore it is essential that the data obtained be reliable. Current software and normalization techniques for microarray analysis rely on the assumption that fluorescent background within spots is essentially the same throughout the glass slide and can be measured by fluorescence surrounding the spots. This assumption is not valid if background fluorescence is spot-localized. Inaccurate estimates of background fluorescence under the spot create a source of error, especially for low expressed genes. We have identified spot-localized, contaminating fluorescence in the Cy3 channel on several commercial and in-house printed microarray slides. We determined through mock hybridizations (without labeled target) that pre-hybridization scans could not be used to predict the contribution of this contaminating fluorescence after hybridization because the change in spot-to-spot fluorescence after hybridization was too variable. Two solutions to this problem were identified. First, allowing 4 h of exposure to air prior to printing on to Corning UltraGAPS slides significantly reduced contaminating fluorescence intensities to approximately the value of the surrounding glass. Alternatively, application of a novel, hyperspectral imaging scanner and multivariate curve resolution algorithms, allowed the spectral contributions of Cy3 signal, glass, and contaminating fluorescence to be distinguished and quantified after hybridization.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号