The analysis of two-colour cDNA microarray data usually involves subtracting background values from foreground values prior to normalization and further analysis. This approach has the advantage of reducing bias and the disadvantage of blowing up the variance of lower abundant spots. Whenever background subtraction is considered, it implicitly assumes locally constant background values. In practice, this assumption is often not met, which casts doubts on the usefulness of simple background subtraction. In order to improve background correction, we propose local background smoothing within the pre-processing pipeline of cDNA microarray data prior to background correction. For this purpose, we employ a geostatistical framework with ordinary kriging using both isotropic and anisotropic models of spatial correlation and 2-D locally weighted regression. We show that application of local background smoothing prior to background correction is beneficial in comparison to using raw background estimates. This is done using data of a self-versus-self experiment in Arabidopsis where subsets of differentially expressed genes were simulated. Using locally smoothed background values in conjunction with existing background correction methods increases the power, increases the accuracy and decreases the number of false positive results.  相似文献   

Microarrays are part of a new class of biotechnologies that allow the monitoring of expression levels for thousands of genes simultaneously. Image analysis is an important aspect of microarray experiments, one that can have a potentially large impact on subsequent analyses, such as clustering or the identification of differentially expressed genes. This paper reviews a number of existing image analysis methods used on cDNA microarray data. In particular, it describes and discusses the different segmentation and background adjustment methods. It was found that in some cases background adjustment can substantially reduce the precision--that is, increase the variability of low-intensity spot values. In contrast, the choice of segmentation procedure seems to have a smaller impact.  相似文献   

Image and statistical analysis are two important stages of cDNA microarrays. Of these, gridding is necessary to accurately identify the location of each spot while extracting spot intensities from the microarray images and automating this procedure permits high-throughput analysis. Due to the deficiencies of the equipment used to print the arrays, rotations, misalignments, high contamination with noise and artifacts, and the enormous amount of data generated, solving the gridding problem by means of an automatic system is not trivial. Existing techniques to solve the automatic grid segmentation problem cover only limited aspects of this challenging problem and require the user to specify the size of the spots, the number of rows and columns in the grid, and boundary conditions. In this paper, a hill-climbing automatic gridding and spot quantification technique is proposed which takes a microarray image (or a subgrid) as input and makes no assumptions about the size of the spots, rows, and columns in the grid. The proposed method is based on a hill-climbing approach that utilizes different objective functions. The method has been found to effectively detect the grids on microarray images drawn from databases from GEO and the Stanford genomic laboratories.  相似文献   

MOTIVATION: Data from one-channel cDNA microarray studies may exhibit poor reproducibility due to spatial heterogeneity, non-linear array-to-array variation and problems in correcting for background. Uncorrected, these phenomena can give rise to misleading conclusions. RESULTS: Spatial heterogeneity may be corrected using two-dimensional loess smoothing (Colantuoni et al., 2002). Non-linear between-array variation may be corrected using an iterative application of one-dimensional loess smoothing. A method for background correction using a smoothing function rather than simple subtraction is described. These techniques promote within-array spatial uniformity and between-array reproducibility. Their application is illustrated using data from a study of the effects of an insulin sensitizer, rosiglitazone, on gene expression in white adipose tissue in diabetic db/db mice. They may also be useful with data from two-channel cDNA microarrays and from oligonucleotide arrays. AVAILABILITY: R functions for the methods described are available on request from the author.  相似文献   



The quality of cDNA microarray data is crucial for expanding its application to other research areas, such as the study of gene regulatory networks. Despite the fact that a number of algorithms have been suggested to increase the accuracy of microarray gene expression data, it is necessary to obtain reliable microarray images by improving wet-lab experiments. As the first step of a cDNA microarray experiment, spotting cDNA probes is critical to determining the quality of spot images.  相似文献   

MOTIVATION: We present a new approach to the analysis of images for complementary DNA microarray experiments. The image segmentation and intensity estimation are performed simultaneously by adopting a two-component mixture model. One component of this mixture corresponds to the distribution of the background intensity, while the other corresponds to the distribution of the foreground intensity. The intensity measurement is a bivariate vector consisting of red and green intensities. The background intensity component is modeled by the bivariate gamma distribution, whose marginal densities for the red and green intensities are independent three-parameter gamma distributions with different parameters. The foreground intensity component is taken to be the bivariate t distribution, with the constraint that the mean of the foreground is greater than that of the background for each of the two colors. The degrees of freedom of this t distribution are inferred from the data but they could be specified in advance to reduce the computation time. Also, the covariance matrix is not restricted to being diagonal and so it allows for nonzero correlation between R and G foreground intensities. This gamma-t mixture model is fitted by maximum likelihood via the EM algorithm. A final step is executed whereby nonparametric (kernel) smoothing is undertaken of the posterior probabilities of component membership. The main advantages of this approach are: (1) it enjoys the well-known strengths of a mixture model, namely flexibility and adaptability to the data; (2) it considers the segmentation and intensity simultaneously and not separately as in commonly used existing software, and it also works with the red and green intensities in a bivariate framework as opposed to their separate estimation via univariate methods; (3) the use of the three-parameter gamma distribution for the background red and green intensities provides a much better fit than the normal (log normal) or t distributions; (4) the use of the bivariate t distribution for the foreground intensity provides a model that is less sensitive to extreme observations; (5) as a consequence of the aforementioned properties, it allows segmentation to be undertaken for a wide range of spot shapes, including doughnut, sickle shape and artifacts. RESULTS: We apply our method for gridding, segmentation and estimation to cDNA microarray real images and artificial data. Our method provides better segmentation results in spot shapes as well as intensity estimation than Spot and spotSegmentation R language softwares. It detected blank spots as well as bright artifact for the real data, and estimated spot intensities with high-accuracy for the synthetic data. AVAILABILITY: The algorithms were implemented in Matlab. The Matlab codes implementing both the gridding and segmentation/estimation are available upon request. SUPPLEMENTARY INFORMATION: Supplementary material is available at Bioinformatics online.  相似文献   



Normalization is a basic step in microarray data analysis. A proper normalization procedure ensures that the intensity ratios provide meaningful measures of relative expression values.  相似文献   

An enormous amount of microarray data has been collected and accumulated in public repositories. Although some of the depositions include raw and processed data, significant parts of them include processed data only. If we need to combine multiple datasets for specific purposes, the data should be adjusted prior to use to remove bias between the datasets. We focused on a GeneChip platform and a pre-processing method, RMA, and examined simple quantile correction as the post-processing method for integration. Integration of the data pre-processed by RMA was evaluated using artificial spike-in datasets and real microarray datasets of atopic dermatitis and lung cancer. Studies using the spike-in datasets show that the quantile correction for data integration reduces the data quality at some extent but it should be acceptable level. Studies using the real datasets show that the quantile correction significantly reduces the bias. These results show that the quantile correction is useful for integration of multiple datasets processed by RMA, and encourage effective use of public microarray data.  相似文献   



There are many sources of variation in dual labelled microarray experiments, including data acquisition and image processing. The final interpretation of experiments strongly relies on the accuracy of the measurement of the signal intensity. For low intensity spots in particular, accurately estimating gene expression variations remains a challenge as signal measurement is, in this case, highly subject to fluctuations.  相似文献   



In cancer studies, it is common that multiple microarray experiments are conducted to measure the same clinical outcome and expressions of the same set of genes. An important goal of such experiments is to identify a subset of genes that can potentially serve as predictive markers for cancer development and progression. Analyses of individual experiments may lead to unreliable gene selection results because of the small sample sizes. Meta analysis can be used to pool multiple experiments, increase statistical power, and achieve more reliable gene selection. The meta analysis of cancer microarray data is challenging because of the high dimensionality of gene expressions and the differences in experimental settings amongst different experiments.  相似文献   



In the microarray experiment, many undesirable systematic variations are commonly observed. Normalization is the process of removing such variation that affects the measured gene expression levels. Normalization plays an important role in the earlier stage of microarray data analysis. The subsequent analysis results are highly dependent on normalization. One major source of variation is the background intensities. Recently, some methods have been employed for correcting the background intensities. However, all these methods focus on defining signal intensities appropriately from foreground and background intensities in the image analysis. Although a number of normalization methods have been proposed, no systematic methods have been proposed using the background intensities in the normalization process.  相似文献   

Automatic analysis of DNA microarray images using mathematical morphology   总被引:10,自引:0,他引:10  
MOTIVATION: DNA microarrays are an experimental technology which consists in arrays of thousands of discrete DNA sequences that are printed on glass microscope slides. Image analysis is an important aspect of microarray experiments. The aim of this step is to reduce an image of spots into a table with a measure of the intensity for each spot. Efficient, accurate and automatic analysis of DNA spot images is essential in order to use this technology in laboratory routines. RESULTS: We present an automatic non-supervised set of algorithms for a fast and accurate spot data extraction from DNA microarrays using morphological operators which are robust to both intensity variation and artefacts. The approach can be summarised as follows. Initially, a gridding algorithm yields the automatic segmentation of the microarray image into spot quadrants which are later individually analysed. Then the analysis of the spot quadrant images is achieved in five steps. First, a pre-quantification, the spot size distribution law is calculated. Second, the background noise extraction is performed using a morphological filtering by area. Third, an orthogonal grid provides the first approach to the spot locus. Fourth, the spot segmentation or spot boundaries definition is carried out using the watershed transformation. And fifth, the outline of detected spots allows the signal quantification or spot intensities extraction; in this respect, a noise model has been investigated. The performance of the algorithm has been compared with two packages: ScanAlyze and Genepix, showing its robustness and precision.  相似文献   

The complicated genetic pathway regulates the developmental programs of male reproductive organ, anther tissues. To understand these molecular mechanisms, we performed cDNA microarray analyses and in situ hybridization to monitor gene expression patterns during anther development in rice. Microarray analysis of 4,304 cDNA clones revealed that the hybridization signal of 396 cDNA clones (271 non-redundant groups) increased more than six-fold in every stage of the anthers compared with that of leaves. Cluster analysis with the expression data showed that 259 cDNA clones (156 non redundant groups) were specifically or predominantly expressed in anther tissues and were regulated by developmental stage-specific manners in the anther tissues. These co-regulated genes would be important for development of functional anther tissues. Furthermore, we selected several clones for RNA in situ hybridization analysis. From these analyses, we found several novel genes that show temporal and spatial expression patterns during anther development in addition to anther-specific genes reported so far. These results indicate that the genes identified in this experiment are controlled by different programs and are specialized in their developmental and cell types.  相似文献   

The cDNA microarray is an important tool for generating large datasets of gene expression measurements.An efficient design is critical to ensure that the experiment will be able to address relevant biologicalquestions. Microarray experimental design can be treated as a multicriterion optimization problem. For thisclass of problems evolutionary algorithms (EAs) are well suited, as they can search the solution space andevolve a design that optimizes the parameters of interest based on their relative value to the researcher undera given set of constraints. This paper introduces the use of EAs for optimization of experimental designs ofspotted microarrays using a weighted objective function. The EA and the various criteria relevant to designoptimization are discussed. Evolved designs are compared with designs obtained through exhaustive searchwith results suggesting that the EA can find just as efficient optimal or near-optimal designs within atractable timeframe.  相似文献   

Segmentation of cDNA microarray spots using markov random field modeling   总被引:3,自引:3,他引:0  
Motivation: Spot segmentation is a critical step in microarraygene expression data analysis. Therefore, the performance ofsegmentation may substantially affect the results of subsequentstages of the analysis, such as the detection of differentiallyexpressed genes. Several methods have been developed to segmentmicroarray spots from the surrounding background. In this study,we have proposed a new approach based on Markov random field(MRF) modeling and tested its performance on simulated and realmicroarray images against a widely used segmentation methodbased on Mann–Whitney test adopted by QuantArray software(Boston, MA). Spot addressing was performed using QuantArray.We have also devised a simulation method to generate microarrayimages with realistic features. Such images can be used as goldstandards for the purposes of testing and comparing differentsegmentation methods, and optimizing segmentation parameters. Results: Experiments on simulated and 14 actual microarray imagesets show that the proposed MRF-based segmentation method candetect spot areas and estimate spot intensities with higheraccuracy. Availability: The algorithms were implemented in MatlabTM (TheMathworks, Inc., Natick, MA) environment. The codes for MRF-basedsegmentation and image simulation methods are available uponrequest. Contact: demirkaya{at}ieee.org  相似文献   

RNA amplification strategies for cDNA microarray experiments   总被引:5,自引:0,他引:5  

The treatment of most patients with head and neck cancer includes ionizing radiation (IR). Salivary glands in the IR field suffer significant and irreversible damage, leading to considerable morbidity. Previously, we reported that adenoviral (Ad)-mediated transfer of the human aquaporin-1 (hAQP1) cDNA to rat [C. Delporte, B.C. O'Connell, X. He, H.E. Lancaster, A.C. O'Connell, P. Agre, B.J. Baum, Increased fluid secretion after adenoviral-mediated transfer of the aquaporin-1 cDNA to irradiated rat salivary glands. Proc. Natl. Acad. Sci. U S A. 94 (1997) 3268-3273] and miniature pig [Z. Shan, J. Li, C. Zheng, X. Liu, Z. Fan, C. Zhang, C.M. Goldsmith, R.B. Wellner, B.J Baum, S. Wang. Increased fluid secretion after adenoviral-mediated transfer of the human aquaporin-1 cDNA to irradiated miniature pig parotid glands. Mol. Ther. 11 (2005) 444-451] salivary glands approximately 16 weeks following IR resulted in a dose-dependent increase in salivary flow to > or =80% control levels on day 3. A control Ad vector was without any significant effect on salivary flow. Additionally, after administration of Ad vectors to salivary glands, no significant lasting effects were observed in multiple measured clinical chemistry and hematology values. Taken together, the findings show that localized delivery of AdhAQP1 to IR-damaged salivary glands is useful in transiently increasing salivary secretion in both small and large animal models, without significant general adverse events. Based on these results, we are developing a clinical trial to test if the hAQP1 cDNA transfer strategy will be clinically effective in restoring salivary flow in patients with IR-induced parotid hypofunction.  相似文献   

