首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background  

Normalization of gene expression microarrays carrying thousands of genes is based on assumptions that do not hold for diagnostic microarrays carrying only few genes. Thus, applying standard microarray normalization strategies to diagnostic microarrays causes new normalization problems.  相似文献   

2.
To date, microarray-based genotyping of large, complex plant genomes has been complicated by the need to perform genome complexity reduction to obtain sufficiently strong hybridization signals. Genome complexity reduction techniques are, however, tedious and can introduce unwanted variables into genotyping assays. Here, we report a microarray-based genotyping technology for complex genomes (such as the 2.3 GB maize genome) that does not require genome complexity reduction prior to hybridization. Approximately 200,000 long oligonucleotide probes were identified as being polymorphic between the inbred parents of a mapping population and used to genotype two recombinant inbred lines. While multiple hybridization replicates provided ~97% accuracy, even a single replicate provided ~95% accuracy. Genotyping accuracy was further increased to >99% by utilizing information from adjacent probes. This microarray-based method provides a simple, high-density genotyping approach for large, complex genomes.  相似文献   

3.

Background  

Many different microarray experiments are publicly available today. It is natural to ask whether different experiments for the same phenotypic conditions can be combined using meta-analysis, in order to increase the overall sample size. However, some genes are not measured in all experiments, hence they cannot be included or their statistical significance cannot be appropriately estimated in traditional meta-analysis. Nonetheless, these genes, which we refer to as incomplete genes, may also be informative and useful.  相似文献   

4.
The measurements of coordinated patterns of protein abundance using antibody microarrays could be used to gain insight into disease biology and to probe the use of combinations of proteins for disease classification. The correct use and interpretation of antibody microarray data requires proper normalization of the data, which has not yet been systematically studied. Therefore we undertook a study to determine the optimal normalization of data from antibody microarray profiling of proteins in human serum specimens. Forty-three serum samples collected from patients with pancreatic cancer and from control subjects were probed in triplicate on microarrays containing 48 different antibodies, using a direct labeling, two-color comparative fluorescence detection format. Seven different normalization methods representing major classes of normalization for antibody microarray data were compared by their effects on reproducibility, accuracy, and trends in the data set. Normalization with ELISA-determined concentrations of IgM resulted in the most accurate, reproducible, and reliable data. The other normalization methods were deficient in at least one of the criteria. Multiparametric classification of the samples based on the combined measurement of seven of the proteins demonstrated the potential for increased classification accuracy compared with the use of individual measurements. This study establishes reliable normalization for antibody microarray data, criteria for assessing normalization performance, and the capability of antibody microarrays for serum-protein profiling and multiparametric sample classification.  相似文献   

5.

Background

Normalization is an important step for microarray data analysis to minimize biological and technical variations. Choosing a suitable approach can be critical. The default method in GeneChip expression microarray uses a constant factor, the scaling factor (SF), for every gene on an array. The SF is obtained from a trimmed average signal of the array after excluding the 2% of the probe sets with the highest and the lowest values.

Results

Among the 76 U34A GeneChip experiments, the total signals on each array showed 25.8% variations in terms of the coefficient of variation, although all microarrays were hybridized with the same amount of biotin-labeled cRNA. The 2% of the probe sets with the highest signals that were normally excluded from SF calculation accounted for 34% to 54% of the total signals (40.7% ± 4.4%, mean ± sd). In comparison with normalization factors obtained from the median signal or from the mean of the log transformed signal, SF showed the greatest variation. The normalization factors obtained from log transformed signals showed least variation.

Conclusions

Eliminating 40% of the signal data during SF calculation failed to show any benefit. Normalization factors obtained with log transformed signals performed the best. Thus, it is suggested to use the mean of the logarithm transformed data for normalization, rather than the arithmetic mean of signals in GeneChip gene expression microarrays.
  相似文献   

6.
A two-channel microarray measures the relative expression levels of thousands of genes from a pair of biological samples. In order to reliably compare gene expression levels between and within arrays, it is necessary to remove systematic errors that distort the biological signal of interest. The standard for accomplishing this is smoothing "MA-plots" to remove intensity-dependent dye bias and array-specific effects. However, MA methods require strong assumptions, which limit their general applicability. We review these assumptions and derive several practical scenarios in which they fail. The "dye-swap" normalization method has been much less frequently used because it requires two arrays per pair of samples. We show that a dye-swap is accurate under general assumptions, even under intensity-dependent dye bias, and that a dye-swap removes dye bias from a single pair of samples in general. Based on a flexible model of the relationship between mRNA amount and single-channel fluorescence intensity, we demonstrate the general applicability of a dye-swap approach. We then propose a common array dye-swap (CADS) method for the normalization of two-channel microarrays. We show that CADS removes both dye bias and array-specific effects, and preserves the true differential expression signal for every gene under the assumptions of the model.  相似文献   

7.
MOTIVATION: Normalization of microarray data is essential for multiple-array analyses. Several normalization protocols have been proposed based on different biological or statistical assumptions. A fundamental problem arises whether they have effectively normalized arrays. In addition, for a given array, the question arises how to choose a method to most effectively normalize the microarray data. RESULTS: We propose several techniques to compare the effectiveness of different normalization methods. We approach the problem by constructing statistics to test whether there are any systematic biases in the expression profiles among duplicated spots within an array. The test statistics involve estimating the genewise variances. This is accomplished by using several novel methods, including empirical Bayes methods for moderating the genewise variances and the smoothing methods for aggregating variance information. P-values are estimated based on a normal or chi approximation. With estimated P-values, we can choose a most appropriate method to normalize a specific array and assess the extent to which the systematic biases due to the variations of experimental conditions have been removed. The effectiveness and validity of the proposed methods are convincingly illustrated by a carefully designed simulation study. The method is further illustrated by an application to human placenta cDNAs comprising a large number of clones with replications, a customized microarray experiment carrying just a few hundred genes on the study of the molecular roles of Interferons on tumor, and the Agilent microarrays carrying tens of thousands of total RNA samples in the MAQC project on the study of reproducibility, sensitivity and specificity of the data. AVAILABILITY: Code to implement the method in the statistical package R is available from the authors.  相似文献   

8.

Background  

Several aspects of microarray data analysis are dependent on identification of genes expressed at or near the limits of detection. For example, regression-based normalization methods rely on the premise that most genes in compared samples are expressed at similar levels and therefore require accurate identification of nonexpressed genes (additive noise) so that they can be excluded from the normalization procedure. Moreover, key regulatory genes can maintain stringent control of a given response at low expression levels. If arbitrary cutoffs are used for distinguishing expressed from nonexpressed genes, some of these key regulatory genes may be unnecessarily excluded from the analysis. Unfortunately, no accurate method for differentiating additive noise from genes expressed at low levels is currently available.  相似文献   

9.
10.

Background  

Extracting biological information from high-density Affymetrix arrays is a multi-step process that begins with the accurate annotation of microarray probes. Shortfalls in the original Affymetrix probe annotation have been described; however, few studies have provided rigorous solutions for routine data analysis.  相似文献   

11.
The ability to measure gene expression on a genome-wide scale is one of the most promising accomplishments in molecular biology. Microarrays, the technology that first permitted this, were riddled with problems due to unwanted sources of variability. Many of these problems are now mitigated, after a decade's worth of statistical methodology development. The recently developed RNA sequencing (RNA-seq) technology has generated much excitement in part due to claims of reduced variability in comparison to microarrays. However, we show that RNA-seq data demonstrate unwanted and obscuring variability similar to what was first observed in microarrays. In particular, we find guanine-cytosine content (GC-content) has a strong sample-specific effect on gene expression measurements that, if left uncorrected, leads to false positives in downstream results. We also report on commonly observed data distortions that demonstrate the need for data normalization. Here, we describe a statistical methodology that improves precision by 42% without loss of accuracy. Our resulting conditional quantile normalization algorithm combines robust generalized regression to remove systematic bias introduced by deterministic features such as GC-content and quantile normalization to correct for global distortions.  相似文献   

12.
We introduce a novel experimental methodology for the reverse‐phase protein microarray platform which reduces the typical measurement CV as much as 70%. The methodology, referred to as array microenvironment normalization, increases the statistical power of the platform. In the experiment, it enabled the detection of a 1.1‐fold shift in prostate specific antigen concentration using approximately six technical replicates rather than the 37 replicates previously required. The improved reproducibility and statistical power should facilitate clinical implementation of the platform.  相似文献   

13.
New monitoring programs are often designed with some form of temporal replication to deal with imperfect detection by means of occupancy models. However, classical bird census data from earlier times often lack temporal replication, precluding detection‐corrected inferences about occupancy. Historical data have a key role in many ecological studies intended to document range shifts, and so need to be made comparable with present‐day data by accounting for detection probability. We analyze a classical bird census conducted in the region of Murcia (SE Spain) in 1991 and 1992 and propose a solution to estimating detection probability for such historical data when used in a community occupancy model: the spatial replication of subplots nested within larger plots allows estimation of detection probability. In our study, the basic sample units were 1‐km transects, which were considered spatial replicates in two aggregation schemes. We fit two Bayesian multispecies occupancy models, one for each aggregation scheme, and evaluated the linear and quadratic effect of forest cover and temperature, and a linear effect of precipitation on species occupancy probabilities. Using spatial rather than temporal replicates allowed us to obtain individual species occupancy probabilities and species richness accounting for imperfect detection. Species‐specific occupancy and community size decreased with increasing annual mean temperature. Both aggregation schemes yielded estimates of occupancy and detectability that were highly correlated for each species, so in the design of future surveys ecological reasons and cost‐effective sampling designs should be considered to select the most suitable aggregation scheme. In conclusion, the use of spatial replication may often allow historical survey data to be applied formally hierarchical occupancy models and be compared with modern‐day data of the species community to analyze global change process.  相似文献   

14.
Hundreds of tissue samples may be assembled in a tissue microarray format for simultaneous immunostaining assessment of protein expression profiling. A DNA microarray two-color laser scanner was used for automated analysis of tissue microarray indirect immunofluorescence. On sections from both a human lung adenocarcinoma and a squamous cell carcinoma tissue microarray, fluorescence intensity for two epidermal growth factor receptors (EGFR and c-erbB2) correlates with diagnostic pathologic assessment, indicating that immunohistochemistry quantitation can be achieved. Importantly, double-label indirect immunofluorescence detection with the cDNA scanner demonstrates that one reference antigen can normalize tumor marker immunosignal for the cellular content of tissue microarray tissue cores. Therefore, DNA microarray scanners and associated image analysis software provide general and efficient analysis of tissue microarray immunostaining, including estimation of specific protein expression levels.  相似文献   

15.
Structural properties of articular cartilage such as proteoglycan content, collagen content and collagen alignment are known to vary over length scales as small as a few microns (Bullough and Goodfellow, 1968; Bi et al., 2006). Characterizing the resulting variation in mechanical properties is critical for understanding how the inhomogeneous architecture of this tissue gives rise to its function. Previous studies have measured the depth-dependent shear modulus of articular cartilage using methods such as particle image velocimetry (PIV) that rely on cells and cell nuclei as fiducial markers to track tissue deformation (Buckley et al., 2008; Wong et al., 2008a). However, such techniques are limited by the density of trackable markers, which may be too low to take full advantage of optical microscopy. This limitation leads to noise in the acquired data, which is often exacerbated when the data is manipulated. In this study, we report on two techniques for increasing the accuracy of tissue deformation measurements. In the first technique, deformations were tracked in a grid that was photobleached on each tissue sample (Bruehlmann et al., 2004). In the second, a numerical technique was implemented that allowed for accurate differentiation of optical displacement measurements by minimizing the propagated experimental error while ensuring that truncation error associated with local averaging of the data remained small. To test their efficacy, we employed these techniques to compare the depth-dependent shear moduli of neonatal bovine and adult human articular cartilage. Using a photobleached grid and numerical optimization to gather and analyze data led to results consistent with those reported previously (Buckley et al., 2008; Wong et al., 2008a), but with increased spatial resolution and characteristic coefficients of variation that were reduced up to a factor of 3. This increased resolution allowed us to determine that the shear modulus of neonatal bovine and adult human tissue both exhibit a global minimum at a depth z of around 100 μm and plateau at large depths. The consistency of the depth dependence of |G*|(Z) for adult human and neonatal bovine tissue suggests a functional advantage resulting from this behavior.  相似文献   

16.
Understanding actual and potential selection on traits of invasive species requires an assessment of the sources of variation in demographic rates. While some of this variation is assignable to environmental, biotic or historical factors, unexplained demographic variation also may play an important role. Even when sites and populations are chosen as replicates, the residual variation in demographic rates can lead to unexplained divergence of asymptotic and transient population dynamics. This kind of divergence could be important for understanding long- and short- term differences among populations of invasive species, but little is known about it. We investigated the demography of a small invasive tree Psidium cattleianum Sabine in the rainforest of Hawaiʻi at four sites chosen for their ecological similarity. Specifically, we parameterized and analyzed integral projection models (IPM) to investigate projected variability among replicate populations in: (1) total population size and annual per capita population growth rate during the transient and asymptotic periods; (2) population structure initially and asymptotically; (3) three key parameters that characterize transient dynamics (the weighted distance of the structure at each time step from the asymptotic structure, the strength of the sub-dominant relative to the dominant dynamics, and inherent cyclicity in the subdominant); and (4) proportional sensitivity (elasticity) of population growth rates (both asymptotic and transient) to perturbations of various components of the life cycle. We found substantial variability among replicate populations in all these aspects of the dynamics. We discuss potential consequences of variability across ecologically similar sites for management and evolutionary ecology in the exotic range of invasive species.  相似文献   

17.
MOTIVATION: Microarray experiments are affected by numerous sources of non-biological variation that contribute systematic bias to the resulting data. In a dual-label (two-color) cDNA or long-oligonucleotide microarray, these systematic biases are often manifested as an imbalance of measured fluorescent intensities corresponding to Sample A versus those corresponding to Sample B. Systematic biases also affect between-slide comparisons. Making effective corrections for these systematic biases is a requisite for detecting the underlying biological variation between samples. Effective data normalization is therefore an essential step in the confident identification of biologically relevant differences in gene expression profiles. Several normalization methods for the correction of systemic bias have been described. While many of these methods have addressed intensity-dependent bias, few have addressed both intensity-dependent and spatiality-dependent bias. RESULTS: We present a neural network-based normalization method for correcting the intensity- and spatiality-dependent bias in cDNA microarray datasets. In this normalization method, the dependence of the log-intensity ratio (M) on the average log-intensity (A) as well as on the spatial coordinates (X,Y) of spots is approximated with a feed-forward neural network function. Resistance to outliers is provided by assigning weights to each spot based on how distant their M values is from the median over the spots whose A values are similar, as well as by using pseudospatial coordinates instead of spot row and column indices. A comparison of the robust neural network method with other published methods demonstrates its potential in reducing both intensity-dependent bias and spatial-dependent bias, which translates to more reliable identification of truly regulated genes.  相似文献   

18.
A photoimmobilization method has been developed for the preparation of microarray biochips. This photoimmobilization method makes it possible to easily covalently immobilize various types of organic molecules and cells on a chip. In addition, by using hydrophilic polymers as matrixes, it is possible to reduce nonspecific interactions with biological components. Various proteins, antibodies, and cells have been microarrayed using this technique, and interactions between these proteins, antibodies, and cells have been investigated. This type of microarray biochip will be important for academic applications such as genomics, proteomics, and cellomics, and clinical analyses.  相似文献   

19.
The production of oat (Avena sativa L.) phytoalexins, avenanthramides, occurs in response to elicitor treatment with oligo-N-acetylchitooligosaccharides. In this study, avenanthramides production was investigated by techniques that provide high spatial and temporal resolution in order to clarify the process of phytoalexin production at the cellular level. The amount of avenanthramides accumulation in a single mesophyll cell was quantified by a combination of laser micro-sampling and low-diffuse nanoflow liquid chromatography–electrospray ionization tandem mass spectrometry (LC–ESI-MS/MS) techniques. Avenanthramides, NAD(P)H and chlorophyll were also visualized in elicitor-treated mesophyll cells using line-scanning fluorescence microscopy. We found that elicitor-treated mesophyll cells could be categorized into three characteristic cell phases, which occurred serially over time. Phase 0 indicated the normal cell state before metabolic or morphological change in response to elicitor, in which the cells contained abundant NAD(P)H. In phase 1, rapid NAD(P)H oxidation and marked movement of chloroplasts occurred, and this phase was the early stage of avenanthramides biosynthesis. In phase 2, avenanthramides accumulation was maximized, and chloroplasts were degraded. Avenanthramides appear to be synthesized in the chloroplast, because a fluorescence signal originating from avenanthramides was localized to the chloroplasts. Moreover, our results indicated that avenanthramides biosynthesis and the hypersensitive response (HR) occurred in identical cells. Thus, the avenanthramides production may be one of sequential events programmed in HR leading to cell death. Furthermore, the phase of the defense response was different among mesophyll cells simultaneously treated with elicitor. These results suggest that individual cells may have different susceptibility to the elicitor. Electronic supplementary material  The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

20.
Optimal experimental design is important for the efficient use of modern highthroughput technologies such as microarrays and proteomics. Multiple factors including the reliability of measurement system, which itself must be estimated from prior experimental work, could influence design decisions. In this study, we describe how the optimal number of replicate measures (technical replicates) for each biological sample (biological replicate) can be determined. Different allocations of biological and technical replicates were evaluated by minimizing the variance of the ratio of technical variance (measurement error) to the total variance (sum of sampling error and measurement error). We demonstrate that if the number of biological replicates and the number of technical replicates per biological sample are variable, while the total number of available measures is fixed, then the optimal allocation of replicates for measurement evaluation experiments requires two technical replicates for each biological replicate. Therefore, it is recommended to use two technical replicates for each biological replicate if the goal is to evaluate the reproducibility of measurements.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号