首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background

Microarray technology allows the monitoring of expression levels for thousands of genes simultaneously. This novel technique helps us to understand gene regulation as well as gene by gene interactions more systematically. In the microarray experiment, however, many undesirable systematic variations are observed. Even in replicated experiment, some variations are commonly observed. Normalization is the process of removing some sources of variation which affect the measured gene expression levels. Although a number of normalization methods have been proposed, it has been difficult to decide which methods perform best. Normalization plays an important role in the earlier stage of microarray data analysis. The subsequent analysis results are highly dependent on normalization.

Results

In this paper, we use the variability among the replicated slides to compare performance of normalization methods. We also compare normalization methods with regard to bias and mean square error using simulated data.

Conclusions

Our results show that intensity-dependent normalization often performs better than global normalization methods, and that linear and nonlinear normalization methods perform similarly. These conclusions are based on analysis of 36 cDNA microarrays of 3,840 genes obtained in an experiment to search for changes in gene expression profiles during neuronal differentiation of cortical stem cells. Simulation studies confirm our findings.
  相似文献   

2.
Comparison of normalization methods with microRNA microarray   总被引:3,自引:0,他引:3  
Hua YJ  Tu K  Tang ZY  Li YX  Xiao HS 《Genomics》2008,92(2):122-128
MicroRNAs (miRNAs) are a group of RNAs that play important roles in regulating gene expression and protein translation. In a previous study, we established an oligonucleotide microarray platform to detect miRNA expression. Because it contained only hundreds of probes, data normalization was difficult. In this study, the microarray data for eight miRNAs extracted from inflamed rat dorsal root ganglion (DRG) tissue were normalized using 15 methods and compared with the results of real-time polymerase chain reaction. It was found that the miRNA microarray data normalized by the print-tip loess method were the most consistent with results from real-time polymerase chain reaction. Moreover, the same pattern was also observed in 14 different types of rat tissue. This study compares a variety of normalization methods and will be helpful in the preprocessing of miRNA microarray data.  相似文献   

3.

Background  

With the development of DNA hybridization microarray technologies, nowadays it is possible to simultaneously assess the expression levels of thousands to tens of thousands of genes. Quantitative comparison of microarrays uncovers distinct patterns of gene expression, which define different cellular phenotypes or cellular responses to drugs. Due to technical biases, normalization of the intensity levels is a pre-requisite to performing further statistical analyses. Therefore, choosing a suitable approach for normalization can be critical, deserving judicious consideration.  相似文献   

4.
New normalization methods for cDNA microarray data   总被引:7,自引:0,他引:7  
MOTIVATION: The focus of this paper is on two new normalization methods for cDNA microarrays. After the image analysis has been performed on a microarray and before differentially expressed genes can be detected, some form of normalization must be applied to the microarrays. Normalization removes biases towards one or other of the fluorescent dyes used to label each mRNA sample allowing for proper evaluation of differential gene expression. RESULTS: The two normalization methods that we present here build on previously described non-linear normalization techniques. We extend these techniques by firstly introducing a normalization method that deals with smooth spatial trends in intensity across microarrays, an important issue that must be dealt with. Secondly we deal with normalization of a new type of cDNA microarray experiment that is coming into prevalence, the small scale specialty or 'boutique' array, where large proportions of the genes on the microarrays are expected to be highly differentially expressed. AVAILABILITY: The normalization methods described in this paper are available via http://www.pi.csiro.au/gena/ in a software suite called tRMA: tools for R Microarray Analysis upon request of the authors. Images and data used in this paper are also available via the same link.  相似文献   

5.

Background  

The quality of microarray data can seriously affect the accuracy of downstream analyses. In order to reduce variability and enhance signal reproducibility in these data, many normalization methods have been proposed and evaluated, most of which are for data obtained from cDNA microarrays and Affymetrix GeneChips. CodeLink Bioarrays are a newly emerged, single-color oligonucleotide microarray platform. To date, there are no reported studies that evaluate normalization methods for CodeLink Bioarrays.  相似文献   

6.
Simple total tag count normalization is inadequate for microRNA sequencing data generated from the next generation sequencing technology. However, so far systematic evaluation of normalization methods on microRNA sequencing data is lacking. We comprehensively evaluate seven commonly used normalization methods including global normalization, Lowess normalization, Trimmed Mean Method (TMM), quantile normalization, scaling normalization, variance stabilization, and invariant method. We assess these methods on two individual experimental data sets with the empirical statistical metrics of mean square error (MSE) and Kolmogorov-Smirnov (K-S) statistic. Additionally, we evaluate the methods with results from quantitative PCR validation. Our results consistently show that Lowess normalization and quantile normalization perform the best, whereas TMM, a method applied to the RNA-Sequencing normalization, performs the worst. The poor performance of TMM normalization is further evidenced by abnormal results from the test of differential expression (DE) of microRNA-Seq data. Comparing with the models used for DE, the choice of normalization method is the primary factor that affects the results of DE. In summary, Lowess normalization and quantile normalization are recommended for normalizing microRNA-Seq data, whereas the TMM method should be used with caution.  相似文献   

7.
The purpose of this study was to determine the reliability of three normalization methods for analyzing hip abductor activation during rehabilitation exercises. Thirteen healthy subjects performed three open kinetic chain and three closed kinetic chain hip abductor exercises. Surface EMG activity for the gluteus medius was collected during each exercise and normalized based on a maximum voluntary isometric contraction (MVIC), mean dynamic (m-DYN), and peak dynamic activity (pk-DYN). Intraclass coefficient correlations (ICCs), intersubject coefficients of variation (CVs), and intrasubject CVs were then calculated for each normalization method. MVIC ICCs exceeded 0.93 for all exercises. M-DYN and pk-DYN ICCs exceeded 0.85 for all exercises except for the sidelying abduction exercise. Intersubject CVs ranged from 55% to 77% and 19% to 61% for the MVIC and dynamic methods, respectively. Intrasubject CVs ranged from 11% to 22% for all exercises under all normalization methods. The MVIC method provided the highest measurement reliability for determining differences in activation amplitudes between hip abductor exercises in healthy subjects. Future research should determine if these same results would apply to a symptomatic patient population.  相似文献   

8.
9.
In studies designed to compare different methods of measurement where more than two methods are compared or replicate measurements by each method are available, standard statistical approaches such as computation of limits of agreement are not directly applicable. A model is presented for comparing several methods of measurement in the situation where replicate measurements by each method are available. Measurements are viewed as classified by method, subject and replicate. Models assuming exchangeable as well as non-exchangeable replicates are considered. A fitting algorithm is presented that allows the estimation of linear relationships between methods as well as relevant variance components. The algorithm only uses methods already implemented in most statistical software.  相似文献   

10.
Extracting biomedical information from large metabolomic datasets by multivariate data analysis is of considerable complexity. Common challenges include among others screening for differentially produced metabolites, estimation of fold changes, and sample classification. Prior to these analysis steps, it is important to minimize contributions from unwanted biases and experimental variance. This is the goal of data preprocessing. In this work, different data normalization methods were compared systematically employing two different datasets generated by means of nuclear magnetic resonance (NMR) spectroscopy. To this end, two different types of normalization methods were used, one aiming to remove unwanted sample-to-sample variation while the other adjusts the variance of the different metabolites by variable scaling and variance stabilization methods. The impact of all methods tested on sample classification was evaluated on urinary NMR fingerprints obtained from healthy volunteers and patients suffering from autosomal polycystic kidney disease (ADPKD). Performance in terms of screening for differentially produced metabolites was investigated on a dataset following a Latin-square design, where varied amounts of 8 different metabolites were spiked into a human urine matrix while keeping the total spike-in amount constant. In addition, specific tests were conducted to systematically investigate the influence of the different preprocessing methods on the structure of the analyzed data. In conclusion, preprocessing methods originally developed for DNA microarray analysis, in particular, Quantile and Cubic-Spline Normalization, performed best in reducing bias, accurately detecting fold changes, and classifying samples.  相似文献   

11.
12.
13.
14.
MOTIVATION: Normalization of microarray data is essential for multiple-array analyses. Several normalization protocols have been proposed based on different biological or statistical assumptions. A fundamental problem arises whether they have effectively normalized arrays. In addition, for a given array, the question arises how to choose a method to most effectively normalize the microarray data. RESULTS: We propose several techniques to compare the effectiveness of different normalization methods. We approach the problem by constructing statistics to test whether there are any systematic biases in the expression profiles among duplicated spots within an array. The test statistics involve estimating the genewise variances. This is accomplished by using several novel methods, including empirical Bayes methods for moderating the genewise variances and the smoothing methods for aggregating variance information. P-values are estimated based on a normal or chi approximation. With estimated P-values, we can choose a most appropriate method to normalize a specific array and assess the extent to which the systematic biases due to the variations of experimental conditions have been removed. The effectiveness and validity of the proposed methods are convincingly illustrated by a carefully designed simulation study. The method is further illustrated by an application to human placenta cDNAs comprising a large number of clones with replications, a customized microarray experiment carrying just a few hundred genes on the study of the molecular roles of Interferons on tumor, and the Agilent microarrays carrying tens of thousands of total RNA samples in the MAQC project on the study of reproducibility, sensitivity and specificity of the data. AVAILABILITY: Code to implement the method in the statistical package R is available from the authors.  相似文献   

15.
Birch  Gavin F. 《Hydrobiologia》2003,492(1-3):5-13
Chemical analyses of sediment are used for assessing the ability of sediment to support a healthy benthos (sediment quality) and for determining contaminant source and dispersion in aquatic systems. Total sediment analysis is used for sediment quality assessment, whereas source identification and dispersion requires normalised contaminant data. Normalized contaminant data are obtained by physical fractionation (size-normalization) of the sediment and analyses of a constant size fraction (usually the 62.5 m fraction), whereas elemental normalization uses the total sediment analysis normalized to a conservative element. Elemental normalization is preferable, as it is cheaper and less time consuming than size-normalization techniques. In addition, some contaminants associated with oxides and oxyhydroxides in the coarse fraction are excluded in fine fraction analyses. Five techniques used to normalize sedimentary contaminant data were tested in the current study, including a new post-extraction normalization method where total sediment data are normalized to the residue after digestion, on the assumption that this fraction acts as a diluent only. Results of the tests indicated that simple normalization to the mud fraction provides useful dispersion information, but that the post-extraction normalization method produced a superior indication of source. Limited source and dispersion information was gleamed from the elemental-normalization (Al, Fe) approach, whereas the size-normalization technique provided the clearest indication of source and dispersion. Simple mud normalization and post-extraction normaliaation methods should be considered because only one analysis provides sediment quality, as well as source and dispersion information. However, for detailed information on source and dispersion, size normalization is recommended.  相似文献   

16.
MOTIVATION: Clusters of genes encoding proteins with related functions, or in the same regulatory network, often exhibit expression patterns that are correlated over a large number of conditions. Protein associations and gene regulatory networks can be modelled from expression data. We address the question of which of several normalization methods is optimal prior to computing the correlation of the expression profiles between every pair of genes. RESULTS: We use gene expression data from five experiments with a total of 78 hybridizations and 23 diverse conditions. Nine methods of data normalization are explored based on all possible combinations of normalization techniques according to between and within gene and experiment variation. We compare the resulting empirical distribution of gene x gene correlations with the expectations and apply cross-validation to test the performance of each method in predicting accurate functional annotation. We conclude that normalization methods based on mixed-model equations are optimal.  相似文献   

17.
18.

Background  

When DNA microarray data are used for gene clustering, genotype/phenotype correlation studies, or tissue classification the signal intensities are usually transformed and normalized in several steps in order to improve comparability and signal/noise ratio. These steps may include subtraction of an estimated background signal, subtracting the reference signal, smoothing (to account for nonlinear measurement effects), and more. Different authors use different approaches, and it is generally not clear to users which method they should prefer.  相似文献   

19.

Background  

Genome context methods have been introduced in the last decade as automatic methods to predict functional relatedness between genes in a target genome using the patterns of existence and relative locations of the homologs of those genes in a set of reference genomes. Much work has been done in the application of these methods to different bioinformatics tasks, but few papers present a systematic study of the methods and their combination necessary for their optimal use.  相似文献   

20.
Measuring rates of spread during biological invasions is important for predicting where and when invading organisms will spread in the future as well as for quantifying the influence of environmental conditions on invasion speed. While several methods have been proposed in the literature to measure spread rates, a comprehensive comparison of their accuracy when applied to empirical data would be problematic because true rates of spread are never known. This study compares the performances of several spread rate measurement methods using a set of simulated invasions with known theoretical spread rates over a hypothetical region where a set of sampling points are distributed. We vary the density and distribution (aggregative, random, and regular) of the sampling points as well as the shape of the invaded area and then compare how different spread rate measurement methods accommodate these varying conditions. We find that the method of regressing distance to the point of origin of the invasion as a function of time of first detection provides the most reliable method over adverse conditions (low sampling density, aggregated distribution of sampling points, irregular invaded area). The boundary displacement method appears to be a useful complementary method when sampling density is sufficiently high, as it provides an instantaneous measure of spread rate, and does not require long time series of data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号