期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A comparison of normalization techniques for microRNA microarray data

Rao Y Lee Y Jarjoura D Ruppert AS Liu CG Hsu JC Hagan JP 《Statistical applications in genetics and molecular biology》2008,7(1):Article22

Normalization of expression levels applied to microarray data can help in reducing measurement error. Different methods, including cyclic loess, quantile normalization and median or mean normalization, have been utilized to normalize microarray data. Although there is considerable literature regarding normalization techniques for mRNA microarray data, there are no publications comparing normalization techniques for microRNA (miRNA) microarray data, which are subject to similar sources of measurement error. In this paper, we compare the performance of cyclic loess, quantile normalization, median normalization and no normalization for a single-color microRNA microarray dataset. We show that the quantile normalization method works best in reducing differences in miRNA expression values for replicate tissue samples. By showing that the total mean squared error are lowest across almost all 36 investigated tissue samples, we are assured that the bias correction provided by quantile normalization is not outweighed by additional error variance that can arise from a more complex normalization method. Furthermore, we show that quantile normalization does not achieve these results by compression of scale. 相似文献

2.

A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data

Ting Wang Weihua Guan Jerome Lin Nadia Boutaoui Glorisa Canino Jianhua Luo Juan Carlos Celedón Wei Chen 《Epigenetics》2015,10(7):662-669

DNA methylation plays an important role in disease etiology. The Illumina Infinium HumanMethylation450 (450K) BeadChip is a widely used platform in large-scale epidemiologic studies. This platform can efficiently and simultaneously measure methylation levels at ∼480,000 CpG sites in the human genome in multiple study samples. Due to the intrinsic chip design of 2 types of chemistry probes, data normalization or preprocessing is a critical step to consider before data analysis. To date, numerous methods and pipelines have been developed for this purpose, and some studies have been conducted to evaluate different methods. However, validation studies have often been limited to a small number of CpG sites to reduce the variability in technical replicates. In this study, we measured methylation on a set of samples using both whole-genome bisulfite sequencing (WGBS) and 450K chips. We used WGBS data as a gold standard of true methylation states in cells to compare the performances of 8 normalization methods for 450K data on a genome-wide scale. Analyses on our dataset indicate that the most effective methods are peak-based correction (PBC) and quantile normalization plus β-mixture quantile normalization (QN.BMIQ). To our knowledge, this is the first study to systematically compare existing normalization methods for Illumina 450K data using novel WGBS data. Our results provide a benchmark reference for the analysis of DNA methylation chip data, particularly in white blood cells. 相似文献

3.

Systematic evaluation of three microRNA profiling platforms: microarray, beads array, and quantitative real-time PCR array

Wang B Howel P Bruheim S Ju J Owen LB Fodstad O Xi Y 《PloS one》2011,6(2):e17167

Background

A number of gene-profiling methodologies have been applied to microRNA research. The diversity of the platforms and analytical methods makes the comparison and integration of cross-platform microRNA profiling data challenging. In this study, we systematically analyze three representative microRNA profiling platforms: Locked Nucleic Acid (LNA) microarray, beads array, and TaqMan quantitative real-time PCR Low Density Array (TLDA).

Methodology/Principal Findings

The microRNA profiles of 40 human osteosarcoma xenograft samples were generated by LNA array, beads array, and TLDA. Results show that each of the three platforms perform similarly regarding intra-platform reproducibility or reproducibility of data within one platform while LNA array and TLDA had the best inter-platform reproducibility or reproducibility of data across platforms. The endogenous controls/probes contained in each platform have been observed for their stability under different treatments/environments; those included in TLDA have the best performance with minimal coefficients of variation. Importantly, we identify that the proper selection of normalization methods is critical for improving the inter-platform reproducibility, which is evidenced by the application of two non-linear normalization methods (loess and quantile) that substantially elevated the sensitivity and specificity of the statistical data assessment.

Conclusions

Each platform is relatively stable in terms of its own microRNA profiling intra-reproducibility; however, the inter-platform reproducibility among different platforms is low. More microRNA specific normalization methods are in demand for cross-platform microRNA microarray data integration and comparison, which will improve the reproducibility and consistency between platforms. 相似文献

4.

A two-stage normalization method for partially degraded mRNA microarray data

Liu LY Wang N Lupton JR Turner ND Chapkin RS Davidson LA 《Bioinformatics (Oxford, England)》2005,21(21):4000-4006

MOTIVATION: The goal of the study is to obtain genetic information from exfoliated colonocytes in the fecal stream rather than directly from mucosa cells within the colon. The latter is obtained through invasive procedures. The difficulties encountered by this procedure are that certain probe information may be compromised due to partially degraded mRNA. Proper normalization is essential to obtaining useful information from these fecal array data. RESULTS: We propose a new two-stage semiparametric normalization method motivated by the features observed in fecal microarray data. A location-scale transformation and a robust inclusion step were used to roughly align arrays within the same treatment. A non-parametric estimated non-linear transformation was then used to remove the potential intensity-based biases. We compared the performance of the new method in analyzing a fecal microarray dataset with those achieved by two existing normalization approaches: global median transformation and quantile normalization. The new method favorably compared with the global median and quantile normalization methods. AVAILABILITY: The R codes implementing the two-stage method may be obtained from the corresponding author. 相似文献

5.

Testing for differentially-expressed microRNAs with errors-in-variables nonparametric regression

Wang B Zhang SG Wang XF Tan M Xi Y 《PloS one》2012,7(5):e37537

相似文献

6.

Depth normalization of small RNA sequencing: using data and biology to select a suitable method

Yannick Düren Johannes Lederer Li-Xuan Qin 《Nucleic acids research》2022,50(10):e56

相似文献

7.

Quality Assessment and Data Analysis for microRNA Expression Arrays 总被引：1，自引：0，他引：1

下载免费PDF全文

D. Sarkar R. Parkin S. Wyman A. Bendoraite C. Sather J. Delrow A. K. Godwin C. Drescher W. Huber R. Gentleman M. Tewari 《Nucleic acids research》2009,37(2):e17

MicroRNAs are small (~22 nt) RNAs that regulate gene expression and play important roles in both normal and disease physiology. The use of microarrays for global characterization of microRNA expression is becoming increasingly popular and has the potential to be a widely used and valuable research tool. However, microarray profiling of microRNA expression raises a number of data analytic challenges that must be addressed in order to obtain reliable results. We introduce here a universal reference microRNA reagent set as well as a series of nonhuman spiked-in synthetic microRNA controls, and demonstrate their use for quality control and between-array normalization of microRNA expression data. We also introduce diagnostic plots designed to assess and compare various normalization methods. We anticipate that the reagents and analytic approach presented here will be useful for improving the reliability of microRNA microarray experiments. 相似文献

8.

An analysis of normalization methods for Drosophila RNAi genomic screens and development of a robust validation scheme

Wiles AM Ravi D Bhavani S Bishop AJ 《Journal of biomolecular screening》2008,13(8):777-784

Genome-wide RNA interference (RNAi) screening allows investigation of the role of individual genes in a process of choice. Most RNAi screens identify a large number of genes with a continuous gradient in the assessed phenotype. Screeners must decide whether to examine genes with the most robust phenotype or the full gradient of genes that cause an effect and how to identify candidate genes. The authors have used RNAi in Drosophila cells to examine viability in a 384-well plate format and compare 2 screens, untreated control and treatment. They compare multiple normalization methods, which take advantage of different features within the data, including quantile normalization, background subtraction, scaling, cellHTS2 (Boutros et al. 2006), and interquartile range measurement. Considering the false-positive potential that arises from RNAi technology, a robust validation method was designed for the purpose of gene selection for future investigations. In a retrospective analysis, the authors describe the use of validation data to evaluate each normalization method. Although no method worked ideally, a combination of 2 methods, background subtraction followed by quantile normalization and cellHTS2, at different thresholds, captures the most dependable and diverse candidate genes. Thresholds are suggested depending on whether a few candidate genes are desired or a more extensive systems-level analysis is sought. The normalization approaches and experimental design to perform validation experiments are likely to apply to those high-throughput screening systems attempting to identify genes for systems-level analysis. 相似文献

9.

Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments

James H Bullard Elizabeth Purdom Kasper D Hansen Sandrine Dudoit 《BMC bioinformatics》2010,11(1):94

相似文献

10.

Modified least-variant set normalization for miRNA microarray

Suo C Salim A Chia KS Pawitan Y Calza S 《RNA (New York, N.Y.)》2010,16(12):2293-2303

相似文献

11.

How data analysis affects power,reproducibility and biological insight of RNA-seq studies in complex datasets

Lucia Peixoto Davide Risso Shane G. Poplawski Mathieu E. Wimmer Terence P. Speed Marcelo A. Wood Ted Abel 《Nucleic acids research》2015,43(16):7664-7674

相似文献

12.

Using Generalized Procrustes Analysis (GPA) for normalization of cDNA microarray data

Huiling Xiong Dapeng Zhang Christopher J Martyniuk Vance L Trudeau Xuhua Xia 《BMC bioinformatics》2008,9(1):25

Background

Normalization is essential in dual-labelled microarray data analysis to remove non-biological variations and systematic biases. Many normalization methods have been used to remove such biases within slides (Global, Lowess) and across slides (Scale, Quantile and VSN). However, all these popular approaches have critical assumptions about data distribution, which is often not valid in practice. 相似文献

13.

Enhanced quantile normalization of microarray data to reduce loss of information in gene expression profiles

Hu J He X 《Biometrics》2007,63(1):50-59

In microarray experiments, removal of systematic variations resulting from array preparation or sample hybridization conditions is crucial to ensure sensible results from the ensuing data analysis. For example, quantile normalization is routinely used in the treatment of both oligonucleotide and cDNA microarray data, even though there might be some loss of information in the normalization process. We recognize that the ideal normalization, if it ever exists, would aim to keep the maximal amount of gene profile information with the lowest possible noise. With this objective in mind, we propose a valuable enhancement to quantile normalization, and demonstrate through three Affymetrix experiments that the enhanced normalization can result in better performance in detecting and ranking differentially expressed genes across experimental conditions. 相似文献

14.

Differential expression analyses for single-cell RNA-Seq: old questions on new data

Zhun Miao Xuegong Zhang 《Quantitative Biology.》2016,4(4):243

Background: Single-cell RNA sequencing (scRNA-seq) is an emerging technology that enables high resolution detection of heterogeneities between cells. One important application of scRNA-seq data is to detect differential expression (DE) of genes. Currently, some researchers still use DE analysis methods developed for bulk RNA-Seq data on single-cell data, and some new methods for scRNA-seq data have also been developed. Bulk and single-cell RNA-seq data have different characteristics. A systematic evaluation of the two types of methods on scRNA-seq data is needed. Results: In this study, we conducted a series of experiments on scRNA-seq data to quantitatively evaluate 14 popular DE analysis methods, including both of traditional methods developed for bulk RNA-seq data and new methods specifically designed for scRNA-seq data. We obtained observations and recommendations for the methods under different situations. Conclusions: DE analysis methods should be chosen for scRNA-seq data with great caution with regard to different situations of data. Different strategies should be taken for data with different sample sizes and/or different strengths of the expected signals. Several methods for scRNA-seq data show advantages in some aspects, and DEGSeq tends to outperform other methods with respect to consistency, reproducibility and accuracy of predictions on scRNA-seq data. 相似文献

15.

Removing technical variability in RNA-seq data using conditional quantile normalization

Hansen KD Irizarry RA Wu Z 《Biostatistics (Oxford, England)》2012,13(2):204-216

The ability to measure gene expression on a genome-wide scale is one of the most promising accomplishments in molecular biology. Microarrays, the technology that first permitted this, were riddled with problems due to unwanted sources of variability. Many of these problems are now mitigated, after a decade's worth of statistical methodology development. The recently developed RNA sequencing (RNA-seq) technology has generated much excitement in part due to claims of reduced variability in comparison to microarrays. However, we show that RNA-seq data demonstrate unwanted and obscuring variability similar to what was first observed in microarrays. In particular, we find guanine-cytosine content (GC-content) has a strong sample-specific effect on gene expression measurements that, if left uncorrected, leads to false positives in downstream results. We also report on commonly observed data distortions that demonstrate the need for data normalization. Here, we describe a statistical methodology that improves precision by 42% without loss of accuracy. Our resulting conditional quantile normalization algorithm combines robust generalized regression to remove systematic bias introduced by deterministic features such as GC-content and quantile normalization to correct for global distortions. 相似文献

16.

Molecular effects of doxycycline treatment on pterygium as revealed by massive transcriptome sequencing

Larráyoz IM de Luis A Rúa O Velilla S Cabello J Martínez A 《PloS one》2012,7(6):e39359

Pterygium is a lesion of the eye surface which involves cell proliferation, migration, angiogenesis, fibrosis, and extracellular matrix remodelling. Surgery is the only approved method to treat this disorder, but high recurrence rates are common. Recently, it has been shown in a mouse model that treatment with doxycycline resulted in reduction of the pterygium lesions. Here we study the mechanism(s) of action by which doxycycline achieves these results, using massive sequencing techniques. Surgically removed pterygia from 10 consecutive patients were set in short term culture and exposed to 0 (control), 50, 200, and 500 μg/ml doxycycline for 24 h, their mRNA was purified, reverse transcribed and sequenced through Illumina's massive sequencing protocols. Acquired data were subjected to quantile normalization and analyzed using cytoscape plugin software to explore the pathways involved. False discovery rate (FDR) methods were used to identify 332 genes which modified their expression in a dose-dependent manner upon exposure to doxycycline. The more represented cellular pathways included all mitochondrial genes, the endoplasmic reticulum stress response, integrins and extracellular matrix components, and growth factors. A high correlation was obtained when comparing ultrasequencing data with qRT-PCR and ELISA results. Doxycycline significantly modified the expression of important cellular pathways in pterygium cells, in a way which is consistent with the observed efficacy of this antibiotic to reduce pterygium lesions in a mouse model. Clinical trials are under way to demonstrate whether there is a benefit for human patients. 相似文献

17.

Gaussian process regression model for normalization of LC-MS data using scan-level information

Mohammad?R?Nezami Ranjbar Yi?Zhao Mahlet?G?Tadesse Yue?Wang Habtom?W?Ressom Email author 《Proteome science》2013,11(Z1):S13

Background

Differences in sample collection, biomolecule extraction, and instrument variability introduce bias to data generated by liquid chromatography coupled with mass spectrometry (LC-MS). Normalization is used to address these issues. In this paper, we introduce a new normalization method using the Gaussian process regression model (GPRM) that utilizes information from individual scans within an extracted ion chromatogram (EIC) of a peak. The proposed method is particularly applicable for normalization based on analysis order of LC-MS runs. Our method uses measurement variabilities estimated through LC-MS data acquired from quality control samples to correct for bias caused by instrument drift. Maximum likelihood approach is used to find the optimal parameters for the fitted GPRM. We review several normalization methods and compare their performance with GPRM.

Results

To evaluate the performance of different normalization methods, we consider LC-MS data from a study where metabolomic approach is utilized to discover biomarkers for liver cancer. The LC-MS data were acquired by analysis of sera from liver cancer patients and cirrhotic controls. In addition, LC-MS runs from a quality control (QC) sample are included to assess the run to run variability and to evaluate the ability of various normalization method in reducing this undesired variability. Also, ANOVA models are applied to the normalized LC-MS data to identify ions with intensity measurements that are significantly different between cases and controls.

Conclusions

One of the challenges in using label-free LC-MS for quantitation of biomolecules is systematic bias in measurements. Several normalization methods have been introduced to overcome this issue, but there is no universally applicable approach at the present time. Each data set should be carefully examined to determine the most appropriate normalization method. We review here several existing methods and introduce the GPRM for normalization of LC-MS data. Through our in-house data set, we show that the GPRM outperforms other normalization methods considered here, in terms of decreasing the variability of ion intensities among quality control runs.

相似文献

18.

Unit-Free and Robust Detection of Differential Expression from RNA-Seq Data

Hui Jiang Tianyu Zhan 《Statistics in biosciences》2017,9(1):178-199

相似文献

19.

Faster cyclic loess: normalizing RNA arrays via linear models

Ballman KV Grill DE Oberg AL Therneau TM 《Bioinformatics (Oxford, England)》2004,20(16):2778-2786

MOTIVATION: Our goal was to develop a normalization technique that yields results similar to cyclic loess normalization and with speed comparable to quantile normalization. RESULTS: Fastlo yields normalized values similar to cyclic loess and quantile normalization and is fast; it is at least an order of magnitude faster than cyclic loess and approaches the speed of quantile normalization. Furthermore, fastlo is more versatile than both cyclic loess and quantile normalization because it is model-based. AVAILABILITY: The Splus/R function for fastlo normalization is available from the authors. 相似文献

20.

Detection of differentially expressed genes in discrete single‐cell RNA sequencing data using a hurdle model with correlated random effects

Michael Sekula Jeremy Gaskins Susmita Datta 《Biometrics》2019,75(4):1051-1062

相似文献