cn.FARMS: a latent variable model to detect copy number variations in microarray data with a low false discovery rate |
| |
Authors: | Clevert Djork-Arné Mitterecker Andreas Mayr Andreas Klambauer Günter Tuefferd Marianne De Bondt An Talloen Willem Göhlmann Hinrich Hochreiter Sepp |
| |
Affiliation: | 1Institute of Bioinformatics, Johannes Kepler University Linz, Linz, Austria, 2Department of Nephrology and Internal Intensive Care, Charité University Medicine, Berlin, Germany and 3Johnson & Johnson Pharmaceutical Research & Development, a Division of Janssen Pharmaceutica, Beerse, Belgium |
| |
Abstract: | Cost-effective oligonucleotide genotyping arrays like the Affymetrix SNP 6.0 are still the predominant technique to measure DNA copy number variations (CNVs). However, CNV detection methods for microarrays overestimate both the number and the size of CNV regions and, consequently, suffer from a high false discovery rate (FDR). A high FDR means that many CNVs are wrongly detected and therefore not associated with a disease in a clinical study, though correction for multiple testing takes them into account and thereby decreases the study's discovery power. For controlling the FDR, we propose a probabilistic latent variable model, 'cn.FARMS', which is optimized by a Bayesian maximum a posteriori approach. cn.FARMS controls the FDR through the information gain of the posterior over the prior. The prior represents the null hypothesis of copy number 2 for all samples from which the posterior can only deviate by strong and consistent signals in the data. On HapMap data, cn.FARMS clearly outperformed the two most prevalent methods with respect to sensitivity and FDR. The software cn.FARMS is publicly available as a R package at http://www.bioinf.jku.at/software/cnfarms/cnfarms.html. |
| |
Keywords: | |
本文献已被 PubMed 等数据库收录! |
|