Rare Variant Association Testing by Adaptive Combination of P-values |
| |
Authors: | Wan-Yu Lin Xiang-Yang Lou Guimin Gao Nianjun Liu |
| |
Institution: | 1. Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan University, Taipei, Taiwan.; 2. Department of Biostatistics, University of Alabama at Birmingham, Birmingham, Alabama, United States of America.; 3. Department of Biostatistics, Virginia Commonwealth University, Richmond, Virginia, United States of America.; University of North Carolina, United States of America, |
| |
Abstract: | With the development of next-generation sequencing technology, there is a great demand for powerful statistical methods to detect rare variants (minor allele frequencies (MAFs)<1%) associated with diseases. Testing for each variant site individually is known to be underpowered, and therefore many methods have been proposed to test for the association of a group of variants with phenotypes, by pooling signals of the variants in a chromosomal region. However, this pooling strategy inevitably leads to the inclusion of a large proportion of neutral variants, which may compromise the power of association tests. To address this issue, we extend the
-MidP method (Cheung et al., 2012, Genet Epidemiol 36: 675–685) and propose an approach (named ‘adaptive combination of P-values for rare variant association testing’, abbreviated as ‘ADA’) that adaptively combines per-site P-values with the weights based on MAFs. Before combining P-values, we first imposed a truncation threshold upon the per-site P-values, to guard against the noise caused by the inclusion of neutral variants. This ADA method is shown to outperform popular burden tests and non-burden tests under many scenarios. ADA is recommended for next-generation sequencing data analysis where many neutral variants may be included in a functional region. |
| |
Keywords: | |
|
|