Variant Identification in Multi-Sample Pools by Illumina Genome Analyzer Sequencing |
| |
Authors: | Rebecca L. Margraf Jacob D. Durtschi Shale Dames David C. Pattison Jack E. Stephens Karl V. Voelkerding |
| |
Affiliation: | 1ARUP Institute for Clinical & Experimental Pathology®, Salt Lake City, Utah and ;2Department of Pathology, University of Utah School of Medicine, Salt Lake City, Utah |
| |
Abstract: | Multi-sample pooling and Illumina Genome Analyzer (GA) sequencing allows high throughput sequencing of multiple samples to determine population sequence variation. A preliminary experiment, using the RET proto-oncogene as a model, predicted ≤30 samples could be pooled to reliably detect singleton variants without requiring additional confirmation testing. This report used 30 and 50 sample pools to test the hypothesized pooling limit and also to test recent protocol improvements, Illumina GAIIx upgrades, and longer read chemistry. The SequalPrepTM method was used to normalize amplicons before pooling. For comparison, a single ‘control’ sample was run in a different flow cell lane. Data was evaluated by variant read percentages and the subtractive correction method which utilizes the control sample. In total, 59 variants were detected within the pooled samples, which included all 47 known true variants. The 15 known singleton variants due to Sanger sequencing had an average of 1.62±0.26% variant reads for the 30 pool (expected 1.67% for a singleton variant [unique variant within the pool]) and 1.01±0.19% for the 50 pool (expected 1%). The 76 base read lengths had higher error rates than shorter read lengths (33 and 50 base reads), which eliminated the distinction of true singleton variants from background error. This report demonstrated pooling limits from 30 up to 50 samples (depending on error rates and coverage), for reliable singleton variant detection. The presented pooling protocols and analysis methods can be used for variant discovery in other genes, facilitating molecular diagnostic test design and interpretation. |
| |
Keywords: | next generation sequencing Illumina Genome Analyzer RET pooling massively parallel sequencing |
|
|