PacBio-LITS: a large-insert targeted sequencing method for characterization of human disease-associated chromosomal structural variations |
| |
Authors: | Min Wang Christine R Beck Adam C English Qingchang Meng Christian Buhay Yi Han Harsha V Doddapaneni Fuli Yu Eric Boerwinkle James R Lupski Donna M Muzny Richard A Gibbs |
| |
Affiliation: | .Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030 USA ;.Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030 USA ;.Human Genetics Center, University of Texas Health Science Center at Houston, Houston, TX 77030 USA |
| |
Abstract: | BackgroundGeneration of long (>5 Kb) DNA sequencing reads provides an approach for interrogation of complex regions in the human genome. Currently, large-insert whole genome sequencing (WGS) technologies from Pacific Biosciences (PacBio) enable analysis of chromosomal structural variations (SVs), but the cost to achieve the required sequence coverage across the entire human genome is high.ResultsWe developed a method (termed PacBio-LITS) that combines oligonucleotide-based DNA target-capture enrichment technologies with PacBio large-insert library preparation to facilitate SV studies at specific chromosomal regions. PacBio-LITS provides deep sequence coverage at the specified sites at substantially reduced cost compared with PacBio WGS. The efficacy of PacBio-LITS is illustrated by delineating the breakpoint junctions of low copy repeat (LCR)-associated complex structural rearrangements on chr17p11.2 in patients diagnosed with Potocki–Lupski syndrome (PTLS; MIM#610883). We successfully identified previously determined breakpoint junctions in three PTLS cases, and also were able to discover novel junctions in repetitive sequences, including LCR-mediated breakpoints. The new information has enabled us to propose mechanisms for formation of these structural variants.ConclusionsThe new method leverages the cost efficiency of targeted capture-sequencing as well as the mappability and scaffolding capabilities of long sequencing reads generated by the PacBio platform. It is therefore suitable for studying complex SVs, especially those involving LCRs, inversions, and the generation of chimeric Alu elements at the breakpoints. Other genomic research applications, such as haplotype phasing and small insertion and deletion validation could also benefit from this technology.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-015-1370-2) contains supplementary material, which is available to authorized users. |
| |
Keywords: | Targeted sequencing Single molecule sequencing Complex genomic rearrangement |
|
|