首页 | 本学科首页   官方微博 | 高级检索  
   检索      


Combining markers into haplotypes can improve population structure inference
Authors:Gattepaille Lucie M  Jakobsson Mattias
Institution:Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, SE-752 36, Uppsala, Sweden.
Abstract:High-throughput genotyping and sequencing technologies can generate dense sets of genetic markers for large numbers of individuals. For most species, these data will contain many markers in linkage disequilibrium (LD). To utilize such data for population structure inference, we investigate the use of haplotypes constructed by combining the alleles at single-nucleotide polymorphisms (SNPs). We introduce a statistic derived from information theory, the gain of informativeness for assignment (GIA), which quantifies the additional information for assigning individuals to populations using haplotype data compared to using individual loci separately. Using a two-loci-two-allele model, we demonstrate that combining markers in linkage equilibrium into haplotypes always leads to nonpositive GIA, suggesting that combining the two markers is not advantageous for ancestry inference. However, for loci in LD, GIA is often positive, suggesting that assignment can be improved by combining markers into haplotypes. Using GIA as a criterion for combining markers into haplotypes, we demonstrate for simulated data a significant improvement of assigning individuals to candidate populations. For the many cases that we investigate, incorrect assignment was reduced between 26% and 97% using haplotype data. For empirical data from French and German individuals, the incorrectly assigned individuals can, for example, be decreased by 73% using haplotypes. Our results can be useful for challenging population structure and assignment problems, in particular for studies where large-scale population-genomic data are available.
Keywords:
本文献已被 PubMed 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号