首页 | 本学科首页   官方微博 | 高级检索  
   检索      


Next‐generation sequencing for molecular ecology: a caveat regarding pooled samples
Authors:Eric C Anderson  Hans J Skaug  Daniel J Barshis
Institution:1. Fisheries Ecology Division, Southwest Fisheries Science Center, National Marine Fisheries Service, NOAA, , Santa Cruz, CA, 95060 USA;2. Department of Applied Math and Statistics (SOE2), University of California, , Santa Cruz, CA, 95064 USA;3. Department of Mathematics, University of Bergen, , P.B. 7800, Bergen, Norway
Abstract:We develop a model based on the Dirichlet‐compound multinomial distribution (CMD) and Ewens sampling formula to predict the fraction of SNP loci that will appear fixed for alternate alleles between two pooled samples drawn from the same underlying population. We apply this model to next‐generation sequencing (NGS) data from Baltic Sea herring recently published by (Corander et al., 2013 , Molecular Ecology, 2931 –2940), and show that there are many more fixed loci than expected in the absence of genetic structure. However, we show through coalescent simulations that the degree of population structure required to explain the fraction of alternatively fixed SNPs is extraordinarily high and that the surplus of fixed loci is more likely a consequence of limited representation of individual gene copies in the pooled samples, than it is of population structure. Our analysis signals that the use of NGS on pooled samples to identify divergent SNPs warrants caution. With pooled samples, it is hard to diagnose when an NGS experiment has gone awry; especially when NGS data on pooled samples are of low read depth with a limited number of individuals, it may be worthwhile to temper claims of unexpected population differentiation from pooled samples, pending verification with more reliable methods or stricter adherence to recommended sampling designs for pooled sequencing e.g. Futschik & Schlötterer 2010 , Genetics, 186 , 207; Gautier et al., 2013a , Molecular Ecology, 3766 –3779). Analysis of the data and diagnosis of problems is easier and more reliable (and can be less costly) with individually barcoded samples. Consequently, for some scenarios, individual barcoding may be preferable to pooling of samples.
Keywords:compound multinomial distribution  outlier analysis  population divergence  SNP discovery
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号