首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Modern individual clustering methods utilising hypervariable nuclear microsatellite DNA polymorphisms are being increasingly applied in the field of population genetics. This study explores the efficiency of the clustering methods in identifying the breeds of origin of 250 domestic dog (Canis familiaris) individuals based on 10 microsatellite loci. An allele sharing distance (DAS) matrix and the corresponding neighbour-joining tree of individuals revealed monophyletic assemblages that corresponded perfectly with the breeds of origin of the dogs. Individual assignment tests using a Bayesian statistical approach, an allele frequency based method, and a DCE genetic distance based method were all extremely powerful. Most strikingly, the Bayesian method provided 100% assignment success of individuals into their correct breeds of origin and 100% exclusion success of individuals from all alternate reference populations with a high level of statistical confidence (P < 0.0001). A Bayesian Markov Chain Monte Carlo clustering approach revealed clear distinction of individuals into groups according to their breeds of origin, with a near-zero level of 'genetic admixture' among breeds. The results demonstrate that an FST of 0.18, mean expected gene diversity of 0.6 across 10 loci, and approximately 50 individuals per reference population suffice to provide maximum individual assignment success in C. familiaris. This refutes the traditional view that DNA based dog breed identification is not feasible at the individual level of resolution.  相似文献   

2.
Inference of population structure under a Dirichlet process model   总被引:1,自引:0,他引:1       下载免费PDF全文
Huelsenbeck JP  Andolfatto P 《Genetics》2007,175(4):1787-1802
Inferring population structure from genetic data sampled from some number of individuals is a formidable statistical problem. One widely used approach considers the number of populations to be fixed and calculates the posterior probability of assigning individuals to each population. More recently, the assignment of individuals to populations and the number of populations have both been considered random variables that follow a Dirichlet process prior. We examined the statistical behavior of assignment of individuals to populations under a Dirichlet process prior. First, we examined a best-case scenario, in which all of the assumptions of the Dirichlet process prior were satisfied, by generating data under a Dirichlet process prior. Second, we examined the performance of the method when the genetic data were generated under a population genetics model with symmetric migration between populations. We examined the accuracy of population assignment using a distance on partitions. The method can be quite accurate with a moderate number of loci. As expected, inferences on the number of populations are more accurate when theta = 4N(e)u is large and when the migration rate (4N(e)m) is low. We also examined the sensitivity of inferences of population structure to choice of the parameter of the Dirichlet process model. Although inferences could be sensitive to the choice of the prior on the number of populations, this sensitivity occurred when the number of loci sampled was small; inferences are more robust to the prior on the number of populations when the number of sampled loci is large. Finally, we discuss several methods for summarizing the results of a Bayesian Markov chain Monte Carlo (MCMC) analysis of population structure. We develop the notion of the mean population partition, which is the partition of individuals to populations that minimizes the squared partition distance to the partitions sampled by the MCMC algorithm.  相似文献   

3.
Choi SC  Hey J 《Genetics》2011,189(2):561-577
A new approach to assigning individuals to populations using genetic data is described. Most existing methods work by maximizing Hardy-Weinberg and linkage equilibrium within populations, neither of which will apply for many demographic histories. By including a demographic model, within a likelihood framework based on coalescent theory, we can jointly study demographic history and population assignment. Genealogies and population assignments are sampled from a posterior distribution using a general isolation-with-migration model for multiple populations. A measure of partition distance between assignments facilitates not only the summary of a posterior sample of assignments, but also the estimation of the posterior density for the demographic history. It is shown that joint estimates of assignment and demographic history are possible, including estimation of population phylogeny for samples from three populations. The new method is compared to results of a widely used assignment method, using simulated and published empirical data sets.  相似文献   

4.
Gene flow from sugar beets to sea beets occurs in the seed propagation areas in southern Europe. Some seed propagation also takes place in Denmark, but here the crop-wild gene flow has not been investigated. Hence, we studied gene flow to sea beet populations from sugar beet lines used in Danish seed propagation areas. A set of 12 Danish, two Swedish, one French, one Italian, one Dutch, and one Irish populations of sea beets, and four lines of sugar beet were analysed. To evaluate the genetic variation and gene flow, eight microsatellite loci were screened. This analysis revealed hybridization with cultivated beet in one of the sea beet populations from the centre of the Danish seed propagation area. Triploid hybrids found in this population were verified with flow cytometry. Possible hybrids or introgressed plants were also found in the French and Italian populations. However, individual assignment test using a Bayesian method provided 100% assignment success of diploid individuals into their correct subspecies of origin, and a Bayesian Markov chain Monte Carlo (MC MC) approach revealed clear distinction of individuals into groups according to their subspecies of origin, with a zero level of genetic admixture among subspecies. This underlines that introgression beyond the first hybridization is not extensive. The overall pattern of genetic distance and structure showed that Danish and Swedish sea beet populations were closely related to each other, and they are both more closely related to the population from Ireland than to the populations from France, the Netherlands, and Italy.  相似文献   

5.
We have used a new method for binning minisatellite alleles (semi-automated allele aggregation) and report the extent of population diversity detectable by eleven minisatellite loci in 2,689 individuals from 19 human populations distributed widely throughout the world. Whereas population relationships are consistent with those found in other studies, our estimate of genetic differentiation (F(st)) between populations is less than 8%, which is lower than comparative estimates of between 10%-15% obtained by using other sources of polymorphism data. We infer that mutational processes are involved in reducing F(st) estimates from minisatellite data because, first, the lowest F(st) estimates are found at loci showing autocorrelated frequencies among alleles of similar size and, second, F(st) declines with heterozygosity but by more than predicted assuming simple models of mutation. These conclusions are consistent with the view that minisatellites are subject to selective or mutational constraints in addition to those expected under simple step-wise mutation models.  相似文献   

6.
We demonstrate the effectiveness of a genetic algorithm for discovering multi-locus combinations that provide accurate individual assignment decisions and estimates of mixture composition based on likelihood classification. Using simulated data representing different levels of inter-population differentiation (Fst~ 0.01 and 0.10), genetic diversities (four or eight alleles per locus), and population sizes (20, 40, 100 individuals in baseline populations), we show that subsets of loci can be identified that provide comparable levels of accuracy in classification decisions relative to entire multi-locus data sets, where 5, 10, or 20 loci were considered. Microsatellite data sets from hatchery strains of lake trout, Salvelinus namaycush, representing a comparable range of inter-population levels of differentiation in allele frequencies confirmed simulation results. For both simulated and empirical data sets, assignment accuracy was achieved using fewer loci (e.g., three or four loci out of eight for empirical lake trout studies). Simulation results were used to investigate properties of the ‘leave-one-out’ (L1O) method for estimating assignment error rates. Accuracy of population assignments based on L1O methods should be viewed with caution under certain conditions, particularly when baseline population sample sizes are low (<50).  相似文献   

7.
Classification methods used in machine learning (e.g., artificial neural networks, decision trees, and k-nearest neighbor clustering) are rarely used with population genetic data. We compare different nonparametric machine learning techniques with parametric likelihood estimations commonly employed in population genetics for purposes of assigning individuals to their population of origin ("assignment tests"). Classifier accuracy was compared across simulated data sets representing different levels of population differentiation (low and high F(ST)), number of loci surveyed (5 and 10), and allelic diversity (average of three or eight alleles per locus). Empirical data for the lake trout (Salvelinus namaycush) exhibiting levels of population differentiation comparable to those used in simulations were examined to further evaluate and compare classification methods. Classification error rates associated with artificial neural networks and likelihood estimators were lower for simulated data sets compared to k-nearest neighbor and decision tree classifiers over the entire range of parameters considered. Artificial neural networks only marginally outperformed the likelihood method for simulated data (0-2.8% lower error rates). The relative performance of each machine learning classifier improved relative likelihood estimators for empirical data sets, suggesting an ability to "learn" and utilize properties of empirical genotypic arrays intrinsic to each population. Likelihood-based estimation methods provide a more accessible option for reliable assignment of individuals to the population of origin due to the intricacies in development and evaluation of artificial neural networks.  相似文献   

8.
Although much work has been conducted on coastal populations of the American alligator (Alligator mississippiensis), less is known about the population dynamics and genetic structure of populations of alligators confined to inland habitats. DNA microsatellite loci, derived from the American alligator, were used to investigate patterns of genetic variation within and between populations of alligators distributed at coastal and inland localities in Texas. These data were used to evaluate the genetic discreteness of different alligator stocks relative to their basic ecology at these sites. Observed mean heterozygosities across seven loci for both coastal and inland populations ranged from 0.50-0.61, with both inland and coastal populations revealing similar patterns of variation. Measures of F(st) revealed significant population differentiation among all populations; however, analyses of molecular variance (AMOVAs) failed to demonstrate any apparent geographic pattern relative to the population differentiation indicated by F(st) values. Each population contained unique alleles for at least one locus. Additionally, assignment tests based on the distribution of genotypes placed 76% of individuals to their source population. These genetic data suggest considerable subdivision among alligator populations, possibly influenced by demographic and life history differences as well as barriers to dispersal. These results have clear implications for management. Rather than managing alligators in Texas as a single panmictic population, translocation programs and harvest quotas should consider the ecological and genetic distinctiveness of local alligator populations.  相似文献   

9.
Echinacea laevigata (Boynton and Beadle) Blake is a federally endangered flowering plant species restricted to four states in the southeastern United States. To determine the population structure and outcrossing rate across the range of the species, we conducted AFLP analysis using four primer combinations for 22 populations. The genetic diversity of this species was high based on the level of polymorphic loci (200 of 210 loci; 95.24%) and Nei’s gene diversity (ranging from 0.1398 to 0.2606; overall 0.2611). There was significant population genetic differentiation (GST = 0.294; ӨII = 0.218 from the Bayesian f = 0 model). Results from the AMOVA analysis suggest that a majority of the genetic variance is attributed to variation within populations (70.26%), which is also evident from the PCoA. However, 82% of individuals were assigned back to the original population based on the results of the assignment test. An isolation by distance analysis indicated that genetic differentiation among populations was a function of geographic distance, although long-distance gene dispersal between some populations was suggested from an analysis of relatedness between populations using the neighbor-joining method. An estimate of the outcrossing rate based on genotypes of progenies from six of the 22 populations using the multilocus method from the program MLTR ranged from 0.780 to 0.912, suggesting that the species is predominantly outcrossing. These results are encouraging for conservation, signifying that populations may persist due to continued genetic exchange sustained by the outcrossing mating system of the species.  相似文献   

10.
Nielsen R 《Genetics》2000,154(2):931-942
Some general likelihood and Bayesian methods for analyzing single nucleotide polymorphisms (SNPs) are presented. First, an efficient method for estimating demographic parameters from SNPs in linkage equilibrium is derived. The method is applied in the estimation of growth rates of a human population based on 37 SNP loci. It is demonstrated how ascertainment biases, due to biased sampling of loci, can be avoided, at least in some cases, by appropriate conditioning when calculating the likelihood function. Second, a Markov chain Monte Carlo (MCMC) method for analyzing linked SNPs is developed. This method can be used for Bayesian and likelihood inference on linked SNPs. The utility of the method is illustrated by estimating recombination rates in a human data set containing 17 SNPs and 60 individuals. Both methods are based on assumptions of low mutation rates.  相似文献   

11.
Microsatellite null alleles are found to a varying degree across all taxa. They are problematic as they may inflate measures of genetic differentiation and create false homozygotes. Although there are several methods for correcting allele frequencies for null alleles and enable estimations of F(ST), much less is known about how null alleles affect assignment testing. Data presented here, based on simulations, show that the percentage of correctly assigned individuals in model-based clustering and Bayesian assignment methods were slightly, though significantly, reduced in the presence of null alleles (frequency range from 0.000 to 0.913). The bias in assignment tests caused by null alleles lead to a slight reduction in the power to correctly assigned individuals (0.2 and 1.0 percent units for STRUCTURE- and 2.4 percent units for GENECLASS-based assignment tests). Further, the presence of null alleles caused a small, however, significant overestimation of F(ST). Consequently, microsatellite loci affected by null alleles would probably not alter the overall outcome of assignment testing and could therefore be included in these types of studies. Nevertheless, loci prone to null alleles should be used with caution as they lower the power of assignment tests and alter the accuracy of F(ST), and loci less prone to null alleles should always be preferred.  相似文献   

12.
Elmer KR  Dávila JA  Lougheed SC 《Heredity》2007,99(5):506-515
We assess patterns of genetic diversity of a neotropical leaflitter frog, Eleutherodactylus ockendeni, in the upper Amazon of Ecuador without a priori delineation of biological populations and with sufficiently intensive sampling to assess inter-individual patterns. We mapped the location of each collected frog across a 5.4 x 1 km landscape at the Jatun Sacha Biological Station, genotyped 185 individuals using five species-specific DNA microsatellite loci, and sequenced a fragment of mitochondrial cytochrome b for a subset of 51 individuals. The microsatellites were characterized by high allelic diversity and homozygote excess across all loci, suggesting that when pooled the sample is not a panmictic population. We conclude that the lack of panmixia is not attributable to the influence of null alleles or biased sampling of consanguineous family groups. Multiple methods of population cluster analysis, using both Bayesian and maximum likelihood approaches, failed to identify discrete genetic clusters across the sampled area. Using multivariate spatial autocorrelation, kinship coefficients and relatedness coefficients, we identify a continuous isolation by distance population structure, with a first patch size of ca. 260 m and apparently large population sizes. Analysis of mtDNA corroborates the observation of high genetic diversity at fine scales: there are multiple haplotypes, they are non-randomly distributed and a binary haplotype correlogram shows significant spatial genetic autocorrelation. We demonstrate the utility of inter-individual genetic methods and caution against making a priori assumptions about population genetic structure based simply on arbitrary or convenient patterns of sampling.  相似文献   

13.
Large escapes of cultured salmon from net‐pens have become inevitable disasters linked to the growth of aquaculture in coastal areas. Hybridization between farmed and wild salmon has been witnessed; but the extent of eventual genetic introgression is controversial as selection against hybrids can maintain distinct gene pools. Individual assignment tests based on genetic data have been widely used in fisheries, due to the importance of accurate population assignment for a variety of purposes including distinction between individuals of native and stocked origin. However the ability of these Bayesian programs to detect hybrids and subsequent generations between closely related populations has been little investigated. Here we present results regarding the efficiency of two new computer programs, structure and New Hybrids in detecting hybridization between farmed and wild salmon from the river Teno (Northern Europe) based on genetic data obtained from 17 microsatellite loci.  相似文献   

14.
Population genetic analyses traditionally focus on the frequencies of alleles or genotypes in 'populations' that are delimited a priori. However, there are potential drawbacks of amalgamating genetic data into such composite attributes of assemblages of specimens: genetic information on individual specimens is lost or submerged as an inherent part of the analysis. A potential also exists for circular reasoning when a population's initial identification and subsequent genetic characterization are coupled. In principle, these problems are circumvented by some newer methods of population identification and individual assignment based on statistical clustering of specimen genotypes. Here we evaluate a recent method in this genre--Bayesian clustering--using four genotypic data sets involving different types of molecular markers in non-model organisms from nature. As expected, measures of population genetic structure (F(ST) and phiST) tended to be significantly greater in Bayesian a posteriori data treatments than in analyses where populations were delimited a priori. In the four biological contexts examined, which involved both geographic population structures and hybrid zones, Bayesian clustering was able to recover differentiated populations, and Bayesian assignments were able to identify likely population sources of specific individuals.  相似文献   

15.
Lin FJ  Jiang PP  Ding P 《动物学研究》2010,31(5):461-468
In this study, we reported the population genetic analyses in the Elliot's Pheasant(Syrnaticus ellioti) using seven polymorphism microsatellite loci based on 105 individuals from 4 geographical populations. Departures from Hardy-Weinberg equilibrium were found in four geographical populations. The average number of alleles was 8.86, with a total of 62 alleles across 7 loci; observed heterozygosity (HO) was generally low and the average number was 0.504. For the seven microsatellite loci, the polymorphism information content ranged from 0.549 to 0.860, with an average number 0.712. Population bottlenecks of the four geographical populations were tested by infinite allele mutation model, step-wise mutation model and two-phase mutation model, which found that each population had experienced bottleneck effect during the recent period. Fst analysis across all geographical populations indicated that the genetic differentiaton between the Guizhou geographical population and the Hunan geographical population was highly significant (P<0.001), a finding supported by the far genetic relationship showed by the neighbor-joining tree of four geographical populations based on Nei's unbiased genetic distances. Using hierarchical analysis of molecular variance (Guizhou geographical population relative to all others pooled), we found a low level of the genetic variation among geographical populations and that between groups. However, differences among populations relative to the total sample explained most of the genetic variance (92.84%), which was significant.  相似文献   

16.
Gattepaille LM  Jakobsson M 《Genetics》2012,190(1):159-174
High-throughput genotyping and sequencing technologies can generate dense sets of genetic markers for large numbers of individuals. For most species, these data will contain many markers in linkage disequilibrium (LD). To utilize such data for population structure inference, we investigate the use of haplotypes constructed by combining the alleles at single-nucleotide polymorphisms (SNPs). We introduce a statistic derived from information theory, the gain of informativeness for assignment (GIA), which quantifies the additional information for assigning individuals to populations using haplotype data compared to using individual loci separately. Using a two-loci-two-allele model, we demonstrate that combining markers in linkage equilibrium into haplotypes always leads to nonpositive GIA, suggesting that combining the two markers is not advantageous for ancestry inference. However, for loci in LD, GIA is often positive, suggesting that assignment can be improved by combining markers into haplotypes. Using GIA as a criterion for combining markers into haplotypes, we demonstrate for simulated data a significant improvement of assigning individuals to candidate populations. For the many cases that we investigate, incorrect assignment was reduced between 26% and 97% using haplotype data. For empirical data from French and German individuals, the incorrectly assigned individuals can, for example, be decreased by 73% using haplotypes. Our results can be useful for challenging population structure and assignment problems, in particular for studies where large-scale population-genomic data are available.  相似文献   

17.
S Wilkinson  C Haley  L Alderson  P Wiener 《Heredity》2011,106(2):261-269
Recently developed Bayesian genotypic clustering methods for analysing genetic data offer a powerful tool to evaluate the genetic structure of domestic farm animal breeds. The unit of study with these approaches is the individual instead of the population. We aimed to empirically evaluate various individual-based population genetic statistical methods for characterization of genetic diversity and structure of livestock breeds. Eighteen British pig populations, comprising 819 individuals, were genotyped at 46 microsatellite markers. Three Bayesian genotypic clustering approaches, principle component analysis (PCA) and phylogenetic reconstruction were applied to individual multilocus genotypes to infer the genetic structure and diversity of the British pig breeds. Comparisons of the three Bayesian genotypic clustering methods (, and ) revealed some broad similarities but also some notable differences. Overall, the methods agreed that majority of the British pig breeds are independent genetic units with little evidence of admixture. The three Bayesian genotypic clustering methods provided complementary, biologically credible clustering solutions but at different levels of resolution. detected finer genetic differentiation and in some cases, populations within breeds. Consequently, it estimated a greater number of underlying genetic populations (K, in the notation of Bayesian clustering methods). Two of the Bayesian methods ( and ) and phylogenetic reconstruction provided similar success in assignment of individuals, supporting the use of these methods for breed assignment.  相似文献   

18.
Conservation and management of widespread species can be improved if populations exhibiting genetic differentiation are recognized as local management units. Specimens of Nile crocodile (Crocodylus niloticus) corresponding to major river drainage systems from Eastern Africa and Madagascar, and a small set of samples from Western Africa, were analyzed using multilocus genotyping to evaluate the potential to discriminate among locations and to assign individuals to population of origin. Populations from all sampled regions exhibited marked levels of genetic and genotypic differentiation as assessed by significant F ST values and Bayesian analysis of population structure. At the regional level, the majority (94%) of all specimens were successfully assigned to the population of origin using only four microsatellite loci. Three populations sampled within Madagascar required the use of 12 loci for successful assignment of greater than 84%. Our findings demonstrate a need for alternative management strategies that consider the biogeographic sub-structuring of Nile crocodiles associated with major river drainages in Africa and Madagascar.  相似文献   

19.
Introductions of biological control agents may cause bottlenecks in population size despite efforts to avoid them. We examined the population genetics of Aphidius ervi (Hymenoptera: Braconidae), a parasitoid that was introduced to North America from Western Europe in 1959 to control pea aphids. To explore the phylogeographical relationships of A. ervi we sequenced 1249 bp of mitochondrial DNA (mtDNA) from 27 individuals from the native range and 51 individuals from the introduced range. Most individuals from Western Europe, the Middle East and North America shared one of two common haplotypes, consistent with the known history of the introduction. However, some A. ervi from the Pacific Northwest have a haplotype that is most similar to haplotypes found in Japan, raising the possibility of a second accidental introduction. To examine population structure and assess whether a bottleneck occurred upon introduction to North America, we assayed variation at 5 microsatellite loci in 62 individuals from 2 native populations and 230 individuals from 6 introduced populations. Introduced samples had fewer rare alleles than native samples (F1,34 = 13.5, P = 0.0008), but heterozygosity did not differ significantly. These results suggest that a mild bottleneck occurred in spite of the introduction of over 1000 individuals. Using a hierarchical Bayesian approach, the founding population size was estimated to be 245 individuals. amova showed significant genetic differentiation between the European and North American samples, and a Bayesian assignment approach clustered individuals into four groups, with most European individuals in one group and most North American individuals in the other three. These results highlight that genetic changes are associated with founder events in rapidly growing natural populations, even when the founding population size is relatively large.  相似文献   

20.
Although F(ST) values are widely used to elucidate population relationships, in some cases, when employing highly polymorphic loci, they should be regarded with caution, particularly when subspecies are under consideration. Tripterygion delaisi presents two subspecies that were investigated here, using 10 microsatellite loci. A Bayesian approach allowed us to clearly identify both subspecies as two different evolutionary significant units. However, low F(ST) values were found between subspecies as a consequence of the large number of alleles per locus, while homoplasy could be disregarded as indicated by the standardized genetic distance G'(ST). Heterozygosity saturation was observed in highly polymorphic loci containing more than 15 alleles, and this threshold was used to define two loci pools. The less variable loci pool revealed higher genetic variance between subspecies, while the more variable pool showed higher genetic variance between populations. Furthermore, higher differentiation was also observed between populations using G'(ST) with the more variable loci. Nonetheless, a more reliable population structure within subspecies was obtained when all loci were included in the analyses. In T. d. xanthosoma, isolation by distance was detected between the eight analysed populations, and six genetically homogeneous clusters were inferred by Bayesian analyses that are in accordance with F(ST) values. The neighbourhood-size method also indicated rather small dispersal capabilities. In conclusion, in fish with limited adult and larval dispersal capabilities, continuous rocky habitat seems to allow contact between populations and prevent genetic differentiation, while large discontinuities of sand or deep-water channels seems to reduce gene flow.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号