首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
Population genetics model based Bayesian methods have been proposed and widely applied to making unsupervised inference of population structure from a sample of multilocus genotypes. Usually they provide good estimates of the ancestry (or population membership) of sampled individuals by clustering them probabilistically or proportionally into (anonymous) populations. However, they have difficulties in accurately estimating the number of populations (K) represented by the sampled individuals. This study proposed a new ad hoc estimator of K, calculable from the output of a population clustering program such as STRUCTURE or ADMIXTURE. The new criterion, called parsimony index (PI), aims to identify the number of populations (K) which yields consistently the minimal admixture estimates of sampled individuals. Extensive simulated and empirical data were used to compare the accuracy of PI and two popular K estimators based on Pr[X|K] (i.e., the probability of genotype data X given K) and ΔK (i.e., the rate of change of the probability of data as a function of K) calculated from STRUCTURE outputs, and the accuracy of PI and the cross‐validation method calculated from ADMIXTURE outputs. It was shown that PI was more accurate than the other methods consistently in various population structure (e.g., hierarchical island model, different extents of differentiation) and sampling (e.g., unbalanced sample sizes, different marker information contents) scenarios. The ΔK method was more accurate than the Pr[X|K] method only for hierarchically structured or highly inbred populations, and the opposite was true in the other scenarios. The PI method was implemented in a computer program, KFinder, which can be run on all major computer platforms.  相似文献   

Inferences of population genetic structure are of great importance to the fields of ecology and evolutionary biology. The program structure has been widely used to infer population genetic structure. However, previous studies demonstrated that uneven sampling often leads to wrong inferences on hierarchical structure. The most widely used ΔK method tends to identify the uppermost hierarchy of population structure. Recently, four alternative statistics (medmedk , medmeak , maxmedk and maxmeak ) were proposed, which appear to be more accurate than the previously used methods for both even and uneven sampling data. However, the lack of easy‐to‐use software limits the use of these appealing new estimators. Here, we developed a web‐based user‐friendly software structureselector to calculate the four appealing alternative statistics together with the commonly used Ln Pr(X|K) and ΔK statistics. structureselector accepts the result files of structure , admixture or faststructure as input files. It reports the “best” K for each estimator, and the results are available as HTML or tab separated tables. The program can also generate graphical representations for specific K, which can be easily downloaded from the server. The software is freely available at http://lmme.qdio.ac.cn/StructureSelector/ .  相似文献   

Estimating dispersal—a key parameter for population ecology and management—is notoriously difficult. The use of pedigree assignments, aided by likelihood‐based software, has become popular to estimate dispersal rate and distance. However, the partial sampling of populations may produce false assignments. Further, it is unknown how the accuracy of assignment is affected by the genealogical relationships of individuals and is reflected by software‐derived assignment probabilities. Inspired by a project managing invasive American mink (Neovison vison), we estimated individual dispersal distances using inferred pairwise relationships of culled individuals. Additionally, we simulated scenarios to investigate the accuracy of pairwise inferences. Estimates of dispersal distance varied greatly when derived from different inferred pairwise relationships, with mother–offspring relationship being the shortest (average = 21 km) and the most accurate. Pairs assigned as maternal half‐siblings were inaccurate, with 64%–97% falsely assigned, implying that estimates for these relationships in the wild population were unreliable. The false assignment rate was unrelated to the software‐derived assignment probabilities at high dispersal rates. Assignments were more accurate when the inferred parents were older and immigrants and when dispersal rates between subpopulations were low (1% and 2%). Using 30 instead of 15 loci increased pairwise reliability, but half‐sibling assignments were still inaccurate (>59% falsely assigned). The most reliable approach when using inferred pairwise relationships in polygamous species would be not to use half‐sibling relationship types. Our simulation approach provides guidance for the application of pedigree inferences under partial sampling and is applicable to other systems where pedigree assignments are used for ecological inference.  相似文献   

Habitat fragmentation has often been implicated in the decline of many species. For habitat specialists and/or sedentary species, loss of habitat can result in population isolation and lead to negative genetic effects. However, factors other than fragmentation can often be important and also need to be considered when assessing the genetic structure of a species. We genotyped individuals from 13 populations of the cooperatively breeding Brown‐headed Nuthatch Sitta pusilla in Florida to test three alternative hypotheses regarding the effects that habitat fragmentation might have on genetic structure. A map of potential habitat developed from recent satellite imagery suggested that Brown‐headed Nuthatch populations in southern Florida occupied smaller and more isolated habitat patches (i.e. were more fragmented) than populations in northern Florida. We also genotyped individuals from a small, isolated Brown‐headed Nuthatch population on Grand Bahama Island. We found that populations associated with more fragmented habitat in southern Florida had lower allelic richness than populations in northern Florida (P = 0.02), although there were no differences in heterozygosity. Although pairwise estimates of FST were low overall, values among southern populations were generally higher than northern populations. Population assignment tests identified K = 3 clusters corresponding to a northern cluster, a southern cluster and a unique population in southeast Florida; using sampling localities as prior information revealed K = 7 clusters, with greater structure only among southern Florida populations. The Bahamas population showed moderate to high differentiation compared with Florida populations. Overall, our results suggest that fragmentation could affect gene flow in Brown‐headed Nuthatch populations and is likely to become more pronounced over time.  相似文献   

The program structure has been used extensively to understand and visualize population genetic structure. It is one of the most commonly used clustering algorithms, cited over 11 500 times in Web of Science since its introduction in 2000. The method estimates ancestry proportions to assign individuals to clusters, and post hoc analyses of results may indicate the most likely number of clusters, or populations, on the landscape. However, as has been shown in this issue of Molecular Ecology Resources by Puechmaille ( 2016 ), when sampling is uneven across populations or across hierarchical levels of population structure, these post hoc analyses can be inaccurate and identify an incorrect number of population clusters. To solve this problem, Puechmaille ( 2016 ) presents strategies for subsampling and new analysis methods that are robust to uneven sampling to improve inferences of the number of population clusters.  相似文献   

Inferences about introduction histories of invasive species remain challenging because of the stochastic demographic processes involved. Approximate Bayesian computation (ABC) can help to overcome these problems, but such method requires a prior understanding of population structure over the study area, necessitating the use of alternative methods and an intense sampling design. In this study, we made inferences about the worldwide invasion history of the ladybird Harmonia axyridis by various population genetics statistical methods, using a large set of sampling sites distributed over most of the species’ native and invaded areas. We evaluated the complementarity of the statistical methods and the consequences of using different sets of site samples for ABC inferences. We found that the H. axyridis invasion has involved two bridgehead invasive populations in North America, which have served as the source populations for at least six independent introductions into other continents. We also identified several situations of genetic admixture between differentiated sources. Our results highlight the importance of coupling ABC methods with more traditional statistical approaches. We found that the choice of site samples could affect the conclusions of ABC analyses comparing possible scenarios. Approaches involving independent ABC analyses on several sample sets constitute a sensible solution, complementary to standard quality controls based on the analysis of pseudo‐observed data sets, to minimize erroneous conclusions. This study provides biologists without expertise in this area with detailed methodological and conceptual guidelines for making inferences about invasion routes when dealing with a large number of sampling sites and complex population genetic structures.  相似文献   

Bayesian clustering as implemented in STRUCTURE or GENELAND software is widely used to form genetic groups of populations or individuals. On the other hand, in order to satisfy the need for less computer-intensive approaches, multivariate analyses are specifically devoted to extracting information from large datasets. In this paper, we report the use of a dataset of AFLP markers belonging to 15 sampling sites of Acacia caven for studying the genetic structure and comparing the consistency of three methods: STRUCTURE, GENELAND and DAPC. Of these methods, DAPC was the fastest one and showed accuracy in inferring the K number of populations (K = 12 using the find.clusters option and K = 15 with a priori information of populations). GENELAND in turn, provides information on the area of membership probabilities for individuals or populations in the space, when coordinates are specified (K = 12). STRUCTURE also inferred the number of K populations and the membership probabilities of individuals based on ancestry, presenting the result K = 11 without prior information of populations and K = 15 using the LOCPRIOR option. Finally, in this work all three methods showed high consistency in estimating the population structure, inferring similar numbers of populations and the membership probabilities of individuals to each group, with a high correlation between each other.  相似文献   

Many models for inference of population genetic parameters are based on the assumption that the data set at hand consists of groups displaying within-group Hardy-Weinberg equilibrium at individual loci and linkage equilibrium between loci. This assumption is commonly violated by the presence of within-group spatial structure arising from nonrandom mating of individuals due to isolation by distance (IBD). This paper proposes a model and simulation method implemented in a computer program to flexibly simulate data displaying such patterns. The program permits displaying of smooth spatial variations of allele frequencies due to IBD and more abrupt variations due to presence of strong barriers to gene flow. It is useful in assessing performance of various statistical inference methods and in designing spatial sampling schemes. This is shown by a simulation study aimed at assessing the extent to which IBD patterns affect accuracy of cluster inferences performed in models assuming panmixia. The program is also used to study the effects of spatial sampling scheme (e.g. sampling individuals in clumps or uniformly across the spatial domain). The accuracy of such inferences is assessed in terms of number of inferred populations, assignment of individuals to populations and location of borders between populations. The effect of spatial sampling was weak while the effect of IBD may be substantial, leading to the inference of spurious populations, especially when IBD was strong with respect to the size of the sampling domain. The model and program are new and have been embedded in the R package Geneland, for user convenience and compliance with existing data formats.  相似文献   

Non‐invasive genetic sampling is an increasingly popular approach for investigating the demographics of natural populations. This has also become a useful tool for managers and conservation biologists, especially for those species for which traditional mark–recapture studies are not practical. However, the consequence of collecting DNA indirectly is that an individual may be sampled multiple times per sampling session. This requires alternative statistical approaches to those used in traditional mark–recapture studies. Here we present the R package capwire , an implementation of the population size estimators of Miller et al. (Molecular Ecology 2005; 14 : 1991), which were designed to deal specifically with this type of sampling. The aim of this project is to enable users across platforms to easily manipulate their data and interact with existing R packages. We have also provided functions to simulate data under a variety of scenarios to allow for rigorous testing of the robustness of the method and to facilitate further development of this approach.  相似文献   

Population stratification may confound the results of genetic association studies among unrelated individuals from admixed populations. Several methods have been proposed to estimate the ancestral information in admixed populations and used to adjust the population stratification in genetic association tests. We evaluate the performances of three different methods: maximum likelihood estimation, ADMIXMAP and Structure through various simulated data sets and real data from Latino subjects participating in a genetic study of asthma. All three methods provide similar information on the accuracy of ancestral estimates and control type I error rate at an approximately similar rate. The most important factor in determining accuracy of the ancestry estimate and in minimizing type I error rate is the number of markers used to estimate ancestry. We demonstrate that approximately 100 ancestry informative markers (AIMs) are required to obtain estimates of ancestry that correlate with correlation coefficients more than 0.9 with the true individual ancestral proportions. In addition, after accounting for the ancestry information in association tests, the excess of type I error rate is controlled at the 5% level when 100 markers are used to estimate ancestry. However, since the effect of admixture on the type I error rate worsens with sample size, the accuracy of ancestry estimates also needs to increase to make the appropriate correction. Using data from the Latino subjects, we also apply these methods to an association study between body mass index and 44 AIMs. These simulations are meant to provide some practical guidelines for investigators conducting association studies in admixed populations.  相似文献   

The similarity index and DNA fingerprinting   总被引:147,自引:0,他引:147  
DNA-fingerprint similarity is being used increasingly to make inferences about levels of genetic variation within and between natural populations. It is shown that the similarity index--the average fraction of shared restriction fragments--provides upwardly biased estimates of population homozygosity but nearly unbiased estimates of the average identity-in-state for random pairs of individuals. A method is suggested for partitioning the DNA-fingerprint dissimilarity into within- and between-population components. Some simple expressions are given for the sampling variances of these estimators.  相似文献   

spag e d i version 1.0 is a software primarily designed to characterize the spatial genetic structure of mapped individuals or populations using genotype data of codominant markers. It computes various statistics describing genetic relatedness or differentiation between individuals or populations by pairwise comparisons and tests their significance by appropriate numerical resampling. spag e d i is useful for: (i) detecting isolation by distance within or among populations and estimating gene dispersal parameters; (ii) assessing genetic relatedness between individuals and its actual variance, a parameter of interest for marker based inferences of quantitative inheritance; (iii) assessing genetic differentiation among populations, including the case of haploids or autopolyploids.  相似文献   

The association between a geographical region and an mtDNA haplogroup(s) has provided the basis for using mtDNA haplogroups to infer an individual’s place of origin and genetic ancestry. Although it is well known that ancestry inferences using mtDNA haplogroups and those using genome-wide markers are frequently discrepant, little empirical information exists on the magnitude and scope of such discrepancies between multiple mtDNA haplogroups and worldwide populations. We compared genetic-ancestry inferences made by mtDNA-haplogroup membership to those made by autosomal SNPs in ∼940 samples of the Human Genome Diversity Panel and recently admixed populations from the 1000 Genomes Project. Continental-ancestry proportions often varied widely among individuals sharing the same mtDNA haplogroup. For only half of mtDNA haplogroups did the highest average continental-ancestry proportion match the highest continental-ancestry proportion of a majority of individuals with that haplogroup. Prediction of an individual’s mtDNA haplogroup from his or her continental-ancestry proportions was often incorrect. Collectively, these results indicate that for most individuals in the worldwide populations sampled, mtDNA-haplogroup membership provides limited information about either continental ancestry or continental region of origin.  相似文献   

Many long‐lived plant and animal species have nondiscrete overlapping generations. Although numerous models have been developed to predict the effective sizes (Ne) of populations with overlapping generations, they are extremely difficult to apply to natural populations because of the large array of unknown and elusive life‐table parameters involved. Unfortunately, little work has been done to estimate the Ne of populations with overlapping generations from marker data, in sharp contrast to the situation of populations with discrete generations for which quite a few estimators are available. In this study, we propose an estimator (EPA, estimator by parentage assignments) of the current Ne of populations with overlapping generations, using the sex, age, and multilocus genotype information of a single sample of individuals taken at random from the population. Simulations show that EPA provides unbiased and accurate estimates of Ne under realistic sampling and genotyping effort. Additionally, it yields estimates of other interesting parameters such as generation interval, the variances and covariances of lifetime family size, effective number of breeders of each age class, and life‐table variables. Data from wild populations of baboons and hihi (stitchbird) were analyzed by EPA to demonstrate the use of the estimator in practical sampling and genotyping situations.  相似文献   

Microsatellites (simple sequence repeats, SSRs) still remain popular molecular markers for studying neutral genetic variation. Two alternative models outline how new microsatellite alleles evolve. Infinite alleles model (IAM) assumes that all possible alleles are equally likely to result from a mutation, while stepwise mutation model (SMM) describes microsatellite evolution as stepwise adding or subtracting single repeat units. Genetic relationships between individuals can be analyzed in higher precision when assuming the SMM scenario with allele size differences as a proxy of genetic distance. If population structure is not predetermined in advance, an empirical data analysis usually includes (a) estimating proximity between individual SSR profiles with a selected dissimilarity measure and (b) determining putative genetic structure of a given set of individuals using methods of clustering and/or ordination for the obtained dissimilarity matrix. We developed new dissimilarity indices between SSR profiles of haploid, diploid, or polyploid organisms assuming different mutation models and compared the performance of these indices for determining genetic structure with population data and with simulations. More specifically, we compared SMM with a constant or variable mutation rate at different SSR loci to IAM using data from natural populations of a freshwater bryozoan Cristatella mucedo (diploid), wheat leaf rust Puccinia triticina (dikaryon), and wheat powdery mildew Blumeria graminis (monokaryon). We show that inferences about population genetic structure are sensitive to the assumed mutation model. With simulations, we found that Bruvo's distance performs generally poorly, while the new metrics are capturing the differences in the genetic structure of the populations.  相似文献   

Most large mammals have constantly been exposed to anthropogenic influence over decades or even centuries. Because of their long generation times and lack of sampling material, inferences of past population genetic dynamics, including anthropogenic impacts, have only relied on the analysis of the structure of extant populations. Here, we investigate for the first time the change in the genetic constitution of a natural red deer population over two centuries, using up to 200‐year‐old antlers (30 generations) stored in trophy collections. To the best of our knowledge, this is the oldest DNA source ever used for microsatellite population genetic analyses. We demonstrate that government policy and hunting laws may have strong impacts on populations that can lead to unexpectedly rapid changes in the genetic constitution of a large mammal population. A high ancestral individual polymorphism seen in an outbreeding population (1813–1861) was strongly reduced in descendants (1923–1940) during the mid‐19th and early 20th century by genetic bottlenecks. Today (2011), individual polymorphism and variance among individuals is increasing in a constant‐sized (managed) population. Differentiation was high among periods (FST > ***); consequently, assignment tests assigned individuals to their own period with >85% probability. In contrast to the high variance observed at nuclear microsatellite loci, mtDNA (D‐loop) was monomorphic through time, suggesting that male immigration dominates the genetic evolution in this population.  相似文献   

Perhaps the oldest unresolved debate inconservation genetics is whether geneticvariability matters – in other words, whetherrelatively low average genetic variationcontributes to deficits in individual andpopulation level vigor and fitness. Using astatistically powerful paired sampling designin which each of three pairs of populationsconsisted of one high genetic variability andone low genetic variability population from aparticular subspecies of the pocket gopher,Thomomys bottae, we tested the hypothesisthat individuals from populations with lowergenetic variability have lower growth rates (acommonly used surrogate for fitness) than thosefrom populations with higher variability. Wemeasured genetic variability using averageallozyme heterozygosity and two measures of DNAfingerprint band sharing (Jeffreys 33.15 andMS1 probes). The population rankings of thelevels of genetic variability among the threemeasures were concordant. The least squaresmean growth rate (controlling for sex,subspecies and initial mass) of gophers fromlow variability populations (0.41 ± 0.06g/day, n = 48) was less than half that ofgophers from high variability populations (1.04± 0.07 g/day, n = 45). This result lendscredence to the premise that differences inpopulation level genetic variability havesignificant fitness consequences andunderscores the importance of maintaininggenetic variability in managed populations.  相似文献   

Low genetic variation is often considered to contribute to the extinction of species when they reach small population sizes. In this study we examined the mitochondrial control region from museum specimens of the Heath Hen (Tympanuchus cupido cupido), which went extinct in 1932. Today, the closest living relatives of the Heath Hen, the Greater (T. c. pinnatus), Attwater’s (T. c. attwateri) and Lesser (T. pallidicinctus) Prairie-chicken, are declining throughout most of their range in Midwestern North America, and loss of genetic variation is a likely contributor to their decline. Here we show that 30 years prior to their extinction, Heath Hens had low levels of mitochondrial genetic variation when compared with contemporary populations of prairie-chickens. Furthermore, some current populations of Greater Prairie-chickens are isolated and losing genetic variation due to drift. We estimate that these populations will reach the low levels of genetic variation found in Heath Hens within the next 40 years. Genetic variation and fitness can be restored with translocation of individuals from other populations; however, we also show that choosing an appropriate source population for translocation can be difficult without knowledge of historic population bottlenecks and their effect on genetic structure.  相似文献   

We developed a spatially explicit model of a bioinvasion and used an approximate Bayesian computation (ABC) framework to make various inferences from a combination of genetic (microsatellite genotypes), historical (first observation dates) and geographical (spatial coordinates of introduction and sampled sites) information. Our method aims to discriminate between alternative introduction scenarios and to estimate posterior densities of demographically relevant parameters of the invasive process. The performance of our landscape-ABC method is assessed using simulated data sets differing in their information content (genetic and/or historical data). We apply our methodology to the recent introduction and spatial expansion of the cane toad, Bufo marinus, in northern Australia. We find that, at least in the context of cane toad invasion, historical data are more informative than genetic data for discriminating between introduction scenarios. However, the combination of historical and genetic data provides the most accurate estimates of demographic parameters. For the cane toad, we find some evidence for a strong bottleneck prior to introduction, a small initial number of founder individuals (about 15), a large population growth rate (about 400% per generation), a standard deviation of dispersal distance of 19 km per generation and a high invasion speed at equilibrium (50 km per year). Our approach strengthens the application of the ABC method to the field of bioinvasion by allowing statistical inferences to be made on the introduction and the spatial expansion dynamics of invasive species using a combination of various relevant sources of information.  相似文献   

Jinliang Wang 《Molecular ecology》2014,23(13):3191-3213
Coupled with rapid developments of efficient genetic markers, powerful population genetic methods were proposed to estimate migration rates (m) in natural populations in much broader spatial and temporal scales than the traditional mark‐release‐recapture (MRR) methods. Highly polymorphic (e.g. microsatellites) and genomic‐wide (e.g. SNPs) markers provide sufficient information to assign individuals to their populations or parents of origin and thereby to estimate directly m in a way similar to MRR. Such direct estimates of current migration rates are particularly useful in understanding the ecology and microevolution of wild populations and in managing the populations in the future. In this study, I proposed and implemented, in the software MigEst, a likelihood method to use marker‐based parentage assignments in jointly estimating m and candidate parent sampling proportions (x) in a subset of populations, investigated its power and accuracy using data simulated in various scenarios of population properties (e.g. the actual m, number, size and differentiation of populations) and sampling properties (e.g. the numbers of sampled parent candidates, offspring and markers), compared it with the population assignment approach implemented in the software BayesAss and demonstrated its usefulness by analysing a microsatellite data set from three natural populations of Brazilian bats. Simulations showed that MigEst provides unbiased and accurate estimates of m and performs better than BayesAss except when populations are highly differentiated with very small and ecologically insignificant migration rates. A valuable property of MigEst is that in the presence of unsampled populations, it gives good estimates of the rate of migration among sampled populations as well as of the rate of migration into each sampled population from the pooled unsampled populations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号