首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
distruct: a program for the graphical display of population structure   总被引:3,自引:0,他引:3  
In analysis of multilocus genotypes from structured populations, individual coefficients of membership in subpopulations are often estimated using programs such as structure . distruct provides a general method for visualizing these estimated membership coefficients. Subpopulations are represented as colours, and individuals are depicted as bars partitioned into coloured segments that correspond to membership coefficients in the subgroups. distruct , available at http://www.cmb.usc.edu/~noahr/distruct.html , can also be used to display subpopulation assignment probabilities when individuals are assumed to have ancestry in only one group.  相似文献   

2.
3.
Argentine population genetic structure was examined using a set of 78 ancestry informative markers (AIMs) to assess the contributions of European, Amerindian, and African ancestry in 94 individuals members of this population. Using the Bayesian clustering algorithm STRUCTURE, the mean European contribution was 78%, the Amerindian contribution was 19.4%, and the African contribution was 2.5%. Similar results were found using weighted least mean square method: European, 80.2%; Amerindian, 18.1%; and African, 1.7%. Consistent with previous studies the current results showed very few individuals (four of 94) with greater than 10% African admixture. Notably, when individual admixture was examined, the Amerindian and European admixture showed a very large variance and individual Amerindian contribution ranged from 1.5 to 84.5% in the 94 individual Argentine subjects. These results indicate that admixture must be considered when clinical epidemiology or case control genetic analyses are studied in this population. Moreover, the current study provides a set of informative SNPs that can be used to ascertain or control for this potentially hidden stratification. In addition, the large variance in admixture proportions in individual Argentine subjects shown by this study suggests that this population is appropriate for future admixture mapping studies.  相似文献   

4.
The computer program Structure implements a Bayesian method, based on a population genetics model, to assign individuals to their source populations using genetic marker data. It is widely applied in the fields of ecology, evolutionary biology, human genetics and conservation biology for detecting hidden genetic structures, inferring the most likely number of populations (K), assigning individuals to source populations and estimating admixture and migration rates. Recently, several simulation studies repeatedly concluded that the program yields erroneous inferences when samples from different populations are highly unbalanced in size. Analysing both simulated and empirical data sets, this study confirms that Structure indeed yields poor individual assignments to source populations and gives frequently incorrect estimates of K when sampling is unbalanced. However, this poor performance is mainly caused by the adoption of the default ancestry prior, which assumes all source populations contribute equally to the pooled sample of individuals. When the alternative ancestry prior, which allows for unequal representations of the source populations by the sample, is adopted, accurate individual assignments could be obtained even if sampling is highly unbalanced. The alternative prior also improves the inference of K by two estimators, albeit the improvement is not as much as that in individual assignments to populations. For the difficult case of many populations and unbalanced sampling, a rarely used parameter combination of the alternative ancestry prior, an initial ALPHA value much smaller than the default and the uncorrelated allele frequency model is required for Structure to yield accurate inferences. I conclude that Structure is easy to use but is easier to misuse because of its complicated genetic model and many parameter (prior) options which may not be obvious to choose, and suggest using multiple plausible models (parameters) and K estimators in conducting comparative and exploratory Structure analysis.  相似文献   

5.
Reproducibility is the benchmark for results and conclusions drawn from scientific studies, but systematic studies on the reproducibility of scientific results are surprisingly rare. Moreover, many modern statistical methods make use of ‘random walk’ model fitting procedures, and these are inherently stochastic in their output. Does the combination of these statistical procedures and current standards of data archiving and method reporting permit the reproduction of the authors' results? To test this, we reanalysed data sets gathered from papers using the software package structure to identify genetically similar clusters of individuals. We find that reproducing structure results can be difficult despite the straightforward requirements of the program. Our results indicate that 30% of analyses were unable to reproduce the same number of population clusters. To improve this, we make recommendations for future use of the software and for reporting structure analyses and results in published works.  相似文献   

6.
On the basis of simulated data, this study compares the relative performances of the Bayesian clustering computer programs structure , geneland , geneclust and a new program named tess . While these four programs can detect population genetic structure from multilocus genotypes, only the last three ones include simultaneous analysis from geographical data. The programs are compared with respect to their abilities to infer the number of populations, to estimate membership probabilities, and to detect genetic discontinuities and clinal variation. The results suggest that combining analyses using tess and structure offers a convenient way to address inference of spatial population structure.  相似文献   

7.
We studied 156 individuals of Native American descent from the city of Tlapa in the state of Guerrero in western Mexico. Most individuals' ethnicity was either Nahua, Mixtec, or Tlapanec, but self-identified Mestizos and individuals of mixed ethnicities were also included in the sample. We typed 24 autosomal, one Y-chromosome, and four mitochondrial ancestry-informative markers (AIMs) to estimate group and individual admixture proportions, and determine whether the admixture process involved directional gene flow between parental groups. When genetically defined (GD) Mestizos were excluded from the analysis, Native American ancestry represented approximately 98% of the population's gene pool, while European and West African ancestry represented approximately 1% each. Maternally inherited markers also showed an exceptionally high Native American contribution (98.5%), as did the paternally inherited marker, DYS199 (90.7%). We did not detect genetic structure in this population using these AIMs, which appears consistent with the homogeneity of the sample in terms of admixture proportions. The addition of GD Mestizos to the sample did not produce a considerable change in admixture estimates, but it had a major effect on population structure. These results show that the population of Tlapa in Guerrero, Mexico, has experienced little admixture with Europeans and/or West Africans. They also show that the impact of a small number of admixed individuals on an otherwise homogeneous population might have profound implications on subsequent ancestry/phenotype analysis and mapping strategies. We suggest that heterogeneity is a major characteristic of Mexican populations and, as a consequence, should not be disregarded when designing epidemiological studies of Mexican and Mexican American populations.  相似文献   

8.
9.

Motivation

Correctly modeling population structure is important for understanding recent evolution and for association studies in humans. While pre-existing knowledge of population history can be used to specify expected levels of subdivision, objective metrics to detect population structure are important and may even be preferable for identifying groups in some situations. One such metric for genomic scale data is implemented in the cross-validation procedure of the program ADMIXTURE, but it has not been evaluated on recently diverged and potentially cryptic levels of population structure. Here, I develop a new method, AdmixKJump, and test both metrics under this scenario.

Findings

I show that AdmixKJump is more sensitive to recent population divisions compared to the cross-validation metric using both realistic simulations, as well as 1000 Genomes Project European genomic data. With two populations of 50 individuals each, AdmixKJump is able to detect two populations with 100% accuracy that split at least 10KYA, whereas cross-validation obtains this 100% level at 14KYA. I also show that AdmixKJump is more accurate with fewer samples per population. Furthermore, in contrast to the cross-validation approach, AdmixKJump is able to detect the population split between the Finnish and Tuscan populations of the 1000 Genomes Project.

Conclusion

AdmixKJump has more power to detect the number of populations in a cohort of samples with smaller sample sizes and shorter divergence times.

Availability

A java implementation can be found at https://sites.google.com/site/igsevolgenomicslab/home/downloads  相似文献   

10.
This article reviews recent developments in Bayesian algorithms that explicitly include geographical information in the inference of population structure. Current models substantially differ in their prior distributions and background assumptions, falling into two broad categories: models with or without admixture. To aid users of this new generation of spatially explicit programs, we clarify the assumptions underlying the models, and we test these models in situations where their assumptions are not met. We show that models without admixture are not robust to the inclusion of admixed individuals in the sample, thus providing an incorrect assessment of population genetic structure in many cases. In contrast, admixture models are robust to an absence of admixture in the sample. We also give statistical and conceptual reasons why data should be explored using spatially explicit models that include admixture.  相似文献   

11.
Genetic clustering algorithms require a certain amount of data to produce informative results. In the common situation that individuals are sampled at several locations, we show how sample group information can be used to achieve better results when the amount of data is limited. New models are developed for the structure program, both for the cases of admixture and no admixture. These models work by modifying the prior distribution for each individual's population assignment. The new prior distributions allow the proportion of individuals assigned to a particular cluster to vary by location. The models are tested on simulated data, and illustrated using microsatellite data from the CEPH Human Genome Diversity Panel. We demonstrate that the new models allow structure to be detected at lower levels of divergence, or with less data, than the original structure models or principal components methods, and that they are not biased towards detecting structure when it is not present. These models are implemented in a new version of structure which is freely available online at http://pritch.bsd.uchicago.edu/structure.html.  相似文献   

12.
The use of dominant markers such as amplified fragment length polymorphism (AFLP) for population genetics analyses is often impeded by the lack of appropriate computer programs and rarely motivated by objective considerations. The point of the present note is twofold: (i) we describe how the computer program Geneland designed to infer population structure has been adapted to deal with dominant markers; and (ii) we use Geneland for numerical comparison of dominant and codominant markers to perform clustering. AFLP markers lead to less accurate results than bi-allelic codominant markers such as single nucleotide polymorphisms (SNP) markers but this difference becomes negligible for data sets of common size (number of individuals n≥100, number of markers L≥200). The latest Geneland version (3.2.1) handling dominant markers is freely available as an R package with a fully clickable graphical interface. Installation instructions and documentation can be found on http://www2.imm.dtu.dk/~gigu/Geneland.  相似文献   

13.

Objectives

Since 2010, genome-wide data from hundreds of ancient Native Americans have contributed to the understanding of Americas' prehistory. However, these samples have never been studied as a single dataset, and distinct relationships among themselves and with present-day populations may have never come to light. Here, we reassess genomic diversity and population structure of 223 ancient Native Americans published between 2010 and 2019.

Materials and Methods

The genomic data from ancient Americas was merged with a worldwide reference panel of 278 present-day genomes from the Simons Genome Diversity Project and then analyzed through ADMIXTURE, D-statistics, PCA, t-SNE, and UMAP.

Results

We find largely similar population structures in ancient and present-day Americas. However, the population structure of contemporary Native Americans, traced here to at least 10,000 years before present, is noticeably less diverse than their ancient counterparts, a possible outcome of the European contact. Additionally, in the past there were greater levels of population structure in North than in South America, except for ancient Brazil, which harbors comparatively high degrees of structure. Moreover, we find a component of genetic ancestry in the ancient dataset that is closely related to that of present-day Oceanic populations but does not correspond to the previously reported Australasian signal. Lastly, we report an expansion of the Ancient Beringian ancestry, previously reported for only one sample.

Discussion

Overall, our findings support a complex scenario for the settlement of the Americas, accommodating the occurrence of founder effects and the emergence of ancestral mixing events at the regional level.  相似文献   

14.
The program structure has been used extensively to understand and visualize population genetic structure. It is one of the most commonly used clustering algorithms, cited over 11 500 times in Web of Science since its introduction in 2000. The method estimates ancestry proportions to assign individuals to clusters, and post hoc analyses of results may indicate the most likely number of clusters, or populations, on the landscape. However, as has been shown in this issue of Molecular Ecology Resources by Puechmaille ( 2016 ), when sampling is uneven across populations or across hierarchical levels of population structure, these post hoc analyses can be inaccurate and identify an incorrect number of population clusters. To solve this problem, Puechmaille ( 2016 ) presents strategies for subsampling and new analysis methods that are robust to uneven sampling to improve inferences of the number of population clusters.  相似文献   

15.
16.
Chinese mitten crab (Eriocheir sinensis) has higher commercial value as food source than any other species of Eriocheir in China.To evaluate the germplasm resources and characterize the genetic diversity and population structure of the crabs in different water systems,two stocks and two farming populations were assessed with 25 polymorphic microsallite loci available in public GenBank.Basic statistics showed that the average observed heterozygosity (Ho) amongst populations ranged from 0.5789 to 0.6824.However,a remarkable presence of inbreeding and heterozygote deficiencies were observed.To analyze population structure,pairwise FST coefficients explained only ~10.3% variability from the subdivision of mitten crab populations,the remaining variability stems from the subdivision within subpopulations.Although the four populations had slight differentiation,different allelic frequencies resulted in distinct population structures.Two stocks and one farming population were clustered together to the phylogenetic branch of Yangtze crab,with an approximate membership of 95%.Whereas,another fanning population was clustered singly to the phylogenetic branch of the Liaohe crab,with a membership of 97.1%.The tests for individual admixture showed that Yangtze crab had probably been contaminated with individuals from other water systems.Genetic relationships between populations also supported the conclusion that Yangtze crab and Liaohe crab had different gene pools in spite of the origins of the same species.  相似文献   

17.
Coalescent simulations were used to investigate the possible role of population subdivision and history in shaping nucleotide variation in a recombining 88-kb genomic fragment of Drosophila simulans displaying an unusual large-scale haplotype structure. The multilocus analysis, based on summary statistics using specific demographic null models under recombination, indicates that the observed levels of linkage disequilibrium differed significantly from the values expected under different bottleneck and population admixture scenarios. These results indicate that demography alone may not account for the observed pattern of variation and support the previous claim that the data are better described by a model in which an adaptive mutation has not yet gone to fixation.  相似文献   

18.
Glossophaga longirostris and Leptonycteris curasoae are nectar-feeding bats associated with arid zones in northern South America. Despite their close phylogenetic relationship, sympatric condition and niche similarities, morphological and ecological evidence suggest that these species differ in dispersal capabilities. Using mitochondrial DNA, we tested the hypothesis that these species exhibit different levels of population structure that are congruent with their particular movement capabilities. We sequenced a section of the control region of mtDNA for 41 G. longirostris and 42 L. curasoae from 11 zones in Venezuela. Population subdivision in G. longirostris (FST = 0.725) was considerably higher than in L. curasoae (FST = 0.167). L. curasoae individuals shared haplotypes at greater distances (812 km) than G. longirostris (592 km). Our results offer preliminary evidence for one of two possible scenarios, either greater mobility in L. curasoae or a higher degree of female philopatry in G. longirostris.  相似文献   

19.
S T Kalinowski 《Heredity》2011,106(4):625-632
One of the primary goals of population genetics is to succinctly describe genetic relationships among populations, and the computer program STRUCTURE is one of the most frequently used tools for doing so. The mathematical model used by STRUCTURE was designed to sort individuals into Hardy–Weinberg populations, but the program is also frequently used to group individuals from a large number of populations into a small number of clusters that are supposed to represent the main genetic divisions within species. In this study, I used computer simulations to examine how well STRUCTURE accomplishes this latter task. Simulations of populations that had a simple hierarchical history of fragmentation showed that when there were relatively long divergence times within evolutionary lineages, the clusters created by STRUCTURE were frequently not consistent with the evolutionary history of the populations. These difficulties can be attributed to forcing STRUCTURE to place individuals into too few clusters. Simulations also showed that the clusters produced by STRUCTURE can be strongly influenced by variation in sample size. In some circumstances, STRUCTURE simply put all of the individuals from the largest sample in the same cluster. A reanalysis of human population structure suggests that the problems I identified with STRUCTURE in simulations may have obscured relationships among human populations—particularly genetic similarity between Europeans and some African populations.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号