首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 984 毫秒
1.
In this paper we consider the genealogy of a random sample of n chromosomes from a panmictic population which has evolved with constant size N over many generations. We address two related problems. First we describe how genealogical information may be usefully partitioned into information on the events (mutations and coalescences) which occur in the genealogy, and the times between these events. We show that the distribution of the times given information on the events is particularly simple and describe how this can considerably reduce the computational burden when performing inference for these times. Second we investigate the effect on the genealogy of conditioning on a single mutation having occurred during the ancestry of the sample. In particular we use results from the first part of the paper to derive explicit formulae for the density of the age of a mutant allele, conditional on its frequency in either a sample or the population.  相似文献   

2.
Identification and study of genetic variation in recently admixed populations not only provides insight into historical population events but also is a powerful approach for mapping disease loci. We studied a population (OG-W-IP) that is of African-Indian origin and has resided in the western part of India for 500 years; members of this population are believed to be descendants of the Bantu-speaking population of Africa. We have carried out this study by using a set of 18,534 autosomal markers common between Indian, CEPH-HGDP, and HapMap populations. Principal-components analysis clearly revealed that the African-Indian population derives its ancestry from Bantu-speaking west-African as well as Indo-European-speaking north and northwest Indian population(s). STRUCTURE and ADMIXTURE analyses show that, overall, the OG-W-IPs derive 58.7% of their genomic ancestry from their African past and have very little inter-individual ancestry variation (8.4%). The extent of linkage disequilibrium also reveals that the admixture event has been recent. Functional annotation of genes encompassing the ancestry-informative markers that are closer in allele frequency to the Indian ancestral population revealed significant enrichment of biological processes, such as ion-channel activity, and cadherins. We briefly examine the implications of determining the genetic diversity of this population, which could provide opportunities for studies involving admixture mapping.  相似文献   

3.
The origins of the inhabitants of Madagascar have not been fully resolved. Anthropological studies and preliminary genetic data point to two main sources of ancestry of the Malagasy, namely, Indonesian and African, with additional contributions from India and Arabia. The sickle-cell (beta s) mutation is found in populations of African and Indian origin. The frequency of the beta s-globin gene, derived from 1,425 Malagasy individuals, varies from 0 in some highland populations to .25 in some coastal populations. The beta s mutation is thought to have arisen at least five times, on the basis of the presence of five distinct beta s-associated haplotypes, each found in a separate geographic area. Twenty-five of the 35 Malagasy beta s haplotypes were of the typical "Bantu" type, 1 "Senegal" haplotype was found, and 2 rare or atypical haplotypes were observed; the remaining 7 haplotypes were consistent with the Bantu haplotype. The Bantu beta s mutation is thought to have been introduced into Madagascar by Bantu-speaking immigrants (colonists or slaves) from central or east Africa. The Senegal beta s mutation may have been introduced to the island via Portuguese naval explorers. This study provides the first definitive biological evidence that a major component of Malagasy ancestry is derived from African populations, in particular, Bantu-speaking Negroids. beta A haplotypes are also consistent with the claim for a significant African contribution to Malagasy ancestry but are also suggestive of Asian/Oceanic and Caucasoid admixture within the Malagasy population.  相似文献   

4.
How natural selection acts to limit the proliferation of transposable elements (TEs) in genomes has been of interest to evolutionary biologists for many years. To describe TE dynamics in populations, previous studies have used models of transposition–selection equilibrium that assume a constant rate of transposition. However, since TE invasions are known to happen in bursts through time, this assumption may not be reasonable. Here we propose a test of neutrality for TE insertions that does not rely on the assumption of a constant transposition rate. We consider the case of TE insertions that have been ascertained from a single haploid reference genome sequence. By conditioning on the age of an individual TE insertion allele (inferred by the number of unique substitutions that have occurred within the particular TE sequence since insertion), we determine the probability distribution of the insertion allele frequency in a population sample under neutrality. Taking models of varying population size into account, we then evaluate predictions of our model against allele frequency data from 190 retrotransposon insertions sampled from North American and African populations of Drosophila melanogaster. Using this nonequilibrium neutral model, we are able to explain ∼80% of the variance in TE insertion allele frequencies based on age alone. Controlling for both nonequilibrium dynamics of transposition and host demography, we provide evidence for negative selection acting against most TEs as well as for positive selection acting on a small subset of TEs. Our work establishes a new framework for the analysis of the evolutionary forces governing large insertion mutations like TEs, gene duplications, or other copy number variants.  相似文献   

5.
In this study, we explore the geographic and temporal distribution of a unique variant of the O blood group allele called O1vG542A, which has been shown to be shared among Native Americans but is rare in other populations. O1vG542A was previously reported in Native American populations in Mesoamerica and South America, and has been proposed as an ancestry informative marker. We investigated whether this allele is also found in the Tlingit and Haida, two contemporary indigenous populations from Alaska, and a pre‐Columbian population from California. If O1vG542A is present in Na‐Dene speakers (i.e., Tlingits), it would indicate that Na‐Dene speaking groups share close ancestry with other Native American groups and support a Beringian origin of the allele, consistent with the Beringian Incubation Model. If O1vG542A is found in pre‐Columbian populations, it would further support a Beringian origin of the allele, rather than a more recent introduction of the allele into the Americas via gene flow from one or more populations which have admixed with Native Americans over the past five centuries. We identified this allele in one Na‐Dene population at a frequency of 0.11, and one ancient California population at a frequency of 0.20. Our results support a Beringian origin of O1vG542A, which is distributed today among all Native American groups that have been genotyped in appreciable numbers at this locus. This result is consistent with the hypothesis that Na‐Dene and other Native American populations primarily derive their ancestry from a single source population. Am J Phys Anthropol 151:649–657, 2013. © 2013 Wiley Periodicals, Inc.  相似文献   

6.
Genetic profile of cosmopolitan populations: effects of hidden subdivision   总被引:1,自引:0,他引:1  
Natural populations of many organisms exhibit excess of rare alleles in comparison with the predictions of the neutral mutation hypothesis. It has been shown before that either a population bottleneck or the presence of slightly deleterious mutations can explain this phenomenon. A third explanation is presented in this work, showing that hidden subdivision within a population can also lead to an excess of rare alleles in the total population when the expectations of the neutral model are based on the allele frequency profile of the entire population data. With two examples (mitochondrial DNA-morph distribution and isozyme allele frequency distributions), it is shown that most cosmopolitan human populations exhibit excess of rare as well as total allele counts, when these are compared with the expectations of the neutral mutation hypothesis. The mitochondrial data demonstrate that such excesses can be detected from genetic variation at a single locus as well, and this is not due to stochastic error of allele frequency distributions. Contrast of the present observations with the allele frequency profiles in agglomerated tribal populations from South and Central America shows that even when the neutral expectations hold for individual subpopulations, if all subpopulations are grouped into a single population, the pooled data exhibit an excess of total number of alleles that is mainly due to the excess of rare alleles. Therefore, a primary cause of the excess number of rare alleles could be the hidden subdivision, and the magnitude of the excess indicates the extent of substructuring. The two components of hidden subdivision are: 1) Number of subpopulations, and 2) the average genetic distance among them. The implications of this observation in estimating mutation rate are discussed indicating the difficulties of comparing mutation rates from different population surveys.  相似文献   

7.
Admixture between populations originating on different continents can be exploited to detect disease susceptibility loci at which risk alleles are distributed differentially between these populations. We first examine the statistical power and mapping resolution of this approach in the limiting situation in which gamete admixture and locus ancestry are measured without uncertainty. We show that, for a rare disease, the most efficient design is to study affected individuals only. In a typical African American population (two-way admixture proportions 0.8/0.2, ancestry crossover rate 2 per 100 cM), a study of 800 affected individuals has 90% power to detect at P values <10(-5) a locus that generates a risk ratio of 2 between populations, with an expected mapping resolution (size of 95% confidence region for the position of the locus) of 4 cM. In practice, to infer locus ancestry from marker data requires Bayesian computationally intensive methods, as implemented in the program ADMIXMAP. Affected-only study designs require strong prior information on the frequencies of each allele given locus ancestry. We show how data from unadmixed and admixed populations can be combined to estimate these ancestry-specific allele frequencies within the admixed population under study, allowing for variation between allele frequencies in unadmixed and admixed populations. Using simulated data based on the genetic structure of the African American population, we show that 60% of information can be extracted in a test for linkage using markers with an ancestry information content of 36% at 3-cM spacing. As in classic linkage studies, the most efficient strategy is to use markers at a moderate density for an initial genome search and then to saturate regions of putative linkage with additional markers, to extract nearly all information about locus ancestry.  相似文献   

8.
本文采用分层整群抽样的调查方法捺印1183名藏族青少年的掌指纹, 分析掌指纹参数,然后与其他56个群体的掌指纹参数进行聚类分析, 进而从肤纹学角度探讨藏族的起源。发现藏族指纹以斗型纹为主(52.89%), 其次为箕型纹(42.95%), 弓形纹出现频率最低(4.16%); 总指嵴线计数为139.01(其中男性为144.75, 女性为133.87); atd角在男性为42.95°, 女性为43.28°。掌指纹参数聚类分析显示: 藏族与汉族和氐羌氏族的后裔(门巴族、普米族、羌族等)等我国北方人群聚在一起。因而从肤纹学角度推断藏族与汉族和氐羌氏族的亲缘关系较近, 而与印度人和孟加拉人的亲缘关系较远。  相似文献   

9.
Genetic studies have identified substantial non-African admixture in the Horn of Africa (HOA). In the most recent genomic studies, this non-African ancestry has been attributed to admixture with Middle Eastern populations during the last few thousand years. However, mitochondrial and Y chromosome data are suggestive of earlier episodes of admixture. To investigate this further, we generated new genome-wide SNP data for a Yemeni population sample and merged these new data with published genome-wide genetic data from the HOA and a broad selection of surrounding populations. We used multidimensional scaling and ADMIXTURE methods in an exploratory data analysis to develop hypotheses on admixture and population structure in HOA populations. These analyses suggested that there might be distinct, differentiated African and non-African ancestries in the HOA. After partitioning the SNP data into African and non-African origin chromosome segments, we found support for a distinct African (Ethiopic) ancestry and a distinct non-African (Ethio-Somali) ancestry in HOA populations. The African Ethiopic ancestry is tightly restricted to HOA populations and likely represents an autochthonous HOA population. The non-African ancestry in the HOA, which is primarily attributed to a novel Ethio-Somali inferred ancestry component, is significantly differentiated from all neighboring non-African ancestries in North Africa, the Levant, and Arabia. The Ethio-Somali ancestry is found in all admixed HOA ethnic groups, shows little inter-individual variance within these ethnic groups, is estimated to have diverged from all other non-African ancestries by at least 23 ka, and does not carry the unique Arabian lactase persistence allele that arose about 4 ka. Taking into account published mitochondrial, Y chromosome, paleoclimate, and archaeological data, we find that the time of the Ethio-Somali back-to-Africa migration is most likely pre-agricultural.  相似文献   

10.
Detailed information about the geographic distribution of genetic and genomic variation is necessary to better understand the organization and structure of biological diversity. In particular, spatial isolation within species and hybridization between them can blur species boundaries and create evolutionary relationships that are inconsistent with a strictly bifurcating tree model. Here, we analyse genome‐wide DNA sequence and genetic ancestry variation in Lycaeides butterflies to quantify the effects of admixture and spatial isolation on how biological diversity is organized in this group. We document geographically widespread and pervasive historical admixture, with more restricted recent hybridization. This includes evidence supporting previously known and unknown instances of admixture. The genome composition of admixed individuals varies much more among than within populations, and tree‐ and genetic ancestry‐based analyses indicate that multiple distinct admixed lineages or populations exist. We find that most genetic variants in Lycaeides are rare (minor allele frequency <0.5%). Because the spatial and taxonomic distributions of alleles reflect demographic and selective processes since mutation, rare alleles, which are presumably younger than common alleles, were spatially and taxonomically restricted compared with common variants. Thus, we show patterns of genetic variation in this group are multifaceted, and we argue that this complexity challenges simplistic notions concerning the organization of biological diversity into discrete, easily delineated and hierarchically structured entities.  相似文献   

11.
Analyses of evolution and maintenance of quantitative genetic variation depend on the mutation models assumed. Currently two polygenic mutation models have been used in theoretical analyses. One is the random walk mutation model and the other is the house-of-cards mutation model. Although in the short term the two models give similar results for the evolution of neutral genetic variation within and between populations, the predictions of the changes of the variation are qualitatively different in the long term. In this paper a more general mutation model, called the regression mutation model, is proposed to bridge the gap of the two models. The model regards the regression coefficient, γ, of the effect of an allele after mutation on the effect of the allele before mutation as a parameter. When γ = 1 or 0, the model becomes the random walk model or the house-of-cards model, respectively. The additive genetic variances within and between populations are formulated for this mutation model, and some insights are gained by looking at the changes of the genetic variances as γ changes. The effects of γ on the statistical test of selection for quantitative characters during macroevolution are also discussed. The results suggest that the random walk mutation model should not be interpreted as a null hypothesis of neutrality for testing against alternative hypotheses of selection during macroevolution because it can potentially allocate too much variation for the change of population means under neutrality.  相似文献   

12.
We report a study of genome-wide, dense SNP (∼900K) and copy number polymorphism data of indigenous southern Africans. We demonstrate the genetic contribution to southern and eastern African populations, which involved admixture between indigenous San, Niger-Congo-speaking and populations of Eurasian ancestry. This finding illustrates the need to account for stratification in genome-wide association studies, and that admixture mapping would likely be a successful approach in these populations. We developed a strategy to detect the signature of selection prior to and following putative admixture events. Several genomic regions show an unusual excess of Niger-Kordofanian, and unusual deficiency of both San and Eurasian ancestry, which were considered the footprints of selection after population admixture. Several SNPs with strong allele frequency differences were observed predominantly between the admixed indigenous southern African populations, and their ancestral Eurasian populations. Interestingly, many candidate genes, which were identified within the genomic regions showing signals for selection, were associated with southern African-specific high-risk, mostly communicable diseases, such as malaria, influenza, tuberculosis, and human immunodeficiency virus/AIDs. This observation suggests a potentially important role that these genes might have played in adapting to the environment. Additionally, our analyses of haplotype structure, linkage disequilibrium, recombination, copy number variation and genome-wide admixture highlight, and support the unique position of San relative to both African and non-African populations. This study contributes to a better understanding of population ancestry and selection in south-eastern African populations; and the data and results obtained will support research into the genetic contributions to infectious as well as non-communicable diseases in the region.  相似文献   

13.
Natural selection is a significant force that shapes the architecture of the human genome and introduces diversity across global populations. The question of whether advantageous mutations have arisen in the human genome as a result of single or multiple mutation events remains unanswered except for the fact that there exist a handful of genes such as those that confer lactase persistence, affect skin pigmentation, or cause sickle cell anemia. We have developed a long-range-haplotype method for identifying genomic signatures of positive selection to complement existing methods, such as the integrated haplotype score (iHS) or cross-population extended haplotype homozygosity (XP-EHH), for locating signals across the entire allele frequency spectrum. Our method also locates the founder haplotypes that carry the advantageous variants and infers their corresponding population frequencies. This presents an opportunity to systematically interrogate the whole human genome whether a selection signal shared across different populations is the consequence of a single mutation process followed subsequently by gene flow between populations or of convergent evolution due to the occurrence of multiple independent mutation events either at the same variant or within the same gene. The application of our method to data from 14 populations across the world revealed that positive-selection events tend to cluster in populations of the same ancestry. Comparing the founder haplotypes for events that are present across different populations revealed that convergent evolution is a rare occurrence and that the majority of shared signals stem from the same evolutionary event.  相似文献   

14.
The genetic diversity and population structure of a population of African lions in Hwange National Park, Zimbabwe, was studied using 17 microsatellite loci. Spatial genetic analysis using Bayesian methods suggested a weak genetic structure within the population and high levels of gene flow across the study area. We were able to identify a few individuals with aberrant or admixed ancestry, which we interpreted as either immigrants or as descendants thereof. This, together with relatively high genetic diversity, suggests that immigrants from beyond the study area have influenced the genetic structure within the park. We suggest that the levels of genetic diversity and the observed weak structure are indicative of the large and viable Okavango-Hwange population of which our study population is a part. Genetic patterns can also be attributed to still existing high levels of habitat connectivity between protected areas. Given expected increases in human populations and anthropogenic impacts, efforts to identify and maintain existing movement corridors between regional lion populations will be important in retaining the high genetic diversity status of this population. Our results show that understanding existing levels of genetic diversity and genetic connectivity has implications, not only for this lion population, but also for managing large wild populations of carnivores.  相似文献   

15.
A highly polymorphic CAG repeat locus, ERDA1, was recently described on human chromosome 17q21.3, with alleles as large as 50-90 repeats and without any disease association in the general population. We have studied allelic distribution at this locus in five human populations and have characterized the mutational patterns by direct observation of 731 meioses. The data show that large alleles (>/=40 CAG repeats) are generally most common in Asian populations, less common in populations of European ancestry, and least common among Africans. We have observed a high intergenerational instability (46. 3%+/-5.1%) of the large alleles. Although the mutation rate is not dependent on parental sex, paternal transmissions have predominantly resulted in contractions, whereas maternal transmissions have yielded expansions. Within this class of large alleles, the mutation rate increases concomitantly with increasing allele size, but the magnitude of repeat size change does not depend on the size of the progenitor allele. Sequencing of specific alleles reveals that the intermediate-sized alleles (30-40 repeats) have CAT/CAC interruptions within the CAG-repeat array. These results indicate that expansion and instability of trinucleotide repeats are not exclusively disease-associated phenomena. The implications of the existence of massively expanded alleles in the general populations are not yet understood.  相似文献   

16.
Ala100Thr has been suggested to be a Caucasian genetic marker on the FY*B allele. As the Brazilian population has arisen from miscegenation among Portuguese, Africans, and Indians, this mutation could possibly be found in Euro- and Afro-Brazilians, or in Brazilian Indians. Fifty-three related individuals and a random sample of 100 subjects from the Brazilian population were investigated using the polymerase chain reaction and four restriction fragment length polymorphisms. Confirming the working hypothesis, among the related individuals three Afro-Brazilians (two of them a mother and daughter) and a woman of Amerindian descent had the Ala100Thr mutation on the FY*B allele. Five non-related Euro-Brazilians also carried the mutation. All nine individuals presented the Fy(a-b+) phenotype. We conclude that the Ala100Thr mutation can occur in populations other than Caucasians and that this mutation does not affect Duffy expression on red blood cells. Gene frequencies for this allele in the non-related individuals were in agreement with those of other populations. The Duffy frequencies of two Amerindian tribes were also investigated.  相似文献   

17.
The interest to study the effects of inbreeding in natural populations has increased in the last years. Several microsatellite-derived metrics have recently been developed to infer inbreeding from multilocus heterozygosity data without requiring detailed pedigrees that are difficult to obtain in open populations. Internal relatedness (IR) is currently the most widespread used index and its main attribute is that allele frequency is incorporated into the measure. However, IR underestimates heterozygosity of individuals carrying rare alleles. For example, descendants of immigrants paired with natives (normally more outbred) bearing novel or rare alleles would be considered more homozygous than descendants of native parents. Thus, the analogy between homozygosity and inbreeding that generally is carried out would have no logic in those cases. We propose an alternative index, homozygosity by loci (HL) that avoids such problems by weighing the contribution of each locus to the homozygosity index depending on their allelic variability. Under a wide range of simulated scenarios, we found that our index (HL) correlated better than both IR and uncorrected homozygosity (H(O)), measured as proportion of homozygous loci) with genome-wide homozygosity and inbreeding coefficients in open populations. In these populations, which are likely to prevail in nature, the use of HL instead of IR reduced considerably the sample sizes required to achieve a given statistical power. This is likely to have important consequences on the ability to detect heterozygosity fitness correlations assuming the relationship between genome-wide heterozygosity and fitness traits.  相似文献   

18.
In populations of northern European ancestry, hereditary hemochromatosis (HH) is tightly linked to mutations within the hemochromatosis gene (HFE gene). Over 93% of Irish HH patients are homozygous for the HFE gene C282Y mutation, providing a reliable diagnostic marker of the disease in this population. However, the prevalence of the C282Y mutation and that of the second HFE gene mutation, H63D, have yet to be determined within the Irish population. The objective of this study was to identify the true prevalence of the genetic form of HH in the Irish population. DNA was extracted from 1002 randomly selected newborn screening cards and analyzed for the C282Y and H63D mutations within the HFE gene. Complete results were obtained from 800 cards. Mutations were identified in 364 (46%) neonates. Eight (1%) neonates were homozygous for C282Y and 8 (1%) were homozygous for H63D. One hundred and fifty-five (19%) neonates were C282Y heterozygous and 226 (28%) were H63D heterozygous. Of these, 33 (4%) carried one copy of both C282Y and H63D mutations, i.e., compound heterozygous. Allele frequencies for C282Y and H63D were 11% and 15%, respectively. The high C282Y allele frequency in the Irish population together with its close linkage to HH indicate that C282Y genotyping is the preferred screening strategy for this disease in Ireland.  相似文献   

19.
The extent to which natural selection shapes diversity within populations is a key question for population genetics. Thus, there is considerable interest in quantifying the strength of selection. A full likelihood approach for inference about selection at a single site within an otherwise neutral fully linked sequence of sites is described here. A coalescent model of evolution is used to model the ancestry of a sample of DNA sequences which have the selected site segregating. The mutation model, for the selected and neutral sites, is the infinitely many-sites model where there is no back or parallel mutation at sites. A unique perfect phylogeny, a gene tree, can be constructed from the configuration of mutations on the sample sequences under this model of mutation. The approach is general and can be used for any bi-allelic selection scheme. Selection is incorporated through modelling the frequency of the selected and neutral allelic classes stochastically back in time, then using a subdivided population model considering the population frequencies through time as variable population sizes. An importance sampling algorithm is then used to explore over coalescent tree space consistent with the data. The method is applied to a simulated data set and the gene tree presented in Verrelli et al. (2002).  相似文献   

20.
Li WH 《Genetics》1978,90(2):349-382
Formulae are developed for the distribution of allele frequencies (the frequency spectrum), the mean number of alleles in a sample, and the mean and variance of heterozygosity under mutation pressure and under either genic or recessive selection. Numerical computations are carried out by using these formulae and Watterson's (1977) formula for the distribution of allele frequencies under overdominant selection. The following properties are observed: (1) The effect of selection on the distribution of allele frequencies is slight when 4Ns 相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号