首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 436 毫秒
1.
A short tandem repeat-based phylogeny for the human Y chromosome   总被引:9,自引:0,他引:9       下载免费PDF全文
Human Y-chromosomal short tandem repeat (STR) data provide a potential model system for the understanding of autosomal STR mutations in humans and other species. Yet, the reconstruction of STR evolution is rarely attempted, because of the absence of an appropriate methodology. We here develop and validate a phylogenetic-network approach. We have typed 256 Y chromosomes of indigenous descent from Africa, Asia, Europe, Australia, and highland Papua New Guinea, for the STR loci DYS19, DXYS156Y, DYS389, DYS390, DYS392, and DYS393, as well as for five ancient biallelic mutation events: two poly (A) length variants associated with the YAP insertion, two independent SRY-1532 mutations, and the 92R7 mutation. We have used our previously published pedigree data from 11,000 paternity-tested autosomal STR-allele transfers to produce a two-class weighting system for the Y-STR loci that is based on locus lengths and motif lengths. Reduced-median-network analysis yields a phylogeny that is independently supported by the five biallelic mutations, with an error of 6%. We find the earliest branch in our African San (Bushmen) sample. Assuming an age of 20,000 years for the Native American DYS199 T mutation, we estimate a mutation rate of 2.6x10-4 mutations/20 years for slowly mutating Y STRs, approximately 10-fold slower than the published average pedigree rate.  相似文献   

2.
We report the discovery of an African American Y chromosome that carries the ancestral state of all SNPs that defined the basal portion of the Y chromosome phylogenetic tree. We sequenced ∼240 kb of this chromosome to identify private, derived mutations on this lineage, which we named A00. We then estimated the time to the most recent common ancestor (TMRCA) for the Y tree as 338 thousand years ago (kya) (95% confidence interval = 237–581 kya). Remarkably, this exceeds current estimates of the mtDNA TMRCA, as well as those of the age of the oldest anatomically modern human fossils. The extremely ancient age combined with the rarity of the A00 lineage, which we also find at very low frequency in central Africa, point to the importance of considering more complex models for the origin of Y chromosome diversity. These models include ancient population structure and the possibility of archaic introgression of Y chromosomes into anatomically modern humans. The A00 lineage was discovered in a large database of consumer samples of African Americans and has not been identified in traditional hunter-gatherer populations from sub-Saharan Africa. This underscores how the stochastic nature of the genealogical process can affect inference from a single locus and warrants caution during the interpretation of the geographic location of divergent branches of the Y chromosome phylogenetic tree for the elucidation of human origins.  相似文献   

3.
Tang H  Siegmund DO  Shen P  Oefner PJ  Feldman MW 《Genetics》2002,161(1):447-459
This article proposes a method of estimating the time to the most recent common ancestor (TMRCA) of a sample of DNA sequences. The method is based on the molecular clock hypothesis, but avoids assumptions about population structure. Simulations show that in a wide range of situations, the point estimate has small bias and the confidence interval has at least the nominal coverage probability. We discuss conditions that can lead to biased estimates. Performance of this estimator is compared with existing methods based on the coalescence theory. The method is applied to sequences of Y chromosomes and mtDNAs to estimate the coalescent times of human male and female populations.  相似文献   

4.
We estimate an effective mutation rate at an average Y chromosome short-tandem repeat locus as 6.9x10-4 per 25 years, with a standard deviation across loci of 5.7x10-4, using data on microsatellite variation within Y chromosome haplogroups defined by unique-event polymorphisms in populations with documented short-term histories, as well as comparative data on worldwide populations at both the Y chromosome and various autosomal loci. This value is used to estimate the times of the African Bantu expansion, the divergence of Polynesian populations (the Maoris, Cook Islanders, and Samoans), and the origin of Gypsy populations from Bulgaria.  相似文献   

5.
This study examines the genetic variation in Basque Y chromosome lineages using data on 12 Y-short tandem repeat (STR) loci in a sample of 158 males from four Basque provinces of Spain (Alava, Vizcaya, Guipuzcoa, and Navarre). As reported in previous studies, the Basques are characterized by high frequencies of haplogroup R1b (83%). AMOVA analysis demonstrates genetic homogeneity, with a small but significant amount of genetic structure between provinces (Y-short tandem repeat loci STRs: 1.71%, p = 0.0369). Gene and haplotype diversity levels in the Basque population are on the low end of the European distribution (gene diversity: 0.4268; haplotype diversity: 0.9421). Post-Neolithic contribution to the paternal Basque gene pool was estimated by measuring the proportion of those haplogroups with a Time to Most Recent Common Ancestor (TMRCA) previously dated either prior (R1b, I2a2) or subsequent to (E1b1b, G2a, J2a) the Neolithic. Based on these estimates, the Basque provinces show varying degrees of post-Neolithic contribution in the paternal lineages (10.9% in the combined sample).  相似文献   

6.
Mitochondrial DNA data have been used extensively to study evolution and early human origins. These applications require estimates of the rate at which nucleotide substitutions occur in the DNA sequence. We consider the problem of estimating substitution rates in the presence of site-to-site rate variation. A coalescent model is presented that allows for different substitution rates for purines and pyrimidines, as well as more detailed models that allow fast and slow rates within each of the purine and pyrimidine classes. A method for estimating such rates is presented. Even for these simple models of site heterogeneity, there are, typically, insufficient data to obtain reliable estimates of site-specific substitution rates. However, estimates of the average rate across all sites appear to be relatively stable even in the presence of site heterogeneity. Simulations of models with site-to-site variation in mutation rate show that hypervariable sites can produce peaks in the pairwise difference curves that have previously been attributed to population dynamics.  相似文献   

7.
Inference of intraspecific population divergence patterns typically requires genetic data for molecular markers with relatively high mutation rates. Microsatellites, or short tandem repeat (STR) polymorphisms, have proven informative in many such investigations. These markers are characterized, however, by high levels of homoplasy and varying mutational properties, often leading to inaccurate inference of population divergence. A SNPSTR is a genetic system that consists of an STR polymorphism closely linked (typically < 500 bp) to one or more single-nucleotide polymorphisms (SNPs). SNPSTR systems are characterized by lower levels of homoplasy than are STR loci. Divergence time estimates based on STR variation (on the derived SNP allele background) should, therefore, be more accurate and precise. We use coalescent-based simulations in the context of several models of demographic history to compare divergence time estimates based on SNPSTR haplotype frequencies and STR allele frequencies. We demonstrate that estimates of divergence time based on STR variation on the background of a derived SNP allele are more accurate (3% to 7% bias for SNPSTR versus 11% to 20% bias for STR) and more precise than STR-based estimates, conditional on a recent SNP mutation. These results hold even for models involving complex demographic scenarios with gene flow, population expansion, and population bottlenecks. Varying the timing of the mutation event generating the SNP revealed that estimates of divergence time are sensitive to SNP age, with more recent SNPs giving more accurate and precise estimates of divergence time. However, varying both mutational properties of STR loci and SNP age demonstrated that multiple independent SNPSTR systems provide less biased estimates of divergence time. Furthermore, the combination of estimates based separately on STR and SNPSTR variation provides insight into the age of the derived SNP alleles. In light of our simulations, we interpret estimates from data for human populations.  相似文献   

8.
In 10,844 parent/child allelic transfers at nine short-tandem-repeat (STR) loci, 23 isolated STR mismatches were observed. The parenthood in each of these cases was highly validated (probability >99.97%). The event was always repeat related, owing to either a single-step mutation (n=22) or a double-step mutation (n=1). The mutation rate was between 0 and 7 x 10(-3) per locus per gamete per generation. No mutations were observed in three of the nine loci. Mutation events in the male germ line were five to six times more frequent than in the female germ line. A positive exponential correlation between the geometric mean of the number of uninterrupted repeats and the mutation rate was observed. Our data demonstrate that mutation rates of different loci can differ by several orders of magnitude and that different alleles at one locus exhibit different mutation rates.  相似文献   

9.

Background

Polymorphic Y chromosome short tandem repeats (STRs) have been widely used in population genetic and evolutionary studies. Compared to di-, tri-, and tetranucleotide repeats, STRs with longer repeat units occur more rarely and are far less commonly used.

Principal Findings

In order to study the evolutionary dynamics of STRs according to repeat unit size, we analysed variation at 24 Y chromosome repeat loci: 1 tri-, 14 tetra-, 7 penta-, and 2 hexanucleotide loci. According to our results, penta- and hexanucleotide repeats have approximately two times lower repeat variance and diversity than tri- and tetranucleotide repeats, indicating that their mutation rate is about half of that of tri- and tetranucleotide repeats. Thus, STR markers with longer repeat units are more robust in distinguishing Y chromosome haplogroups and, in some cases, phylogenetic splits within established haplogroups.

Conclusions

Our findings suggest that Y chromosome STRs of increased repeat unit size have a lower rate of evolution, which has significant relevance in population genetic and evolutionary studies.  相似文献   

10.
Nonrecombinant portions of the genome, Y chromosome and mitochondrial DNA, are widely used for research on human population gene pools and reconstruction of their history. These systems allow the genetic dating of clusters of emerging haplotypes. The main method for age estimations is ρ statistics, which is an average number of mutations from founder haplotype to all modern-day haplotypes. A researcher can estimate the age of the cluster by multiplying this number by the mutation rate. The second method of estimation, ASD, is used for STR haplotypes of the Y chromosome and is based on the squared difference in the number of repeats. In addition to the methods of calculation, methods of Bayesian modeling assume a new significance. They have greater computational cost and complexity, but they allow obtaining an a posteriori distribution of the value of interest that is the most consistent with experimental data. The mutation rate must be known for both calculation methods and modeling methods. It can be determined either during the analysis of lineages or by providing calibration points based on populations with known formation time. These two approaches resulted in rate estimations for Y-chromosomal STR haplotypes with threefold difference. This contradiction was only recently refuted through the use of sequence data for the complete Y chromosome; “whole-genomic” rates of single nucleotide mutations obtained by both methods are mutually consistent and mark the area of application for different rates of STR markers. An issue even more crucial than that of the rates is correlation of the reconstructed history of the haplogroup (a cluster of haplotypes) and the history of the population. Although the need for distinguishing “lineage history” and “population history” arose in the earliest days of phylogeographic research, reconstructing the population history using genetic dating requires a number of methods and conditions. It is known that population history events leave distinct traces in the history of haplogroups only under certain demographic conditions. Direct identification of national history with the history of its occurring haplogroups is inappropriate and is avoided in population genetic studies, although because of its simplicity and attractiveness it is a constant temptation for researchers. An example of DNA genealogy, an amateur field that went beyond the borders of even citizen science and is consistently using the principle of equating haplogroup with lineage and population, which leads to absurd results (e.g., Eurasia as an origin of humankind), can serve as a warning against a simplified approach for interpretation of genetic dating results.  相似文献   

11.
Mutations that have recently increased in frequency by positive natural selection are an important component of naturally occurring variation that affects fitness. To identify such variants, we developed a method to test for recent selection by estimating the age of an allele from the extent of haplotype sharing at linked sites. Neutral coalescent simulations are then used to determine the likelihood of this age given the allele's observed frequency. We applied this method to a common disease allele, the hemochromatosis-associated HFE C282Y mutation. Our results allow us to reject neutral models incorporating plausible human demographic histories for HFE C282Y and one other young but common allele, indicating positive selection at HFE or a linked locus. This method will be useful for scanning the human genome for alleles under selection using the haplotype map now being constructed.  相似文献   

12.
Recently a candidate gene for the primary testis-determining factor (TDF) encoding a zinc finger protein (ZFY) has been cloned from the human Y chromosome. A highly homologous X-linked copy has also been identified. Using this human sequence it is possible to identify two Y loci, an X and an autosomal locus in the mouse (Zfy-1, Zfy-2, Zfx and Zfa, respectively). Suprisingly ZFY is more homologous to the mouse X and autosomal sequences than it is to either of the Y-linked loci. Both Zfy-1 and Zfy-2 are present in the Sxr region of the Y but Zfy-2 is absent in the Sxr deletion variant Sxrb (or Sxr") suggesting it is not necessary for male determination. Extensive backcross analyses map Zfa to mouse chromosome 10 and Zfx to a 5-cM interval between anonymous X probe MDXS120 and the tabby locus (Ta). We also show that the mouse androgen receptor locus (m-AR) believed to underlie the testicular feminization mutation (Tfm) shows complete linkage to Zfx. Comparative mapping indicates that in man these genes lie in separate conserved DNA segments.  相似文献   

13.
How important is DNA replication for mutagenesis?   总被引:4,自引:0,他引:4  
Rates of mutation and substitution in mammals are generally greater in the germ lines of males. This is usually explained as resulting from the larger number of germ cell divisions during spermatogenesis compared with oogenesis, with the assumption made that mutations occur primarily during DNA replication. However, the rate of cell division is not the only difference between male and female germ lines, and mechanisms are known that can give rise to mutations independently of DNA replication. We investigate the possibility that there are other causes of male-biased mutation. First, we show that patterns of variation at approximately 5,200 short tandem repeat (STR) loci indicate a higher mutation rate in males. We estimate a ratio of male-to-female mutation rates of approximately 1.9. This is significantly greater than 1 and supports a greater rate of mutation in males, affecting the evolution of these loci. Second, we show that there are chromosome-specific patterns of nucleotide and dinucleotide composition in mammals that have been shaped by mutation at CpG dinucleotides. Comparable patterns occur in birds. In mammals, male germ lines are more methylated than female germ lines, and these patterns indicate that differential methylation has played a role in male-biased vertebrate evolution. However, estimates of male mutation bias obtained from both classes of mutation are substantially lower than estimates of cell division bias from anatomical data. This discrepancy, along with published data indicating slipped-strand mispairing arising at STR loci in nonreplicating DNA, suggests that a substantial percentage of mutation may occur in nonreplicating DNA.  相似文献   

14.
Huang  Yujie  Liu  Cong  Xiao  Chao  Chen  Xiaoying  Han  Xueli  Yi  Shaohua  Huang  Daixin 《Molecular biology reports》2021,48(6):5363-5369

Short tandem repeats (STRs) have been extensively used in forensic genetics. However, according to previous studies, the mutation rates of STRs are relatively high and are affected by many factors. Therefore, it is important to analyze STR mutations and determine the influence of underlying factors on STR mutation rates. Mutation rates of 28 autosomal STRs were determined from 8708 paternity testing cases in the Chinese Han population, and the relationships between STR mutation rates and population, sex, age, allele length and heterozygosity were investigated. A total of 279 mutations were observed at 27 loci in a total of 233,530 meiosis cases, including 273 (97.8%) one-step, 5 (1.8%) two-step and 1 (0.4%) three-step mutations. The overall average mutation rate was 1.19?×?10–3 (95% CI 1.06?×?10–3???1.34?×?10–3) ranging from 0 (TPOX) to 2.79?×?10–3 (D13S325). Mutation rate comparisons revealed statistically significant differences at several STRs among populations. Paternal mutations occurred more frequently than maternal mutations, at a ratio of 6.04:1, and the mutation rate tended to increase with paternal age. Moreover, our study revealed a bias towards contraction mutations for long alleles and expansion mutations for short alleles. No obvious bias was observed in the overall mutation direction. In addition, STR loci with higher expected heterozygosity (Hexp) tended to have higher mutation rates. This work revealed the relationships between STR mutation rates and several influencing factors, providing useful data and information for further research on STR mutations in forensic genetics.

  相似文献   

15.
The origin of the Kerala non tribal population has been a matter of contention for centuries. While some claim that Negritos were the first inhabitants, some historians suggest a Dravidian origin for all Keralites. The aim of our study has been to provide sufficient scientific evidence based on Y chromosome short tandem repeat (Y STR) analysis for tracing the paternal lineage and also to create a database of the Y STR haplotype of the male population for future forensic analysis. Whole blood samples (n = 168) were collected from unrelated healthy men of the Kerala non-tribal population over a period of 2 years from October 2009. Genomic DNA was extracted by salting out method. All samples were genotyped for the 17 Y STR loci by the AmpFLSTR Y-filer PCR Amplification Kit. The haplotype and allele frequencies were determined by direct counting and analyzed using Arlequin 3.1 software, and molecular variance was calculated with the Y chromosome haplotype reference database online analysis tool, . Haplotype diversity was calculated using HaPYDive (). The majority of haplotypes were unique (149/168). The variant allele 17.1 was observed in DYS 385 loci in three samples. Fifteen samples (8.93%) showed the presence of alleles that are not within the established marker range denoted as outside marker range (OMR). The allele frequency of Kerala non tribal population ranged from 0.00003 to 0.5809. The most polymorphic single locus marker was DYS 458. The haplotype diversity value for Kerala non tribal population was 0.9978. The pairwise difference value ranged from 0.0531 to 0.0854 on comparison of the haplotypes of the Kerala non tribals with other Indian populations. The multi dimensional scaling plot depicted the proximity of Kerala non tribal population with Vasterbotten population (Swedish) and Paiwan, Patyal population of Taiwan, Thailand, and Zhuang population of China. The results of the study indicate towards a European paternal lineage in the non tribal Kerala population.  相似文献   

16.
Deleterious Background Selection with Recombination   总被引:23,自引:10,他引:13       下载免费PDF全文
R. R. Hudson  N. L. Kaplan 《Genetics》1995,141(4):1605-1617
An analytic expression for the expected nucleotide diversity is obtained for a neutral locus in a region with deleterious mutation and recombination. Our analytic results are used to predict levels of variation for the entire third chromosome of Drosophila melanogaster. The predictions are consistent with the low levels of variation that have been observed at loci near the centromeres of the third chromosome of D. melanogaster. However, the low levels of variation observed near the tips of this chromosome are not predicted using currently available estimates of the deleterious mutation rate and of selection coefficients. If considerably smaller selection coefficients are assumed, the low observed levels of variation at the tips of the third chromosome are consistent with the background selection model.  相似文献   

17.
Estimating the age of alleles by use of intraallelic variability.   总被引:9,自引:6,他引:3  
A method is presented for estimating the age of an allele by use of its frequency and the extent of variation among different copies. The method uses the joint distribution of the number of copies in a population sample and the coalescence times of the intraallelic gene genealogy conditioned on the number of copies. The linear birth-death process is used to approximate the dynamics of a rare allele in a finite population. A maximum-likelihood estimate of the age of the allele is obtained by Monte Carlo integration over the coalescence times. The method is applied to two alleles at the cystic fibrosis (CFTR) locus, deltaF508 and G542X, for which intraallelic variability at three intronic microsatellite loci has been examined. Our results indicate that G542X is somewhat older than deltaF508. Although absolute estimates depend on the mutation rates at the microsatellite loci, our results support the hypothesis that deltaF508 arose < 500 generations (approximately 10,000 years) ago.  相似文献   

18.
Zhang K  Rosenberg NA 《Genetics》2007,177(4):2109-2122
When a microsatellite locus is duplicated in a diploid organism, a single pair of PCR primers may amplify as many as four distinct alleles. To study the evolution of a duplicated microsatellite, we consider a coalescent model with symmetric stepwise mutation. Conditional on the time of duplication and a mutation rate, both in a model of completely unlinked loci and in a model of completely linked loci, we compute the probabilities for a sampled diploid individual to amplify one, two, three, or four distinct alleles with one pair of microsatellite PCR primers. These probabilities are then studied to examine the nature of their dependence on the duplication time and the mutation rate. The mutation rate is observed to have a stronger effect than the duplication time on the four probabilities, and the unlinked and linked cases are seen to behave similarly. Our results can be useful for helping to interpret genetic variation at microsatellite loci in species with a very recent history of gene and genome duplication.  相似文献   

19.
Analysis of complete mitochondrial genome sequences is becoming increasingly common in genetic studies. The availability of full genome datasets enables an analysis of the information content distributed throughout the mitochondrial genome in order to optimize the research design of future evolutionary studies. The goal of our study was to identify informative regions of the human mitochondrial genome using two criteria: (1) accurate reconstruction of a phylogeny and (2) consistent estimates of time to most recent common ancestor (TMRCA). We created two series of datasets by deleting individual genes of varied length and by deleting 10 equal-size fragments throughout the coding region. Phylogenies were statistically compared to the full-coding-region tree, while coalescent methods were used to estimate the TMRCA and associated credible intervals. Individual fragments important for maintaining a phylogeny similar to the full-coding-region tree encompassed bp 577-2122 and 11,399-16,023, including all or part of 12S rRNA, 16S rRNA, ND4, ND5, ND6, and cytb. The control region only tree was the most poorly resolved with the majority of the tree manifest as an unresolved polytomy. Coalescent estimates of TMRCA were less sensitive to removal of any particular fragment(s) than reconstruction of a consistent phylogeny. Overall, we discovered that half the genome, i.e., bp 3669-11,398, could be removed with no significant change in the phylogeny (p(AU)=0.077) while still maintaining overlap of TMRCA 95% credible intervals. Thus, sequencing a contiguous fragment from bp 11,399 through the control region to bp 3668 would create a dataset that optimizes the information necessary for phylogenetic and coalescent analyses and also takes advantage of the wealth of data already available on the control region.  相似文献   

20.
Carbone I  Liu YC  Hillman BI  Milgroom MG 《Genetics》2004,166(4):1611-1629
Genealogy-based methods were used to estimate migration of the fungal virus Cryphonectria hypovirus 1 between vegetative compatibility types of the host fungus, Cryphonectria parasitica, as a means of estimating horizontal transmission within two host populations. Vegetative incompatibility is a self/non-self recognition system that inhibits virus transmission under laboratory conditions but its effect on transmission in nature has not been clearly demonstrated. Recombination within and among different loci in the virus genome restricted the genealogical analyses to haplotypes with common mutation and recombinational histories. The existence of recombination necessitated that we also use genealogical approaches that can take advantage of both the mutation and recombinational histories of the sample. Virus migration between populations was significantly restricted. In contrast, estimates of migration between vegetative compatibility types were relatively high within populations despite previous evidence that transmission in the laboratory was restricted. The discordance between laboratory estimates and migration estimates from natural populations highlights the challenges in estimating pathogen transmission rates. Genealogical analyses inferred migration patterns throughout the entire coalescent history of one viral region in natural populations and not just recent patterns of migration or laboratory transmission. This application of genealogical analyses provides markedly stronger inferences on overall transmission rates than laboratory estimates do.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号