首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The frequency distribution of pairwise differences between sequences of mtDNA has recently been used to estimate the size of human populations before and after a hypothetical episode of rapid population growth and the time at which the population grew. To test the internal consistency of this method, we used three different sets of human mtDNA data and the corresponding demographic parameters estimated from the distribution of pairwise differences to determine by simulation the expected number of segregating sites, S, and its empirical distribution. The results indicate that the observed values of S are significantly lower than expected in two of three cases under the assumption of the infinite-sites model. Further simulations in which mutations were allowed to occur more than once at the same site and in which there was variation in mutation rate among sites show that the expected number of segregating sites can be much lower than under the infinite-site assumption. Nevertheless, the observed value of S is still significantly different from the value expected under the expansion hypothesis in two of three cases.   相似文献   

2.
It is known that under neutral mutation at a known mutation rate a sample of nucleotide sequences, within which there is assumed to be no recombination, allows estimation of the effective size of an isolated population. This paper investigates the case of very long sequences, where each pair of sequences allows a precise estimate of the divergence time of those two gene copies. The average divergence time of all pairs of copies estimates twice the effective population number and an estimate can also be derived from the number of segregating sites. One can alternatively estimate the genealogy of the copies. This paper shows how a maximum likelihood estimate of the effective population number can be derived from such a genealogical tree. The pairwise and the segregating sites estimates are shown to be much less efficient than this maximum likelihood estimate, and this is verified by computer simulation. The result implies that there is much to gain by explicitly taking the tree structure of these genealogies into account.  相似文献   

3.
F. Tajima 《Genetics》1989,123(3):585-595
The relationship between the two estimates of genetic variation at the DNA level, namely the number of segregating sites and the average number of nucleotide differences estimated from pairwise comparison, is investigated. It is found that the correlation between these two estimates is large when the sample size is small, and decreases slowly as the sample size increases. Using the relationship obtained, a statistical method for testing the neutral mutation hypothesis is developed. This method needs only the data of DNA polymorphism, namely the genetic variation within population at the DNA level. A simple method of computer simulation, that was used in order to obtain the distribution of a new statistic developed, is also presented. Applying this statistical method to the five regions of DNA sequences in Drosophila melanogaster, it is found that large insertion/deletion (greater than 100 bp) is deleterious. It is suggested that the natural selection against large insertion/deletion is so weak that a large amount of variation is maintained in a population.  相似文献   

4.
Coalescent theory is commonly used to perform population genetic inference at the nucleotide level. Here, we examine the procedure that fixes the number of segregating sites (henceforth the FS procedure). In this approach a fixed number of segregating sites (S) are placed on a coalescent tree (independently of the total and internode lengths of the tree). Thus, although widely used, the FS procedure does not strictly follow the assumptions of coalescent theory and must be considered an approximation of (i) the standard procedure that uses a fixed population mutation parameter theta, and (ii) procedures that condition on the number of segregating sites. We study the differences in the false positive rate for nine statistics by comparing the FS procedure with the procedures (i) and (ii), using several evolutionary models with single-locus and multilocus data. Our results indicate that for single-locus data the FS procedure is accurate for the equilibrium neutral model, but problems arise under the alternative models studied; furthermore, for multilocus data, the FS procedure becomes inaccurate even for the standard neutral model. Therefore, we recommend a procedure that fixes the theta value (or alternatively, procedures that condition on S and take into account the uncertainty of theta) for analysing evolutionary models with multilocus data. With single-locus data, the FS procedure should not be employed for models other than the standard neutral model.  相似文献   

5.
The nucleotide composition of the genome is a balance between the origin and fixation rates of different mutations. For example, it is well-known that transitions occur more frequently than transversions, particularly at CpG sites. Differences in fixation rates of mutation types are less explored. Specifically, recombination-associated GC-biased gene conversion (gBGC) may differentially impact GC-changing mutations, due to differences in their genomic distributions and efficiency of mismatch repair mechanisms. Given that recombination evolves rapidly across species, we explore gBGC of different mutation types across human populations and great ape species. We report a stronger correlation between segregating GC frequency and recombination for transitions than for transversions. Notably, CpG transitions are most strongly affected by gBGC in humans and chimpanzees. We show that the overall strength of gBGC is generally correlated with effective population sizes in humans, with some notable exceptions, such as a stronger effect of gBGC on non-CpG transitions in populations of European descent. Furthermore, species of the Gorilla and Pongo genus have a greatly reduced gBGC effect on CpG sites. We also study the dependence of gBGC dynamics on flanking nucleotides and show that some mutation types evolve in opposition to the gBGC expectation, likely due to the hypermutability of specific nucleotide contexts. Our results highlight the importance of different gBGC dynamics experienced by GC-changing mutations and their impact on nucleotide composition evolution.  相似文献   

6.
Properties of a neutral allele model with intragenic recombination   总被引:35,自引:0,他引:35  
An infinite-site neutral allele model with crossing-over possible at any of an infinite number of sites is studied. A formula for the variance of the number of segregating sites in a sample of gametes is obtained. An approximate expression for the expected homozygosity is also derived. Simulation results are presented to indicate the accuracy of the approximations. The results concerning the number of segregating sites and the expected homozygosity indicate that a two-locus model and the infinite-site model behave similarly for 4Nu less than or equal to 2 and r less than or equal to 5u, where N is the population size, u is the neutral mutation rate, and r is the recombination rate. Simulations of a two-locus model and a four-locus model were also carried out to determine the effect of intragenic recombination on the homozygosity test of Watterson (Genetics 85, 789-814; 88, 405-417) and on the number of unique alleles in a sample. The results indicate that for 4Nu less than or equal to 2 and r less than or equal to 10u, the effect of recombination is quite small.  相似文献   

7.
Pálsson S 《Hereditas》2004,141(1):74-80
Deleterious mutations affect genetic variation at linked neutral loci. Neutral variation can be reduced due to background selection, but in small population and with tight linkage such variation may increase due to associative overdominance. Here I report the results of computer simulations of diploid genotypes in small populations, where I look at the effect of deleterious mutations and linkage on comparisons of intra- and interspecific variation. Each chromosome consisted of 2000 loci where deleterious and neutral mutations occurred. The ratio of nonsynonymous to synonymous substitution rates (Ka/Ks) either increases with tight linkage or is unaffected, depending on the strength of selection. The ratio of the numbers of segregating mutations to the number of fixed mutations decreases under the conditions leading to background selection but can increase at tight linkage. Numbers of segregating sites (Sn) are less affected than nucleotide site diversity (pi), pi reduces more than Sn at intermediate linkage, but pi increases more than Sn when linkage is tight. Similar effects as found for Sn and pi are observed for heterozygosity and variance in allele size of tandem repeat loci.  相似文献   

8.
F. Tajima 《Genetics》1996,143(3):1457-1465
The expectations of the average number of nucleotide differences per site (π), the proportion of segregating site (s), the minimum number of mutations per site (s*) and some other quantities were derived under the finite site models with and without rate variation among sites, where the finite site models include Jukes and Cantor's model, the equal-input model and Kimura's model. As a model of rate variation, the gamma distribution was used. The results indicate that if distribution parameter α is small, the effect of rate variation on these quantities are substantial, so that the estimates of θ based on the infinite site model are substantially underestimated, where θ = 4Nv, N is the effective population size and v is the mutation rate per site per generation. New methods for estimating θ are also presented, which are based on the finite site models with and without rate variation. Using these methods, underestimation can be corrected.  相似文献   

9.
Mitochondrial DNA (mtDNA) sequences that include (a) a part of the cytochrome b gene, (b) two tRNA genes, and (c) a part of the noncoding D-loop region of 31 Anguilla japonica (Japanese eel) and 1 A. marmorata collected from Taiwan, Japan, and mainland China were determined to evaluate the population structure of Japanese eel. Among 30 genotypes identified from the 31 Japanese eel mtDNAs sequenced, there are 58 variable sites, predominantly clustered at the D-loop region. The phylogenetic tree constructed by the unweighted pair-group method with arithmetic mean shows neither significant genealogical branches nor geographic clusters. Furthermore, the sequence-statistics test reveals little, if any, significant genetic differentiation. These results indicate that the 31 Japanese eels might come from a single population. Analysis of sequence variation in mtDNA by using the relationship between the number of segregating sites and the average number of nucleotide differences under the neutral mutation hypothesis reveals that neutral mutation acts as a major factor influencing the evolutionary divergence of the Japanese eel mitochondrial genome sequenced, especially in the noncoding region.   相似文献   

10.
A. Pluzhnikov  P. Donnelly 《Genetics》1996,144(3):1247-1262
Two commonly used measures of genetic diversity for intraspecies DNA sequence data are based, respectively, on the number of segregating sites, and on the average number of pairwise nucleotide differences. Expressions are derived for their variance in the presence of intragenic recombination for a panmictic population of fixed size that is at neutral equilibrium at the region sequenced. We show that, in contrast to the slow decrease in variance with increasing sample size, if the recombination rate is nonzero, the asymptotic rate of decrease of variance with increasing sequence length, for fixed sample size, is quite rapid. In particular, it is close to that which would be obtained by sequencing independent chromosome regions. The correlation between measures of diversity from linked regions is also examined. For a given total number of bases sequenced in a particular region, optimal sequencing strategies are derived. These typically involve sequencing relatively few (three to 10) long copies of the region. Under optimal strategies, the variances of the two measures are very similar for most parameter values considered. Results concerning optimal sequencing strategies will be sensitive to gross departures from the underlying assumptions, such as population bottlenecks, selective sweeps, and substantial population substructure.  相似文献   

11.
In population genetics, under a neutral Wright-Fisher model, the scaling parameter straight theta=4Nmu represents twice the average number of new mutants per generation. The effective population size is N and mu is the mutation rate per sequence per generation. Watterson proposed a consistent estimator of this parameter based on the number of segregating sites in a sample of nucleotide sequences. We study the distribution of the Watterson estimator. Enlarging the size of the sample, we asymptotically set a Central Limit Theorem for the Watterson estimator. This exhibits asymptotic normality with a slow rate of convergence. We then prove the asymptotic efficiency of this estimator. In the second part, we illustrate the slow rate of convergence found in the Central Limit Theorem. To this end, by studying the confidence intervals, we show that the asymptotic Gaussian distribution is not a good approximation for the Watterson estimator.  相似文献   

12.
This study addresses the question of how purifying selection operates during recent rapid population growth such as has been experienced by human populations. This is not a straightforward problem because the human population is not at equilibrium: population genetics predicts that, on the one hand, the efficacy of natural selection increases as population size increases, eliminating ever more weakly deleterious variants; on the other hand, a larger number of deleterious mutations will be introduced into the population and will be more likely to increase in their number of copies as the population grows. To understand how patterns of human genetic variation have been shaped by the interaction of natural selection and population growth, we examined the trajectories of mutations with varying selection coefficients, using computer simulations. We observed that while population growth dramatically increases the number of deleterious segregating sites in the population, it only mildly increases the number carried by each individual. Our simulations also show an increased efficacy of natural selection, reflected in a higher fraction of deleterious mutations eliminated at each generation and a more efficient elimination of the most deleterious ones. As a consequence, while each individual carries a larger number of deleterious alleles than expected in the absence of growth, the average selection coefficient of each segregating allele is less deleterious. Combined, our results suggest that the genetic risk of complex diseases in growing populations might be distributed across a larger number of more weakly deleterious rare variants.  相似文献   

13.
The Effect of Deleterious Mutations on Neutral Molecular Variation   总被引:12,自引:12,他引:0  
Selection against deleterious alleles maintained by mutation may cause a reduction in the amount of genetic variability at linked neutral sites. This is because a new neutral variant can only remain in a large population for a long period of time if it is maintained in gametes that are free of deleterious alleles, and hence are not destined for rapid elimination from the population by selection. Approximate formulas are derived for the reduction below classical neutral values resulting from such background selection against deleterious mutations, for the mean times to fixation and loss of new mutations, nucleotide site diversity, and number of segregating sites. These formulas apply to random-mating populations with no genetic recombination, and to populations reproducing exclusively asexually or by self-fertilization. For a given selection regime and mating system, the reduction is an exponential function of the total mutation rate to deleterious mutations for the section of the genome involved. Simulations show that the effect decreases rapidly with increasing recombination frequency or rate of outcrossing. The mean time to loss of new neutral mutations and the total number of segregating neutral sites are less sensitive to background selection than the other statistics, unless the population size is of the order of a hundred thousand or more. The stationary distribution of allele frequencies at the neutral sites is correspondingly skewed in favor of rare alleles, compared with the classical neutral result. Observed reductions in molecular variation in low recombination genomic regions of sufficiently large size, for instance in the centromere-proximal regions of Drosophila autosomes or in highly selfing plant populations, may be partly due to background selection against deleterious mutations.  相似文献   

14.
K. Misawa  F. Tajima 《Genetics》1997,147(4):1959-1964
Knowing the amount of DNA polymorphism is essential to understand the mechanism of maintaining DNA polymorphism in a natural population. The amount of DNA polymorphism can be measured by the average number of nucleotide differences per site (π), the proportion of segregating (polymorphic) site (s) and the minimum number of mutations per site (s*). Since the latter two quantities depend on the sample size, θ is often used as a measure of the amount of DNA polymorphism, where θ = 4Nμ, N is the effective population size and μ is the neutral mutation rate per site per generation. It is known that θ estimated from π, s and s* under the infinite site model can be biased when the mutation rate varies among sites. We have therefore developed new methods for estimating θ under the finite site model. Using computer simulations, it has been shown that the new methods give almost unbiased estimates even when the mutation rate varies among sites substantially. Furthermore, we have also developed new statistics for testing neutrality by modifying Tajima's D statistic. Computer simulations suggest that the new test statistics can be used even when the mutation rate varies among sites.  相似文献   

15.
In order to analyze the pattern of DNA polymorphism in detail, we have developed a simple method using a new statistic theta(i) which estimates 4Nmu from the number of segregating sites whose allelic nucleotide frequency is i/n among n DNA sequences, where N is the effective population size and mu is the mutation rate per generation per nucleotide site. Under the assumption that mutations are selectively neutral and a population size is constant, the expectation of theta(i) is equal to that of theta, which estimates 4Nmu from the number of segregating sites, so that the distribution of theta(i) is flat. Therefore, the departure of the distribution of theta(i) from the horizontal line, which represents the value of theta, reflects change in population size and natural selection. Results of the coalescent simulation show that the distributions of theta(i) in the populations which experienced expansion and reduction are U-shaped and upside-down U-shaped, respectively. And the distributions of theta(i) in some populations that experienced bottleneck are W-shaped. Furthermore, we have applied this method to the SNP data in the International HapMap Project. Results of data analyses show that the distributions of theta(i) in the CEU (European), CHB and JPT (Asian) populations are different from that in the YRI population (African). From these results of data analyses in nuclear DNA and the pattern of polymorphism in human mitochondrial DNA already known, we infer that the CEU, CHB and JPT populations experienced the bottleneck.  相似文献   

16.
We have analyzed human genomic diversity in 32 individuals representing four continental populations of Homo sapiens in the context of four ape species. We used DNA resequencing chips covering 898 expressed sequence tags (ESTs), corresponding to 109 kb of sequence. Based on the intra-species data, the neutral hypothesis could not be rejected. However, the mutation rate was two times lower than typically observed in functionally unconstrained genomic segments, suggesting a certain level of selection. The worldwide diversity (297 segregating sites and nucleotide diversity of 0.054%) was partitioned among continents, with the greatest amount of variation observed in the African sample. The long-term effective population size of the human population was estimated at 13,000; a similar figure was obtained for the African sample and a 20% lower estimate was obtained for the other continents. Africans also differed in having a higher number of continental-specific polymorphisms contributing to the higher average nucleotide diversity. These results are consistent with the existence of two distinct lineages of modern humans: amalgamation of these lineages in Africa led to the higher present-day diversity on that continent, whereas colonization of other continents by one of them gave the effect of a population bottleneck.  相似文献   

17.
The distribution of the number of nucleotide differences between two randomly chosen cistrons in a finite population is studied here when the population size changes from generation to generation. When genetic variability is measured by heterozygosity (i.e., the probability that two cistrons are different), by the probability that two cistrons differ at two or more nucleotide sites, or by mean number of site differences between cistrons, it is seen that in a population going through a small bottleneck all of these measures decline rapidly but, as soon as population size becomes large, they start to increase owing to new mutations. The amount of reduction in these measures depends not only on the size of bottleneck but also on the rate of population growth. The implications of this study explaining the observed variations in the rates of amino acid substitutions during the evolutionary process are also discussed.  相似文献   

18.
Statistical Properties of a DNA Sample under the Finite-Sites Model   总被引:1,自引:0,他引:1       下载免费PDF全文
Z. Yang 《Genetics》1996,144(4):1941-1950
Statistical properties of a DNA sample from a random-mating population of constant size are studied under the finite-sites model. It is assumed that there is no migration and no recombination occurs within the locus. A Markov process model is used for nucleotide substitution, allowing for multiple substitutions at a single site. The evolutionary rates among sites are treated as either constant or variable. The general likelihood calculation using numerical integration involves intensive computation and is feasible for three or four sequences only; it may be used for validating approximate algorithms. Methods are developed to approximate the probability distribution of the number of segregating sites in a random sample of n sequences, with either constant or variable substitution rates across sites. Calculations using parameter estimates obtained for human D-loop mitochondrial DNAs show that among-site rate variation has a major effect on the distribution of the number of segregating sites; the distribution under the finite-sites model with variable rates among sites is quite different from that under the infinite-sites model.  相似文献   

19.
Relationship between DNA Polymorphism and Fixation Time   总被引:5,自引:3,他引:2       下载免费PDF全文
F. Tajima 《Genetics》1990,125(2):447-454
When there is no recombination among nucleotide sites in DNA sequences, DNA polymorphism and fixation of mutants at nucleotide sites are mutually related. Using the method of gene genealogy, the relationship between the DNA polymorphism and the fixation of mutant nucleotide was quantitatively investigated under the assumption that mutants are selectively neutral, that there is no recombination among nucleotide sites, and that the population is a random mating population with N diploid individuals. The results obtained indicate that the expected number of nucleotide differences between two DNA sequences randomly sampled from the population is 42% less when a mutant at a particular nucleotide site reaches fixation than at a random time, and that heterozygosity is also expected to be less when fixation takes place than at a random time, but the amount of reduction depends on the value of 4Nv in this case, where v is the mutation rate per DNA sequence per generation. The formula for obtaining the expected number of nucleotide differences between the two DNA sequences for a given fixation time is also derived, and indicates that, even when it takes a large number of generations for a mutant to reach fixation, this number is 33% less than at a random time. The computer simulation conducted suggests that the expected number of nucleotide differences between the two DNA sequences at the time when an advantageous mutant becomes fixed is essentially the same as that of neutral mutant if the fixation time is the same. The effect of recombination on the amount of DNA polymorphism was also investigated by using computer simulation.  相似文献   

20.
We have developed the first comprehensive simulator for polyploid genomes (PolySim) and demonstrated its value by performing large‐scale simulations to examine the effect of different population parameters on the evolution of polyploids. PolySim is unlimited in terms of ploidy, population size or number of simulated loci. Our process considered the evolution of polyploids from diploid ancestors, polysomic inheritance, inbreeding, recombination rate change in polyploids and gene flow from lower to higher ploidies. We compared the number of segregating single nucleotide polymorphisms, minor allele frequency, heterozygosity, R2 and average kinship relatedness between different simulated scenarios, and to real data from polyploid species. As expected, allotetraploid populations showed no difference from their ancestral diploids when population size remained constant and there was no gene flow or multivalent (MV) pairing between subgenomes. Autotetraploid populations showed significant differences from their ancestors for most parameters and diverged from their ancestral populations faster than allotetraploids. Autotetraploids can have significantly higher heterozygosity, relatedness and extended linkage disequilibrium compared with allotetraploids. Interestingly, autotetraploids were more sensitive to increasing selfing rate and decreasing population size. MV formation can homogenize allotetraploid subgenomes, but this homogenization requires a higher MV rate than previously proposed. Our results can be considered as the first building block to understand polyploid population evolutionary dynamics. PolySim can be used to simulate a wide variety of polyploid organisms that mimic empirical populations, which, in combination with quantitative genetics tools, can be used to investigate the power of genomewide association, genomic selection or breeding programme designs in these species.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号