首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In the absence of selection, the structure of equilibrium allelic diversity is described by the elegant sampling formula of Ewens. This formula has helped to shape our expectations of empirical patterns of molecular variation. Along with coalescent theory, it provides statistical techniques for rejecting the null model of neutrality. However, we still do not fully understand the statistics of the allelic diversity expected in the presence of natural selection. Earlier work has described the effects of strongly deleterious mutations linked to many neutral sites, and allelic variation in models where offspring fitness is unrelated to parental fitness, but it has proven difficult to understand allelic diversity in the presence of purifying selection at many linked sites. Here, we study the population genetics of infinitely many perfectly linked sites, some neutral and some deleterious. Our approach is based on studying the lineage structure within each class of individuals of similar fitness in the deleterious mutation-selection balance. Consistent with previous observations, we find that for moderate and weak selection pressures, the patterns of allelic diversity cannot be described by a neutral model for any choice of the effective population site. We compute precisely how purifying selection at many linked sites distorts the patterns of allelic diversity, by developing expressions for the likelihood of any configuration of allelic types in a sample analogous to the Ewens sampling formula.  相似文献   

2.
Zeng K  Charlesworth B 《Genetics》2011,189(1):251-266
Background selection, the effects of the continual removal of deleterious mutations by natural selection on variability at linked sites, is potentially a major determinant of DNA sequence variability. However, the joint effects of background selection and genetic recombination on the shape of the neutral gene genealogy have proved hard to study analytically. The only existing formula concerns the mean coalescent time for a pair of alleles, making it difficult to assess the importance of background selection from genome-wide data on sequence polymorphism. Here we develop a structured coalescent model of background selection with recombination and implement it in a computer program that efficiently generates neutral gene genealogies for an arbitrary sample size. We check the validity of the structured coalescent model against forward-in-time simulations and show that it accurately captures the effects of background selection. The model produces more accurate predictions of the mean coalescent time than the existing formula and supports the conclusion that the effect of background selection is greater in the interior of a deleterious region than at its boundaries. The level of linkage disequilibrium between sites is elevated by background selection, to an extent that is well summarized by a change in effective population size. The structured coalescent model is readily extendable to more realistic situations and should prove useful for analyzing genome-wide polymorphism data.  相似文献   

3.
The serial coalescent extends traditional coalescent theory to include genealogies in which not all individuals were sampled at the same time. Inference in this framework is powerful because population size and evolutionary rate may be estimated independently. However, when the sequences in question are affected by selection acting at many sites, the genealogies may differ significantly from their neutral expectation, and inference of demographic parameters may become inaccurate. I demonstrate that this inaccuracy is severe when the mutation rate and strength of selection are jointly large, and I develop a new likelihood calculation that, while approximate, improves the accuracy of population size estimates. When used in a Bayesian parameter estimation context, the new calculation allows for estimation of the shape of the pairwise coalescent rate function and can be used to detect the presence of selection acting at many sites in a sequence. Using the new method, I investigate two sets of dengue virus sequences from Puerto Rico and Thailand, and show that both genealogies are likely to have been distorted by selection.  相似文献   

4.
Payseur BA  Nachman MW 《Gene》2002,300(1-2):31-42
Theoretical and empirical work indicates that patterns of neutral polymorphism can be affected by linked, selected mutations. Under background selection, deleterious mutations removed from a population by purifying selection cause a reduction in linked neutral diversity. Under genetic hitchhiking, the rise in frequency and fixation of beneficial mutations also reduces the level of linked neutral polymorphism. Here we review the evidence that levels of neutral polymorphism in humans are affected by selection at linked sites. We then discuss four approaches for distinguishing between background selection and genetic hitchhiking based on (i) the relationship between polymorphism level and recombination rate for neutral loci with high mutation rates, (ii) relative levels of variation on the X chromosome and the autosomes, (iii) the frequency distribution of neutral polymorphisms, and (iv) population-specific patterns of genetic variation. Although the evidence for selection at linked sites in humans is clear, current methods and data do not allow us to clearly assess the relative importance of background selection and genetic hitchhiking in humans. These results contrast with those obtained for Drosophila, where the signals of positive selection are stronger.  相似文献   

5.
Kai Zeng  Pádraic Corcoran 《Genetics》2015,201(4):1539-1554
It is well known that most new mutations that affect fitness exert deleterious effects and that natural populations are often composed of subpopulations (demes) connected by gene flow. To gain a better understanding of the joint effects of purifying selection and population structure, we focus on a scenario where an ancestral population splits into multiple demes and study neutral diversity patterns in regions linked to selected sites. In the background selection regime of strong selection, we first derive analytic equations for pairwise coalescent times and FST as a function of time after the ancestral population splits into two demes and then construct a flexible coalescent simulator that can generate samples under complex models such as those involving multiple demes or nonconservative migration. We have carried out extensive forward simulations to show that the new methods can accurately predict diversity patterns both in the nonequilibrium phase following the split of the ancestral population and in the equilibrium between mutation, migration, drift, and selection. In the interference selection regime of many tightly linked selected sites, forward simulations provide evidence that neutral diversity patterns obtained from both the nonequilibrium and equilibrium phases may be virtually indistinguishable for models that have identical variance in fitness, but are nonetheless different with respect to the number of selected sites and the strength of purifying selection. This equivalence in neutral diversity patterns suggests that data collected from subdivided populations may have limited power for differentiating among the selective pressures to which closely linked selected sites are subject.  相似文献   

6.
Barton NH  Etheridge AM 《Genetics》2004,166(2):1115-1131
The coalescent process can describe the effects of selection at linked loci only if selection is so strong that genotype frequencies evolve deterministically. Here, we develop methods proposed by Kaplan, Darden, and Hudson to find the effects of weak selection. We show that the overall effect is given by an extension to Price's equation: the change in properties such as moments of coalescence times is equal to the covariance between those properties and the fitness of the sample of genes. The distribution of coalescence times differs substantially between allelic classes, even in the absence of selection. However, the average coalescence time between randomly chosen genes is insensitive to the current allele frequency and is affected significantly by purifying selection only if deleterious mutations are common and selection is strong (i.e., the product of population size and selection coefficient, Ns>3). Balancing selection increases mean coalescence times, but the effect becomes large only when mutation rates between allelic classes are low and when selection is extremely strong. Our analysis supports previous simulations that show that selection has surprisingly little effect on genealogies. Moreover, small fluctuations in allele frequency due to random drift can greatly reduce any such effects. This will make it difficult to detect the action of selection from neutral variation alone.  相似文献   

7.
K Zeng 《Heredity》2013,110(4):363-371
There is increasing evidence that background selection, the effects of the elimination of recurring deleterious mutations by natural selection on variability at linked sites, may be a major factor shaping genome-wide patterns of genetic diversity. To accurately quantify the importance of background selection, it is vital to have computationally efficient models that include essential biological features. To this end, a structured coalescent procedure is used to construct a model of background selection that takes into account the effects of recombination, recent changes in population size and variation in selection coefficients against deleterious mutations across sites. Furthermore, this model allows a flexible organization of selected and neutral sites in the region concerned, and has the ability to generate sequence variability at both selected and neutral sites, allowing the correlation between these two types of sites to be studied. The accuracy of the model is verified by checking against the results of forward simulations. These simulations also reveal several patterns of diversity that are in qualitative agreement with observations reported in recent studies of DNA sequence polymorphisms. These results suggest that the model should be useful for data analysis.  相似文献   

8.
Coalescent theory is routinely used to estimate past population dynamics and demographic parameters from genealogies. While early work in coalescent theory only considered simple demographic models, advances in theory have allowed for increasingly complex demographic scenarios to be considered. The success of this approach has lead to coalescent-based inference methods being applied to populations with rapidly changing population dynamics, including pathogens like RNA viruses. However, fitting epidemiological models to genealogies via coalescent models remains a challenging task, because pathogen populations often exhibit complex, nonlinear dynamics and are structured by multiple factors. Moreover, it often becomes necessary to consider stochastic variation in population dynamics when fitting such complex models to real data. Using recently developed structured coalescent models that accommodate complex population dynamics and population structure, we develop a statistical framework for fitting stochastic epidemiological models to genealogies. By combining particle filtering methods with Bayesian Markov chain Monte Carlo methods, we are able to fit a wide class of stochastic, nonlinear epidemiological models with different forms of population structure to genealogies. We demonstrate our framework using two structured epidemiological models: a model with disease progression between multiple stages of infection and a two-population model reflecting spatial structure. We apply the multi-stage model to HIV genealogies and show that the proposed method can be used to estimate the stage-specific transmission rates and prevalence of HIV. Finally, using the two-population model we explore how much information about population structure is contained in genealogies and what sample sizes are necessary to reliably infer parameters like migration rates.  相似文献   

9.
A population genetic model with a single locus at which balancing selection acts and many linked loci at which neutral mutations can occur is analysed using the coalescent approach. The model incorporates geographic subdivision with migration, as well as mutation, recombination, and genetic drift of neutral variation. It is found that geographic subdivision can affect genetic variation even with high rates of migration, providing that selection is strong enough to maintain different allele frequencies at the selected locus. Published sequence data from the alcohol dehydrogenase locus of Drosophila melanogaster are found to fit the proposed model slightly better than a similar model without subdivision.  相似文献   

10.
The evolution of the human immunodeficiency virus (HIV-1) during chronic infection involves the rapid, continuous turnover of genetic diversity. However, the role of natural selection, relative to random genetic drift, in governing this process is unclear. We tested a stochastic model of genetic drift using partial envelope sequences sampled longitudinally in 28 infected children. In each case the Bayesian posterior (empirical) distribution of coalescent genealogies was estimated using Markov chain Monte Carlo methods. Posterior predictive simulation was then used to generate a null distribution of genealogies assuming neutrality, with the null and empirical distributions compared using four genealogy-based summary statistics sensitive to nonneutral evolution. Because both null and empirical distributions were generated within a coalescent framework, we were able to explicitly account for the confounding influence of demography. From the distribution of corrected P-values across patients, we conclude that empirical genealogies are more asymmetric than expected if evolution is driven by mutation and genetic drift only, with an excess of low-frequency polymorphisms in the population. This indicates that although drift may still play an important role, natural selection has a strong influence on the evolution of HIV-1 envelope. A negative relationship between effective population size and substitution rate indicates that as the efficacy of selection increases, a smaller proportion of mutations approach fixation in the population. This suggests the presence of deleterious mutations. We therefore conclude that intrahost HIV-1 evolution in envelope is dominated by purifying selection against low-frequency deleterious mutations that do not reach fixation.  相似文献   

11.
Navarro A  Barton NH 《Genetics》2002,161(2):849-863
We studied the effect of multilocus balancing selection on neutral nucleotide variability at linked sites by simulating a model where diallelic polymorphisms are maintained at an arbitrary number of selected loci by means of symmetric overdominance. Different combinations of alleles define different genetic backgrounds that subdivide the population and strongly affect variability. Several multilocus fitness regimes with different degrees of epistasis and gametic disequilibrium are allowed. Analytical results based on a multilocus extension of the structured coalescent predict that the expected linked neutral diversity increases exponentially with the number of selected loci and can become extremely large. Our simulation results show that although variability increases with the number of genetic backgrounds that are maintained in the population, it is reduced by random fluctuations in the frequencies of those backgrounds and does not reach high levels even in very large populations. We also show that previous results on balancing selection in single-locus systems do not extend to the multilocus scenario in a straightforward way. Different patterns of linkage disequilibrium and of the frequency spectrum of neutral mutations are expected under different degrees of epistasis. Interestingly, the power to detect balancing selection using deviations from a neutral distribution of allele frequencies seems to be diminished under the fitness regime that leads to the largest increase of variability over the neutral case. This and other results are discussed in the light of data from the Mhc.  相似文献   

12.
To model deviations from selectively neutral genetic variation caused by different forms of selection, it is necessary to first understand patterns of neutral variation. Best understood is neutral genetic variation at a single locus. But, as is well known, additional insights can be gained by investigating multiple loci. The resulting patterns reflect the degree of association (linkage) between loci and provide information about the underlying multilocus gene genealogies. The statistical properties of two-locus gene genealogies have been intensively studied for populations of constant size, as well as for simple demographic histories such as exponential population growth and single bottlenecks. By contrast, the combined effect of recombination and sustained demographic fluctuations is poorly understood. Addressing this issue, we study a two-locus Wright-Fisher model of a population subject to recurrent bottlenecks. We derive coalescent approximations for the covariance of the times to the most recent common ancestor at two loci in samples of two chromosomes. This covariance reflects the degree of association and thus linkage disequilibrium between these loci. We find, first, that an effective population-size approximation describes the numerically observed association between two loci provided that recombination occurs either much faster or much more slowly than the population-size fluctuations. Second, when recombination occurs frequently between but rarely within bottlenecks, we observe that the association of gene histories becomes independent of physical distance over a certain range of distances. Third, we show that in this case, a commonly used measure of linkage disequilibrium, σ(2)(d) (closely related to r(2)), fails to capture the long-range association between two loci. The reason is that constituent terms, each reflecting the long-range association, cancel. Fourth, we analyze a limiting case in which the long-range association can be described in terms of a Xi coalescent allowing for simultaneous multiple mergers of ancestral lines.  相似文献   

13.
Natural populations are structured spatially into local populations and genetically into diverse 'genetic backgrounds' defined by different combinations of selected alleles. If selection maintains genetic backgrounds at constant frequency then neutral diversity is enhanced. By contrast, if background frequencies fluctuate then diversity is reduced. Provided that the population size of each background is large enough, these effects can be described by the structured coalescent process. Almost all the extant results based on the coalescent deal with a single selected locus. Yet we know that very large numbers of genes are under selection and that any substantial effects are likely to be due to the cumulative effects of many loci. Here, we set up a general framework for the extension of the coalescent to multilocus scenarios and we use it to study the simplest model, where strong balancing selection acting on a set of n loci maintains 2n backgrounds at constant frequencies and at linkage equilibrium. Analytical results show that the expected linked neutral diversity increases exponentially with the number of selected loci and can become extremely large. However, simulation results reveal that the structured coalescent approach breaks down when the number of backgrounds approaches the population size, because of stochastic fluctuations in background frequencies. A new method is needed to extend the structured coalescent to cases with large numbers of backgrounds.  相似文献   

14.
Coop G  Ralph P 《Genetics》2012,192(1):205-224
Two major sources of stochasticity in the dynamics of neutral alleles result from resampling of finite populations (genetic drift) and the random genetic background of nearby selected alleles on which the neutral alleles are found (linked selection). There is now good evidence that linked selection plays an important role in shaping polymorphism levels in a number of species. One of the best-investigated models of linked selection is the recurrent full-sweep model, in which newly arisen selected alleles fix rapidly. However, the bulk of selected alleles that sweep into the population may not be destined for rapid fixation. Here we develop a general model of recurrent selective sweeps in a coalescent framework, one that generalizes the recurrent full-sweep model to the case where selected alleles do not sweep to fixation. We show that in a large population, only the initial rapid increase of a selected allele affects the genealogy at partially linked sites, which under fairly general assumptions are unaffected by the subsequent fate of the selected allele. We also apply the theory to a simple model to investigate the impact of recurrent partial sweeps on levels of neutral diversity and find that for a given reduction in diversity, the impact of recurrent partial sweeps on the frequency spectrum at neutral sites is determined primarily by the frequencies rapidly achieved by the selected alleles. Consequently, recurrent sweeps of selected alleles to low frequencies can have a profound effect on levels of diversity but can leave the frequency spectrum relatively unperturbed. In fact, the limiting coalescent model under a high rate of sweeps to low frequency is identical to the standard neutral model. The general model of selective sweeps we describe goes some way toward providing a more flexible framework to describe genomic patterns of diversity than is currently available.  相似文献   

15.
The neutral theory of molecular evolution predicts that the amount of neutral polymorphisms within a species will increase proportionally with the census population size (Nc). However, this prediction has not been borne out in practice: while the range of Nc spans many orders of magnitude, levels of genetic diversity within species fall in a comparatively narrow range. Although theoretical arguments have invoked the increased efficacy of natural selection in larger populations to explain this discrepancy, few direct empirical tests of this hypothesis have been conducted. In this work, we provide a direct test of this hypothesis using population genomic data from a wide range of taxonomically diverse species. To do this, we relied on the fact that the impact of natural selection on linked neutral diversity depends on the local recombinational environment. In regions of relatively low recombination, selected variants affect more neutral sites through linkage, and the resulting correlation between recombination and polymorphism allows a quantitative assessment of the magnitude of the impact of selection on linked neutral diversity. By comparing whole genome polymorphism data and genetic maps using a coalescent modeling framework, we estimate the degree to which natural selection reduces linked neutral diversity for 40 species of obligately sexual eukaryotes. We then show that the magnitude of the impact of natural selection is positively correlated with Nc, based on body size and species range as proxies for census population size. These results demonstrate that natural selection removes more variation at linked neutral sites in species with large Nc than those with small Nc and provides direct empirical evidence that natural selection constrains levels of neutral genetic diversity across many species. This implies that natural selection may provide an explanation for this longstanding paradox of population genetics.  相似文献   

16.
When long‐lasting, balancing selection can lead to “trans‐species” polymorphisms that are shared by two or more species identical by descent. In such cases, the gene genealogy at the selected site clusters by allele instead of by species, and nearby neutral sites also have unusual genealogies because of linkage. While this scenario is expected to leave discernible footprints in genetic variation data, the specific patterns remain poorly characterized. Motivated by recent findings in primates, we focus on the case of a biallelic polymorphism under ancient balancing selection and derive approximations for summaries of the polymorphism data from two species. Specifically, we characterize the length of the segment that carries most of the footprints, the expected number of shared neutral single nucleotide polymorphisms (SNPs), and the patterns of allelic associations among them. We confirm the accuracy of our approximations by coalescent simulations. We further show that for humans and chimpanzees—more generally, for pairs of species with low genetic diversity levels—these patterns are highly unlikely to be generated by neutral recurrent mutations. We discuss the implications for the design and interpretation of genome scans for ancient balanced polymorphisms in primates and other taxa.  相似文献   

17.
Genetic diversity is shaped by mutation, genetic drift, gene flow, recombination, and selection. The dynamics and interactions of these forces shape genetic diversity across different parts of the genome, between populations and species. Here, we have studied the effects of linked selection on nucleotide diversity in outcrossing populations of two Brassicaceae species, Arabidopsis lyrata and Capsella grandiflora, with contrasting demographic history. In agreement with previous estimates, we found evidence for a modest population size expansion thousands of generations ago, as well as efficient purifying selection in C. grandiflora. In contrast, the A. lyrata population exhibited evidence for very recent strong population size decline and weaker efficacy of purifying selection. Using multiple regression analyses with recombination rate and other genomic covariates as explanatory variables, we can explain 47% of the variance in neutral diversity in the C. grandiflora population, while in the A. lyrata population, only 11% of the variance was explained by the model. Recombination rate had a significant positive effect on neutral diversity in both species, suggesting that selection at linked sites has an effect on patterns of neutral variation. In line with this finding, we also found reduced neutral diversity in the vicinity of genes in the C. grandiflora population. However, in A. lyrata no such reduction in diversity was evident, a finding that is consistent with expectations of the impact of a recent bottleneck on patterns of neutral diversity near genes. This study thus empirically demonstrates how differences in demographic history modulate the impact of selection at linked sites in natural populations.  相似文献   

18.
Brendan O’Fallon 《Genetics》2013,194(2):485-492
The extent to which selective forces shape patterns of genetic and genealogical variation is unknown in many species. Recent theoretical models have suggested that even relatively weak purifying selection may produce significant distortions in gene genealogies, but few studies have sought to quantify this effect in humans. Here, we employ a reconstruction method based on the ancestral recombination graph to infer genealogies across the length of the human X chromosome and to examine time to most recent common ancestor (TMRCA) and measures of tree imbalance at both broad and very fine scales. In agreement with theory, TMRCA is significantly reduced and genealogies are significantly more imbalanced in coding regions and introns when compared to intergenic regions, and these effects are increased in areas of greater evolutionary constraint. These distortions are present at multiple scales, and chromosomal regions as broad as 5 Mb show a significant negative correlation in TMRCA with exon density. We also show that areas of recent TMRCA are significantly associated with the disease-causing potential of site as measured by the MutationTaster prediction algorithm. Together, these findings suggest that purifying selection has significantly distorted human genealogical structure on both broad and fine scales and that few chromosomal regions escape selection-induced distortions.  相似文献   

19.
Balancing selection at one locus can increase the amount of selectively neutral variation within neighboring genomic regions. Discrete phenotypic polymorphisms studied in natural populations are frequently determined by sets of interacting genes instead of alternative alleles at single loci. We extend coalescent theory to investigate balancing selection on combinations of linked genes. We find that variation at neutral sites is increased across a much larger genomic region relative to the single-locus models: the entire region lying between the two loci in balanced combination is affected to some degree. Epistatic selection maintains these high levels of neutral variation because it directly opposes the homogenizing effect of recombination. The results of the theory are discussed in relation to published gene sequence data, primarily from Drosophila.  相似文献   

20.
The signature of positive selection on standing genetic variation   总被引:12,自引:0,他引:12  
Considerable interest is focused on the use of polymorphism data to identify regions of the genome that underlie recent adaptations. These searches are guided by a simple model of positive selection, in which a mutation is favored as soon as it arises. This assumption may not be realistic, as environmental changes and range expansions may lead previously neutral or deleterious alleles to become beneficial. We examine what effect this mode of selection has on patterns of variation at linked neutral sites by implementing a new coalescent model of positive directional selection on standing variation. In this model, a neutral allele arises and drifts in the population, then at frequency f becomes beneficial, and eventually reaches fixation. Depending on the value of f, this scenario can lead to a large variance in allele frequency spectra and in levels of linkage disequilibrium at linked, neutral sites. In particular, for intermediate f, the beneficial substitution often leads to a loss of rare alleles--a pattern that differs markedly from the signature of directional selection currently relied on by researchers. These findings highlight the importance of an accurate characterization of the effects of positive selection, if we are to reliably identify recent adaptations from polymorphism data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号