首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Thornton KR  Jensen JD 《Genetics》2007,175(2):737-750
Rapid typing of genetic variation at many regions of the genome is an efficient way to survey variability in natural populations in an effort to identify segments of the genome that have experienced recent natural selection. Following such a genome scan, individual regions may be chosen for further sequencing and a more detailed analysis of patterns of variability, often to perform a parametric test for selection and to estimate the strength of a recent selective sweep. We show here that not accounting for the ascertainment of loci in such analyses leads to false inference of natural selection when the true model is selective neutrality, because the procedure of choosing unusual loci (in comparison to the rest of the genome-scan data) selects regions of the genome with genealogies similar to those expected under models of recent directional selection. We describe a simple and efficient correction for this ascertainment bias, which restores the false-positive rate to near-nominal levels. For the parameters considered here, we find that obtaining a test with the expected distribution of P-values depends on accurately accounting both for ascertainment of regions and for demography. Finally, we use simulations to explore the utility of relying on outlier loci to detect recent selective sweeps. We find that measures of diversity and of population differentiation are more effective than summaries of the site-frequency spectrum and that sequencing larger regions (2.5 kbp) in genome-scan studies leads to more power to detect recent selective sweeps.  相似文献   

2.
Coevolution between hosts and their parasites is expected to follow a range of possible dynamics, the two extreme cases being called trench warfare (or Red Queen) and arms races. Long‐term stable polymorphism at the host and parasite coevolving loci is characteristic of trench warfare, and is expected to promote molecular signatures of balancing selection, while the recurrent allele fixation in arms races should generate selective sweeps. We compare these two scenarios using a finite size haploid gene‐for‐gene model that includes both mutation and genetic drift. We first show that trench warfare do not necessarily display larger numbers of coevolutionary cycles per unit of time than arms races. We subsequently perform coalescent simulations under these dynamics to generate sequences at both host and parasite loci. Genomic footprints of recurrent selective sweeps are often found, whereas trench warfare yield signatures of balancing selection only in parasite sequences, and only in a limited parameter space. Our results suggest that deterministic models of coevolution with infinite population sizes do not predict reliably the observed genomic signatures, and it may be best to study parasite rather than host populations to find genomic signatures of coevolution, such as selective sweeps or balancing selection.  相似文献   

3.
Adaptation from standing genetic variation or recurrent de novo mutation in large populations should commonly generate soft rather than hard selective sweeps. In contrast to a hard selective sweep, in which a single adaptive haplotype rises to high population frequency, in a soft selective sweep multiple adaptive haplotypes sweep through the population simultaneously, producing distinct patterns of genetic variation in the vicinity of the adaptive site. Current statistical methods were expressly designed to detect hard sweeps and most lack power to detect soft sweeps. This is particularly unfortunate for the study of adaptation in species such as Drosophila melanogaster, where all three confirmed cases of recent adaptation resulted in soft selective sweeps and where there is evidence that the effective population size relevant for recent and strong adaptation is large enough to generate soft sweeps even when adaptation requires mutation at a specific single site at a locus. Here, we develop a statistical test based on a measure of haplotype homozygosity (H12) that is capable of detecting both hard and soft sweeps with similar power. We use H12 to identify multiple genomic regions that have undergone recent and strong adaptation in a large population sample of fully sequenced Drosophila melanogaster strains from the Drosophila Genetic Reference Panel (DGRP). Visual inspection of the top 50 candidates reveals that in all cases multiple haplotypes are present at high frequencies, consistent with signatures of soft sweeps. We further develop a second haplotype homozygosity statistic (H2/H1) that, in combination with H12, is capable of differentiating hard from soft sweeps. Surprisingly, we find that the H12 and H2/H1 values for all top 50 peaks are much more easily generated by soft rather than hard sweeps. We discuss the implications of these results for the study of adaptation in Drosophila and in species with large census population sizes.  相似文献   

4.
Genetic adaptation to external stimuli occurs through the combined action of mutation and selection. A central problem in genetics is to identify loci responsive to specific selective constraints. Many tests have been proposed to identify the genomic signatures of natural selection by quantifying the skew in the site frequency spectrum (SFS) under selection relative to neutrality. We build upon recent work that connects many of these tests under a common framework, by describing how selective sweeps affect the scaled SFS. We show that the specific skew depends on many attributes of the sweep, including the selection coefficient and the time under selection. Using supervised learning on extensive simulated data, we characterize the features of the scaled SFS that best separate different types of selective sweeps from neutrality. We develop a test, SFselect, that consistently outperforms many existing tests over a wide range of selective sweeps. We apply SFselect to polymorphism data from a laboratory evolution experiment of Drosophila melanogaster adapted to hypoxia and identify loci that strengthen the role of the Notch pathway in hypoxia tolerance, but were missed by previous approaches. We further apply our test to human data and identify regions that are in agreement with earlier studies, as well as many novel regions.  相似文献   

5.
Population and locus-specific reduction of variability of polymorphic loci could be an indication of positive selection at a linked site (selective sweep) and therefore point toward genes that have been involved in recent adaptations. Analysis of microsatellite variability offers a way to identify such regions and to ask whether they occur more often than expected by chance. We studied four populations of the house mouse (Mus musculus) to assess the frequency of such signatures of selective sweeps under natural conditions. Three samples represent the subspecies Mus m. dometicus [corrected] and came from Germany, France, and Cameroon. One sample came from Kazakhstan and constitutes a population of the subspecies Mus m. [corrected] musculus. Mitochondrial D-loop sequences from all animals confirm their respective assignments. Approximately 200 microsatellite loci were typed for up to 60 unrelated individuals from each population and evaluated for signs of selective sweeps on the basis of Schl?tterer's ln RV and ln RH statistics. Our data suggest that there are slightly more signs of selective sweeps than would have been expected by chance alone in each of the populations and also highlights some of the statistical challenges faced in genome scans for detecting selection. Single-nucleotide polymorphism typing of one sweep signature in the M. m. domesticus populations around the beta-defensin 6 locus confirms a lowered nucleotide diversity in this region and limits the potential sweep region to about 20 kb. However, no amino acid exchange has occurred in the coding region when compared to M. m. musculus. If this sweep signature is due to a recent adaptation, it is expected that a regulatory change would have caused it. Our data provide a framework for conducting a systematic whole genome scan for signatures of selective sweeps in the mouse genome.  相似文献   

6.
Hermisson J  Pennings PS 《Genetics》2005,169(4):2335-2352
A population can adapt to a rapid environmental change or habitat expansion in two ways. It may adapt either through new beneficial mutations that subsequently sweep through the population or by using alleles from the standing genetic variation. We use diffusion theory to calculate the probabilities for selective adaptations and find a large increase in the fixation probability for weak substitutions, if alleles originate from the standing genetic variation. We then determine the parameter regions where each scenario-standing variation vs. new mutations-is more likely. Adaptations from the standing genetic variation are favored if either the selective advantage is weak or the selection coefficient and the mutation rate are both high. Finally, we analyze the probability of "soft sweeps," where multiple copies of the selected allele contribute to a substitution, and discuss the consequences for the footprint of selection on linked neutral variation. We find that soft sweeps with weaker selective footprints are likely under both scenarios if the mutation rate and/or the selection coefficient is high.  相似文献   

7.
Identification of partial sweeps, which include both hard and soft sweeps that have not currently reached fixation, provides crucial information about ongoing evolutionary responses. To this end, we introduce partialS/HIC, a deep learning method to discover selective sweeps from population genomic data. partialS/HIC uses a convolutional neural network for image processing, which is trained with a large suite of summary statistics derived from coalescent simulations incorporating population-specific history, to distinguish between completed versus partial sweeps, hard versus soft sweeps, and regions directly affected by selection versus those merely linked to nearby selective sweeps. We perform several simulation experiments under various demographic scenarios to demonstrate partialS/HIC’s performance, which exhibits excellent resolution for detecting partial sweeps. We also apply our classifier to whole genomes from eight mosquito populations sampled across sub-Saharan Africa by the Anopheles gambiae 1000 Genomes Consortium, elucidating both continent-wide patterns as well as sweeps unique to specific geographic regions. These populations have experienced intense insecticide exposure over the past two decades, and we observe a strong overrepresentation of sweeps at insecticide resistance loci. Our analysis thus provides a list of candidate adaptive loci that may be relevant to mosquito control efforts. More broadly, our supervised machine learning approach introduces a method to distinguish between completed and partial sweeps, as well as between hard and soft sweeps, under a variety of demographic scenarios. As whole-genome data rapidly accumulate for a greater diversity of organisms, partialS/HIC addresses an increasing demand for useful selection scan tools that can track in-progress evolutionary dynamics.  相似文献   

8.
Methods for detecting the genomic signatures of natural selection have been heavily studied, and they have been successful in identifying many selective sweeps. For most of these sweeps, the favored allele remains unknown, making it difficult to distinguish carriers of the sweep from non-carriers. In an ongoing selective sweep, carriers of the favored allele are likely to contain a future most recent common ancestor. Therefore, identifying them may prove useful in predicting the evolutionary trajectory—for example, in contexts involving drug-resistant pathogen strains or cancer subclones. The main contribution of this paper is the development and analysis of a new statistic, the Haplotype Allele Frequency (HAF) score. The HAF score, assigned to individual haplotypes in a sample, naturally captures many of the properties shared by haplotypes carrying a favored allele. We provide a theoretical framework for computing expected HAF scores under different evolutionary scenarios, and we validate the theoretical predictions with simulations. As an application of HAF score computations, we develop an algorithm (PreCIOSS: Predicting Carriers of Ongoing Selective Sweeps) to identify carriers of the favored allele in selective sweeps, and we demonstrate its power on simulations of both hard and soft sweeps, as well as on data from well-known sweeps in human populations.  相似文献   

9.
While hundreds of loci have been identified as reflecting strong-positive selection in human populations, connections between candidate loci and specific selective pressures often remain obscure. This study investigates broader patterns of selection in African populations, which are underrepresented despite their potential to offer key insights into human adaptation. We scan for hard selective sweeps using several haplotype and allele-frequency statistics with a data set of nearly 500,000 genome-wide single-nucleotide polymorphisms in 12 highly diverged African populations that span a range of environments and subsistence strategies. We find that positive selection does not appear to be a strong determinant of allele-frequency differentiation among these African populations. Haplotype statistics do identify putatively selected regions that are shared across African populations. However, as assessed by extensive simulations, patterns of haplotype sharing between African populations follow neutral expectations and suggest that tails of the empirical distributions contain false-positive signals. After highlighting several genomic regions where positive selection can be inferred with higher confidence, we use a novel method to identify biological functions enriched among populations’ empirical tail genomic windows, such as immune response in agricultural groups. In general, however, it seems that current methods for selection scans are poorly suited to populations that, like the African populations in this study, are affected by ascertainment bias and have low levels of linkage disequilibrium, possibly old selective sweeps, and potentially reduced phasing accuracy. Additionally, population history can confound the interpretation of selection statistics, suggesting that greater care is needed in attributing broad genetic patterns to human adaptation.  相似文献   

10.
Characterizing the nature of the adaptive process at the genetic level is a central goal for population genetics. In particular, we know little about the sources of adaptive substitution or about the number of adaptive variants currently segregating in nature. Historically, population geneticists have focused attention on the hard-sweep model of adaptation in which a de novo beneficial mutation arises and rapidly fixes in a population. Recently more attention has been given to soft-sweep models, in which alleles that were previously neutral, or nearly so, drift until such a time as the environment shifts and their selection coefficient changes to become beneficial. It remains an active and difficult problem, however, to tease apart the telltale signatures of hard vs. soft sweeps in genomic polymorphism data. Through extensive simulations of hard- and soft-sweep models, here we show that indeed the two might not be separable through the use of simple summary statistics. In particular, it seems that recombination in regions linked to, but distant from, sites of hard sweeps can create patterns of polymorphism that closely mirror what is expected to be found near soft sweeps. We find that a very similar situation arises when using haplotype-based statistics that are aimed at detecting partial or ongoing selective sweeps, such that it is difficult to distinguish the shoulder of a hard sweep from the center of a partial sweep. While knowing the location of the selected site mitigates this problem slightly, we show that stochasticity in signatures of natural selection will frequently cause the signal to reach its zenith far from this site and that this effect is more severe for soft sweeps; thus inferences of the target as well as the mode of positive selection may be inaccurate. In addition, both the time since a sweep ends and biologically realistic levels of allelic gene conversion lead to errors in the classification and identification of selective sweeps. This general problem of “soft shoulders” underscores the difficulty in differentiating soft and partial sweeps from hard-sweep scenarios in molecular population genomics data. The soft-shoulder effect also implies that the more common hard sweeps have been in recent evolutionary history, the more prevalent spurious signatures of soft or partial sweeps may appear in some genome-wide scans.  相似文献   

11.
The extraordinary phenotypic diversity of dog breeds has been sculpted by a unique population history accompanied by selection for novel and desirable traits. Here we perform a comprehensive analysis using multiple test statistics to identify regions under selection in 509 dogs from 46 diverse breeds using a newly developed high-density genotyping array consisting of >170,000 evenly spaced SNPs. We first identify 44 genomic regions exhibiting extreme differentiation across multiple breeds. Genetic variation in these regions correlates with variation in several phenotypic traits that vary between breeds, and we identify novel associations with both morphological and behavioral traits. We next scan the genome for signatures of selective sweeps in single breeds, characterized by long regions of reduced heterozygosity and fixation of extended haplotypes. These scans identify hundreds of regions, including 22 blocks of homozygosity longer than one megabase in certain breeds. Candidate selection loci are strongly enriched for developmental genes. We chose one highly differentiated region, associated with body size and ear morphology, and characterized it using high-throughput sequencing to provide a list of variants that may directly affect these traits. This study provides a catalogue of genomic regions showing extreme reduction in genetic variation or population differentiation in dogs, including many linked to phenotypic variation. The many blocks of reduced haplotype diversity observed across the genome in dog breeds are the result of both selection and genetic drift, but extended blocks of homozygosity on a megabase scale appear to be best explained by selection. Further elucidation of the variants under selection will help to uncover the genetic basis of complex traits and disease.  相似文献   

12.
Speciation, the evolution of reproductive isolation among populations, is continuous, complex, and involves multiple, interacting barriers. Until it is complete, the effects of this process vary along the genome and can lead to a heterogeneous genomic landscape with peaks and troughs of differentiation and divergence. When gene flow occurs during speciation, barriers restricting gene flow locally in the genome lead to patterns of heterogeneity. However, genomic heterogeneity can also be produced or modified by variation in factors such as background selection and selective sweeps, recombination and mutation rate variation, and heterogeneous gene density. Extracting the effects of gene flow, divergent selection and reproductive isolation from such modifying factors presents a major challenge to speciation genomics. We argue one of the principal aims of the field is to identify the barrier loci involved in limiting gene flow. We first summarize the expected signatures of selection at barrier loci, at the genomic regions linked to them and across the entire genome. We then discuss the modifying factors that complicate the interpretation of the observed genomic landscape. Finally, we end with a road map for future speciation research: a proposal for how to account for these modifying factors and to progress towards understanding the nature of barrier loci. Despite the difficulties of interpreting empirical data, we argue that the availability of promising technical and analytical methods will shed further light on the important roles that gene flow and divergent selection have in shaping the genomic landscape of speciation.  相似文献   

13.

Background

Animal domestication involved drastic phenotypic changes driven by strong artificial selection and also resulted in new populations of breeds, established by humans. This study aims to identify genes that show evidence of recent artificial selection during pig domestication.

Results

Whole-genome resequencing of 30 individual pigs from domesticated breeds, Landrace and Yorkshire, and 10 Asian wild boars at ~16-fold coverage was performed resulting in over 4.3 million SNPs for 19,990 genes. We constructed a comprehensive genome map of directional selection by detecting selective sweeps using an FST-based approach that detects directional selection in lineages leading to the domesticated breeds and using a haplotype-based test that detects ongoing selective sweeps within the breeds. We show that candidate genes under selection are significantly enriched for loci implicated in quantitative traits important to pig reproduction and production. The candidate gene with the strongest signals of directional selection belongs to group III of the metabolomics glutamate receptors, known to affect brain functions associated with eating behavior, suggesting that loci under strong selection include loci involved in behaviorial traits in domesticated pigs including tameness.

Conclusions

We show that a significant proportion of selection signatures coincide with loci that were previously inferred to affect phenotypic variation in pigs. We further identify functional enrichment related to behavior, such as signal transduction and neuronal activities, for those targets of selection during domestication in pigs.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1330-x) contains supplementary material, which is available to authorized users.  相似文献   

14.
Jeremy J. Berg  Graham Coop 《Genetics》2015,201(2):707-725
The use of genetic polymorphism data to understand the dynamics of adaptation and identify the loci that are involved has become a major pursuit of modern evolutionary genetics. In addition to the classical “hard sweep” hitchhiking model, recent research has drawn attention to the fact that the dynamics of adaptation can play out in a variety of different ways and that the specific signatures left behind in population genetic data may depend somewhat strongly on these dynamics. One particular model for which a large number of empirical examples are already known is that in which a single derived mutation arises and drifts to some low frequency before an environmental change causes the allele to become beneficial and sweeps to fixation. Here, we pursue an analytical investigation of this model, bolstered and extended via simulation study. We use coalescent theory to develop an analytical approximation for the effect of a sweep from standing variation on the genealogy at the locus of the selected allele and sites tightly linked to it. We show that the distribution of haplotypes that the selected allele is present on at the time of the environmental change can be approximated by considering recombinant haplotypes as alleles in the infinite-alleles model. We show that this approximation can be leveraged to make accurate predictions regarding patterns of genetic polymorphism following such a sweep. We then use simulations to highlight which sources of haplotypic information are likely to be most useful in distinguishing this model from neutrality, as well as from other sweep models, such as the classic hard sweep and multiple-mutation soft sweeps. We find that in general, adaptation from a unique standing variant will likely be difficult to detect on the basis of genetic polymorphism data from a single population time point alone, and when it can be detected, it will be difficult to distinguish from other varieties of selective sweeps. Samples from multiple populations and/or time points have the potential to ease this difficulty.  相似文献   

15.
Marshall JM  Weiss RE 《Genetics》2006,173(4):2357-2370
The distribution of microsatellite allele sizes in populations aids in understanding the genetic diversity of species and the evolutionary history of recent selective sweeps. We propose a heterogeneous Bayesian analysis of variance model for inferring loci involved in recent selective sweeps by analyzing the distribution of allele sizes at multiple loci in multiple populations. Our model is shown to be consistent with a multilocus test statistic, ln RV, proposed for identifying microsatellite loci involved in recent selective sweeps. Our methodology differs in that it accepts original allele size data rather than summary statistics and allows the incorporation of prior knowledge about allele frequencies using a hierarchical prior distribution consisting of log normal and gamma probability distributions. Interesting features of the model are its ability to simultaneously analyze allele size data for any number of populations and to cope with the presence of any number of selected loci. The utility of the method is illustrated by application to two sets of microsatellite allele size data for a group of West African Anopheles gambiae populations. The results are consistent with the suppressed-recombination model of speciation, and additional candidate loci on chromosomes 2 (079 and 175) and 3 (088) are discovered that escaped former analysis.  相似文献   

16.
Teschke M  Mukabayire O  Wiehe T  Tautz D 《Genetics》2008,180(3):1537-1545
Genome scans of polymorphisms promise to provide insights into the patterns and frequencies of positive selection under natural conditions. The use of microsatellites as markers has the potential to focus on very recent events, since in contrast to SNPs, their high mutation rates should remove signatures of older events. We assess this concept here in a large-scale study. We have analyzed two population pairs of the house mouse, one pair of the subspecies Mus musculus domesticus and the other of M. m. musculus. A total of 915 microsatellite loci chosen to cover the whole genome were assessed in a prescreening procedure, followed by individual typing of candidate loci. Schlötterer's ratio statistics (lnRH) were applied to detect loci with significant deviations from patterns of neutral expectation. For eight loci from each population pair we have determined the size of the potential sweep window and applied a second statistical procedure (linked locus statistics). For the two population pairs, we find five and four significant sweep loci, respectively, with an average estimated window size of 120 kb. On the basis of the analysis of individual allele frequencies, it is possible to identify the most recent sweep, for which we estimate an onset of 400–600 years ago. Given the known population history for the French–German population pair, we infer that the average frequency of selective sweeps in these populations is higher than 1 in 100 generations across the whole genome. We discuss the implications for adaptation processes in natural populations.  相似文献   

17.
M. J. Ford  C. F. Aquadro 《Genetics》1996,144(2):689-703
We present the results of a restriction site survey of variation at five loci in Drosophila athabasca, complimenting a previous study of the period locus. There is considerably greater differentiation between the three semispecies of D. athabasca at the period locus and two other X-linked genes (no-on-transient-A and E74A) than at three autosomal genes (Xdh, Adh and RC98). Using a modification of the HKA test, which uses fixed differences between the semispecies and a test based on differences in Fst among loci, we show that the greater differentiation of the X-linked loci compared with the autosomal loci is inconsistent with a neutral model of molecular evolution. We explore several evolutionary scenarios by computer simulation, including differential migration of X and autosomal genes, very low levels of migration among the semispecies, selective sweeps, and background selection, and conclude that X-linked selective sweeps in at least two of the semispecies are the best explanation for the data. This evidence that natural selection acted on the X-chromosome suggests that another X-linked trait, mating song differences among the semispecies, may have been the target of selection.  相似文献   

18.
We have evaluated a pooling approach that can reduce the number of polymerase chain reactions in a screen for selective sweeps by more than an order of magnitude. We show that the complex peak pattern that results from pooling of all samples from a given population is a faithful reflection of the composite pattern of the individual alleles, although with an under‐representation of the larger alleles. Candidate loci for selective sweeps can be identified by visual inspection of the pool patterns. We have also implemented a software tool, which can find suitable microsatellite loci in the vicinity of annotated genes.  相似文献   

19.
We estimated the intensity of selection on preferred codons in Drosophila pseudoobscura and D. miranda at X-linked and autosomal loci, using a published data set on sequence variability at 67 loci, by means of an improved method that takes account of demographic effects. We found evidence for stronger selection at X-linked loci, consistent with their higher levels of codon usage bias. The estimates of the strength of selection and mutational bias in favor of unpreferred codons were similar to those found in other species, after taking into account the fact that D. pseudoobscura showed evidence for a recent expansion in population size. We examined correlates of synonymous and nonsynonymous diversity in these species and found no evidence for effects of recurrent selective sweeps on nonsynonymous mutations, which is probably because this set of genes have much higher than average levels of selective constraints. There was evidence for correlated effects of levels of selective constraints on protein sequences and on codon usage, as expected under models of selection for translational accuracy. Our analysis of a published data set on D. melanogaster provided evidence for the effects of selective sweeps of nonsynonymous mutations on linked synonymous diversity, but only in the subset of loci that experienced the highest rates of nonsynonymous substitutions (about one-quarter of the total) and not at more slowly evolving loci. Our correlational analysis of this data set suggested that both selective constraints on protein sequences and recurrent selective sweeps affect the overall level of codon usage.  相似文献   

20.
Summary To maximize parameter estimation efficiency and statistical power and to estimate epistasis, the parameters of multiple quantitative trait loci (QTLs) must be simultaneously estimated. If multiple QTL affect a trait, then estimates of means of QTL genotypes from individual locus models are statistically biased. In this paper, I describe methods for estimating means of QTL genotypes and recombination frequencies between marker and quantitative trait loci using multilocus backcross, doubled haploid, recombinant inbred, and testcross progeny models. Expected values of marker genotype means were defined using no double or multiple crossover frequencies and flanking markers for linked and unlinked quantitative trait loci. The expected values for a particular model comprise a system of nonlinear equations that can be solved using an interative algorithm, e.g., the Gauss-Newton algorithm. The solutions are maximum likelihood estimates when the errors are normally distributed. A linear model for estimating the parameters of unlinked quantitative trait loci was found by transforming the nonlinear model. Recombination frequency estimators were defined using this linear model. Certain means of linked QTLs are less efficiently estimated than means of unlinked QTLs.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号