首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Detecting positive selection using genomic data is critical to understanding the role of adaptive evolution. Of particular interest in this context is sex chromosomes since they are thought to play a special role in local adaptation and speciation. We sought to circumvent the challenges associated with statistical phasing when using haplotype‐based statistics in sweep scans by benefitting from that whole chromosome haplotypes of the sex chromosomes can be obtained by resequencing of individuals of the hemizygous sex. We analyzed whole Z chromosome haplotypes from 100 females from several populations of four black and white flycatcher species (in birds, females are ZW and males ZZ). Based on integrated haplotype score (iHS) and number of segregating sites by length (nSL) statistics, we found strong and frequent haplotype structure in several regions of the Z chromosome in each species. Most of these sweep signals were population‐specific, with essentially no evidence for regions under selection shared among species. Some completed sweeps were revealed by the cross‐population extended haplotype homozygosity (XP‐EHH) statistic. Importantly, by using statistically phased Z chromosome data from resequencing of males, we failed to recover the signals of selection detected in analyses based on whole chromosome haplotypes from females; instead, what likely represent false signals of selection were frequently seen. This highlights the power issues in statistical phasing and cautions against conclusions from selection scans using such data. The detection of frequent selective sweeps on the avian Z chromosome supports a large role of sex chromosomes in adaptive evolution.  相似文献   

2.
In several Drosophila species, the XY Mendelian ratio is disturbed by X-linked segregation distorters (sex-ratio drive). We used a collection of recombinants between a nondistorting chromosome and a distorting X chromosome originating from the Seychelles to map a candidate sex-ratio region in Drosophila simulans using molecular biallelic markers. Our data were compatible with the presence of a sex-ratio locus in the 7F cytological region. Using sequence polymorphism at the Nrg locus, we showed that sex-ratio has induced a strong selective sweep in populations from Madagascar and Réunion, where distorting chromosomes are close to a 50% frequency. The complete association between the marker and the sex-ratio phenotype and the near absence of mutations and recombination in the studied fragment after the sweep event indicate that this event is recent. Examples of selective sweeps are increasingly reported in a number of genomes. This case identifies the causal selective force. It illustrates that all selective sweeps are not necessarily indicative of an increase in the average fitness of populations.  相似文献   

3.
Several hypotheses have been elaborated to account for the evolutionary decay commonly observed in full-fledged Y chromosomes. Enhanced drift, background selection and selective sweeps, which are expected to result from reduced recombination, may all share responsibilities in the initial decay of proto-Y chromosomes, but little empirical information has been gathered so far. Here we take advantage of three markers that amplify on both of the morphologically undifferentiated sex chromosomes of the European tree frog (Hyla arborea) to show that recombination is suppressed in males (the heterogametic sex) but not in females. Accordingly, genetic variability is reduced on the Y, but in a way that can be accounted for by merely the number of chromosome copies per breeding pair, without the need to invoke background selection or selective sweeps.  相似文献   

4.
Restriction‐site associated DNA sequencing (RADSeq) facilitates rapid generation of thousands of genetic markers at relatively low cost; however, several sources of error specific to RADSeq methods often lead to biased estimates of allele frequencies and thereby to erroneous population genetic inference. Estimating the distribution of sample allele frequencies without calling genotypes was shown to improve population inference from whole genome sequencing data, but the ability of this approach to account for RADSeq‐specific biases remains unexplored. Here we assess in how far genotype‐free methods of allele frequency estimation affect demographic inference from empirical RADSeq data. Using the well‐studied pied flycatcher (Ficedula hypoleuca) as a study system, we compare allele frequency estimation and demographic inference from whole genome sequencing data with that from RADSeq data matched for samples using both genotype‐based and genotype free methods. The demographic history of pied flycatchers as inferred from RADSeq data was highly congruent with that inferred from whole genome resequencing (WGS) data when allele frequencies were estimated directly from the read data. In contrast, when allele frequencies were derived from called genotypes, RADSeq‐based estimates of most model parameters fell outside the 95% confidence interval of estimates derived from WGS data. Notably, more stringent filtering of the genotype calls tended to increase the discrepancy between parameter estimates from WGS and RADSeq data, respectively. The results from this study demonstrate the ability of genotype‐free methods to improve allele frequency spectrum‐ (AFS‐) based demographic inference from empirical RADSeq data and highlight the need to account for uncertainty in NGS data regardless of sequencing method.  相似文献   

5.
Current methods of identifying positively selected regions in the genome are limited in two key ways: the underlying models cannot account for the timing of adaptive events and the comparison between models of selective sweeps and sequence data is generally made via simple summaries of genetic diversity. Here, we develop a tractable method of describing the effect of positive selection on the genealogical histories in the surrounding genome, explicitly modeling both the timing and context of an adaptive event. In addition, our framework allows us to go beyond analyzing polymorphism data via the site frequency spectrum or summaries thereof and instead leverage information contained in patterns of linked variants. Tests on both simulations and a human data example, as well as a comparison to SweepFinder2, show that even with very small sample sizes, our analytic framework has higher power to identify old selective sweeps and to correctly infer both the time and strength of selection. Finally, we derived the marginal distribution of genealogical branch lengths at a locus affected by selection acting at a linked site. This provides a much-needed link between our analytic understanding of the effects of sweeps on sequence variation and recent advances in simulation and heuristic inference procedures that allow researchers to examine the sequence of genealogical histories along the genome.  相似文献   

6.
The nonrecombining Drosophila melanogaster Y chromosome is heterochromatic and has few genes. Despite these limitations, there remains ample opportunity for natural selection to act on the genes that are vital for male fertility and on Y factors that modulate gene expression elsewhere in the genome. Y chromosomes of many organisms have low levels of nucleotide variability, but a formal survey of D. melanogaster Y chromosome variation had yet to be performed. Here we surveyed Y-linked variation in six populations of D. melanogaster spread across the globe. We find surprisingly low levels of variability in African relative to Cosmopolitan (i.e., non-African) populations. While the low levels of Cosmopolitan Y chromosome polymorphism can be explained by the demographic histories of these populations, the staggeringly low polymorphism of African Y chromosomes cannot be explained by demographic history. An explanation that is entirely consistent with the data is that the Y chromosomes of Zimbabwe and Uganda populations have experienced recent selective sweeps. Interestingly, the Zimbabwe and Uganda Y chromosomes differ: in Zimbabwe, a European Y chromosome appears to have swept through the population.  相似文献   

7.
The human and chimpanzee X chromosomes are less divergent than expected based on autosomal divergence. We study incomplete lineage sorting patterns between humans, chimpanzees and gorillas to show that this low divergence can be entirely explained by megabase-sized regions comprising one-third of the X chromosome, where polymorphism in the human-chimpanzee ancestral species was severely reduced. We show that background selection can explain at most 10% of this reduction of diversity in the ancestor. Instead, we show that several strong selective sweeps in the ancestral species can explain it. We also report evidence of population specific sweeps in extant humans that overlap the regions of low diversity in the ancestral species. These regions further correspond to chromosomal sections shown to be devoid of Neanderthal introgression into modern humans. This suggests that the same X-linked regions that undergo selective sweeps are among the first to form reproductive barriers between diverging species. We hypothesize that meiotic drive is the underlying mechanism causing these two observations.  相似文献   

8.
Sequencing of pooled samples (Pool-Seq) using next-generation sequencing technologies has become increasingly popular, because it represents a rapid and cost-effective method to determine allele frequencies for single nucleotide polymorphisms (SNPs) in population pools. Validation of allele frequencies determined by Pool-Seq has been attempted using an individual genotyping approach, but these studies tend to use samples from existing model organism databases or DNA stores, and do not validate a realistic setup for sampling natural populations. Here we used pyrosequencing to validate allele frequencies determined by Pool-Seq in three natural populations of Arabidopsis halleri (Brassicaceae). The allele frequency estimates of the pooled population samples (consisting of 20 individual plant DNA samples) were determined after mapping Illumina reads to (i) the publicly available, high-quality reference genome of a closely related species (Arabidopsis thaliana) and (ii) our own de novo draft genome assembly of A. halleri. We then pyrosequenced nine selected SNPs using the same individuals from each population, resulting in a total of 540 samples. Our results show a highly significant and accurate relationship between pooled and individually determined allele frequencies, irrespective of the reference genome used. Allele frequencies differed on average by less than 4%. There was no tendency that either the Pool-Seq or the individual-based approach resulted in higher or lower estimates of allele frequencies. Moreover, the rather high coverage in the mapping to the two reference genomes, ranging from 55 to 284x, had no significant effect on the accuracy of the Pool-Seq. A resampling analysis showed that only very low coverage values (below 10-20x) would substantially reduce the precision of the method. We therefore conclude that a pooled re-sequencing approach is well suited for analyses of genetic variation in natural populations.  相似文献   

9.
Copy number variations (CNVs) are being used as genetic markers or functional candidates in gene-mapping studies. However, unlike single nucleotide polymorphism or microsatellite genotyping techniques, most CNV detection methods are limited to detecting total copy numbers, rather than copy number in each of the two homologous chromosomes. To address this issue, we developed a statistical framework for intensity-based CNV detection platforms using family data. Our algorithm identifies CNVs for a family simultaneously, thus avoiding the generation of calls with Mendelian inconsistency while maintaining the ability to detect de novo CNVs. Applications to simulated data and real data indicate that our method significantly improves both call rates and accuracy of boundary inference, compared to existing approaches. We further illustrate the use of Mendelian inheritance to infer SNP allele compositions in each of the two homologous chromosomes in CNV regions using real data. Finally, we applied our method to a set of families genotyped using both the Illumina HumanHap550 and Affymetrix genome-wide 5.0 arrays to demonstrate its performance on both inherited and de novo CNVs. In conclusion, our method produces accurate CNV calls, gives probabilistic estimates of CNV transmission and builds a solid foundation for the development of linkage and association tests utilizing CNVs.  相似文献   

10.
Detecting the targets of adaptive natural selection from whole genome sequencing data is a central problem for population genetics. However, to date most methods have shown sub-optimal performance under realistic demographic scenarios. Moreover, over the past decade there has been a renewed interest in determining the importance of selection from standing variation in adaptation of natural populations, yet very few methods for inferring this model of adaptation at the genome scale have been introduced. Here we introduce a new method, S/HIC, which uses supervised machine learning to precisely infer the location of both hard and soft selective sweeps. We show that S/HIC has unrivaled accuracy for detecting sweeps under demographic histories that are relevant to human populations, and distinguishing sweeps from linked as well as neutrally evolving regions. Moreover, we show that S/HIC is uniquely robust among its competitors to model misspecification. Thus, even if the true demographic model of a population differs catastrophically from that specified by the user, S/HIC still retains impressive discriminatory power. Finally, we apply S/HIC to the case of resequencing data from human chromosome 18 in a European population sample, and demonstrate that we can reliably recover selective sweeps that have been identified earlier using less specific and sensitive methods.  相似文献   

11.
Testing models of selection and demography in Drosophila simulans   总被引:8,自引:0,他引:8  
Wall JD  Andolfatto P  Przeworski M 《Genetics》2002,162(1):203-216
We analyze patterns of nucleotide variability at 15 X-linked loci and 14 autosomal loci from a North American population of Drosophila simulans. We show that there is significantly more linkage disequilibrium on the X chromosome than on chromosome arm 3R and much more linkage disequilibrium on both chromosomes than expected from estimates of recombination rates, mutation rates, and levels of diversity. To explore what types of evolutionary models might explain this observation, we examine a model of recurrent, nonoverlapping selective sweeps and a model of a recent drastic bottleneck (e.g., founder event) in the demographic history of North American populations of D. simulans. The simple sweep model is not consistent with the observed patterns of linkage disequilibrium nor with the observed frequencies of segregating mutations. Under a restricted range of parameter values, a simple bottleneck model is consistent with multiple facets of the data. While our results do not exclude some influence of selection on X vs. autosome variability levels, they suggest that demography alone may account for patterns of linkage disequilibrium and the frequency spectrum of segregating mutations in this population of D. simulans.  相似文献   

12.
The relatively recent origin of sex chromosomes in the plant genus Silene provides an opportunity to study the early stages of sex chromosome evolution and, potentially, to test between the different population genetic processes likely to operate in nonrecombining chromosomes such as Y chromosomes. We previously reported much lower nucleotide polymorphism in a Y-linked gene (SlY1) of the plant Silene latifolia than in the homologous X-linked gene (SlX1). Here, we report a more extensive study of nucleotide diversity in these sex-linked genes, including a larger S. latifolia sample and a sample from the closely related species Silene dioica, and we also study the diversity of an autosomal gene, CCLS37.1. We demonstrate that nucleotide diversity in the Y-linked genes of both S. latifolia and S. dioica is very low compared with that of the X-linked gene. However, the autosomal gene also has low DNA polymorphism, which may be due to a selective sweep. We use a single individual of the related hermaphrodite species Silene conica, as an outgroup to show that the low SlY1 diversity is not due to a lower mutation rate than that for the X-linked gene. We also investigate several other possibilities for the low SlY1 diversity, including differential gene flow between the two species for Y-linked, X-linked, and autosomal genes. The frequency spectrum of nucleotide polymorphism on the Y chromosome deviates significantly from that expected under a selective-sweep model. However, we detect population subdivision in both S. latifolia and S. dioica, so it is not simple to test for selective sweeps. We also discuss the possibility that Y-linked diversity is reduced due to highly variable male reproductive success, and we conclude that this explanation is unlikely.  相似文献   

13.
Given genomic variation data from multiple individuals, computing the likelihood of complex population genetic models is often infeasible. To circumvent this problem, we introduce a novel likelihood-free inference framework by applying deep learning, a powerful modern technique in machine learning. Deep learning makes use of multilayer neural networks to learn a feature-based function from the input (e.g., hundreds of correlated summary statistics of data) to the output (e.g., population genetic parameters of interest). We demonstrate that deep learning can be effectively employed for population genetic inference and learning informative features of data. As a concrete application, we focus on the challenging problem of jointly inferring natural selection and demography (in the form of a population size change history). Our method is able to separate the global nature of demography from the local nature of selection, without sequential steps for these two factors. Studying demography and selection jointly is motivated by Drosophila, where pervasive selection confounds demographic analysis. We apply our method to 197 African Drosophila melanogaster genomes from Zambia to infer both their overall demography, and regions of their genome under selection. We find many regions of the genome that have experienced hard sweeps, and fewer under selection on standing variation (soft sweep) or balancing selection. Interestingly, we find that soft sweeps and balancing selection occur more frequently closer to the centromere of each chromosome. In addition, our demographic inference suggests that previously estimated bottlenecks for African Drosophila melanogaster are too extreme.  相似文献   

14.
Adaptation from standing genetic variation or recurrent de novo mutation in large populations should commonly generate soft rather than hard selective sweeps. In contrast to a hard selective sweep, in which a single adaptive haplotype rises to high population frequency, in a soft selective sweep multiple adaptive haplotypes sweep through the population simultaneously, producing distinct patterns of genetic variation in the vicinity of the adaptive site. Current statistical methods were expressly designed to detect hard sweeps and most lack power to detect soft sweeps. This is particularly unfortunate for the study of adaptation in species such as Drosophila melanogaster, where all three confirmed cases of recent adaptation resulted in soft selective sweeps and where there is evidence that the effective population size relevant for recent and strong adaptation is large enough to generate soft sweeps even when adaptation requires mutation at a specific single site at a locus. Here, we develop a statistical test based on a measure of haplotype homozygosity (H12) that is capable of detecting both hard and soft sweeps with similar power. We use H12 to identify multiple genomic regions that have undergone recent and strong adaptation in a large population sample of fully sequenced Drosophila melanogaster strains from the Drosophila Genetic Reference Panel (DGRP). Visual inspection of the top 50 candidates reveals that in all cases multiple haplotypes are present at high frequencies, consistent with signatures of soft sweeps. We further develop a second haplotype homozygosity statistic (H2/H1) that, in combination with H12, is capable of differentiating hard from soft sweeps. Surprisingly, we find that the H12 and H2/H1 values for all top 50 peaks are much more easily generated by soft rather than hard sweeps. We discuss the implications of these results for the study of adaptation in Drosophila and in species with large census population sizes.  相似文献   

15.
Selection at linked sites has important consequences for the properties of neutral variation and for tests of the predictions of the neutral theory of molecular evolution. We review the theory of the effect of adaptive gene substitutions on neutral variability at linked sites (hitchhiking or selective sweeps) and discuss theoretical results on the effect of selection against deleterious alleles on variation at linked sites (background selection). InDrosophila melanogaster there is a clear relation between the frequency of recombination in a given region of the chromosome and the amount of natural variability in that region. Attempts to predict this relation have given rise to models of selective sweeps and background selection. We describe possible methods of discriminating between these models, and also discuss the probable strong influence of selective sweeps on variation in largely nonrecombining genomes, with particular reference toEscherichia coll. Finally we present some unresolved questions and possible directions for future research.  相似文献   

16.
Kim Y 《Genetics》2006,172(3):1967-1978
The allele frequency of a neutral variant in a population is pushed either upward or downward by directional selection on a linked beneficial mutation ("selective sweeps"). DNA sequences sampled after the fixation of the beneficial allele thus contain an excess of rare neutral alleles. This study investigates the allele frequency distribution under selective sweep models using analytic approximation and simulation. First, given a single selective sweep at a fixed time, I derive an expression for the sampling probabilities of neutral mutants. This solution can be used to estimate the time of the fixation of a beneficial allele from sequence data. Next, I obtain an approximation to mean allele frequencies under recurrent selective sweeps. Under recurrent sweeps, the frequency spectrum is skewed toward rare alleles. However, the excess of high-frequency derived alleles, previously shown to be a signature of single selective sweeps, disappears with recurrent sweeps. It is shown that, using this approximation and multilocus polymorphism data, genomewide parameters of directional selection can be estimated.  相似文献   

17.
Kauer MO  Dieringer D  Schlötterer C 《Genetics》2003,165(3):1137-1148
We report a "hitchhiking mapping" study in D. melanogaster, which searches for genomic regions with reduced variability. The study's aim was to identify selective sweeps associated with the "out of Africa" habitat expansion. We scanned 103 microsatellites on chromosome 3 and 102 microsatellites on the X chromosome for reduced variability in non-African populations. When the chromosomes were analyzed separately, the number of loci with a significant reduction in variability only slightly exceeded the expectation under neutrality--six loci on the third chromosome and four loci on the X chromosome. However, non-African populations also have a more pronounced average loss in variability on the X chromosomes as compared to the third chromosome, which suggests the action of selection. Therefore, comparing the X chromosome to the autosome yields a higher number of significantly reduced loci. However, a more pronounced loss of variability on the X chromosome may be caused by demographic events rather than by natural selection. We therefore explored a range of demographic scenarios and found that some of these captured most, but not all aspects of our data. More theoretical work is needed to evaluate how demographic events might differentially affect X chromosomes and autosomes and to estimate the most likely scenario associated with the out of Africa expansion of D. melanogaster.  相似文献   

18.
Roze D  Barton NH 《Genetics》2006,173(3):1793-1811
In finite populations, genetic drift generates interference between selected loci, causing advantageous alleles to be found more often on different chromosomes than on the same chromosome, which reduces the rate of adaptation. This "Hill-Robertson effect" generates indirect selection to increase recombination rates. We present a new method to quantify the strength of this selection. Our model represents a new beneficial allele (A) entering a population as a single copy, while another beneficial allele (B) is sweeping at another locus. A third locus affects the recombination rate between selected loci. Using a branching process model, we calculate the probability distribution of the number of copies of A on the different genetic backgrounds, after it is established but while it is still rare. Then, we use a deterministic model to express the change in frequency of the recombination modifier, due to hitchhiking, as A goes to fixation. We show that this method can give good estimates of selection for recombination. Moreover, it shows that recombination is selected through two different effects: it increases the fixation probability of new alleles, and it accelerates selective sweeps. The relative importance of these two effects depends on the relative times of occurrence of the beneficial alleles.  相似文献   

19.
Speciation may be promoted in hybrid zones if there is an interruption to gene flow between the hybridizing forms. For hybridizing chromosome races of the house mouse in Valtellina (Italy), distinguished by whole‐arm chromosomal rearrangements, previous studies have shown that there is greater interruption to gene flow at the centromeres of chromosomes that differ between the races than at distal regions of the same chromosome or at the centromeres of other chromosomes. Here, by increasing the number of markers along race‐specific chromosomes, we reveal a decay in between‐race genetic differentiation from the centromere to the distal telomere. For the first time, we use simulation models to investigate the possible role of recombination suppression and hybrid breakdown in generating this pattern. We also consider epistasis and selective sweeps as explanations for isolated chromosomal regions away from the centromere showing differentiation between the races. Hybrid breakdown alone is the simplest explanation for the decay in genetic differentiation with distance from the centromere. Robertsonian fusions/whole‐arm reciprocal translocations are common chromosomal rearrangements characterizing both closely related species and races within species, and this fine‐scale empirical analysis suggests that the unfitness associated with these rearrangements in the heterozygous state may contribute to the speciation process.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号