首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
The detection of molecular signatures of selection is one of the major concerns of modern population genetics. A widely used strategy in this context is to compare samples from several populations and to look for genomic regions with outstanding genetic differentiation between these populations. Genetic differentiation is generally based on allele frequency differences between populations, which are measured by FST or related statistics. Here we introduce a new statistic, denoted hapFLK, which focuses instead on the differences of haplotype frequencies between populations. In contrast to most existing statistics, hapFLK accounts for the hierarchical structure of the sampled populations. Using computer simulations, we show that each of these two features—the use of haplotype information and of the hierarchical structure of populations—significantly improves the detection power of selected loci and that combining them in the hapFLK statistic provides even greater power. We also show that hapFLK is robust with respect to bottlenecks and migration and improves over existing approaches in many situations. Finally, we apply hapFLK to a set of six sheep breeds from Northern Europe and identify seven regions under selection, which include already reported regions but also several new ones. We propose a method to help identifying the population(s) under selection in a detected region, which reveals that in many of these regions selection most likely occurred in more than one population. Furthermore, several of the detected regions correspond to incomplete sweeps, where the favorable haplotype is only at intermediate frequency in the population(s) under selection.  相似文献   

2.
Humans show tremendous phenotypic diversity across geographically distributed populations, and much of this diversity undoubtedly results from genetic adaptations to different environmental pressures. The availability of genome-wide genetic variation data from densely sampled populations offers unprecedented opportunities for identifying the loci responsible for these adaptations and for elucidating the genetic architecture of human adaptive traits. Several approaches have been used to detect signals of selection in human populations, and these approaches differ in the assumptions they make about the underlying mode of selection. We contrast the results of approaches based on haplotype structure and differentiation of allele frequencies to those from a method for identifying single nucleotide polymorphisms strongly correlated with environmental variables. Although the first group of approaches tends to detect new beneficial alleles that were driven to high frequencies by selection, the environmental correlation approach has power to identify alleles that experienced small shifts in frequency owing to selection. We suggest that the first group of approaches tends to identify only variants with relatively strong phenotypic effects, whereas the environmental correlation methods can detect variants that make smaller contributions to an adaptive trait.  相似文献   

3.
The recent advent of high-throughput sequencing and genotyping technologies makes it possible to produce, easily and cost effectively, large amounts of detailed data on the genotype composition of populations. Detecting locus-specific effects may help identify those genes that have been, or are currently, targeted by natural selection. How best to identify these selected regions, loci, or single nucleotides remains a challenging issue. Here, we introduce a new model-based method, called SelEstim, to distinguish putative selected polymorphisms from the background of neutral (or nearly neutral) ones and to estimate the intensity of selection at the former. The underlying population genetic model is a diffusion approximation for the distribution of allele frequency in a population subdivided into a number of demes that exchange migrants. We use a Markov chain Monte Carlo algorithm for sampling from the joint posterior distribution of the model parameters, in a hierarchical Bayesian framework. We present evidence from stochastic simulations, which demonstrates the good power of SelEstim to identify loci targeted by selection and to estimate the strength of selection acting on these loci, within each deme. We also reanalyze a subset of SNP data from the Stanford HGDP–CEPH Human Genome Diversity Cell Line Panel to illustrate the performance of SelEstim on real data. In agreement with previous studies, our analyses point to a very strong signal of positive selection upstream of the LCT gene, which encodes for the enzyme lactase–phlorizin hydrolase and is associated with adult-type hypolactasia. The geographical distribution of the strength of positive selection across the Old World matches the interpolated map of lactase persistence phenotype frequencies, with the strongest selection coefficients in Europe and in the Indus Valley.  相似文献   

4.
Studies of the apportionment of human genetic variation have long established that most human variation is within population groups and that the additional variation between population groups is small but greatest when comparing different continental populations. These studies often used Wright’s F ST that apportions the standardized variance in allele frequencies within and between population groups. Because local adaptations increase population differentiation, high-F ST may be found at closely linked loci under selection and used to identify genes undergoing directional or heterotic selection. We re-examined these processes using HapMap data. We analyzed 3 million SNPs on 602 samples from eight worldwide populations and a consensus subset of 1 million SNPs found in all populations. We identified four major features of the data: First, a hierarchically F ST analysis showed that only a paucity (12%) of the total genetic variation is distributed between continental populations and even a lesser genetic variation (1%) is found between intra-continental populations. Second, the global F ST distribution closely follows an exponential distribution. Third, although the overall F ST distribution is similarly shaped (inverse J), F ST distributions varies markedly by allele frequency when divided into non-overlapping groups by allele frequency range. Because the mean allele frequency is a crude indicator of allele age, these distributions mark the time-dependent change in genetic differentiation. Finally, the change in mean-F ST of these groups is linear in allele frequency. These results suggest that investigating the extremes of the F ST distribution for each allele frequency group is more efficient for detecting selection. Consequently, we demonstrate that such extreme SNPs are more clustered along the chromosomes than expected from linkage disequilibrium for each allele frequency group. These genomic regions are therefore likely candidates for natural selection.  相似文献   

5.
Resolving the role of natural selection is a basic objective of evolutionary biology. It is generally difficult to detect the influence of selection because ubiquitous non-selective stochastic change in allele frequencies (genetic drift) degrades evidence of selection. As a result, selection scans typically only identify genomic regions that have undergone episodes of intense selection. Yet it seems likely such episodes are the exception; the norm is more likely to involve subtle, concurrent selective changes at a large number of loci. We develop a new theoretical approach that uncovers a previously undocumented genome-wide signature of selection in the collective divergence of allele frequencies over time. Applying our approach to temporally resolved allele frequency measurements from laboratory and wild Drosophila populations, we quantify the selective contribution to allele frequency divergence and find that selection has substantial effects on much of the genome. We further quantify the magnitude of the total selection coefficient (a measure of the combined effects of direct and linked selection) at a typical polymorphic locus, and find this to be large (of order 1%) even though most mutations are not directly under selection. We find that selective allele frequency divergence is substantially elevated at intermediate allele frequencies, which we argue is most parsimoniously explained by positive—not negative—selection. Thus, in these populations most mutations are far from evolving neutrally in the short term (tens of generations), including mutations with neutral fitness effects, and the result cannot be explained simply as an ongoing purging of deleterious mutations.  相似文献   

6.
Phenotypic divergences between modern human populations have developed as a result of genetic adaptation to local environments over the past 100,000 years. To identify genes involved in population-specific phenotypes, it is necessary to detect signatures of recent positive selection in the human genome. Although detection of elongated linkage disequilibrium (LD) has been a powerful tool in the field of evolutionary genetics, current LD-based approaches are not applicable to already fixed loci. Here, we report a method of scanning for population-specific strong selective sweeps that have reached fixation. In this method, genome-wide SNP data is used to analyze differences in the haplotype frequency, nucleotide diversity, and LD between populations, using the ratio of haplotype homozygosity between populations. To estimate the detection power of the statistics used in this study, we performed computer simulations and found that these tests are relatively robust against the density of typed SNPs and demographic parameters if the advantageous allele has reached fixation. Therefore, we could determine the threshold for maintaining high detection power, regardless of SNP density and demographic history. When this method was applied to the HapMap data, it was able to identify the candidates of population-specific strong selective sweeps more efficiently than the outlier approach that depends on the empirical distribution. This study, confirming strong positive selection on genes previously reported to be associated with specific phenotypes, also identifies other candidates that are likely to contribute to phenotypic differences between human populations.  相似文献   

7.
Roze D  Barton NH 《Genetics》2006,173(3):1793-1811
In finite populations, genetic drift generates interference between selected loci, causing advantageous alleles to be found more often on different chromosomes than on the same chromosome, which reduces the rate of adaptation. This "Hill-Robertson effect" generates indirect selection to increase recombination rates. We present a new method to quantify the strength of this selection. Our model represents a new beneficial allele (A) entering a population as a single copy, while another beneficial allele (B) is sweeping at another locus. A third locus affects the recombination rate between selected loci. Using a branching process model, we calculate the probability distribution of the number of copies of A on the different genetic backgrounds, after it is established but while it is still rare. Then, we use a deterministic model to express the change in frequency of the recombination modifier, due to hitchhiking, as A goes to fixation. We show that this method can give good estimates of selection for recombination. Moreover, it shows that recombination is selected through two different effects: it increases the fixation probability of new alleles, and it accelerates selective sweeps. The relative importance of these two effects depends on the relative times of occurrence of the beneficial alleles.  相似文献   

8.
Nuzhdin SV  Harshman LG  Zhou M  Harmon K 《Heredity》2007,99(3):313-321
Identification of genes underlying complex traits is an important problem. Quantitative trait loci (QTL) are mapped using marker-trait co-segregation in large panels of recombinant genotypes. Most frequently, recombinant inbred lines derived from two isogenic parents are used. Segregation patterns are also studied in pedigrees from multiple families. Great advances have been made through creative use of these techniques, but narrow sampling and inadequate power represent strong limitations. Here, we propose an approach combining the strengths of both techniques. We established a mapping population from a sample of natural genotypes, and applied artificial selection for a complex character. Selection changed the frequencies of alleles in QTLs contributing to the selection response. We infer QTLs with dense genotyping microarrays by identifying blocks of linked markers undergoing selective changes in allele frequency. We demonstrated this approach with an experimental population composed from 20 isogenic strains. Selection for starvation survival was executed in three replicated populations with three control non-selected populations. Three individuals per population were genotyped using Affymetrix GeneChips. Two regions of the genome, one each on the left arms of the second and third chromosomes, showed significant divergence between control and selected populations. For the former region, we inferred allele frequencies in selected and control populations by pyrosequencing. We conclude that the allele frequency difference, averaging approximately 40% between selected and control lines, contributed to selection response. Our approach can contribute to the fine scale decomposition of the genetics of direct and indirect selection responses, and genotype by environment interactions.  相似文献   

9.
As species struggle to keep pace with the rapidly warming climate, adaptive introgression of beneficial alleles from closely related species or populations provides a possible avenue for rapid adaptation. We investigate the potential for adaptive introgression in the copepod, Tigriopus californicus, by hybridizing two populations with divergent heat tolerance limits. We subjected hybrids to strong heat selection for 15 generations followed by whole-genome resequencing. Utilizing a hybridize evolve and resequence (HER) technique, we can identify loci responding to heat selection via a change in allele frequency. We successfully increased the heat tolerance (measured as LT50) in selected lines, which was coupled with higher frequencies of alleles from the southern (heat tolerant) population. These repeatable changes in allele frequencies occurred on all 12 chromosomes across all independent selected lines, providing evidence that heat tolerance is polygenic. These loci contained genes with lower protein-coding sequence divergence than the genome-wide average, indicating that these loci are highly conserved between the two populations. In addition, these loci were enriched in genes that changed expression patterns between selected and control lines in response to a nonlethal heat shock. Therefore, we hypothesize that the mechanism of heat tolerance divergence is explained by differential gene expression of highly conserved genes. The HER approach offers a unique solution to identifying genetic variants contributing to polygenic traits, especially variants that might be missed through other population genomic approaches.  相似文献   

10.
Genetic architecture of a selection response in Arabidopsis thaliana   总被引:1,自引:0,他引:1  
Quantitative trait locus (QTL) mapping has become an established and effective method for studying the genetic architecture of complex traits. In this report, we use a QTL mapping approach in combination with data from a large selection experiment in Arabidopsis thaliana to explore a response to selection of experimental populations with differentiated genetic backgrounds. Experimental populations with genetic backgrounds derived from ecotypes Landsberg and Niederzenz were exposed to multiple generations of fertility and viability selection. This selection resulted in phenotypic shifts in a number of life-history and fitness-related characters including early development time, flowering time, dry biomass, longevity, and fruit production. Quantitative trait loci were mapped for these traits and their positions were compared to previously characterized allele frequency changes in the experimental populations (Ungerer et al. 2003). Quantitative trait locus positions largely colocalized with genomic regions under strong and consistent selection in populations with differentiated genetic backgrounds, suggesting that alleles for these traits were selected similarly in differentiated genetic backgrounds. However, one QTL region exhibited a more variable response; being positively selected on one genetic background but apparently neutral in another. This study demonstrates how QTL mapping approaches can be combined with map-based population genetic data to study how selection acts on standing genetic variation in populations.  相似文献   

11.
M. J. Mackinnon  MAJ. Georges 《Genetics》1992,132(4):1177-1185
The effects of within-sample selection on the outcome of analyses detecting linkage between genetic markers and quantitative traits were studied. It was found that selection by truncation for the trait of interest significantly reduces the differences between marker genotype means thus reducing the power to detect linked quantitative trait loci (QTL). The size of this reduction is a function of proportion selected, the magnitude of the QTL effect, recombination rate between the marker locus and the QTL, and the allele frequency of the QTL. Proportion selected was the most influential of these factors on bias, e.g., for an allele substitution effect of one standard deviation unit, selecting the top 80%, 50% or 20% of the population required 2, 6 or 24 times the number of progeny, respectively, to offset the loss of power caused by this selection. The effect on power was approximately linear with respect to the size of gene effect, almost invariant to recombination rate, and a complex function of QTL allele frequency. It was concluded that experimental samples from animal populations which have been subjected to even minor amounts of selection will be inefficient in yielding information on linkage between markers and loci influencing the quantitative trait under selection.  相似文献   

12.
The inference of positive selection in genomes is a problem of great interest in evolutionary genomics. By identifying putative regions of the genome that contain adaptive mutations, we are able to learn about the biology of organisms and their evolutionary history. Here we introduce a composite likelihood method that identifies recently completed or ongoing positive selection by searching for extreme distortions in the spatial distribution of the haplotype frequency spectrum along the genome relative to the genome-wide expectation taken as neutrality. Furthermore, the method simultaneously infers two parameters of the sweep: the number of sweeping haplotypes and the “width” of the sweep, which is related to the strength and timing of selection. We demonstrate that this method outperforms the leading haplotype-based selection statistics, though strong signals in low-recombination regions merit extra scrutiny. As a positive control, we apply it to two well-studied human populations from the 1000 Genomes Project and examine haplotype frequency spectrum patterns at the LCT and MHC loci. We also apply it to a data set of brown rats sampled in NYC and identify genes related to olfactory perception. To facilitate use of this method, we have implemented it in user-friendly open source software.  相似文献   

13.
Skin pigmentation is a human phenotype that varies greatly among human populations and it has long been speculated that this variation is adaptive. We therefore expect the genes that contribute to these large differences in phenotype to show large allele frequency differences among populations and to possibly harbor signatures of positive selection. To identify the loci that likely contribute to among-population human skin pigmentation differences, we measured allele frequency differentiation among Europeans, Chinese and Africans for 24 human pigmentation genes from 2 publicly available, large scale SNP data sets. Several skin pigmentation genes show unusually large allele frequency differences among these populations. To determine whether these allele frequency differences might be due to selection, we employed a within-population test based on long-range haplotype structure and identified several outliers that have not been previously identified as putatively adaptive. Most notably, we identify the DCT gene as a candidate for recent positive selection in the Chinese. Moreover, our analyses suggest that it is likely that different genes are responsible for the lighter skin pigmentation found in different non-African populations. Electronic supplementary material Supplementary material is available in the online version of this article at and is accessible for authorized users.  相似文献   

14.
Thanks to genome‐scale diversity data, present‐day studies can provide a detailed view of how natural and cultivated species adapt to their environment and particularly to environmental gradients. However, due to their sensitivity, up‐to‐date studies might be more sensitive to undocumented demographic effects such as the pattern of migration and the reproduction regime. In this study, we provide guidelines for the use of popular or recently developed statistical methods to detect footprints of selection. We simulated 100 populations along a selective gradient and explored different migration models, sampling schemes and rates of self‐fertilization. We investigated the power and robustness of eight methods to detect loci potentially under selection: three designed to detect genotype–environment correlations and five designed to detect adaptive differentiation (based on FST or similar measures). We show that genotype–environment correlation methods have substantially more power to detect selection than differentiation‐based methods but that they generally suffer from high rates of false positives. This effect is exacerbated whenever allele frequencies are correlated, either between populations or within populations. Our results suggest that, when the underlying genetic structure of the data is unknown, a number of robust methods are preferable. Moreover, in the simulated scenario we used, sampling many populations led to better results than sampling many individuals per population. Finally, care should be taken when using methods to identify genotype–environment correlations without correcting for allele frequency autocorrelation because of the risk of spurious signals due to allele frequency correlations between populations.  相似文献   

15.
New strategies are required to identify the most important targets of protective immunity in complex eukaryotic pathogens. Natural selection maintains allelic variation in some antigens of the malaria parasite Plasmodium falciparum. Analysis of allele frequency distributions could identify the loci under most intense selection. The merozoite surface protein 1 (Msp1) is the most-abundant surface component on the erythrocyte-invading stage of P. falciparum. Immunization with whole Msp1 has protected monkeys completely against homologous and partially against non-homologous parasite strains. The single-copy msp1 gene, of about 5 kilobases, has highly divergent alleles with stable frequencies in endemic populations. To identify the region of msp1 under strongest selection to maintain alleles within populations, we studied multiple intragenic sequence loci in populations in different regions of Africa and Southeast Asia. On both continents, the locus with the lowest inter-population variance in allele frequencies was block 2, indicating selection in this part of the gene. To test the hypothesis of immune selection, we undertook a large prospective longitudinal cohort study. This demonstrated that serum IgG antibodies against each of the two most frequent allelic types of block 2 of the protein were strongly associated with protection from P. falciparum malaria.  相似文献   

16.
While hundreds of loci have been identified as reflecting strong-positive selection in human populations, connections between candidate loci and specific selective pressures often remain obscure. This study investigates broader patterns of selection in African populations, which are underrepresented despite their potential to offer key insights into human adaptation. We scan for hard selective sweeps using several haplotype and allele-frequency statistics with a data set of nearly 500,000 genome-wide single-nucleotide polymorphisms in 12 highly diverged African populations that span a range of environments and subsistence strategies. We find that positive selection does not appear to be a strong determinant of allele-frequency differentiation among these African populations. Haplotype statistics do identify putatively selected regions that are shared across African populations. However, as assessed by extensive simulations, patterns of haplotype sharing between African populations follow neutral expectations and suggest that tails of the empirical distributions contain false-positive signals. After highlighting several genomic regions where positive selection can be inferred with higher confidence, we use a novel method to identify biological functions enriched among populations’ empirical tail genomic windows, such as immune response in agricultural groups. In general, however, it seems that current methods for selection scans are poorly suited to populations that, like the African populations in this study, are affected by ascertainment bias and have low levels of linkage disequilibrium, possibly old selective sweeps, and potentially reduced phasing accuracy. Additionally, population history can confound the interpretation of selection statistics, suggesting that greater care is needed in attributing broad genetic patterns to human adaptation.  相似文献   

17.
Natural hybrid zones offer a powerful framework for understanding the genetic basis of speciation in progress because ongoing hybridization continually creates unfavorable gene combinations. Evidence indicates that postzygotic reproductive isolation is often caused by epistatic interactions between mutations in different genes that evolved independently of one another (hybrid incompatibilities). We examined the potential to detect epistatic selection against incompatibilities from genome sequence data using the site frequency spectrum (SFS) of polymorphisms by conducting individual-based simulations in SLiM. We found that the genome-wide SFS in hybrid populations assumes a diagnostic shape, with the continual input of fixed differences between source populations via migration inducing a mass at intermediate allele frequency. Epistatic selection locally distorts the SFS as non-incompatibility alleles rise in frequency in a manner analogous to a selective sweep. Building on these results, we present a statistical method to identify genomic regions containing incompatibility loci that locates departures in the local SFS compared with the genome-wide SFS. Cross-validation studies demonstrate that our method detects recessive and codominant incompatibilities across a range of scenarios varying in the strength of epistatic selection, migration rate, and hybrid zone age. Our approach takes advantage of whole genome sequence data, does not require knowledge of demographic history, and can be applied to any pair of nascent species that forms a hybrid zone.  相似文献   

18.
Selection mapping applies the population genetics theory of hitchhiking to the localization of genomic regions containing genes under selection. This approach predicts that neutral loci linked to genes under positive selection will have reduced diversity due to their shared history with a selected locus, and thus, genome scans of diversity levels can be used to identify regions containing selected loci. Most previous approaches to this problem ignore the spatial genomic pattern of diversity expected under selection. The regression-based approach advocated in this paper takes into account the expected pattern of decreasing genetic diversity with increased proximity to a selected locus. Simulated data are used to examine the patterns of diversity under different scenarios, in order to assess the power of a regression-based approach to the identification of regions under selection. Application of this method to both simulated and empirical data demonstrates its potential to detect selection. In contrast to some other methods, the regression approach described in this paper can be applied to any marker type. Results also suggest that this approach may give more precise estimates of the location of the selected locus than alternative methods, although the power is slightly lower in some cases.  相似文献   

19.
Experimental evolution studies can be used to explore genomic response to artificial and natural selection. In such studies, loci that display larger allele frequency change than expected by genetic drift alone are assumed to be directly or indirectly associated with traits under selection. However, such studies report surprisingly many loci under selection, suggesting that current tests for allele frequency change may be subject to P‐value inflation and hence be anticonservative. One factor known from genomewide association (GWA) studies to cause P‐value inflation is population stratification, such as relatedness among individuals. Here, we suggest that by treating presence of an individual in a population after selection as a binary response variable, existing GWA methods can be used to account for relatedness when estimating allele frequency change. We show that accounting for relatedness like this effectively reduces false‐positives in tests for allele frequency change in simulated data with varying levels of population structure. However, once relatedness has been accounted for, the power to detect causal loci under selection is low. Finally, we demonstrate the presence of P‐value inflation in allele frequency change in empirical data spanning multiple generations from an artificial selection experiment on tarsus length in two free‐living populations of house sparrow and correct for this using genomic control. Our results indicate that since allele frequencies in large parts of the genome may change when selection acts on a heritable trait, such selection is likely to have considerable and immediate consequences for the eco‐evolutionary dynamics of the affected populations.  相似文献   

20.
Genetic similarities within and between human populations   总被引:2,自引:0,他引:2       下载免费PDF全文
The proportion of human genetic variation due to differences between populations is modest, and individuals from different populations can be genetically more similar than individuals from the same population. Yet sufficient genetic data can permit accurate classification of individuals into populations. Both findings can be obtained from the same data set, using the same number of polymorphic loci. This article explains why. Our analysis focuses on the frequency, omega, with which a pair of random individuals from two different populations is genetically more similar than a pair of individuals randomly selected from any single population. We compare omega to the error rates of several classification methods, using data sets that vary in number of loci, average allele frequency, populations sampled, and polymorphism ascertainment strategy. We demonstrate that classification methods achieve higher discriminatory power than omega because of their use of aggregate properties of populations. The number of loci analyzed is the most critical variable: with 100 polymorphisms, accurate classification is possible, but omega remains sizable, even when using populations as distinct as sub-Saharan Africans and Europeans. Phenotypes controlled by a dozen or fewer loci can therefore be expected to show substantial overlap between human populations. This provides empirical justification for caution when using population labels in biomedical settings, with broad implications for personalized medicine, pharmacogenetics, and the meaning of race.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号