首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Directional selection and the site-frequency spectrum.   总被引:4,自引:0,他引:4  
C D Bustamante  J Wakeley  S Sawyer  D L Hartl 《Genetics》2001,159(4):1779-1788
In this article we explore statistical properties of the maximum-likelihood estimates (MLEs) of the selection and mutation parameters in a Poisson random field population genetics model of directional selection at DNA sites. We derive the asymptotic variances and covariance of the MLEs and explore the power of the likelihood ratio tests (LRT) of neutrality for varying levels of mutation and selection as well as the robustness of the LRT to deviations from the assumption of free recombination among sites. We also discuss the coverage of confidence intervals on the basis of two standard-likelihood methods. We find that the LRT has high power to detect deviations from neutrality and that the maximum-likelihood estimation performs very well when the ancestral states of all mutations in the sample are known. When the ancestral states are not known, the test has high power to detect deviations from neutrality for negative selection but not for positive selection. We also find that the LRT is not robust to deviations from the assumption of independence among sites.  相似文献   

2.
Zhu L  Bustamante CD 《Genetics》2005,170(3):1411-1421
We present a novel composite-likelihood-ratio test (CLRT) for detecting genes and genomic regions that are subject to recurrent natural selection (either positive or negative). The method uses the likelihood functions of Hartl et al. (1994) for inference in a Wright-Fisher genic selection model and corrects for nonindependence among sites by application of coalescent simulations with recombination. Here, we (1) characterize the distribution of the CLRT statistic (Lambda) as a function of the population recombination rate (R=4Ner); (2) explore the effects of bias in estimation of R on the size (type I error) of the CLRT; (3) explore the robustness of the model to population growth, bottlenecks, and migration; (4) explore the power of the CLRT under varying levels of mutation, selection, and recombination; (5) explore the discriminatory power of the test in distinguishing negative selection from population growth; and (6) evaluate the performance of maximum composite-likelihood estimation (MCLE) of the selection coefficient. We find that the test has excellent power to detect weak negative selection and moderate power to detect positive selection. Moreover, the test is quite robust to bias in the estimate of local recombination rate, but not to certain demographic scenarios such as population growth or a recent bottleneck. Last, we demonstrate that the MCLE of the selection parameter has little bias for weak negative selection and has downward bias for positively selected mutations.  相似文献   

3.
Anisimova M  Nielsen R  Yang Z 《Genetics》2003,164(3):1229-1236
Maximum-likelihood methods based on models of codon substitution accounting for heterogeneous selective pressures across sites have proved to be powerful in detecting positive selection in protein-coding DNA sequences. Those methods are phylogeny based and do not account for the effects of recombination. When recombination occurs, such as in population data, no unique tree topology can describe the evolutionary history of the whole sequence. This violation of assumptions raises serious concerns about the likelihood method for detecting positive selection. Here we use computer simulation to evaluate the reliability of the likelihood-ratio test (LRT) for positive selection in the presence of recombination. We examine three tests based on different models of variable selective pressures among sites. Sequences are simulated using a coalescent model with recombination and analyzed using codon-based likelihood models ignoring recombination. We find that the LRT is robust to low levels of recombination (with fewer than three recombination events in the history of a sample of 10 sequences). However, at higher levels of recombination, the type I error rate can be as high as 90%, especially when the null model in the LRT is unrealistic, and the test often mistakes recombination as evidence for positive selection. The test that compares the more realistic models M7 (beta) against M8 (beta and omega) is more robust to recombination, where the null model M7 allows the positive selection pressure to vary between 0 and 1 (and so does not account for positive selection), and the alternative model M8 allows an additional discrete class with omega = d(N)/d(S) that could be estimated to be >1 (and thus accounts for positive selection). Identification of sites under positive selection by the empirical Bayes method appears to be less affected than the LRT by recombination.  相似文献   

4.
The selective pressure at the protein level is usually measured by the nonsynonymous/synonymous rate ratio (omega = dN/dS), with omega < 1, omega = 1, and omega > 1 indicating purifying (or negative) selection, neutral evolution, and diversifying (or positive) selection, respectively. The omega ratio is commonly calculated as an average over sites. As every functional protein has some amino acid sites under selective constraints, averaging rates across sites leads to low power to detect positive selection. Recently developed models of codon substitution allow the omega ratio to vary among sites and appear to be powerful in detecting positive selection in empirical data analysis. In this study, we used computer simulation to investigate the accuracy and power of the likelihood ratio test (LRT) in detecting positive selection at amino acid sites. The test compares two nested models: one that allows for sites under positive selection (with omega > 1), and another that does not, with the chi2 distribution used for significance testing. We found that use of the chi(2) distribution makes the test conservative, especially when the data contain very short and highly similar sequences. Nevertheless, the LRT is powerful. Although the power can be low with only 5 or 6 sequences in the data, it was nearly 100% in data sets of 17 sequences. Sequence length, sequence divergence, and the strength of positive selection also were found to affect the power of the LRT. The exact distribution assumed for the omega ratio over sites was found not to affect the effectiveness of the LRT.  相似文献   

5.
M. J. Mackinnon  MAJ. Georges 《Genetics》1992,132(4):1177-1185
The effects of within-sample selection on the outcome of analyses detecting linkage between genetic markers and quantitative traits were studied. It was found that selection by truncation for the trait of interest significantly reduces the differences between marker genotype means thus reducing the power to detect linked quantitative trait loci (QTL). The size of this reduction is a function of proportion selected, the magnitude of the QTL effect, recombination rate between the marker locus and the QTL, and the allele frequency of the QTL. Proportion selected was the most influential of these factors on bias, e.g., for an allele substitution effect of one standard deviation unit, selecting the top 80%, 50% or 20% of the population required 2, 6 or 24 times the number of progeny, respectively, to offset the loss of power caused by this selection. The effect on power was approximately linear with respect to the size of gene effect, almost invariant to recombination rate, and a complex function of QTL allele frequency. It was concluded that experimental samples from animal populations which have been subjected to even minor amounts of selection will be inefficient in yielding information on linkage between markers and loci influencing the quantitative trait under selection.  相似文献   

6.
The advent of molecular genetic markers has stimulated interest in detecting linkage between a marker locus and a quantitative trait locus (QTL) because the marker locus, even without direct effect on the quantitative trait, could be useful in increasing the response to selection. A correlation method for detecting and estimating linkage between a marker locus and a QTL is described using selfing and sib-mating populations. Computer simulations were performed to estimate the power of the method, the sample size (N) needed to detect linkage, and the recombination value (r). The power of this method was a function of the expected recombination value E(r), the standardized difference (d) between the QTL genotypic means, and N. The power was highest at complete linkage, decreased with an increase in E(r), and then increased at E(r)=0.5. A larger d and N led to a higher power. The sample size needed to detect linkage was dependent upon E(r) and d. The sample size had a minimum value at E(r)=0, increased with an increase in E(r) and a decrease in d. In general, the r was overestimated. With an increase in d, the r was closer to its expectation. Detection of linkage by the proposed method under incomplete linkage was more efficient than estimation of recombination values. The correlation method and the method of comparison of marker-genotype means have a similar power when there is linkage, but the former has a slightly higher power than the latter when there is no linkage.  相似文献   

7.
Cutter AD 《Genetics》2008,178(3):1661-1672
Natural selection and neutral processes such as demography, mutation, and gene conversion all contribute to patterns of polymorphism within genomes. Identifying the relative importance of these varied components in evolution provides the principal challenge for population genetics. To address this issue in the nematode Caenorhabditis remanei, I sampled nucleotide polymorphism at 40 loci across the X chromosome. The site-frequency spectrum for these loci provides no evidence for population size change, and one locus presents a candidate for linkage to a target of balancing selection. Selection for codon usage bias leads to the non-neutrality of synonymous sites, and despite its weak magnitude of effect (N(e)s approximately 0.1), is responsible for profound patterns of diversity and divergence in the C. remanei genome. Although gene conversion is evident for many loci, biased gene conversion is not identified as a significant evolutionary process in this sample. No consistent association is observed between synonymous-site diversity and linkage-disequilibrium-based estimators of the population recombination parameter, despite theoretical predictions about background selection or widespread genetic hitchhiking, but genetic map-based estimates of recombination are needed to rigorously test for a diversity-recombination relationship. Coalescent simulations also illustrate how a spurious correlation between diversity and linkage-disequilibrium-based estimators of recombination can occur, due in part to the presence of unbiased gene conversion. These results illustrate the influence that subtle natural selection can exert on polymorphism and divergence, in the form of codon usage bias, and demonstrate the potential of C. remanei for detecting natural selection from genomic scans of polymorphism.  相似文献   

8.
According to population genetics models, genomic regions with lower crossing-over rates are expected to experience less effective selection because of Hill-Robertson interference (HRi). The effect of genetic linkage is thought to be particularly important for a selection of weak intensity such as selection affecting codon usage. Consistent with this model, codon bias correlates positively with recombination rate in Drosophila melanogaster and Caenorhabditis elegans. However, in these species, the G+C content of both noncoding DNA and synonymous sites correlates positively with recombination, which suggests that mutation patterns and recombination are associated. To remove this effect of mutation patterns on codon bias, we used the synonymous sites of lowly expressed genes that are expected to be effectively neutral sites. We measured the differences between codon biases of highly expressed genes and their lowly expressed neighbors. In D. melanogaster we find that HRi weakly reduces selection on codon usage of genes located in regions of very low recombination; but these genes only comprise 4% of the total. In C. elegans we do not find any evidence for the effect of recombination on selection for codon bias. Computer simulations indicate that HRi poorly enhances codon bias if the local recombination rate is greater than the mutation rate. This prediction of the model is consistent with our data and with the current estimate of the mutation rate in D. melanogaster. The case of C. elegans, which is highly self-fertilizing, is discussed. Our results suggest that HRi is a minor determinant of variations in codon bias across the genome.  相似文献   

9.
A simple nearly neutral mutation model of protein evolution was studied using computer simulation assuming a constant population size. In this model, a gene consists of a finite number of codons and there is no recombination within a gene. Each codon has two replacement and one silent sites. The fitness of a gene was determined multiplicatively by amino acids specified by codons (the independent multicodon model). Nucleotide diversity at replacement sites decreases as selection becomes stronger. A reduction of nucleotide diversity at silent sites also occurs as selection intensifies but the magnitude of the reduction is not a monotone function of the intensity of selection. The dispersion index is close to one. The average value of Tajima's and Fu and Li's statistics are negative and their absolute values increases as selection intensifies. However, their powers of detecting selection under the present model were not high unless the number of sites is large or mutation rate is high. The MK test was shown to detect intermediate selection fairly well. For comparison, the house-of-cards model was also investigated and its behavior was shown to be more sensitive to changes of population size than that of the independent multicodon model. The relevance of the present model for explaining protein evolution was discussed comparing its prediction and recent DNA data. Received: 24 May 1999 / Accepted: 17 August 1999  相似文献   

10.
Brian Charlesworth 《Genetics》2013,194(4):955-971
Genomic traits such as codon usage and the lengths of noncoding sequences may be subject to stabilizing selection rather than purifying selection. Mutations affecting these traits are often biased in one direction. To investigate the potential role of stabilizing selection on genomic traits, the effects of mutational bias on the equilibrium value of a trait under stabilizing selection in a finite population were investigated, using two different mutational models. Numerical results were generated using a matrix method for calculating the probability distribution of variant frequencies at sites affecting the trait, as well as by Monte Carlo simulations. Analytical approximations were also derived, which provided useful insights into the numerical results. A novel conclusion is that the scaled intensity of selection acting on individual variants is nearly independent of the effective population size over a wide range of parameter space and is strongly determined by the logarithm of the mutational bias parameter. This is true even when there is a very small departure of the mean from the optimum, as is usually the case. This implies that studies of the frequency spectra of DNA sequence variants may be unable to distinguish between stabilizing and purifying selection. A similar investigation of purifying selection against deleterious mutations was also carried out. Contrary to previous suggestions, the scaled intensity of purifying selection with synergistic fitness effects is sensitive to population size, which is inconsistent with the general lack of sensitivity of codon usage to effective population size.  相似文献   

11.
人类混血群体可以说是混合群体的一种特例.在无选择、无突变、无限随机交配群体的假定前提下,研究了亲本群体的基因频率对混血群体及其衍生后代群体连锁不平衡结构的影响,导出了各群体连锁不平衡值的表达式,建立了一个估计基因间重组率的简便方法;同时, 采用估算分子标记与QTL之间连锁不平衡系数的统计分析方法,分析了人类混血群体及其衍生后代群体QTL检测与估计的关系,建立了该关系的系列理论公式.研究结果表明,本方法不仅适用于人类疾病(包括复杂遗传疾病)基因定位,而且适合于人类正常基因的定位,同时也适用于人类普通多基因性状的QTL分析.  相似文献   

12.
The composite-likelihood estimator (CLE) of the population recombination rate considers only sites with exactly two alleles under a finite-sites mutation model (McVean, G. A. T., P. Awadalla, and P. Fearnhead. 2002. A coalescent-based method for detecting and estimating recombination from gene sequences. Genetics 160:1231-1241). While in such a model the identity of alleles is not considered, the CLE has been shown to be robust to minor misspecification of the underlying mutational model. However, there are many situations where the putative mutation and demographic history can be quite complex. One good example is rapidly evolving pathogens, like HIV-1. First we evaluated the performance of the CLE and the likelihood permutation test (LPT) under more complex, realistic models, including a general time reversible (GTR) substitution model, rate heterogeneity among sites (Gamma), positive selection, population growth, population structure, and noncontemporaneous sampling. Second, we relaxed some of the assumptions of the CLE allowing for a four-allele, GTR + Gamma model in an attempt to use the data more efficiently. Through simulations and the analysis of real data, we concluded that the CLE is robust to severe misspecifications of the substitution model, but underestimates the recombination rate in the presence of exponential growth, population mixture, selection, or noncontemporaneous sampling. In such cases, the use of more complex models slightly increases performance in some occasions, especially in the case of the LPT. Thus, our results provide for a more robust application of the estimation of recombination rates.  相似文献   

13.
Stochastic simulations of the infinite sites model were used to study the behavior of genetic diversity at a neutral locus in a genomic region without recombination, but subject to selection against deleterious alleles maintained by recurrent mutation (background selection). In large populations, the effect of background selection on the number of segregating sites approaches the effct on nucleotide site diversity, i.e., the reduction in genetic variability caused by background selection resembles that caused by a simple reduction in effective population size. We examined, by coalescence-based methods, the power of several tests for the departure from neutral expectation of the frequency spectra of alleles in samples from randomly mating populations (TAJIMA's, FU and LI's, and WATTERSON's tests). All of the tests have low power unless the selection against mutant alleles is extremely weak. In Drosophila, significant TAJIMA's tests are usually not obtained with empirical data sets from loci in genomic regions with restricted recombination frequencies and that exhibit low genetic diversity. This is consistent with the operation of background selection as opposed to selective sweeps. It remains to be decided whether background selection is sufficient to explain the observed extent of reduction in diversity in regions of restricted recombination.  相似文献   

14.
The MHC class II loci encoding cell surface antigens exhibit extremely high allelic polymorphism. There is considerable uncertainty in the literature over the relative roles of recombination and de novo mutation in generating this diversity. We studied class II sequence diversity and allelic polymorphism in two populations of Peromyscus maniculatus, which are among the most widespread and abundant mammals of North America. We find that intragenic recombination (or gene conversion) has been the predominant mode for the generation of allelic polymorphism in this species, with the amount of population recombination per base pair exceeding mutation by at least an order of magnitude during the history of the sample. Despite this, patchwork motifs of sites with high linkage disequilibrium are observed. This does not appear to be consistent with the much larger amount of recombination versus mutation in the history of the sample, unless the recombination rate is highly non-uniform over the sequence or selection maintains certain sites in linkage disequilibrium. We conclude that selection is most likely to be responsible for preserving sequence motifs in the presence of abundant recombination.  相似文献   

15.
The efficient design of association mapping studies relies on a knowledge of the rate of decay of linkage disequilibrium with distance. This rate depends on the population recombination rate, C. An estimate of C for humans is usually obtained from a comparison of physical and genetic maps, assuming an effective population size of approximately 10(4). We demonstrate that under both a constant population size model and a model of long-term exponential growth, there is evidence for more recombination in polymorphism data than is expected from this estimate. An important contribution of gene conversion to meiotic recombination helps to explain our observation, but does not appear to be sufficient. The occurrence of multiple hits at CpG sites and the presence of population structure are not explanations.  相似文献   

16.
Comeron JM  Kreitman M 《Genetics》2000,156(3):1175-1190
Intron length is negatively correlated with recombination in both Drosophila melanogaster and humans. This correlation is not likely to be the result of mutational processes alone: evolutionary analysis of intron length polymorphism in D. melanogaster reveals equivalent ratios of deletion to insertion in regions of high and low recombination. The polymorphism data do reveal, however, an excess of deletions relative to insertions (i.e., a deletion bias), with an overall deletion-to-insertion events ratio of 1.35. We propose two types of selection favoring longer intron lengths. First, the natural mutational bias toward deletion must be opposed by strong selection in very short introns to maintain the minimum intron length needed for the intron splicing reaction. Second, selection will favor insertions in introns that increase recombination between mutations under the influence of selection in adjacent exons. Mutations that increase recombination, even slightly, will be selectively favored because they reduce interference among selected mutations. Interference selection acting on intron length mutations must be very weak, as indicated by frequency spectrum analysis of Drosophila intron length polymorphism, making the equilibrium for intron length sensitive to changes in the recombinational environment and population size. One consequence of this sensitivity is that the advantage of longer introns is expected to decrease inversely with the rate of recombination, thus leading to a negative correlation between intron length and recombination rate. Also in accord with this model, intron length differs between closely related Drosophila species, with the longest variant present more often in D. melanogaster than in D. simulans. We suggest that the study of the proposed dynamic model, taking into account interference among selected sites, might shed light on many aspects of the comparative biology of genome sizes including the C value paradox.  相似文献   

17.
Stabilizing selection around a fixed phenotypic optimum is expected to disfavor sexual reproduction, since asexually reproducing organisms can maintain a higher fitness at equilibrium, while sex disrupts combinations of compensatory mutations. This conclusion rests on the assumption that mutational effects on phenotypic traits are unbiased, that is, mutation does not tend to push phenotypes in any particular direction. In this article, we consider a model of stabilizing selection acting on an arbitrary number of polygenic traits coded by bialellic loci, and show that mutational bias may greatly reduce the mean fitness of asexual populations compared with sexual ones in regimes where mutations have weak to moderate fitness effects. Indeed, mutation and drift tend to push the population mean phenotype away from the optimum, this effect being enhanced by the low effective population size of asexual populations. In a second part, we present results from individual‐based simulations showing that positive rates of sex are favored when mutational bias is present, while the population evolves toward complete asexuality in the absence of bias. We also present analytical (QLE) approximations for the selective forces acting on sex in terms of the effect of sex on the mean and variance in fitness among offspring.  相似文献   

18.
Kao CH 《Genetics》2000,156(2):855-865
The differences between maximum-likelihood (ML) and regression (REG) interval mapping in the analysis of quantitative trait loci (QTL) are investigated analytically and numerically by simulation. The analytical investigation is based on the comparison of the solution sets of the ML and REG methods in the estimation of QTL parameters. Their differences are found to relate to the similarity between the conditional posterior and conditional probabilities of QTL genotypes and depend on several factors, such as the proportion of variance explained by QTL, relative QTL position in an interval, interval size, difference between the sizes of QTL, epistasis, and linkage between QTL. The differences in mean squared error (MSE) of the estimates, likelihood-ratio test (LRT) statistics in testing parameters, and power of QTL detection between the two methods become larger as (1) the proportion of variance explained by QTL becomes higher, (2) the QTL locations are positioned toward the middle of intervals, (3) the QTL are located in wider marker intervals, (4) epistasis between QTL is stronger, (5) the difference between QTL effects becomes larger, and (6) the positions of QTL get closer in QTL mapping. The REG method is biased in the estimation of the proportion of variance explained by QTL, and it may have a serious problem in detecting closely linked QTL when compared to the ML method. In general, the differences between the two methods may be minor, but can be significant when QTL interact or are closely linked. The ML method tends to be more powerful and to give estimates with smaller MSEs and larger LRT statistics. This implies that ML interval mapping can be more accurate, precise, and powerful than REG interval mapping. The REG method is faster in computation, especially when the number of QTL considered in the model is large. Recognizing the factors affecting the differences between REG and ML interval mapping can help an efficient strategy, using both methods in QTL mapping to be outlined.  相似文献   

19.
A contentious issue in molecular evolution and population genetics concerns the roles of recombination as a facilitator of natural selection and as a potential source of mutational input into genomes. The budding yeast Saccharomyces cerevisiae, in particular, has injected both insights and confusion into this topic, as an early system subject to genomic analysis with subsequent conflicting reports. Here, we revisit the role of recombination in mutation and selection with recent genome-wide maps of population polymorphism and recombination for S. cerevisiae. We confirm that recombination-associated mutation does not leave a genomic signature in yeast and conclude that a previously observed, enigmatic, negative recombination-divergence correlation is largely a consequence of weak selection and other genomic covariates. We also corroborate the presence of biased gene conversion from patterns of polymorphism. Moreover, we identify significant positive relations between recombination and population polymorphism at putatively neutrally evolving sites, independent of other factors and the genomic scale of interrogation. We conclude that widespread natural selection across the yeast genome has left its imprint on segregating genetic variation, but that this signature is much weaker than in Drosophila and Caenorhabditis.  相似文献   

20.
We explore factors affecting patterns of polymorphism and divergence (as captured by the neutrality index) at mammalian mitochondrial loci. To do this, we develop a population genetic model that incorporates a fraction of neutral amino acid sites, mutational bias, and a probability distribution of selection coefficients against new nonsynonymous mutations. We confirm, by reanalyzing publicly available datasets, that the mitochondrial cyt-b gene shows a broad range of neutrality indices across mammalian taxa, and explore the biological factors that can explain this observation. We find that observed patterns of differences in the neutrality index, polymorphism, and divergence are not caused by differences in mutational bias. They can, however, be explained by a combination of a small fraction of neutral amino acid sites, weak selection acting on most amino acid mutations, and differences in effective population size among taxa.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号