首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Contrasting the efficacy of selection on the X and autosomes in Drosophila   总被引:1,自引:0,他引:1  
To investigate the relative efficacy of both positive and purifying natural selection on the X chromosome and the autosomes in Drosophila, we compared rates and patterns of molecular evolution between these chromosome sets using the newly available alignments of orthologous genes from 12 species. Parameters that may influence the relative X versus autosomal substitution rates include the relative effective population sizes, the male and female germline mutation rates, the distribution of allelic effects on fitness, and the degree of dominance of novel mutations. Our analysis reveals that codon usage bias is consistently greater for X-linked genes, suggesting that purifying selection consistently has greater efficacy on the X chromosome than on the autosomes across the Drosophila phylogeny. However, our results are less consistent with respect to the efficacy of positive selection, with only some lineages showing a higher substitution rate on the X chromosome. This suggests that either the distribution of selective effects of mutations or other relevant parameters are sufficiently variable across species to tip the balance in different ways in individual lineages. These data suggest that rates of substitution are not solely governed by adaptive evolution. This genome-wide analysis provides a clear picture that the efficacy of selection varies intragenomically and that this effect is markedly more consistent across the phylogeny in the case of purifying selection. Our results also suggest that simple models that predict systematic differences in rates of evolution between the X and the autosomes can only be made to be compatible with these Drosophila data if the relevant population genetic parameters that drive substitution rates differ among species and chromosomal contexts.  相似文献   

2.
We use a likelihood-based method for mapping mutations on a phylogeny in a way that allows for both site-specific and lineage-specific variation in selection intensity. The method accounts for many of the potential sources of bias encountered in mapping of mutations on trees while still being computationally efficient. We apply the method to a previously published influenza data set to investigate hypotheses about changes in selection intensity in influenza strains. Influenza virus is sometimes propagated in chicken cells for several generations before sequencing, a process that has been hypothesized to induce mutations adapting the virus to the lab medium. Our analysis suggests that there are approximately twice as many replacement substitutions in lineages propagated in chicken eggs as in lineages that are not. Previous studies have attempted to predict which viral strains future epidemics may arise from using inferences regarding positive selection. The assumption is that future epidemics are more likely to arise from the strains in which positive selection on the so-called “trunk lineages” of the evolutionary tree is most pervasive. However, we find no difference in the strength of selection in the trunk lineages versus other evolutionary lineages. Our results suggest that it may be more difficult to use inferences regarding the strength of selection on mutations to make predictions regarding viral epidemics than previously thought. Electronic supplementary material The online version of this article (doi:) contains supplementary material, which is available to authorized users. Reviewing Editor: Dr. Willie Swanson  相似文献   

3.
Agrawal AF  Whitlock MC 《Genetics》2011,187(2):553-566
Data from several thousand knockout mutations in yeast (Saccharomyces cerevisiae) were used to estimate the distribution of dominance coefficients. We propose a new unbiased likelihood approach to measuring dominance coefficients. On average, deleterious mutations are partially recessive, with a mean dominance coefficient ~0.2. Alleles with large homozygous effects are more likely to be more recessive than are alleles of weaker effect. Our approach allows us to quantify, for the first time, the substantial variance and skew in the distribution of dominance coefficients. This heterogeneity is so great that many population genetic processes analyses based on the mean dominance coefficient alone will be in substantial error. These results are applied to the debate about various mechanisms for the evolution of dominance, and we conclude that they are most consistent with models that depend on indirect selection on homeostatic gene expression or on the ability to perform well under periods of high demand for a protein.  相似文献   

4.
Amei A  Sawyer S 《PloS one》2012,7(4):e34413
We apply a recently developed time-dependent Poisson random field model to aligned DNA sequences from two related biological species to estimate selection coefficients and divergence time. We use Markov chain Monte Carlo methods to estimate species divergence time and selection coefficients for each locus. The model assumes that the selective effects of non-synonymous mutations are normally distributed across genetic loci but constant within loci, and synonymous mutations are selectively neutral. In contrast with previous models, we do not assume that the individual species are at population equilibrium after divergence. Using a data set of 91 genes in two Drosophila species, D. melanogaster and D. simulans, we estimate the species divergence time t(div) = 2.16 N(e) (or 1.68 million years, assuming the haploid effective population size N(e) = 6.45 x 10(5) years) and a mean selection coefficient per generation μ(γ) = 1.98/N(e). Although the average selection coefficient is positive, the magnitude of the selection is quite small. Results from numerical simulations are also presented as an accuracy check for the time-dependent model.  相似文献   

5.
We present a likelihood method for estimating codon usage bias parameters along the lineages of a phylogeny. The method is an extension of the classical codon-based models used for estimating dN/dS ratios along the lineages of a phylogeny. However, we add one extra parameter for each lineage: the selection coefficient for optimal codon usage (S), allowing joint maximum likelihood estimation of S and the dN/dS ratio. We apply the method to previously published data from Drosophila melanogaster, Drosophila simulans, and Drosophila yakuba and show, in accordance with previous results, that the D. melanogaster lineage has experienced a reduction in the selection for optimal codon usage. However, the D. melanogaster lineage has also experienced a change in the biological mutation rates relative to D. simulans, in particular, a relative reduction in the mutation rate from A to G and an increase in the mutation rate from C to T. However, neither a reduction in the strength of selection nor a change in the mutational pattern can alone explain all of the data observed in the D. melanogaster lineage. For example, we also confirm previous results showing that the Notch locus has experienced positive selection for previously classified unpreferred mutations.  相似文献   

6.
Recent genome sequencing studies with large sample sizes in humans have discovered a vast quantity of low-frequency variants, providing an important source of information to analyze how selection is acting on human genetic variation. In order to estimate the strength of natural selection acting on low-frequency variants, we have developed a likelihood-based method that uses the lengths of pairwise identity-by-state between haplotypes carrying low-frequency variants. We show that in some nonequilibrium populations (such as those that have had recent population expansions) it is possible to distinguish between positive or negative selection acting on a set of variants. With our new framework, one can infer a fixed selection intensity acting on a set of variants at a particular frequency, or a distribution of selection coefficients for standing variants and new mutations. We show an application of our method to the UK10K phased haplotype dataset of individuals.  相似文献   

7.
8.
The distribution of fitness effects (DFE) of new mutations is of fundamental importance in evolutionary genetics. Recently, methods have been developed for inferring the DFE that use information from the allele frequency distributions of putatively neutral and selected nucleotide polymorphic variants in a population sample. Here, we extend an existing maximum-likelihood method that estimates the DFE under the assumption that mutational effects are unconditionally deleterious, by including a fraction of positively selected mutations. We allow one or more classes of positive selection coefficients in the model and estimate both the fraction of mutations that are advantageous and the strength of selection acting on them. We show by simulations that the method is capable of recovering the parameters of the DFE under a range of conditions. We apply the method to two data sets on multiple protein-coding genes from African populations of Drosophila melanogaster. We use a probabilistic reconstruction of the ancestral states of the polymorphic sites to distinguish between derived and ancestral states at polymorphic nucleotide sites. In both data sets, we see a significant improvement in the fit when a category of positively selected amino acid mutations is included, but no further improvement if additional categories are added. We estimate that between 1% and 2% of new nonsynonymous mutations in D. melanogaster are positively selected, with a scaled selection coefficient representing the product of the effective population size, N(e), and the strength of selection on heterozygous carriers of ~2.5.  相似文献   

9.
We studied the evolution of the HA1 domain of the H3 hemagglutinin gene from human influenza virus type A. The phylogeny of these genes showed a single dominant lineage persisting over time. We tested the hypothesis that the progenitors of this single evolutionarily successful lineage were viruses carrying mutations at codons at which prior mutations had helped the virus to avoid human immune surveillance. We found evidence that eighteen hemagglutinin codons appeared to have been under positive selection to change the amino acid they encoded in the past. Retrospective tests show that viral lineages undergoing the greatest number of mutations in the positively selected codons were the progenitors of future H3 lineages in nine of eleven recent influenza seasons. Codons under positive selection were associated with antibody combining sites A or B or the sialic acid receptor binding site. However, not all codons in these sites had predictive value. Monitoring new H3 isolates for additional changes in positively selected codons might help identify the most fit extant viral strains that arise during antigenic drift.  相似文献   

10.
Tamuri AU  dos Reis M  Goldstein RA 《Genetics》2012,190(3):1101-1115
Estimation of the distribution of selection coefficients of mutations is a long-standing issue in molecular evolution. In addition to population-based methods, the distribution can be estimated from DNA sequence data by phylogenetic-based models. Previous models have generally found unimodal distributions where the probability mass is concentrated between mildly deleterious and nearly neutral mutations. Here we use a sitewise mutation-selection phylogenetic model to estimate the distribution of selection coefficients among novel and fixed mutations (substitutions) in a data set of 244 mammalian mitochondrial genomes and a set of 401 PB2 proteins from influenza. We find a bimodal distribution of selection coefficients for novel mutations in both the mitochondrial data set and for the influenza protein evolving in its natural reservoir, birds. Most of the mutations are strongly deleterious with the rest of the probability mass concentrated around mildly deleterious to neutral mutations. The distribution of the coefficients among substitutions is unimodal and symmetrical around nearly neutral substitutions for both data sets at adaptive equilibrium. About 0.5% of the nonsynonymous mutations and 14% of the nonsynonymous substitutions in the mitochondrial proteins are advantageous, with 0.5% and 24% observed for the influenza protein. Following a host shift of influenza from birds to humans, however, we find among novel mutations in PB2 a trimodal distribution with a small mode of advantageous mutations.  相似文献   

11.
The ratio of nonsynonymous (dN) to synonymous (dS) substitution rates, omega, provides a measure of selection at the protein level. Models have been developed that allow omega to vary among lineages. However, these models require the lineages in which differential selection has acted to be specified a priori. We propose a genetic algorithm approach to assign lineages in a phylogeny to a fixed number of different classes of omega, thus allowing variable selection pressure without a priori specification of particular lineages. This approach can identify models with a better fit than a single-ratio model, and with fits that are better than (in an information theoretic sense) a fully local model, in which all lineages are assumed to evolve under different values of omega, but with far fewer parameters. By averaging over models which explain the data reasonably well, we can assess the robustness of our conclusions to uncertainty in model estimation. Our approach can also be used to compare results from models in which branch classes are specified a priori with a wide range of credible models. We illustrate our methods on primate lysozyme sequences and compare them with previous methods applied to the same data sets.  相似文献   

12.
Bollback JP  York TL  Nielsen R 《Genetics》2008,179(1):497-502
We develop a new method for estimating effective population sizes, Ne, and selection coefficients, s, from time-series data of allele frequencies sampled from a single diallelic locus. The method is based on calculating transition probabilities, using a numerical solution of the diffusion process, and assuming independent binomial sampling from this diffusion process at each time point. We apply the method in two example applications. First, we estimate selection coefficients acting on the CCR5-delta 32 mutation on the basis of published samples of contemporary and ancient human DNA. We show that the data are compatible with the assumption of s = 0, although moderate amounts of selection acting on this mutation cannot be excluded. In our second example, we estimate the selection coefficient acting on a mutation segregating in an experimental phage population. We show that the selection coefficient acting on this mutation is approximately 0.43.  相似文献   

13.
We develop a maximum penalized-likelihood (MPL) method to estimate the fitnesses of amino acids and the distribution of selection coefficients (S = 2Ns) in protein-coding genes from phylogenetic data. This improves on a previous maximum-likelihood method. Various penalty functions are used to penalize extreme estimates of the fitnesses, thus correcting overfitting by the previous method. Using a combination of computer simulation and real data analysis, we evaluate the effect of the various penalties on the estimation of the fitnesses and the distribution of S. We show the new method regularizes the estimates of the fitnesses for small, relatively uninformative data sets, but it can still recover the large proportion of deleterious mutations when present in simulated data. Computer simulations indicate that as the number of taxa in the phylogeny or the level of sequence divergence increases, the distribution of S can be more accurately estimated. Furthermore, the strength of the penalty can be varied to study how informative a particular data set is about the distribution of S. We analyze three protein-coding genes (the chloroplast rubisco protein, mammal mitochondrial proteins, and an influenza virus polymerase) and show the new method recovers a large proportion of deleterious mutations in these data, even under strong penalties, confirming the distribution of S is bimodal in these real data. We recommend the use of the new MPL approach for the estimation of the distribution of S in species phylogenies of protein-coding genes.  相似文献   

14.
Eyre-Walker A  Woolfit M  Phelps T 《Genetics》2006,173(2):891-900
The distribution of fitness effects of new mutations is a fundamental parameter in genetics. Here we present a new method by which the distribution can be estimated. The method is fairly robust to changes in population size and admixture, and it can be corrected for any residual effects if a model of the demography is available. We apply the method to extensively sampled single-nucleotide polymorphism data from humans and estimate the distribution of fitness effects for amino acid changing mutations. We show that a gamma distribution with a shape parameter of 0.23 provides a good fit to the data and we estimate that >50% of mutations are likely to have mild effects, such that they reduce fitness by between one one-thousandth and one-tenth. We also infer that <15% of new mutations are likely to have strongly deleterious effects. We estimate that on average a nonsynonymous mutation reduces fitness by a few percent and that the average strength of selection acting against a nonsynonymous polymorphism is approximately 9 x 10(-5). We argue that the relaxation of natural selection due to modern medicine and reduced variance in family size is not likely to lead to a rapid decline in genetic quality, but that it will be very difficult to locate most of the genes involved in complex genetic diseases.  相似文献   

15.
Epidemiological models have highlighted the importance of population structure in the transmission dynamics of infectious diseases. Using HIV-1 as an example of a model evolutionary system, we consider how population structure affects the shape and the structure of a viral phylogeny in the absence of strong selection at the population level. For structured populations, the number of lineages as a function of time is insufficient to describe the shape of the phylogeny. We develop deterministic approximations for the dynamics of tips of the phylogeny over evolutionary time, the number of ‘cherries’, tips that share a direct common ancestor, and Sackin''s index, a commonly used measure of phylogenetic imbalance or asymmetry. We employ cherries both as a measure of asymmetry of the tree as well as a measure of the association between sequences from different groups. We consider heterogeneity in infectiousness associated with different stages of HIV infection, and in contact rates between groups of individuals. In the absence of selection, we find that population structure may have relatively little impact on the overall asymmetry of a tree, especially when only a small fraction of infected individuals is sampled, but may have marked effects on how sequences from different subpopulations cluster and co-cluster.  相似文献   

16.
Kai Zeng  Pádraic Corcoran 《Genetics》2015,201(4):1539-1554
It is well known that most new mutations that affect fitness exert deleterious effects and that natural populations are often composed of subpopulations (demes) connected by gene flow. To gain a better understanding of the joint effects of purifying selection and population structure, we focus on a scenario where an ancestral population splits into multiple demes and study neutral diversity patterns in regions linked to selected sites. In the background selection regime of strong selection, we first derive analytic equations for pairwise coalescent times and FST as a function of time after the ancestral population splits into two demes and then construct a flexible coalescent simulator that can generate samples under complex models such as those involving multiple demes or nonconservative migration. We have carried out extensive forward simulations to show that the new methods can accurately predict diversity patterns both in the nonequilibrium phase following the split of the ancestral population and in the equilibrium between mutation, migration, drift, and selection. In the interference selection regime of many tightly linked selected sites, forward simulations provide evidence that neutral diversity patterns obtained from both the nonequilibrium and equilibrium phases may be virtually indistinguishable for models that have identical variance in fitness, but are nonetheless different with respect to the number of selected sites and the strength of purifying selection. This equivalence in neutral diversity patterns suggests that data collected from subdivided populations may have limited power for differentiating among the selective pressures to which closely linked selected sites are subject.  相似文献   

17.
Accurate estimates of genome-wide rates and fitness effects of new mutations are essential for an improved understanding of molecular evolutionary processes. Although eukaryotic genomes generally contain a large noncoding fraction, functional noncoding regions and fitness effects of mutations in such regions are still incompletely characterized. A promising approach to characterize functional noncoding regions relies on identifying accessible chromatin regions (ACRs) tightly associated with regulatory DNA. Here, we applied this approach to identify and estimate selection on ACRs in Capsella grandiflora, a crucifer species ideal for population genomic quantification of selection due to its favorable population demography. We describe a population-wide ACR distribution based on ATAC-seq data for leaf samples of 16 individuals from a natural population. We use population genomic methods to estimate fitness effects and proportions of positively selected fixations (α) in ACRs and find that intergenic ACRs harbor a considerable fraction of weakly deleterious new mutations, as well as a significantly higher proportion of strongly deleterious mutations than comparable inaccessible intergenic regions. ACRs are enriched for expression quantitative trait loci (eQTL) and depleted of transposable element insertions, as expected if intergenic ACRs are under selection because they harbor regulatory regions. By integrating empirical identification of intergenic ACRs with analyses of eQTL and population genomic analyses of selection, we demonstrate that intergenic regulatory regions are an important source of nearly neutral mutations. These results improve our understanding of selection on noncoding regions and the role of nearly neutral mutations for evolutionary processes in outcrossing Brassicaceae species.  相似文献   

18.
19.
Nielsen R 《Genetics》2001,159(1):401-411
This article describes a new Markov chain Monte Carlo (MCMC) method applicable to DNA sequence data, which treats mutations in the genealogy as missing data. The method facilitates inferences regarding the age and identity of specific mutations while taking the full complexities of the mutational process in DNA sequences into account. We demonstrate the utility of the method in three applications. First, we demonstrate how the method can be used to make inferences regarding population genetical parameters such as theta (the effective population size times the mutation rate). Second, we show how the method can be used to estimate the ages of mutations in finite sites models and for making inferences regarding the distribution and ages of nonsynonymous and synonymous mutations. The method is applied to two previously published data sets and we demonstrate that in one of the data sets the average age of nonsynonymous mutations is significantly lower than the average age of synonymous mutations, suggesting the presence of slightly deleterious mutations. Third, we demonstrate how the method in general can be used to evaluate the posterior distribution of a function of a mapping of mutations on a gene genealogy. This application is useful for evaluating the uncertainty associated with methods that rely on mapping mutations on a phylogeny or a gene genealogy.  相似文献   

20.
Iain Mathieson  Gil McVean 《Genetics》2013,193(3):973-984
Inferring the nature and magnitude of selection is an important problem in many biological contexts. Typically when estimating a selection coefficient for an allele, it is assumed that samples are drawn from a panmictic population and that selection acts uniformly across the population. However, these assumptions are rarely satisfied. Natural populations are almost always structured, and selective pressures are likely to act differentially. Inference about selection ought therefore to take account of structure. We do this by considering evolution in a simple lattice model of spatial population structure. We develop a hidden Markov model based maximum-likelihood approach for estimating the selection coefficient in a single population from time series data of allele frequencies. We then develop an approximate extension of this to the structured case to provide a joint estimate of migration rate and spatially varying selection coefficients. We illustrate our method using classical data sets of moth pigmentation morph frequencies, but it has wide applications in settings ranging from ecology to human evolution.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号