首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Evolutionary pressures on proteins are often quantified by the ratio of substitution rates at non-synonymous and synonymous sites. The dN/dS ratio was originally developed for application to distantly diverged sequences, the differences among which represent substitutions that have fixed along independent lineages. Nevertheless, the dN/dS measure is often applied to sequences sampled from a single population, the differences among which represent segregating polymorphisms. Here, we study the expected dN/dS ratio for samples drawn from a single population under selection, and we find that in this context, dN/dS is relatively insensitive to the selection coefficient. Moreover, the hallmark signature of positive selection over divergent lineages, dN/dS>1, is violated within a population. For population samples, the relationship between selection and dN/dS does not follow a monotonic function, and so it may be impossible to infer selection pressures from dN/dS. These results have significant implications for the interpretation of dN/dS measurements among population-genetic samples.  相似文献   

2.
This is the first study to quantify genomic sequence variation of the major histocompatibility complex (MHC) in wild and ornamental guppies, Poecilia reticulata. We sequenced 196-219 bp of exon 2 MHC class IIB (DAB) in 56 wild Trinidadian guppies and 14 ornamental strain guppies. Each of two natural populations possessed high allelic richness (15-16 alleles), whereas only three or fewer DAB alleles were amplified from ornamental guppies. The disparity in allelic richness between wild and ornamental fish cannot be fully explained by fixation of alleles by inbreeding, nor by the presence of non-amplified sequences (ie null alleles). Rather, we suggest that the same allele is fixed at duplicated MHC DAB loci owing to gene conversion. Alternatively, the number of loci in the ornamental strains has contracted during >100 generations in captivity, a hypothesis consistent with the accordion model of MHC evolution. We furthermore analysed the substitution patterns by making pairwise comparisons of sequence variation at the putative peptide binding region (PBR). The rate of non-synonymous substitutions (dN) only marginally exceeded synonymous substitutions (dS) in PBR codons. Highly diverged sequences showed no evidence for diversifying selection, possibly because synonymous substitutions have accumulated since their divergence. Also, the substitution pattern of similar alleles did not show evidence for diversifying selection, plausibly because advantageous non-synonymous substitutions have not yet accumulated. Intermediately diverged sequences showed the highest relative rate of non-synonymous substitutions, with dN/dS>14 in some pairwise comparisons. Consequently, a curvilinear relationship was observed between the dN/dS ratio and the level of sequence divergence.  相似文献   

3.
The ratio of non-synonymous (dN) to synonymous (dS) changes between taxa is frequently computed to assay the strength and direction of selection. Here we note that for comparisons between closely related strains and/or species a second parameter needs to be considered, namely the time since divergence of the two sequences under scrutiny. We demonstrate that a simple time lag model provides a general, parsimonious explanation of the extensive variation in the dN/dS ratio seen when comparing closely related bacterial genomes. We explore this model through simulation and comparative genomics, and suggest a role for hitch-hiking in the accumulation of non-synonymous mutations. We also note taxon-specific differences in the change of dN/dS over time, which may indicate variation in selection, or in population genetics parameters such as population size or the rate of recombination. The effect of comparing intra-species polymorphism and inter-species substitution, and the problems associated with these concepts for asexual prokaryotes, are also discussed. We conclude that, because of the critical effect of time since divergence, inter-taxa comparisons are only possible by comparing trajectories of dN/dS over time and it is not valid to compare taxa on the basis of single time points.  相似文献   

4.
The selective pressure at the protein level is usually measured by the nonsynonymous/synonymous rate ratio (omega = dN/dS), with omega < 1, omega = 1, and omega > 1 indicating purifying (or negative) selection, neutral evolution, and diversifying (or positive) selection, respectively. The omega ratio is commonly calculated as an average over sites. As every functional protein has some amino acid sites under selective constraints, averaging rates across sites leads to low power to detect positive selection. Recently developed models of codon substitution allow the omega ratio to vary among sites and appear to be powerful in detecting positive selection in empirical data analysis. In this study, we used computer simulation to investigate the accuracy and power of the likelihood ratio test (LRT) in detecting positive selection at amino acid sites. The test compares two nested models: one that allows for sites under positive selection (with omega > 1), and another that does not, with the chi2 distribution used for significance testing. We found that use of the chi(2) distribution makes the test conservative, especially when the data contain very short and highly similar sequences. Nevertheless, the LRT is powerful. Although the power can be low with only 5 or 6 sequences in the data, it was nearly 100% in data sets of 17 sequences. Sequence length, sequence divergence, and the strength of positive selection also were found to affect the power of the LRT. The exact distribution assumed for the omega ratio over sites was found not to affect the effectiveness of the LRT.  相似文献   

5.
A common approach to estimate the strength and direction of selection acting on protein coding sequences is to calculate the dN/dS ratio. The method to calculate dN/dS has been widely used by many researchers and many critical reviews have been made on its application after the proposition by Nei and Gojobori in 1986. However, the method is still evolving considering the non-uniform substitution rates and pretermination codons. In our study of SNPs in 586 genes across 156 Escherichia coli strains, synonymous polymorphism in 2-fold degenerate codons were higher in comparison to that in 4-fold degenerate codons, which could be attributed to the difference between transition (Ti) and transversion (Tv) substitution rates where the average rate of a transition is four times more than that of a transversion in general. We considered both the Ti/Tv ratio, and nonsense mutation in pretermination codons, to improve estimates of synonymous (S) and non-synonymous (NS) sites. The accuracy of estimating dN/dS has been improved by considering the Ti/Tv ratio and nonsense substitutions in pretermination codons. We showed that applying the modified approach based on Ti/Tv ratio and pretermination codons results in higher values of dN/dS in 29 common genes of equal reading-frames between E. coli and Salmonella enterica. This study emphasizes the robustness of amino acid composition with varying codon degeneracy, as well as the pretermination codons when calculating dN/dS values.  相似文献   

6.
An excess of nonsynonymous substitutions over synonymous ones is an important indicator of positive selection at the molecular level. A lineage that underwent Darwinian selection may have a nonsynonymous/synonymous rate ratio (dN/dS) that is different from those of other lineages or greater than one. In this paper, several codon-based likelihood models that allow for variable dN/dS ratios among lineages were developed. They were then used to construct likelihood ratio tests to examine whether the dN/dS ratio is variable among evolutionary lineages, whether the ratio for a few lineages of interest is different from the background ratio for other lineages in the phylogeny, and whether the dN/dS ratio for the lineages of interest is greater than one. The tests were applied to the lysozyme genes of 24 primate species. The dN/dS ratios were found to differ significantly among lineages, indicating that the evolution of primate lysozymes is episodic, which is incompatible with the neutral theory. Maximum- likelihood estimates of parameters suggested that about nine nonsynonymous and zero synonymous nucleotide substitutions occurred in the lineage leading to hominoids, and the dN/dS ratio for that lineage is significantly greater than one. The corresponding estimates for the lineage ancestral to colobine monkeys were nine and one, and the dN/dS ratio for the lineage is not significantly greater than one, although it is significantly higher than the background ratio. The likelihood analysis thus confirmed most, but not all, conclusions Messier and Stewart reached using reconstructed ancestral sequences to estimate synonymous and nonsynonymous rates for different lineages.   相似文献   

7.
Here we present a new sliding window-based method specially designed to detect selective constraints in specific regions of a multiple protein-coding sequence alignment. In contrast to previous window-based procedures, our method is based on a nonarbitrary statistical approach to find the appropriate codon-window size to test deviations of synonymous (dS) and nonsynonymous (dN) nucleotide substitutions from the expectation. The probabilities of dN and dS are obtained from simulated data and used to detect significant deviations of dN and dS in a specific window region of the real sequence alignment. The nonsynonymous-to-synonymous rate ratio (w = dN/dS) was used to highlight selective constraints in any window wherein dS or dN was significantly different from the expectation. In these significant windows, w and its variance [V(w)] were calculated and used to test the neutral hypothesis. Computer simulations showed that the method is accurate even for highly divergent sequences. The main advantages of the new method are that it (i) uses a statistically appropriate window size to detect different selective patterns, (ii) is computationally less intensive than maximum likelihood methods, and (iii) detects saturation of synonymous sites, which can give deviations from neutrality. Hence, it allows the analysis of highly divergent sequences and the test of different alternative hypothesis as well. The application of the method to different human immunodeficiency virus type 1 and to foot-and-mouth disease virus genes confirms the action of positive selection on previously described regions as well as on new regions.  相似文献   

8.
9.
Cête d׳Ivoire continues to have the highest HIV-1 prevalence rate in West Africa, although the infection number is in constant decline. The external envelope protein of the viruses is a likely site of selection, and responsible for receptor binding and entry into host cells, and therefore constitutes an ideal region with which to investigate the evolutionary processes acting on HIV-1. In this study, we analyse 189 envelope glycoprotein V3 loop region sequences of viruse isolates from 1995 to 2009, from HIV-1 untreated patients living in Cête d׳Ivoire, to decipher the temporal relationship between disease diversity, divergence and selection. Our analyses show that the nonsynonymous and synonymous ratio (dN/dS) was lower than 1 for viral populations analysed within 15 years, which showed the sequences did not undergo adequate immune pressure. The phylogenetic tree of the sequences analysed demonstrated distinctly long internal branches and short external branches, suggesting that only a small number of viruses infected the new host cell at each transmission. In addition to identifying sites under purifying selection, we also identified neutral sites that can cause false positive inference of selection. These sites presented form a resource for future studies of selection pressures acting on HIV-1 enν gene in Cête d׳Ivoire and other West African countries.  相似文献   

10.
The relative rates of nucleotide substitution at synonymous and nonsynonymous sites within protein-coding regions have been widely used to infer the action of natural selection from comparative sequence data. It is known, however, that mutational and repair biases can affect rates of evolution at both synonymous and nonsynonymous sites. More importantly, it is also known that synonymous sites are particularly prone to the effects of nucleotide bias. This means that nucleotide biases may affect the calculated ratio of substitution rates at synonymous and nonsynonymous sites. Using a large data set of animal mitochondrial sequences, we demonstrate that this is, in fact, the case. Highly biased nucleotide sequences are characterized by significantly elevated dN/dS ratios, but only when the nucleotide frequencies are not taken into account. When the analysis is repeated taking the nucleotide frequencies at each codon position into account, such elevated ratios disappear. These results suggest that the recently reported differences in dN/dS ratios between vertebrate and invertebrate mitochondrial sequences could be explained by variations in mitochondrial nucleotide frequencies rather than the effects of positive Darwinian selection.  相似文献   

11.
Codon-based substitution models are routinely used to measure selective pressures acting on protein-coding genes. To this effect, the nonsynonymous to synonymous rate ratio (dN/dS = omega) is estimated. The proportion of amino-acid sites potentially under positive selection, as indicated by omega > 1, is inferred by fitting a probability distribution where some sites are permitted to have omega > 1. These sites are then inferred by means of an empirical Bayes or by a Bayes empirical Bayes approach that, respectively, ignores or accounts for sampling errors in maximum-likelihood estimates of the distribution used to infer the proportion of sites with omega > 1. Here, we extend a previous full-Bayes approach to include models with high power and low false-positive rates when inferring sites under positive selection. We propose some heuristics to alleviate the computational burden, and show that (i) full Bayes can be superior to empirical Bayes when analyzing a small data set or small simulated data, (ii) full Bayes has only a small advantage over Bayes empirical Bayes with our small test data, and (iii) Bayesian methods appear relatively insensitive to mild misspecifications of the random process generating adaptive evolution in our simulations, but in practice can prove extremely sensitive to model specification. We suggest that the codon model used to detect amino acids under selection should be carefully selected, for instance using Akaike information criterion (AIC).  相似文献   

12.
MOTIVATION: Accurate detection of positive Darwinian selection can provide important insights to researchers investigating the evolution of pathogens. However, many pathogens (particularly viruses) undergo frequent recombination and the phylogenetic methods commonly applied to detect positive selection have been shown to give misleading results when applied to recombining sequences. We propose a method that makes maximum likelihood inference of positive selection robust to the presence of recombination. This is achieved by allowing tree topologies and branch lengths to change across detected recombination breakpoints. Further improvements are obtained by allowing synonymous substitution rates to vary across sites. RESULTS: Using simulation we show that, even for extreme cases where recombination causes standard methods to reach false positive rates >90%, the proposed method decreases the false positive rate to acceptable levels while retaining high power. We applied the method to two HIV-1 datasets for which we have previously found that inference of positive selection is invalid owing to high rates of recombination. In one of these (env gene) we still detected positive selection using the proposed method, while in the other (gag gene) we found no significant evidence of positive selection. AVAILABILITY: A HyPhy batch language implementation of the proposed methods and the HIV-1 datasets analysed are available at http://www.cbio.uct.ac.za/pub_support/bioinf06. The HyPhy package is available at http://www.hyphy.org, and it is planned that the proposed methods will be included in the next distribution. RDP2 is available at http://darwin.uvigo.es/rdp/rdp.html  相似文献   

13.
Alternative splicing (AS) is known to significantly affect exon-level protein evolutionary rates in mammals. Particularly, alternatively spliced exons (ASEs) have a higher nonsynonymous-to-synonymous substitution rate (dN/dS) ratio than constitutively spliced exons (CSEs), possibly because the former are required only occasionally for normal biological functions. Meanwhile, intrinsically disordered regions (IDRs), the protein regions lacking fixed 3D structures, are also reported to have an increased evolutionary rate due to lack of structural constraint. Interestingly, IDRs tend to be located in alternative protein regions. Yet which of these two factors is the major determinant of the increased dN/dS in mammalian ASEs remains unclear. By comparing human-macaque and human-mouse one-to-one orthologous genes, we demonstrate that AS and protein structural disorder have independent effects on mammalian exon evolution. We performed analyses of covariance to demonstrate that the slopes of the (dN/dS-percentage of IDR) regression lines differ significantly between CSEs and ASEs. In other words, the dN/dS ratios of both ASEs and CSEs increase with the proportion of IDR (PIDR), whereas ASEs have higher dN/dS ratios than CSEs when they have similar PIDRs. Since ASEs and IDRs may less frequently overlap with protein domains (which also affect dN/dS), we also examined the correlations between dN/dS ratio and exon type/PIDR by controlling for the density of protein domain. We found that the effects of exon type and PIDR on dN/dS are both independent of domain density. Our results imply that nature can select for different biological features with regard to ASEs and IDRs, even though the two biological features tend to be localized in the same protein regions.  相似文献   

14.
15.
The higher rate of non-synonymous over synonymous substitutions (dN/dS) of the X chromosome compared with autosomes is often interpreted as a consequence of X hemizygosity. However, other factors, such as gene expression, are also known to vary between X and autosomes. Analysing 4800 orthologues in six mammals, we found that gene expression levels, associated with GC content, fully account for the variation in dN/dS between X and autosomes with no detectable effect of hemizygosity. We also report an extensive variance in dN/dS and gene expression between autosomes.  相似文献   

16.
A popular approach to detecting positive selection is to estimate the parameters of a probabilistic model of codon evolution and perform inference based on its maximum likelihood parameter values. This approach has been evaluated intensively in a number of simulation studies and found to be robust when the available data set is large. However, uncertainties in the estimated parameter values can lead to errors in the inference, especially when the data set is small or there is insufficient divergence between the sequences. We introduce a Bayesian model comparison approach to infer whether the sequence as a whole contains sites at which the rate of nonsynonymous substitution is greater than the rate of synonymous substitution. We incorporated this probabilistic model comparison into a Bayesian approach to site-specific inference of positive selection. Using simulated sequences, we compared this approach to the commonly used empirical Bayes approach and investigated the effect of tree length on the performance of both methods. We found that the Bayesian approach outperforms the empirical Bayes method when the amount of sequence divergence is small and is less prone to false-positive inference when the sequences are saturated, while the results are indistinguishable for intermediate levels of sequence divergence.  相似文献   

17.
Rapidly evolving proteins can aid the identification of genes underlying phenotypic adaptation across taxa, but functional and structural elements of genes can also affect evolutionary rates. In plants, the ‘edges’ of exons, flanking intron junctions, are known to contain splice enhancers and to have a higher degree of conservation compared to the remainder of the coding region. However, the extent to which these regions may be masking indicators of positive selection or account for the relationship between dN/dS and other genomic parameters is unclear. We investigate the effects of exon edge conservation on the relationship of dN/dS to various sequence characteristics and gene expression parameters in the model plant Arabidopsis thaliana. We also obtain lineage‐specific dN/dS estimates, making use of the recently sequenced genome of Thellungiella parvula, the second closest sequenced relative after the sister species Arabidopsis lyrata. Overall, we find that the effect of exon edge conservation, as well as the use of lineage‐specific substitution estimates, upon dN/dS ratios partly explains the relationship between the rates of protein evolution and expression level. Furthermore, the removal of exon edges shifts dN/dS estimates upwards, increasing the proportion of genes potentially under adaptive selection. We conclude that lineage‐specific substitutions and exon edge conservation have an important effect on dN/dS ratios and should be considered when assessing their relationship with other genomic parameters.  相似文献   

18.
When most amino acid substitutions in protein-coding genes are slightly deleterious rather than selectively neutral, life history differences can potentially modify the effective population size or the selective regime, resulting in altered ratios of non-synonymous to synonymous substitutions among taxa. We studied substitution patterns for the mitochondrial cytochrome oxidase subunit I (COI) gene in a sea star genus (Leptasterias spp.) with an obligate brood-protecting mode of reproduction and small-scale population genetic subdivision, and compared the results to available COI sequences in nine other genera of echinoderms with pelagic larvae: three sea stars, five sea urchins and one brittle star. We predicted that this life history difference would be associated with differences in the ratio of non-synonymous (dN) to synonymous (dS) substitution rates. Leptasterias had a significantly greater dN/dS ratio (both between species and within species), a significantly smaller transition/transversion rate ratio, and a significantly lower average nucleotide diversity within species, than did the non-brooding genera. Other explanations for the results, such as altered mutation rates or selective sweeps, were not supported by the data analysis. These findings highlight the potential influence of reproductive traits and other life history factors on patterns of nucleotide substitution within and between species.  相似文献   

19.
Molecular Evolution of the Genomic RNA of Apple Stem Grooving Capillovirus   总被引:1,自引:0,他引:1  
The complete genome of the German isolate AC of Apple stem grooving virus (ASGV) was sequenced. It encodes two overlapping open reading frames (ORFs), similarly to previously described ASGV isolates. Two regions of high variability were detected between the ASGV isolates, variable region 1 (V1, from amino acids (aa) 532 to 570), and variable region 2 (V2, from aa 1,583 to 1,868). The phylogenetic analysis of the V1 and V2 regions suggested that the ASGV diversity was structured by host plant species rather than geographical origin. The dN/dS ratio between nonsynonymous and synonymous nucleotide substitution rates varied greatly along the ASGV genome. Most of ORF1 showed predominant negative selection except for the two regions V1 and V2. V1 showed an elevated dN and an average dS when compared to the ORF1 background but no significant positive selection was detected. The V2 region of ORF1 showed an elevated dN and a low dS when compared to the ORF1 background with an average dN/dS????3.0 indicative of positive selection. However, the V2 area includes overlapping ORFs, making the dN/dS estimate biased. Joint estimates of the selection intensity in the different ORFs by a recent method indicated that this region of ORF1 was in fact evolving close to neutrality. This was convergent with previous results showing that introduction of stop codons in this region of ORF1 did not impair plant infection. These data suggest that the elimination of a stop codon caused the overprinting of a novel coding region over the ancestral ORF.  相似文献   

20.
Tai V  Poon AF  Paulsen IT  Palenik B 《PloS one》2011,6(9):e24249
Environmental metagenomics provides snippets of genomic sequences from all organisms in an environmental sample and are an unprecedented resource of information for investigating microbial population genetics. Current analytical methods, however, are poorly equipped to handle metagenomic data, particularly of short, unlinked sequences. A custom analytical pipeline was developed to calculate dN/dS ratios, a common metric to evaluate the role of selection in the evolution of a gene, from environmental metagenomes sequenced using 454 technology of flow-sorted populations of marine Synechococcus, the dominant cyanobacteria in coastal environments. The large majority of genes (98%) have evolved under purifying selection (dN/dS<1). The metagenome sequence coverage of the reference genomes was not uniform and genes that were highly represented in the environment (i.e. high read coverage) tended to be more evolutionarily conserved. Of the genes that may have evolved under positive selection (dN/dS>1), 77 out of 83 (93%) were hypothetical. Notable among annotated genes, ribosomal protein L35 appears to be under positive selection in one Synechococcus population. Other annotated genes, in particular a possible porin, a large-conductance mechanosensitive channel, an ATP binding component of an ABC transporter, and a homologue of a pilus retraction protein had regions of the gene with elevated dN/dS. With the increasing use of next-generation sequencing in metagenomic investigations of microbial diversity and ecology, analytical methods need to accommodate the peculiarities of these data streams. By developing a means to analyze population diversity data from these environmental metagenomes, we have provided the first insight into the role of selection in the evolution of Synechococcus, a globally significant primary producer.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号