首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A number of statistical tests have been proposed to detect positive Darwinian selection affecting a few amino acid sites in a protein, exemplified by an excess of nonsynonymous nucleotide substitutions. These tests are often more powerful than pairwise sequence comparison, which averages synonymous (d(S)) and nonsynonymous (d(N)) rates over the whole gene. In a recent study, however, Hughes AL and Friedman R (2005. Variation in the pattern of synonymous and nonsynonymous difference between two fungal genomes. Mol Bio Evol. 22: 1320-1324) argue that d(S) and d(N) are expected to fluctuate along the sequence by chance and that an excess of nonsynonymous differences in individual codons is no evidence for positive selection. The authors compared codons in protein-coding genes from the genomes of 2 yeast species, Saccharomyces cerevisiae and Saccharomyces paradoxus. They calculated the proportions of synonymous and nonsynonymous differences per site (p(S) and p(N)) in every codon and discovered that p(N) is often greater than p(S) and that among some codons p(S) and p(N) are negatively correlated. The authors argued that these results invalidate previous tests of codons under positive selection. Here I discuss several errors of statistics in the analysis of Hughes and Friedman, including confusion of statistics with parameters, arbitrary data filtering, and derivation of hypotheses from data. I also apply likelihood ratio tests of positive selection to the yeast data and illustrate empirically that Hughes and Friedman's criticisms on such tests are not valid.  相似文献   

2.
Using basic probability theory, we show that there is a substantial likelihood that even in the presence of strong purifying selection, there will be a number of codons in which the number of synonymous nucleotide substitutions per site (d (S)) exceeds the number of non-synonymous nucleotide substitutions per site (d (N)). In an empirical study, we examined the numbers of synonymous (b (S)) and non-synonymous substitutions (b (N)) along branches of the phylogenies of 69 single-copy orthologous genes from seven species of mammals. A pattern of b (N) > b (S) was most commonly seen in the shortest branches of the tree and was associated with a high coefficient of variation in both b (N) and b (S), suggesting that high stochastic error in b (N) and b (S) on short branches, rather than positive Darwinian selection, is the explanation of most cases where b (N) is greater than b (S) on a given branch. The branch-site method of Zhang et al. (Zhang, Nielsen, Yang, Mol Biol Evol, 22:2472-2479, 2005) identified 117 codons on 35 branches as "positively selected," but a majority of these codons lacked synonymous substitutions, while in the others, synonymous and non-synonymous differences per site occurred in approximately equal frequencies. Thus, it was impossible to rule out the hypothesis that chance variation in the pattern of mutation across sites, rather than positive selection, accounted for the observed pattern. Our results showed that b (N)/b (S) was consistently elevated in immune system genes, but neither the search for branches with b (N) > b (S) nor the branch-site method revealed this trend.  相似文献   

3.
The patterns of nucleotide difference were compared at 3,473,111 codons from 9,390 aligned orthologous genes of mouse (Mus musculus), rat (Rattus norvegicus), and human (Homo sapiens). The results showed evidence of a higher frequency of both synonymous and nonsynonymous differences from human in the rat than in the mouse. However, contrary to a previous report, there was no evidence of a greater frequency of codons with multiple nonsynonymous substitutions between the two rodent species than expected under random substitution.  相似文献   

4.
In order to understand the impact of overlapping reading frames on natural selection by host CD8+ T lymphocytes (CD8(+)-TL), we analyzed the pattern of nucleotide substitution in simian immunodeficiency virus (SIV) genomes sampled from populations at time of death in 35 rhesus monkeys. Both the mean number of nonsynonymous nucleotide substitutions per nonsynonymous site (d(N)) and the mean number of synonymous nucleotide substitutions per synonymous site (d(S)) were elevated in overlap regions in comparison to non-overlap regions. Mean d(N) exceeded mean d(S) in CD8(+)-TL epitopes restricted by the host's class I major histocompatibility complex molecules. This pattern, which is indicative of positive Darwinian selection favoring amino acid changes in these epitopes, was seen in both overlap and non-overlap regions; but mean d(N) was particularly elevated in restricted CD8(+)-TL epitopes encoded in overlap regions. Amino acid changes from the inoculum were defined as parallel if the same amino acid change occurred at the same site independently in two or more monkeys, and a surprisingly high proportion (71.9%) of observed amino acid changes throughout the SIV genome occurred in parallel in different monkeys. The proportion of parallel changes in restricted epitopes encoded by overlapping reading frames was still higher (80%), supporting the hypothesis that the interaction of positive selection and overlapping reading frames enhances the probability of convergent or parallel amino acid change.  相似文献   

5.
The sporozoite threonine-asparagine-rich protein (STARP) of Plasmodium falciparum is an attractive target for a pre-erythrocytic stage malaria vaccine because both naturally acquired and experimentally induced anti-STARP antibodies can block sporozoite invasion of hepatocytes. To explore the extent of sequence variation, we surveyed nucleotide polymorphism across the entire gene, encompassing 2 exons and an intron, of 124 P. falciparum-infected blood samples from Thailand and 10 from 4 other endemic areas. In total 24 haplotypes were identified despite low-level nucleotide diversity at this locus. The mean number of nonsynonymous substitutions per nonsynonymous site (d(N)) significantly exceeded that of synonymous substitutions per synonymous site (d(S)), suggesting that the STARP gene has evolved under positive selection, probably from host immune pressure. The preponderance of conservative amino acid exchanges and a strongly biased T-nucleotide toward the third position of codons in repeat arrays have reflected simultaneous constraints on this molecule, probably from its respective unknown function and nucleotide composition. Sequence conservation in the STARP locus among clinical isolates from different disease endemic areas would not compromise vaccine incorporation.  相似文献   

6.
ADAPTSITE: detecting natural selection at single amino acid sites.   总被引:12,自引:0,他引:12  
ADAPTSITE is a program package for detecting natural selection at single amino acid sites, using a multiple alignment of protein-coding sequences for a given phylogenetic tree. The program infers ancestral codons at all interior nodes, and computes the total numbers of synonymous (c(S)) and nonsynonymous (c(N)) substitutions as well as the average numbers of synonymous (s(S)) and nonsynonymous (s(N)) sites for each codon site. The probabilities of occurrence of synonymous and nonsynonymous substitutions are approximated by s(S) / (s(S) + s(N)) and s(N) / (s(S) + s(N)), respectively. The null hypothesis of selective neutrality is tested for each codon site, assuming a binomial distribution for the probability of obtaining c(S) and c(N). AVAILABILITY: ADAPTSITE is available free of charge at the World-Wide Web sites http://mep.bio.psu.edu/adaptivevol.html and http://www.cib.nig.ac.jp/dda/yossuzuk/welcome.html. The package includes the source code written in C, binary files for UNIX operating systems, manual, and example files.  相似文献   

7.
8.
In the analysis of protein-coding nucleotide sequences, the ratio of the number of nonsynonymous substitutions to that of synonymous substitutions (d(N)/d(S)) is used as an indicator for the direction and magnitude of natural selection operating at the amino acid sequence level. The d(S) and d(N) values are estimated based on the comparison of homologous codons, which are often identified by converting (reverse-translating) aligned amino acid sequences into codon sequences. In this method, however, homologous codons may be mis-identified when frame-shifts occurred or amino acid sequences were mis-aligned, which may lead to overestimation of the d(N)/d(S) ratio. Here the effect of reverse-translating aligned amino acid sequences on the estimation of d(N)/d(S) ratio was examined through a large-scale analysis of protein-coding nucleotide sequences from vertebrate species. Apparently, 1-9% of codon sites that were identified as homologous with reverse-translation contained non-homologous codons, where the d(N)/d(S) ratio was unduly high. By correcting the d(N)/d(S) ratio for these codon sites, it was inferred that the ratio was 5-43% overestimated with reverse-translation. These results suggest that caution should be exerted in the study of natural selection using the d(N)/d(S) ratio by reverse-translating aligned amino acid sequences.  相似文献   

9.
We consider three approaches for estimating the rates of nonsynonymous and synonymous changes at each site in a sequence alignment in order to identify sites under positive or negative selection: (1) a suite of fast likelihood-based "counting methods" that employ either a single most likely ancestral reconstruction, weighting across all possible ancestral reconstructions, or sampling from ancestral reconstructions; (2) a random effects likelihood (REL) approach, which models variation in nonsynonymous and synonymous rates across sites according to a predefined distribution, with the selection pressure at an individual site inferred using an empirical Bayes approach; and (3) a fixed effects likelihood (FEL) method that directly estimates nonsynonymous and synonymous substitution rates at each site. All three methods incorporate flexible models of nucleotide substitution bias and variation in both nonsynonymous and synonymous substitution rates across sites, facilitating the comparison between the methods. We demonstrate that the results obtained using these approaches show broad agreement in levels of Type I and Type II error and in estimates of substitution rates. Counting methods are well suited for large alignments, for which there is high power to detect positive and negative selection, but appear to underestimate the substitution rate. A REL approach, which is more computationally intensive than counting methods, has higher power than counting methods to detect selection in data sets of intermediate size but may suffer from higher rates of false positives for small data sets. A FEL approach appears to capture the pattern of rate variation better than counting methods or random effects models, does not suffer from as many false positives as random effects models for data sets comprising few sequences, and can be efficiently parallelized. Our results suggest that previously reported differences between results obtained by counting methods and random effects models arise due to a combination of the conservative nature of counting-based methods, the failure of current random effects models to allow for variation in synonymous substitution rates, and the naive application of random effects models to extremely sparse data sets. We demonstrate our methods on sequence data from the human immunodeficiency virus type 1 env and pol genes and simulated alignments.  相似文献   

10.
The number of synonymous mutations per synonymous site (K(s)), the number of nonsynonymous mutations per nonsynonymous site (K(a)), and the codon usage statistic (N(c)) were calculated for several hepatitis A virus (HAV) isolates. While K(s) was similar to those of poliovirus (PV) and foot-and-mouth disease virus (FMDV), K(a) was 1 order of magnitude lower. The N(c) parameter provides information on codon usage bias and decreases when bias increases. The N(c) value in HAV was about 38, while in PV and FMDV, it was about 53. The emergence of 22 rare codons in front of 8 in PV and 7 in FMDV was detected. Most of the conserved rare codons of the P1 region were strategically located at the carboxy borders of beta barrels and alpha helices, their potential function being the assurance of proper folding of the capsid proteins through a decrease in the translation speed. This strategic location was not observed for amino acids encoded by the conserved rare codons of the 3D region. The percentage of bases with low pairing number values was higher in the latter region, suggesting a role of the conserved rare codons in the maintenance of RNA structure. Many of the rare codons in HAV are among the most frequent in humans, unlike in PV or in FMDV. This fact may be explained by the lack of cellular shutoff in HAV. One hypothesis is that HAV has evolved in order to avoid competition with its host for cellular tRNAs.  相似文献   

11.
Summary Based on the rates of synonymous substitution in 42 protein-codin gene pairs from rat and human, a correlation is shown to exist between the frequency of the nucleotides in all positions of the codon and the synonymous substitution rate. The correlation coefficients were positive for A and T and negative for C and G. This means that AT-rich genes accumulate more synonymous substitutions than GC-rich genes. Biased patterns of mutation could not account for this phenomenon. Thus, the variation in synonymous substitution rates and the resulting unequal codon usage must be the consequence of selection against A and T in synonymous positions. Most of the varition in rates of synonymous substitution can be explained by the nucleotide composition in synonymous positions. Codon-anticodon interactions, dinucleotide frequencies, and contextual factors influence neither the rates of synonymous substitution nor codon usage. Interestingly, the nucleotide in the second position of codons (always a nonsynonymous position) was found to affect the rate of synonymous substitution. This finding links the rate of nonsynonymous substitution with the synonymous rate. Consequently, highly conservative proteins are expected to be encoded by genes that evolve slowly in terms of synonymous substitutions, and are consequently highly biased in their codon usage.  相似文献   

12.
It has been suggested that codon volatility (the proportion of the point-mutation neighbors of a codon that encode different amino acids) can be used as an index of past positive selection. We compared codon volatility with patterns of synonymous and nonsynonymous nucleotide substitution in genome-wide comparisons of orthologous genes between three pairs of related genomes: (1) the protists Plasmodium falciparum and P. yoelii, (2) the fungi Saccharomyces cerevisiae and S. paradoxus, and (3) the mammals mouse and rat. Codon volatility was not consistently associated with an elevated rate of nonsynonymous substitution, as would be expected under positive selection. Rather, the most consistent and powerful correlate of elevated codon volatility was nucleotide content at the second codon position, as expected, given the nature of the genetic code.  相似文献   

13.
A new method is proposed for estimating the number of synonymous and nonsynonymous nucleotide substitutions between homologous genes. In this method, a nucleotide site is classified as nondegenerate, twofold degenerate, or fourfold degenerate, depending on how often nucleotide substitutions will result in amino acid replacement; nucleotide changes are classified as either transitional or transversional, and changes between codons are assumed to occur with different probabilities, which are determined by their relative frequencies among more than 3,000 changes in mammalian genes. The method is applied to a large number of mammalian genes. The rate of nonsynonymous substitution is extremely variable among genes; it ranges from 0.004 X 10(-9) (histone H4) to 2.80 X 10(-9) (interferon gamma), with a mean of 0.88 X 10(-9) substitutions per nonsynonymous site per year. The rate of synonymous substitution is also variable among genes; the highest rate is three to four times higher than the lowest one, with a mean of 4.7 X 10(-9) substitutions per synonymous site per year. The rate of nucleotide substitution is lowest at nondegenerate sites (the average being 0.94 X 10(-9), intermediate at twofold degenerate sites (2.26 X 10(-9)). and highest at fourfold degenerate sites (4.2 X 10(-9)). The implication of our results for the mechanisms of DNA evolution and that of the relative likelihood of codon interchanges in parsimonious phylogenetic reconstruction are discussed.  相似文献   

14.
Natural selection operating at the amino acid sequence level can be detected by comparing the rates of synonymous (r(S)) and nonsynonymous (r(N)) nucleotide substitutions, where r(N)/r(S) (omega) > 1 and omega < 1 suggest positive and negative selection, respectively. The branch-site test has been developed for detecting positive selection operating at a group of amino acid sites for a pre-specified (foreground) branch of a phylogenetic tree by taking into account the heterogeneity of omega among sites and branches. Here the performance of the branch-site test was examined by computer simulation, with special reference to the false-positive rate when the divergence of the sequences analyzed was small. The false-positive rate was found to inflate when the assumptions made on the omega values for the foreground and other (background) branches in the branch-site test were violated. In addition, under a similar condition, false-positive results were often obtained even when Bonferroni correction was conducted and the false-discovery rate was controlled in a large-scale analysis. False-positive results were also obtained even when the number of nonsynonymous substitutions for the foreground branch was smaller than the minimum value required for detecting positive selection. The existence of a codon site with a possibility of occurrence of multiple nonsynonymous substitutions for the foreground branch often caused the branch-site test to falsely identify positive selection. In the re-analysis of orthologous trios of protein-coding genes from humans, chimpanzees, and macaques, most of the genes previously identified to be positively selected for the human or chimpanzee branch by the branch-site test contained such a codon site, suggesting a possibility that a significant fraction of these genes are false-positives.  相似文献   

15.
Carlini DB  Stephan W 《Genetics》2003,163(1):239-243
The evolution of codon bias, the unequal usage of synonymous codons, is thought to be due to natural selection for the use of preferred codons that match the most abundant species of isoaccepting tRNA, resulting in increased translational efficiency and accuracy. We examined this hypothesis by introducing 1, 6, and 10 unpreferred codons into the Drosophila alcohol dehydrogenase gene (Adh). We observed a significant decrease in ADH protein production with number of unpreferred codons, confirming the importance of natural selection as a mechanism leading to codon bias. We then used this empirical relationship to estimate the selection coefficient (s) against unpreferred synonymous mutations and found the value (s >or= 10(-5)) to be approximately one order of magnitude greater than previous estimates from population genetics theory. The observed differences in protein production appear to be too large to be consistent with current estimates of the strength of selection on synonymous sites in D. melanogaster.  相似文献   

16.
H Li  J Liu  K Wu  Y Chen 《PloS one》2012,7(7):e41167
Glutamine tandem repeats are common in eukaryotic proteins. Although some studies have proposed that replication slippage plays an important role in shaping these repeats, the role of natural selection in glutamine tandem repeat evolution is somewhat unclear. In this study, we identified all of the glutamine tandem repeats containing four or more glutamines in human proteins and then estimated the nonsynonymous (d(N)) and synonymous (d(S)) substitution rates for the regions flanking the glutamine tandem repeats and the proteins containing them. The results indicated that most of the proteins containing polyglutamine (polyQ) tracts of four or more glutamines have undergone purifying selection, and that the purifying selection for the regions flanking the repeats is weaker. Additionally, we observed that the conserved repeats were under stronger selection constraints than the nonconserved repeats. Interestingly, we found that there was a higher level of purifying selection for the regions flanking the polyQ tracts encoded by pure CAG codons compared with those encoded by mixed codons. Based on our findings, we propose that selection has played a more important role than was previously speculated in constraining the expansion of polyQ tracts encoded by pure codons.  相似文献   

17.
It has been suggested that volatility, the proportion of mutations which change an amino acid, can be used to infer the level of natural selection acting upon a gene. This conjecture is supported by a correlation between volatility and the rate of nonsynonymous substitution (dN), or the ratio of nonsynonymous and synonymous substitution rates, in a variety of organisms. These organisms include yeast, in which the correlations are quite strong. Here we show that these correlations are a by-product of a correlation between synonymous codon bias toward translationally optimal codons and dN. Although this analysis suggests that volatility is not a good measure of the selection, we suggest that it might be possible to infer something about the level of natural selection, from a single genome sequence, using translational codon bias.  相似文献   

18.
Natural selection operating on amino acid substitution at single amino acid sites can be detected by comparing the rates of synonymous (r(S)) and nonsynonymous (r(N)) nucleotide substitution at single codon sites. Amino acid substitutions can be classified as conservative or radical according to whether they retain the properties of the substituted amino acid. Here methods for comparing the rates of conservative (r(C)) and radical (r(R)) nonsynonymous substitution with r(S) at single codon sites were developed to detect natural selection operating on these substitutions at single amino acid sites. A method for comparing r(C) and r(R) at single codon sites was also developed to detect biases toward these substitutions at single amino acid sites. Charge was used as the property of the amino acids. In a computer simulation, false-positive rates of these methods were always < 5%, unless termination sites were included in the computation of the numbers of sites and estimates of transition/transversion rate ratio were highly biased. The frequency of detection of natural selection operating on conservative substitution was almost independent of the presence of natural selection operating on radical substitution, and vice versa. Natural selection operating specifically on conservative and radical substitution was detected more efficiently by comparing r(S) with r(C) and r(S) with r(R) than by comparing r(S) with r(N). These methods also appeared to be robust against the occurrence of recombination during evolution. In an analysis of class I human leukocyte antigen, negative selection operating on conservative substitution, but not positive selection operating on radical substitution, was observed at some of the codon sites with r(R) > r(C), suggesting that r(R) > r(C) may not necessarily be an indicator of positive selection operating on radical substitution.  相似文献   

19.
In this work, we have investigated the relationships between synonymous and nonsynonymous rates and base composition in coding sequences from Gramineae to analyze the factors underlying the variation in substitutional rates. We have shown that in these genes the rates of nucleotide divergence, both synonymous and nonsynonymous, are, to some extent, dependent on each other and on the base composition. In the first place, the variation in nonsynonymous rate is related to the GC level at the second codon position (the higher the GC2 level, the higher the amino acid replacement rate). The correlation is especially strong with T2, the coefficients being significant in the three data sets analyzed. This correlation between nonsynonymous rate and base composition at the second codon position is also detectable at the intragenic level, which implies that the factors that tend to increase the intergenic variance in nonsynonymous rates also affect the intragenic variance. On the other hand, we have shown that the synonymous rate is strongly correlated with the GC3 level. This correlation is observed both across genes and at the intragenic level. Similarly, the nonsynonymous rate is also affected at the intragenic level by GC3 level, like the silent rate. In fact, synonymous and nonsynonymous rates exhibit a parallel behavior in relation to GC3 level, indicating that the intragenic patterns of both silent and amino acid divergence rates are influenced in a similar way by the intragenic variation of GC3. This result, taken together with the fact that the number of genes displaying intragenic correlation coefficients between synonymous and nonsynonymous rates is not very high, but higher than random expectation (in the three data sets analyzed), strongly suggests that the processes of silent and amino acid replacement divergence are, at least in part, driven by common evolutionary forces in genes from Gramineae. Received: 2 July 1998 / Accepted: 18 April 1999  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号