首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
ADAPTSITE: detecting natural selection at single amino acid sites.   总被引:12,自引:0,他引:12  
ADAPTSITE is a program package for detecting natural selection at single amino acid sites, using a multiple alignment of protein-coding sequences for a given phylogenetic tree. The program infers ancestral codons at all interior nodes, and computes the total numbers of synonymous (c(S)) and nonsynonymous (c(N)) substitutions as well as the average numbers of synonymous (s(S)) and nonsynonymous (s(N)) sites for each codon site. The probabilities of occurrence of synonymous and nonsynonymous substitutions are approximated by s(S) / (s(S) + s(N)) and s(N) / (s(S) + s(N)), respectively. The null hypothesis of selective neutrality is tested for each codon site, assuming a binomial distribution for the probability of obtaining c(S) and c(N). AVAILABILITY: ADAPTSITE is available free of charge at the World-Wide Web sites http://mep.bio.psu.edu/adaptivevol.html and http://www.cib.nig.ac.jp/dda/yossuzuk/welcome.html. The package includes the source code written in C, binary files for UNIX operating systems, manual, and example files.  相似文献   

2.
A method for detecting positive selection at single amino acid sites   总被引:23,自引:0,他引:23  
A method was developed for detecting the selective force at single amino acid sites given a multiple alignment of protein-coding sequences. The phylogenetic tree was reconstructed using the number of synonymous substitutions. Then, the neutrality was tested for each codon site using the numbers of synonymous and nonsynonymous changes throughout the phylogenetic tree. Computer simulation showed that this method accurately estimated the numbers of synonymous and nonsynonymous substitutions per site, as long as the substitution number on each branch was relatively small. The false-positive rate for detecting the selective force was generally low. On the other hand, the true-positive rate for detecting the selective force depended on the parameter values. Within the range of parameter values used in the simulation, the true-positive rate increased as the strength of the selective force and the total branch length (namely the total number of synonymous substitutions per site) in the phylogenetic tree increased. In particular, with the relative rate of nonsynonymous substitutions to synonymous substitutions being 5.0, most of the positively selected codon sites were correctly detected when the total branch length in the phylogenetic tree was > or = 2.5. When this method was applied to the human leukocyte antigen (HLA) gene, which included antigen recognition sites (ARSs), positive selection was detected mainly on ARSs. This finding confirmed the effectiveness of the present method with actual data. Moreover, two amino acid sites were newly identified as positively selected in non-ARSs. The three-dimensional structure of the HLA molecule indicated that these sites might be involved in antigen recognition. Positively selected amino acid sites were also identified in the envelope protein of human immunodeficiency virus and the influenza virus hemagglutinin protein. This method may be helpful for predicting functions of amino acid sites in proteins, especially in the present situation, in which sequence data are accumulating at an enormous speed.  相似文献   

3.
Zhang Z  Inomata N  Ohba T  Cariou ML  Yamazaki T 《Genetics》2002,161(3):1187-1196
We examined the pattern of synonymous substitutions in the duplicated Amylase (Amy) genes (called the Amy1- and Amy3-type genes, respectively) in the Drosophila montium species subgroup. The GC content at the third synonymous codon sites of the Amy1-type genes was higher than that of the Amy3-type genes, while the GC content in the 5'-flanking region was the same in both genes. This suggests that the difference in the GC content at third synonymous sites between the duplicated genes is not due to the temporal or regional changes in mutation bias. We inferred the direction of synonymous substitutions along branches of a phylogeny. In most lineages, there were more synonymous substitutions from G/C (G or C) to A/T (A or T) than from A/T to G/C. However, in one lineage leading to the Amy1-type genes, which is immediately after gene duplication but before speciation of the montium species, synonymous substitutions from A/T to G/C were predominant. According to a simple model of synonymous DNA evolution in which major codons are selectively advantageous within each codon family, we estimated the selection intensity for specific lineages in a phylogeny on the basis of inferred patterns of synonymous substitutions. Our result suggested that the difference in GC content at synonymous sites between the two Amy-type genes was due to the change of selection intensity immediately after gene duplication but before speciation of the montium species.  相似文献   

4.
Sexually induced gene 1 (Sig1) in the centric diatom Thalassiosira weissflogii is considered to encode a gamete recognition protein. Sorhannus (2003) analyzed nucleotide sequences of Sig1 using parsimony analysis and the maximum-likelihood (ML)-based Bayesian method for inferring positive selection at single amino acid sites and reported that positively selected sites were detected by the latter method but not by the former. He then concluded that for this type of study, the ML-based method is more reliable than parsimony analysis. Here we show that his results apparently represent false-positive cases of the ML-based method and that there is no solid evidence that this gene contains positively selected sites. We further demonstrate that in the tax gene of human T-cell lymphotropic virus type I (HTLV-I), all codon sites, including invariable sites, can be inferred as positively selected sites by the ML-based method. These observations indicate that the ML-based method may produce many false-positive sites. One of the main reasons for the occurrence of false positives is that in the ML-based method, codon sites are grouped into several categories, with different nonsynonymous/synonymous rate ratios (omegas), on a purely statistical basis, and positive selection is inferred indirectly by examining whether the average omega for each category is greater than 1. In parsimony analysis, however, the evolutionary change of nucleotides at each codon site is examined. For this reason, parsimony-based methods rarely produce false positives and are safer than ML-based methods for detecting positive selection at individual codon sites, although a large number of sequences are necessary.  相似文献   

5.
Maliarchuk BA 《Genetika》2012,48(6):713-718
Sequence analysis of the cytochrome b gene fragment in the salamanders of the genus Salamandrella, Siberian salamander and Schrenk salamander was performed with the purpose to elucidate the effect of natural selection on the evolution of mitochondrial DNA (mtDNA) in these species. It was demonstrated that despite of notable influence of negative selection (expressed as very low dN/dS values), speciation and intraspecific divergence in salamanders was accompanied by the appearance of radical amino acid substitutions, caused by the influence of positive (directional) selection. To examine the evolutionary pattern of synonymous mtDNA sites, distribution of conservative and non-conservative substitutions was analyzed. The rates of conservative and non-conservative substitutions were nearly equal, pointing to neutrality of mutation process at synonymous mtDNA sites of salamanders. Analysis of conservative and non-conservative synonymous substitution distributions in different parts of phylogenetic trees showed that the differences between the synonymous groups compared were statistically significant only in one phylogenetic group of Siberian salamander (haplogroup C) (P = 0.02). In the group of single substitutions, located at terminal phylogenetic branches of Siberian salamanders from group C, increased rate of conservative substitutions was observed. Based on these findings, it was suggested that selective processes could have an influence on the formation of the synonymous substitution profile in the Siberian salamander mtDNA fragment examined.  相似文献   

6.
The nature of selection on capsid genes of foot-and-mouth disease virus (FMDV) was characterized by examining the ratio of nonsynonymous to synonymous substitutions in 11 data sets of sequences obtained from six different serotypes of FMDV. Using a method of analysis that assigns each codon position to one of a number of estimated values of nonsynonymous to synonymous ratio, significant evidence of positive selection was identified in 5 data sets, operating at 1-7% of codon positions. Evidence of positive selection was identified in complete capsid sequences of serotypes A and C and in VP1 sequences of serotypes SAT 1 and 2. Sequences of serotype SAT-2 recovered from a persistently infected African buffalo also revealed evidence for positive selection. Locations of codons under positive selection coincide closely with those of antigenic sites previously identified with the use of monoclonal antibody escape mutants. The vast majority of codons are under mild to strong purifying selection. However, these results suggest that arising antigenic variants benefit from a selective advantage in their interaction with the immune system, either during the course of an infection or in transmission to individuals with previous exposure to antigen. Analysis of amino acid usage at sites under positive selection indicates that this selective advantage can be conferred by amino acid substitutions that share physicochemically similar properties.  相似文献   

7.
Akashi H  Ko WY  Piao S  John A  Goel P  Lin CF  Vitins AP 《Genetics》2006,172(3):1711-1726
Although mutation, genetic drift, and natural selection are well established as determinants of genome evolution, the importance (frequency and magnitude) of parameter fluctuations in molecular evolution is less understood. DNA sequence comparisons among closely related species allow specific substitutions to be assigned to lineages on a phylogenetic tree. In this study, we compare patterns of codon usage and protein evolution in 22 genes (>11,000 codons) among Drosophila melanogaster and five relatives within the D. melanogaster subgroup. We assign changes to eight lineages using a maximum-likelihood approach to infer ancestral states. Uncertainty in ancestral reconstructions is taken into account, at least to some extent, by weighting reconstructions by their posterior probabilities. Four of the eight lineages show potentially genomewide departures from equilibrium synonymous codon usage; three are decreasing and one is increasing in major codon usage. Several of these departures are consistent with lineage-specific changes in selection intensity (selection coefficients scaled to effective population size) at silent sites. Intron base composition and rates and patterns of protein evolution are also heterogeneous among these lineages. The magnitude of forces governing silent, intron, and protein evolution appears to have varied frequently, and in a lineage-specific manner, within the D. melanogaster subgroup.  相似文献   

8.
Nucleotide polymorphism at the pantophysin (Pan I) locus in walleye pollock, Theragra chalcogramma, was examined using DNA sequence data. Two distinct allelic lineages were detected in pollock, resulting from three amino acid replacement mutations in the first intravesicular domain of the protein. The common Pan I allelic group, comprising 94% of the samples, was less polymorphic (pi = 0.005) than the uncommon group (pi = 0.008), and nucleotide diversity in both was higher than for two allelic lineages in the related Atlantic cod, Gadus morhua. Phylogenetic analyses of Pan I sequences from these two species did not clearly resolve orthology among allelic groups, in part because of recombination that has occurred between the two pollock lineages. Conventional tests of neutrality comparing polymorphisms within and between homologous regions of the Pan I locus in walleye pollock and Atlantic cod did not detect the effects of selection. This result is likely attributed to low levels of synonymous divergence among allelic lineages and a lack of mutation-drift equilibrium inferred from nucleotide mismatch frequency distributions. However, the ratio of nonsynonymous to synonymous substitutions per site (dN/dS) exceeded unity in two intravesicular domains of the protein and the influence of positive selection at multiple codon sites was strongly inferred through the use of maximum-likelihood analyses. In addition, the frequency spectrum of linked neutral variation showed indirect effects of adaptive hitchhiking in pollock resulting from a selective sweep of the common allelic lineage. Recombination between the two allelic classes may have prevented complete loss of the older, more polymorphic lineage. The results suggest that recurrent sweeps driven by positive selection is the principle mode of evolution at the Pan I locus in gadid fishes.  相似文献   

9.
Summary In species where actin genes exist as single copies, analysis of their synonymous codon usage and of the substitutions occurring between the genes of closely related species shows that there is a positive selection for codons that do not have highly mutable CpG dinucleotides in codon positions 2 and 3 when the GC content of these genes is less than 57%.  相似文献   

10.
All established methods for detecting positive selection at the molecular level rely on comparisons between nucleotide sequences. An exceptional method that purports to detect selection on the basis of a single genomic sequence has recently been proposed. This method uses a measure called "codon volatility," defined for each codon as the ratio between the number of nonsynonymous codons that differ from the codon under study at a single nucleotide position and the number of sense codons that differ from the codon under study at a single nucleotide position. Here, we examine various properties of codon volatility and its derivatives and use simulation of evolutionary processes to determine whether they can be used to detect selective pressures. Codons for only four amino acids (glycine, leucine, arginine, and serine) show any variation in codon volatility. Thus, codon volatility is mainly a proxy for amino acid usage, rather than for codon usage, with 65% of all synonymous changes and 27% of all nonsynonymous changes being undetectable by this measure. Genes identified by the volatility method as being subject to positive selection tend to have idiosyncratic amino acid compositions (e.g., they are glycine rich or arginine poor). An additional property of codon volatility is the near zero variance of its mean expectation, which translates into overestimated statistical significance estimates, especially in the absence of corrections for multiple comparisons. A comparison with measures of selection inferred through comparative methodology reveals no relationship between the results of the two methods. Finally, we show that codon volatility can increase in the absence of positive Darwinian selection; that is, increased codon volatility is not indicative of positive selection.  相似文献   

11.
In many organisms, synonymous codon usage is biased by a history of natural selection. However, codon bias, itself, does not indicate that selection is ongoing; it may be a vestige of past selection. Simple statistical tests have been devised to infer ongoing selection on codon usage by comparing the derived state frequency spectra at polymorphic sites segregating either derived preferred codons or derived unpreferred codons; if selection is effective, the frequency of derived states should be higher in the former. We propose a new test that uses the inferred degree of preference, essentially calculating the correlation of derived state frequency and the difference in preference between the derived and the ancestral states; the correlation should be positive if selection is effective. When implementing the test, derived and ancestral states can be assigned by parsimony or on the basis of relative probability. In either case, statistical significance is estimated by a simple permutation test. We explored the statistical power of the test by sampling polymorphism data from 14 loci in 16 strains of D. simulans, finding that the test retains 80% power even when quite a few of the data are discarded. The power of the test likely reflects better use of multiple features of the data, combining population frequencies of polymorphic variants and quantitative estimates of codon preferences. We also applied this novel test to 14 newly sequenced loci in five strains of D. mauritiana, showing for the first time ongoing selection on codon usage in this species.  相似文献   

12.
In the analysis of protein-coding nucleotide sequences, the ratio of the number of nonsynonymous substitutions to that of synonymous substitutions (d(N)/d(S)) is used as an indicator for the direction and magnitude of natural selection operating at the amino acid sequence level. The d(S) and d(N) values are estimated based on the comparison of homologous codons, which are often identified by converting (reverse-translating) aligned amino acid sequences into codon sequences. In this method, however, homologous codons may be mis-identified when frame-shifts occurred or amino acid sequences were mis-aligned, which may lead to overestimation of the d(N)/d(S) ratio. Here the effect of reverse-translating aligned amino acid sequences on the estimation of d(N)/d(S) ratio was examined through a large-scale analysis of protein-coding nucleotide sequences from vertebrate species. Apparently, 1-9% of codon sites that were identified as homologous with reverse-translation contained non-homologous codons, where the d(N)/d(S) ratio was unduly high. By correcting the d(N)/d(S) ratio for these codon sites, it was inferred that the ratio was 5-43% overestimated with reverse-translation. These results suggest that caution should be exerted in the study of natural selection using the d(N)/d(S) ratio by reverse-translating aligned amino acid sequences.  相似文献   

13.
Natural selection operating on amino acid substitution at single amino acid sites can be detected by comparing the rates of synonymous (r(S)) and nonsynonymous (r(N)) nucleotide substitution at single codon sites. Amino acid substitutions can be classified as conservative or radical according to whether they retain the properties of the substituted amino acid. Here methods for comparing the rates of conservative (r(C)) and radical (r(R)) nonsynonymous substitution with r(S) at single codon sites were developed to detect natural selection operating on these substitutions at single amino acid sites. A method for comparing r(C) and r(R) at single codon sites was also developed to detect biases toward these substitutions at single amino acid sites. Charge was used as the property of the amino acids. In a computer simulation, false-positive rates of these methods were always < 5%, unless termination sites were included in the computation of the numbers of sites and estimates of transition/transversion rate ratio were highly biased. The frequency of detection of natural selection operating on conservative substitution was almost independent of the presence of natural selection operating on radical substitution, and vice versa. Natural selection operating specifically on conservative and radical substitution was detected more efficiently by comparing r(S) with r(C) and r(S) with r(R) than by comparing r(S) with r(N). These methods also appeared to be robust against the occurrence of recombination during evolution. In an analysis of class I human leukocyte antigen, negative selection operating on conservative substitution, but not positive selection operating on radical substitution, was observed at some of the codon sites with r(R) > r(C), suggesting that r(R) > r(C) may not necessarily be an indicator of positive selection operating on radical substitution.  相似文献   

14.
Mitochondrial genetic codons can be categorized by four patterns of nucleotide-site degeneracy based on varying combinations of twofold- or nondegenerate sites at first codon positions and twofold- or fourfold-degenerate sites at third codon positions. Herein, a model of molecular evolution is introduced that uses these patterns to calculate expected substitution frequencies for each codon position and substitution type relative to overall number of synonymous or nonsynonymous substitutions. Regions of the pocket gopher cytochrome oxidase subunit I (COI) and cytochrome b (cyt-b) genes are analyzed using this model. Chi-square distributions are used to produce relative goodness-of-fit (GF) scores for measuring the difference between substitution frequencies predicted by the codon-degeneracy model (CDM), and frequencies inferred using a well-supported phylogenetic tree of closely related species. The GF scores for expected and observed synonymous (GFsyn= 0.429, p= 0.807) and nonsynonymous (GFns= 2.309, p= 0.679) substitution frequencies resulted in a failure to reject the CDM as a null hypothesis for the molecular evolution of COI and cyt-b in pocket gophers. Alternative tree topologies and calculations of transition bias for these data result in higher GF scores. Received: 25 March 1999 / Accepted: 17 September 1999  相似文献   

15.
A number of statistical tests have been proposed to detect positive Darwinian selection affecting a few amino acid sites in a protein, exemplified by an excess of nonsynonymous nucleotide substitutions. These tests are often more powerful than pairwise sequence comparison, which averages synonymous (d(S)) and nonsynonymous (d(N)) rates over the whole gene. In a recent study, however, Hughes AL and Friedman R (2005. Variation in the pattern of synonymous and nonsynonymous difference between two fungal genomes. Mol Bio Evol. 22: 1320-1324) argue that d(S) and d(N) are expected to fluctuate along the sequence by chance and that an excess of nonsynonymous differences in individual codons is no evidence for positive selection. The authors compared codons in protein-coding genes from the genomes of 2 yeast species, Saccharomyces cerevisiae and Saccharomyces paradoxus. They calculated the proportions of synonymous and nonsynonymous differences per site (p(S) and p(N)) in every codon and discovered that p(N) is often greater than p(S) and that among some codons p(S) and p(N) are negatively correlated. The authors argued that these results invalidate previous tests of codons under positive selection. Here I discuss several errors of statistics in the analysis of Hughes and Friedman, including confusion of statistics with parameters, arbitrary data filtering, and derivation of hypotheses from data. I also apply likelihood ratio tests of positive selection to the yeast data and illustrate empirically that Hughes and Friedman's criticisms on such tests are not valid.  相似文献   

16.
Selection Intensity for Codon Bias   总被引:26,自引:7,他引:19       下载免费PDF全文
D. L. Hartl  E. N. Moriyama    S. A. Sawyer 《Genetics》1994,138(1):227-234
The patterns of nonrandom usage of synonymous codons (codon bias) in enteric bacteria were analyzed. Poisson random field (PRF) theory was used to derive the expected distribution of frequencies of nucleotides differing from the ancestral state at aligned sites in a set of DNA sequences. This distribution was applied to synonymous nucleotide polymorphisms and amino acid polymorphisms in the gnd and putP genes of Escherichia coli. For the gnd gene, the average intensity of selection against disfavored synonymous codons was estimated as approximately 7.3 X 10(-9); this value is significantly smaller than the estimated selection intensity against selectively disfavored amino acids in observed polymorphisms (2.0 X 10(-8)), but it is approximately of the same order of magnitude. The selection coefficients for optimal synonymous codons estimated from PRF theory were consistent with independent estimates based on codon usage for threonine and glycine. Across 118 genes in E. coli and Salmonella typhimurium, the distribution of estimated selection coefficients, expressed as multiples of the effective population size, has a mean and standard deviation of 0.5 +/- 0.4. No significant differences were found in the degree of codon bias between conserved positions and replacement positions, suggesting that translational misincorporation is not an important selective constraint among synonymous polymorphic codons in enteric bacteria. However, across the first 100 codons of the genes, conserved amino acids with identical codons have significantly greater codon bias than of either synonymous or nonidentical codons, suggesting that there are unique selective constraints, perhaps including mRNA secondary structures, in this part of the coding region.  相似文献   

17.
Codon Substitution in Evolution and the "Saturation" of Synonymous Changes   总被引:4,自引:1,他引:3  
Takashi Gojobori 《Genetics》1983,105(4):1011-1027
A mathematical model for codon substitution is presented, taking into account unequal mutation rates among different nucleotides and purifying selection. This model is constructed by using a 61 X 61 transition probability matrix for the 61 nonterminating codons. Under this model, a computer simulation is conducted to study the numbers of silent (synonymous) and amino acid-altering (nonsynonymous) nucleotide substitutions when the underlying mutation rates among the four kinds of nucleotides are not equal. It is assumed that the substitution rates are constant over evolutionary time, the codon frequencies being in equilibrium, and, thus, the numbers of synonymous and nonsynonymous substitutions both increase linearly with evolutionary time. It is shown that, when the mutation rates are not equal, the estimate of synonymous substitutions obtained by F. Perler, A. Efstratiadis, P. Lomedico, W. Gilbert, R. Kolodner and J. Dodgson's "Percent Corrected Divergence" method increases nonlinearly, although the true number of synonymous substitutions increases linearly. It is, therefore, possible that the "saturation" of synonymous substitutions observed by Perler et al. is due to the inefficiency of their method to detect all synonymous substitutions.  相似文献   

18.
A O Urrutia  L D Hurst 《Genetics》2001,159(3):1191-1199
In numerous species, from bacteria to Drosophila, evidence suggests that selection acts even on synonymous codon usage: codon bias is greater in more abundantly expressed genes, the rate of synonymous evolution is lower in genes with greater codon bias, and there is consistency between genes in the same species in which codons are preferred. In contrast, in mammals, while nonequal use of alternative codons is observed, the bias is attributed to the background variance in nucleotide concentrations, reflected in the similar nucleotide composition of flanking noncoding and exonic third sites. However, a systematic examination of the covariants of codon usage controlling for background nucleotide content has yet to be performed. Here we present a new method to measure codon bias that corrects for background nucleotide content and apply this to 2396 human genes. Nearly all (99%) exhibit a higher amount of codon bias than expected by chance. The patterns associated with selectively driven codon bias are weakly recovered: Broadly expressed genes have a higher level of bias than do tissue-specific genes, the bias is higher for genes with lower rates of synonymous substitutions, and certain codons are repeatedly preferred. However, while these patterns are suggestive, the first two patterns appear to be methodological artifacts. The last pattern reflects in part biases in usage of nucleotide pairs. We conclude that we find no evidence for selection on codon usage in humans.  相似文献   

19.
The spatial distribution of synonymous substitutions in enterobacterial genes is investigated. It is shown that synonymous substitutions are significantly clustered in such a way that a synonymous substitution in one codon elevates the rate of synonymous substitution in an adjacent codon by about 10%. The level of clustering does not appear to be related to the level of gene expression, and it is restricted to a range of two or three codons. There are at least three possible explanations: (1) sequence-directed mutagenesis, (2) recombination, and (3) selection.  相似文献   

20.
To elucidate the evolutionary mechanisms of the human immunodeficiency virus type 1 gp120 envelope glycoprotein at the single-site level, the degree of amino acid variation and the numbers of synonymous and nonsynonymous substitutions were examined in 186 nucleotide sequences for gp120 (subtype B). Analyses of amino acid variabilities showed that the level of variability was very different from site to site in both conserved (C1 to C5) and variable (V1 to V5) regions previously assigned. To examine the relative importance of positive and negative selection for each amino acid position, the numbers of synonymous and nonsynonymous substitutions that occurred at each codon position were estimated by taking phylogenetic relationships into account. Among the 414 codon positions examined, we identified 33 positions where nonsynonymous substitutions were significantly predominant. These positions where positive selection may be operating, which we call putative positive selection (PS) sites, were found not only in the variable loops but also in the conserved regions (C1 to C4). In particular, we found seven PS sites at the surface positions of the alpha-helix (positions 335 to 347 in the C3 region) in the opposite face for CD4 binding. Furthermore, two PS sites in the C2 region and four PS sites in the C4 region were detected in the same face of the protein. The PS sites found in the C2, C3, and C4 regions were separated in the amino acid sequence but close together in the three-dimensional structure. This observation suggests the existence of discontinuous epitopes in the protein's surface including this alpha-helix, although the antigenicity of this area has not been reported yet.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号