首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Natural selection operating at the amino acid sequence level can be detected by comparing the rates of synonymous (r(S)) and nonsynonymous (r(N)) nucleotide substitutions, where r(N)/r(S) (omega) > 1 and omega < 1 suggest positive and negative selection, respectively. The branch-site test has been developed for detecting positive selection operating at a group of amino acid sites for a pre-specified (foreground) branch of a phylogenetic tree by taking into account the heterogeneity of omega among sites and branches. Here the performance of the branch-site test was examined by computer simulation, with special reference to the false-positive rate when the divergence of the sequences analyzed was small. The false-positive rate was found to inflate when the assumptions made on the omega values for the foreground and other (background) branches in the branch-site test were violated. In addition, under a similar condition, false-positive results were often obtained even when Bonferroni correction was conducted and the false-discovery rate was controlled in a large-scale analysis. False-positive results were also obtained even when the number of nonsynonymous substitutions for the foreground branch was smaller than the minimum value required for detecting positive selection. The existence of a codon site with a possibility of occurrence of multiple nonsynonymous substitutions for the foreground branch often caused the branch-site test to falsely identify positive selection. In the re-analysis of orthologous trios of protein-coding genes from humans, chimpanzees, and macaques, most of the genes previously identified to be positively selected for the human or chimpanzee branch by the branch-site test contained such a codon site, suggesting a possibility that a significant fraction of these genes are false-positives.  相似文献   

2.
ADAPTSITE: detecting natural selection at single amino acid sites.   总被引:12,自引:0,他引:12  
ADAPTSITE is a program package for detecting natural selection at single amino acid sites, using a multiple alignment of protein-coding sequences for a given phylogenetic tree. The program infers ancestral codons at all interior nodes, and computes the total numbers of synonymous (c(S)) and nonsynonymous (c(N)) substitutions as well as the average numbers of synonymous (s(S)) and nonsynonymous (s(N)) sites for each codon site. The probabilities of occurrence of synonymous and nonsynonymous substitutions are approximated by s(S) / (s(S) + s(N)) and s(N) / (s(S) + s(N)), respectively. The null hypothesis of selective neutrality is tested for each codon site, assuming a binomial distribution for the probability of obtaining c(S) and c(N). AVAILABILITY: ADAPTSITE is available free of charge at the World-Wide Web sites http://mep.bio.psu.edu/adaptivevol.html and http://www.cib.nig.ac.jp/dda/yossuzuk/welcome.html. The package includes the source code written in C, binary files for UNIX operating systems, manual, and example files.  相似文献   

3.
N G Smith  L D Hurst 《Genetics》1999,153(3):1395-1402
Nonsynonymous substitutions in DNA cause amino acid substitutions while synonymous substitutions in DNA leave amino acids unchanged. The cause of the correlation between the substitution rates at nonsynonymous (K(A)) and synonymous (K(S)) sites in mammals is a contentious issue, and one that impacts on many aspects of molecular evolution. Here we use a large set of orthologous mammalian genes to investigate the causes of the K(A)-K(S) correlation in rodents. The strength of the K(A)-K(S) correlation exceeds the neutral theory expectation when substitution rates are estimated using algorithmic methods, but not when substitution rates are estimated by maximum likelihood. Irrespective of this methodological uncertainty the strength of the K(A)-K(S) correlation appears mostly due to tandem substitutions, an excess of which is generated by substitutional nonindependence. Doublet mutations cannot explain the excess of tandem synonymous-nonsynonymous substitutions, and substitution patterns indicate that selection on silent sites is the likely cause. We find no evidence for selection on codon usage. The nature of the relationship between synonymous divergence and base composition is unclear because we find a significant correlation if we use maximum-likelihood methods but not if we use algorithmic methods. Finally, we find that K(S) is reduced at the start of genes, which suggests that selection for RNA structure may affect silent sites in mammalian protein-coding genes.  相似文献   

4.
To understand the process and mechanism of protein evolution, it is important to know what types of amino acid substitutions are more likely to be under selection and what types are mostly neutral. An amino acid substitution can be classified as either conservative or radical, depending on whether it involves a change in a certain physicochemical property of the amino acid. Assuming Kimura's two-parameter model of nucleotide substitution, I present a method for computing the numbers of conservative and radical nonsynonymous (amino acid altering) nucleotide substitutions per site and estimate these rates for 47 nuclear genes from mammals. The results are as follows. (1) The average radical/conservative rate ratio is 0.81 for charge changes, 0.85 for polarity changes, and 0.49 when both polarity and volume changes are considered. (2) The radical/conservative rate ratio is positively correlated with the nonsynonymous/synonymous rate ratio for charge changes or when both polarity and volume changes are considered. (3) Both the conservative/synonymous rate ratio and the radical/synonymous rate ratio are lower in the rodent lineage than in the primate or artiodactyl lineage, suggesting more intense purifying selection in the rodent lineage, for both conservative and radical nonsynonymous substitutions. (4) Neglecting transition/transversion bias would cause an underestimation of both radical and conservative rates and the ratio thereof. (5) Transversions induce more dramatic genetic alternations than transitions in that transversions produce more amino acid altering changes and among which, more radical changes. Received: 6 April 1999 / Accepted: 16 August 1999  相似文献   

5.
In the analysis of protein-coding nucleotide sequences, the ratio of the number of nonsynonymous substitutions to that of synonymous substitutions (d(N)/d(S)) is used as an indicator for the direction and magnitude of natural selection operating at the amino acid sequence level. The d(S) and d(N) values are estimated based on the comparison of homologous codons, which are often identified by converting (reverse-translating) aligned amino acid sequences into codon sequences. In this method, however, homologous codons may be mis-identified when frame-shifts occurred or amino acid sequences were mis-aligned, which may lead to overestimation of the d(N)/d(S) ratio. Here the effect of reverse-translating aligned amino acid sequences on the estimation of d(N)/d(S) ratio was examined through a large-scale analysis of protein-coding nucleotide sequences from vertebrate species. Apparently, 1-9% of codon sites that were identified as homologous with reverse-translation contained non-homologous codons, where the d(N)/d(S) ratio was unduly high. By correcting the d(N)/d(S) ratio for these codon sites, it was inferred that the ratio was 5-43% overestimated with reverse-translation. These results suggest that caution should be exerted in the study of natural selection using the d(N)/d(S) ratio by reverse-translating aligned amino acid sequences.  相似文献   

6.
To elucidate the evolutionary mechanisms of the human immunodeficiency virus type 1 gp120 envelope glycoprotein at the single-site level, the degree of amino acid variation and the numbers of synonymous and nonsynonymous substitutions were examined in 186 nucleotide sequences for gp120 (subtype B). Analyses of amino acid variabilities showed that the level of variability was very different from site to site in both conserved (C1 to C5) and variable (V1 to V5) regions previously assigned. To examine the relative importance of positive and negative selection for each amino acid position, the numbers of synonymous and nonsynonymous substitutions that occurred at each codon position were estimated by taking phylogenetic relationships into account. Among the 414 codon positions examined, we identified 33 positions where nonsynonymous substitutions were significantly predominant. These positions where positive selection may be operating, which we call putative positive selection (PS) sites, were found not only in the variable loops but also in the conserved regions (C1 to C4). In particular, we found seven PS sites at the surface positions of the alpha-helix (positions 335 to 347 in the C3 region) in the opposite face for CD4 binding. Furthermore, two PS sites in the C2 region and four PS sites in the C4 region were detected in the same face of the protein. The PS sites found in the C2, C3, and C4 regions were separated in the amino acid sequence but close together in the three-dimensional structure. This observation suggests the existence of discontinuous epitopes in the protein's surface including this alpha-helix, although the antigenicity of this area has not been reported yet.  相似文献   

7.
There are 2 ways to infer selection pressures in the evolution of protein-coding genes, the nonsynonymous and synonymous substitution rate ratio (K(A)/K(S)) and the radical and conservative amino acid replacement rate ratio (K(R)/K(C)). Because the K(R)/K(C) ratio depends on the definition of radical and conservative changes in the classification of amino acids, we develop an amino acid classification that maximizes the correlation between K(A)/K(S) and K(R)/K(C). An analysis of 3,375 orthologous gene groups among 5 mammalian species shows that our classification gives a significantly higher correlation coefficient between the 2 ratios than those of existing classifications. However, there are many orthologous gene groups with a low K(A)/K(S) but a high K(R)/K(C) ratio. Examining the functions of these genes, we found an overrepresentation of functional categories related to development. To determine if the overrepresentation is stage specific, we examined the expression patterns of these genes at different developmental stages of the mouse. Interestingly, these genes are highly expressed in the early middle stage of development (blastocyst to amnion). It is commonly thought that developmental genes tend to be conservative in evolution, but some molecular changes in developmental stages should have contributed to morphological divergence in adult mammals. Therefore, we propose that the relaxed pressures indicated by the K(R)/K(C) ratio but not by K(A)/K(S) in the early middle stage of development may be important for the morphological divergence of mammals at the adult stage, whereas purifying selection detected by K(A)/K(S) occurs in the early middle developmental stage.  相似文献   

8.
Substitution rates at the three codon positions (r1, r2, and r3) of mammalian mitochondrial genes are in the order of r3 > r1 > r2, and the rate heterogeneity at the three positions, as measured by the shape parameter of the gamma distribution (alpha 1, alpha 2, and alpha 3), is in the order of alpha 3 > alpha 1 > alpha 2. The causes for the rate heterogeneity at the three codon positions remain unclear and, in particular, there has been no satisfactory explanation for the observation of alpha 1 > alpha 2. I attempted to dissect the causes of rate heterogeneity by studying the pattern of nonsynonymous substitutions with respect to codon positions in 10 mitochondrial genes from 19 mammalian species. Nonsynonymous substitutions involve more different amino acid replacements at the second than at the first codon position, which results in r1 > r2. The difference between r1 and r2 increases with the intensity of purifying selection, and so does the rate heterogeneity in nonsynonymous substitutions among sites at the same codon position. All mitochondrial genes appear to have functionally important and unimportant codons, with the latter having all three codon positions prone to nonsynonymous substitutions. Within the functionally important codons, the second codon position is much more conservative than the codon position. This explains why alpha 1 > alpha 2. The result suggests that overweighting of the second codon position in phylogenetic analysis may be a misguided practice.   相似文献   

9.
Maximum-likelihood models of codon substitution were used to analyze sperm lysin genes of 25 abalone (HALIOTIS:) species to identify lineages and amino acid sites under diversifying selection. The models used the nonsynonymous/synonymous rate ratio (omega = d(N)/d(S)) as an indicator of selective pressure and allowed the ratio to vary among lineages or sites. Likelihood ratio tests suggested significant variation in selective pressure among lineages. The variable selective pressure provided an explanation for the previous observation that the omega ratio is >1 in comparisons of closely related species and <1 in comparisons of distantly related species. Computer simulations demonstrated that saturation of nonsynonymous substitutions and constraint on lysin structure were unlikely to account for the observed pattern. Lineages linking closely related sympatric species appeared to be under diversifying selection, while lineages separating distantly related species from different geographic locations were associated with low evolutionary rates. The selective pressure indicated by the omega ratio was found to vary greatly among amino acid sites in lysin. Sites under potential diversifying selection were identified. Ancestral lysins were inferred to trace the route of evolution at individual sites and to provide lysin sequences for future laboratory studies.  相似文献   

10.
New Methods for Detecting Positive Selection at Single Amino Acid Sites   总被引:15,自引:0,他引:15  
Inferring positive selection at single amino acid sites is of particular importance for studying evolutionary mechanisms of a protein. For this purpose, Suzuki and Gojobori (1999) developed a method (SG method) for comparing the rates of synonymous and nonsynonymous substitutions at each codon site in a protein-coding nucleotide sequence, using ancestral codons at interior nodes of the phylogenetic tree as inferred by the maximum parsimony method. In the SG method, however, selective neutrality of nucleotide substitutions cannot be tested at codon sites, where only termination codons are inferred at any interior node or the number of equally parsimonious inferences of ancestral codons at all interior nodes exceeds 10,000. Here I present a modified SG method which is free from these problems. Specifically, I use the distance-based Bayesian method for inferring the single most likely ancestral codon from 61 sense codons at each interior node. In the computer simulation and real data analysis, the modified SG method showed a higher overall efficiency of detecting positive selection than the original SG method, particularly at highly polymorphic codon sites. These results indicate that the modified SG method is useful for inferring positive selection at codon sites where neutrality cannot be tested by the original SG method. I also discuss that the p-distance is preferable to the number of synonymous substitutions for inferring the phylogenetic tree in the SG method, and present a maximum likelihood method for detecting positive selection at single amino acid sites, which produced reasonable results in the real data analysis.  相似文献   

11.
Maximum-likelihood models of codon and amino acid substitution were used to analyze the lung-specific surfactant protein C (SP-C) from terrestrial, semi-aquatic, and diving mammals to identify lineages and amino acid sites under positive selection. Site models used the nonsynonymous/synonymous rate ratio (ω) as an indicator of selection pressure. Mechanistic models used physicochemical distances between amino acid substitutions to specify nonsynonymous substitution rates. Site models strongly identified positive selection at different sites in the polar N-terminal extramembrane domain of SP-C in the three diving lineages: site 2 in the cetaceans (whales and dolphins), sites 7, 9, and 10 in the pinnipeds (seals and sea lions), and sites 2, 9, and 10 in the sirenians (dugongs and manatees). The only semi-aquatic contrast to indicate positive selection at site 10 was that including the polar bear, which had the largest body mass of the semi-aquatic species. Analysis of the biophysical properties that were influential in determining the amino acid substitutions showed that isoelectric point, chemical composition of the side chain, polarity, and hydrophobicity were the crucial determinants. Amino acid substitutions at these sites may lead to stronger binding of the N-terminal domain to the surfactant phospholipid film and to increased adsorption of the protein to the air-liquid interface. Both properties are advantageous for the repeated collapse and reinflation of the lung upon diving and resurfacing and may reflect adaptations to the high hydrostatic pressures experienced during diving. Electronic supplementary material The online version of this article (doi:) contains supplementary material, which is available to authorized users. Reviewing Editor: Dr. Richard Kliman  相似文献   

12.
A method for detecting positive selection at single amino acid sites   总被引:23,自引:0,他引:23  
A method was developed for detecting the selective force at single amino acid sites given a multiple alignment of protein-coding sequences. The phylogenetic tree was reconstructed using the number of synonymous substitutions. Then, the neutrality was tested for each codon site using the numbers of synonymous and nonsynonymous changes throughout the phylogenetic tree. Computer simulation showed that this method accurately estimated the numbers of synonymous and nonsynonymous substitutions per site, as long as the substitution number on each branch was relatively small. The false-positive rate for detecting the selective force was generally low. On the other hand, the true-positive rate for detecting the selective force depended on the parameter values. Within the range of parameter values used in the simulation, the true-positive rate increased as the strength of the selective force and the total branch length (namely the total number of synonymous substitutions per site) in the phylogenetic tree increased. In particular, with the relative rate of nonsynonymous substitutions to synonymous substitutions being 5.0, most of the positively selected codon sites were correctly detected when the total branch length in the phylogenetic tree was > or = 2.5. When this method was applied to the human leukocyte antigen (HLA) gene, which included antigen recognition sites (ARSs), positive selection was detected mainly on ARSs. This finding confirmed the effectiveness of the present method with actual data. Moreover, two amino acid sites were newly identified as positively selected in non-ARSs. The three-dimensional structure of the HLA molecule indicated that these sites might be involved in antigen recognition. Positively selected amino acid sites were also identified in the envelope protein of human immunodeficiency virus and the influenza virus hemagglutinin protein. This method may be helpful for predicting functions of amino acid sites in proteins, especially in the present situation, in which sequence data are accumulating at an enormous speed.  相似文献   

13.
We surveyed the molecular evolutionary characteristics of 11 nuclear genes from 10 conifer trees belonging to the Taxodioideae, the Cupressoideae, and the Sequoioideae. Comparisons of substitution rates among the lineages indicated that the synonymous substitution rates of the Cupressoideae lineage were higher than those of the Taxodioideae. This result parallels the pattern previously found in plastid genes. Likelihood-ratio tests showed that the nonsynonymous-synonymous rate ratio did not change significantly among lineages. In addition, after adjustments for lineage effects, the dispersion indices of synonymous and nonsynonymous substitutions were considerably reduced, and the latter was close to 1. These results indicated that the acceleration of evolutionary rates in the Cupressoideae lineage occurred in both the nuclear and plastid genomes, and that generally, this lineage effect affected synonymous and nonsynonymous substitutions similarly. We also investigated the relationship of synonymous substitution rates with the nonsynonymous substitution rate, base composition, and codon bias in each lineage. Synonymous substitution rates were positively correlated with nonsynonymous substitution rates and GC content at third codon positions, but synonymous substitution rates were not correlated with codon bias. Finally, we tested the possibility of positive selection at the protein level, using maximum likelihood models, assuming heterogeneous nonsynonymous-synonymous rate ratios among codon (amino acid) sites. Although we did not detect strong evidence of positively selected codon sites, the analysis suggested that significant variation in nonsynonymous-synonymous rate ratio exists among the sites. The most likely sites for action of positive selection were found in the ferredoxin gene, which is an important component of the apparatus for photosynthesis.  相似文献   

14.
Maliarchuk BA 《Genetika》2012,48(6):713-718
Sequence analysis of the cytochrome b gene fragment in the salamanders of the genus Salamandrella, Siberian salamander and Schrenk salamander was performed with the purpose to elucidate the effect of natural selection on the evolution of mitochondrial DNA (mtDNA) in these species. It was demonstrated that despite of notable influence of negative selection (expressed as very low dN/dS values), speciation and intraspecific divergence in salamanders was accompanied by the appearance of radical amino acid substitutions, caused by the influence of positive (directional) selection. To examine the evolutionary pattern of synonymous mtDNA sites, distribution of conservative and non-conservative substitutions was analyzed. The rates of conservative and non-conservative substitutions were nearly equal, pointing to neutrality of mutation process at synonymous mtDNA sites of salamanders. Analysis of conservative and non-conservative synonymous substitution distributions in different parts of phylogenetic trees showed that the differences between the synonymous groups compared were statistically significant only in one phylogenetic group of Siberian salamander (haplogroup C) (P = 0.02). In the group of single substitutions, located at terminal phylogenetic branches of Siberian salamanders from group C, increased rate of conservative substitutions was observed. Based on these findings, it was suggested that selective processes could have an influence on the formation of the synonymous substitution profile in the Siberian salamander mtDNA fragment examined.  相似文献   

15.
A number of statistical tests have been proposed to detect positive Darwinian selection affecting a few amino acid sites in a protein, exemplified by an excess of nonsynonymous nucleotide substitutions. These tests are often more powerful than pairwise sequence comparison, which averages synonymous (d(S)) and nonsynonymous (d(N)) rates over the whole gene. In a recent study, however, Hughes AL and Friedman R (2005. Variation in the pattern of synonymous and nonsynonymous difference between two fungal genomes. Mol Bio Evol. 22: 1320-1324) argue that d(S) and d(N) are expected to fluctuate along the sequence by chance and that an excess of nonsynonymous differences in individual codons is no evidence for positive selection. The authors compared codons in protein-coding genes from the genomes of 2 yeast species, Saccharomyces cerevisiae and Saccharomyces paradoxus. They calculated the proportions of synonymous and nonsynonymous differences per site (p(S) and p(N)) in every codon and discovered that p(N) is often greater than p(S) and that among some codons p(S) and p(N) are negatively correlated. The authors argued that these results invalidate previous tests of codons under positive selection. Here I discuss several errors of statistics in the analysis of Hughes and Friedman, including confusion of statistics with parameters, arbitrary data filtering, and derivation of hypotheses from data. I also apply likelihood ratio tests of positive selection to the yeast data and illustrate empirically that Hughes and Friedman's criticisms on such tests are not valid.  相似文献   

16.
The number of N-linked glycosylation sites in the globular head of hemagglutinin (HA) has increased during evolution of H3N2 human influenza A virus. Here natural selection operating on the gains of N-linked glycosylation sites was examined by using the single-site analysis and the single-substitution analysis. In the single-site analysis, positive selection was not inferred at the amino acid sites where the substitutions generating N-linked glycosylation sites were observed, but was detected at antigenic sites. In contrast, in the single-substitution analysis, positive selection was detected for the amino acid substitutions generating N-linked glycosylation sites. The single-site analysis and the single-substitution analysis appeared to be suitable for detecting recurrent and episodic natural selection, respectively. The gains of N-linked glycosylation sites were likely to be positively selected for the function of shielding antigenic sites from immune responses. At the antigenic sites, positive selection appeared to have operated not only on the radical substitution but also on the conservative substitution in terms of the charge of amino acids, suggesting that the antigenic drift is not a by-product of the evolution of receptor binding avidity in HA of human H3N2 virus.  相似文献   

17.
The nature of selection on capsid genes of foot-and-mouth disease virus (FMDV) was characterized by examining the ratio of nonsynonymous to synonymous substitutions in 11 data sets of sequences obtained from six different serotypes of FMDV. Using a method of analysis that assigns each codon position to one of a number of estimated values of nonsynonymous to synonymous ratio, significant evidence of positive selection was identified in 5 data sets, operating at 1-7% of codon positions. Evidence of positive selection was identified in complete capsid sequences of serotypes A and C and in VP1 sequences of serotypes SAT 1 and 2. Sequences of serotype SAT-2 recovered from a persistently infected African buffalo also revealed evidence for positive selection. Locations of codons under positive selection coincide closely with those of antigenic sites previously identified with the use of monoclonal antibody escape mutants. The vast majority of codons are under mild to strong purifying selection. However, these results suggest that arising antigenic variants benefit from a selective advantage in their interaction with the immune system, either during the course of an infection or in transmission to individuals with previous exposure to antigen. Analysis of amino acid usage at sites under positive selection indicates that this selective advantage can be conferred by amino acid substitutions that share physicochemically similar properties.  相似文献   

18.
In order to understand the impact of overlapping reading frames on natural selection by host CD8+ T lymphocytes (CD8(+)-TL), we analyzed the pattern of nucleotide substitution in simian immunodeficiency virus (SIV) genomes sampled from populations at time of death in 35 rhesus monkeys. Both the mean number of nonsynonymous nucleotide substitutions per nonsynonymous site (d(N)) and the mean number of synonymous nucleotide substitutions per synonymous site (d(S)) were elevated in overlap regions in comparison to non-overlap regions. Mean d(N) exceeded mean d(S) in CD8(+)-TL epitopes restricted by the host's class I major histocompatibility complex molecules. This pattern, which is indicative of positive Darwinian selection favoring amino acid changes in these epitopes, was seen in both overlap and non-overlap regions; but mean d(N) was particularly elevated in restricted CD8(+)-TL epitopes encoded in overlap regions. Amino acid changes from the inoculum were defined as parallel if the same amino acid change occurred at the same site independently in two or more monkeys, and a surprisingly high proportion (71.9%) of observed amino acid changes throughout the SIV genome occurred in parallel in different monkeys. The proportion of parallel changes in restricted epitopes encoded by overlapping reading frames was still higher (80%), supporting the hypothesis that the interaction of positive selection and overlapping reading frames enhances the probability of convergent or parallel amino acid change.  相似文献   

19.
Rapid evolution of mammalian X-linked testis-expressed homeobox genes   总被引:5,自引:0,他引:5  
Wang X  Zhang J 《Genetics》2004,167(2):879-888
  相似文献   

20.
Models of amino acid substitution were developed and compared using maximum likelihood. Two kinds of models are considered. "Empirical" models do not explicitly consider factors that shape protein evolution, but attempt to summarize the substitution pattern from large quantities of real data. "Mechanistic" models are formulated at the codon level and separate mutational biases at the nucleotide level from selective constraints at the amino acid level. They account for features of sequence evolution, such as transition-transversion bias and base or codon frequency biases, and make use of physicochemical distances between amino acids to specify nonsynonymous substitution rates. A general approach is presented that transforms a Markov model of codon substitution into a model of amino acid replacement. Protein sequences from the entire mitochondrial genomes of 20 mammalian species were analyzed using different models. The mechanistic models were found to fit the data better than empirical models derived from large databases. Both the mutational distance between amino acids (determined by the genetic code and mutational biases such as the transition-transversion bias) and the physicochemical distance are found to have strong effects on amino acid substitution rates. A significant proportion of amino acid substitutions appeared to have involved more than one codon position, indicating that nucleotide substitutions at neighboring sites may be correlated. Rates of amino acid substitution were found to be highly variable among sites.   相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号