首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 390 毫秒
1.
A method for estimating the numbers of synonymous (Ks) and nonsynonymous (Ka) substitutions per site is proposed. The method is based on the Li's (J Mol. Evol. 36:96–99, 1993) and Pamilo and Bianchi's (Mol. Biol. Evol. 10:271–281, 1993) method, but a putative source of bias is solved. It is proposed that the number of synonymous substitutions that are actually transitions or transversions should be computed by separating the twofold degenerate sites into two types of sites, 2S-fold and 2V-fold, where only transitional and transversional substitutions are synonymous, respectively. Kimura's (J. Mol. Evol. 16:111–120, 1980) two-parameter correcting method for multiple substitutions at a site is then applied using the overall observed synonymous transversion frequency to estimate both the numbers of synonymous transversional (Bs) and transitional (As) substitutions per site. This approach, therefore, also minimizes stochastic errors. Computer simulations indicate that the method presented gives more accurate Ks and Ka estimates than the aforementioned methods. Furthermore, the obtention of confidence intervals for divergence estimates by computer simulation is proposed.  相似文献   

2.
A new method is proposed for estimating the number of synonymous and nonsynonymous nucleotide substitutions between homologous genes. In this method, a nucleotide site is classified as nondegenerate, twofold degenerate, or fourfold degenerate, depending on how often nucleotide substitutions will result in amino acid replacement; nucleotide changes are classified as either transitional or transversional, and changes between codons are assumed to occur with different probabilities, which are determined by their relative frequencies among more than 3,000 changes in mammalian genes. The method is applied to a large number of mammalian genes. The rate of nonsynonymous substitution is extremely variable among genes; it ranges from 0.004 X 10(-9) (histone H4) to 2.80 X 10(-9) (interferon gamma), with a mean of 0.88 X 10(-9) substitutions per nonsynonymous site per year. The rate of synonymous substitution is also variable among genes; the highest rate is three to four times higher than the lowest one, with a mean of 4.7 X 10(-9) substitutions per synonymous site per year. The rate of nucleotide substitution is lowest at nondegenerate sites (the average being 0.94 X 10(-9), intermediate at twofold degenerate sites (2.26 X 10(-9)). and highest at fourfold degenerate sites (4.2 X 10(-9)). The implication of our results for the mechanisms of DNA evolution and that of the relative likelihood of codon interchanges in parsimonious phylogenetic reconstruction are discussed.  相似文献   

3.
One method for diagnosing the mode of sequence evolution considers the ratio of nonsynonymous substitutions per nonsynonymous site (K A) to the corresponding figure for synonymous substitutions (K S). A ratio (K A/K S) greater than unity is taken as evidence for positive selection. This, however, need not necessarily be the case. Notably, there is one instance of a high intragenic K A/K S peak, revealed by sliding window analysis and observed in two pairwise comparisons, better accounted for by localised purifying selection on synonymous mutations that affect splicing. Is this example exceptional? To address this we isolate intragenic domains with K A/K S > 1 from more than 1000 long mouse-rat orthologues. Approximately one K A/K S > 1 peak is found per 12–15 kb of coding sequence. Surprisingly, low synonymous substitution rates underpin more incidences than do high nonsynonymous rates. Several reasons, however, prevent us from supposing that the low synonymous rates reflect purifying selection on synonymous mutations. First, for many peaks, the null that the peak is no higher than expected given the underlying rates of evolution, cannot be rejected. Second, of 18 statistically significant incidences with unusually low K S values, only 3 are repeatable across independent comparisons. At least two of these are within alternatively spliced exons. We conclude that repeatable statistically significant intragenic domains of low intragenic K S are rare. As so few K A/K S peaks reflect increased rates of protein evolution and so few hold statistical support, we additionally conclude that sliding window analysis to infer domains of positive selection is highly error-prone.  相似文献   

4.
The activity of complex I of the mitochondrial respiratory chain has been found to be decreased in patients with Parkinsons disease (PD), but no mutations have been identified in genes encoding complex I subunits. Recent studies have suggested that polymorphisms in mitochondrial DNA (mtDNA)-encoded complex I genes (MTND) modify susceptibility to PD. We hypothesize that the risk of PD is conveyed by the total number of nonsynonymous substitutions in the MTND genes in various mtDNA lineages rather than by single mutations. To test this possibility, we determined the number of nonsynonymous substitutions of the seven MTND genes from 183 Finns. The differences in the total number of nonsynonymous substitutions and the nonsynonymous to synonymous substitution rate ratio (Ka/Ks) of MTND genes between the European mtDNA haplogroup clusters (HV, JT, KU, IWX) were analysed by using a statistical approach. Patients with PD (n=238) underwent clinical examination together with mtDNA haplogroup analysis and the clinical features between patient groups defined by the number of nonsynonymous substitutions were compared. Our analysis revealed that the haplogroup clusters HV and KU had a lower average number of amino acid replacements and a lower Ka/Ks ratio in the MTND genes than clusters JT and IWX. Supercluster JTIWX with the highest number of amino acid replacements was more frequent among PD patients and even more frequent among patients with PD who developed dementia. Our results suggest that a relative excess of nonsynonymous mutations in MTND genes in supercluster JTWIX is associated with an increased risk of PD and the disease progression to dementia.  相似文献   

5.
6.
Genetic factors may play an important role in species extinction but their actual effect remains poorly understood, particularly because of a strong and potentially masking effect expected from ecological traits. We investigated the role of genetics in mammal extinction taking both ecological and genetic factors into account. As a proxy for the role of genetics we used the ratio of the rates of nonsynonymous (amino acid changing) to synonymous (leaving the amino acid unchanged) nucleotide substitutions, Ka / Ks. Because most nonsynonymous substitutions are likely to be slightly deleterious and thus selected against, this ratio is a measure of the inefficiency of selection: if large (but less than 1), it implies a low efficiency of selection against nonsynonymous mutations. As a result, nonsynonymous mutations may accumulate and thus contribute to extinction. As a proxy for the role of ecology we used body mass W, with which most extinction‐related ecological traits strongly correlate. As a measure of extinction risk we used species’ affiliation with the five levels of extinction threat according to the IUCN Red List of Threatened Species. We calculated Ka / Ks for mitochondrial protein‐coding genes of 211 mammalian species, each of which was characterized by body mass and the level of threat. Using logistic regression analysis, we then constructed a set of logistic regression models of extinction risk on ln(Ka / Ks) and lnW. We found that Ka / Ks and body mass are responsible for a 38% and a 62% increase in extinction risk, respectively. Given that the standard error of these values is 13%, the contribution of genetic factors to extinction risk in mammals is estimated to be one‐quarter to one‐half of the total of ecological and genetic effects. We conclude that the effect of genetics on extinction is significant, though it is almost certainly smaller than the effect of ecological traits. Synthesis Mutation provides the material for evolution. However, most mutations that play a role in evolution are slightly deleterious and thus may contribute to extinction. We assess the role of mitochondrial DNA mutations in mammalian extinction risk and find it to be one‐quarter to one‐half of the total of mutation and body mass effects, where body mass represents an integral measure of extinction‐related ecological traits. Genetic factors may be all the more important, because ecological traits associated with large body mass would both promote and protect from extinction, while mutation accumulation caused by low effective population size seems to have no counterbalance.  相似文献   

7.
Borrelia burgdorferi is a spirochete pathogen transmitted among warm- blooded hosts by ixodid ticks. Frequency-dependent selection for variant outer-surface proteins might be expected to arise in this species, since rare variants are more likely to avoid immune surveillance in previously infected hosts. We sequenced the OspA and OspB genes of nine North American strains and compared them with nine strains previously described. For each gene, the mean number of synonymous substitutions per synonymous site and the mean number of nonsynonymous substitutions per nonsynonymous site show only a twofold excess of silent mutations. Synonymous rates vary widely along the OspB protein. Some regions show a significant excess of silent substitutions, while divergence in other regions is constrained by biased base composition or selection. The presence, in antigenically important regions of the protein, of significant variation among strains, as well as evidence for recombination among strains, should be considered in attempts to develop vaccines against this disease.   相似文献   

8.
The Artemia hemoglobin is a dimer comprising two nine-domain covalent polymers in quaternary association. Each polymer is encoded by a gene representing nine successive globin domains which have different sequences and are presumed to have been copied originally from a single-domain gene. Two different polymers exist as the result of a complete duplication of the nine-domain gene, allowing the formation of either homodimers or the heterodimer. The total population size of 18 domains comprising nine corresponding pairs, coupled with the probability that they reflect several hundred million years of evolution in the same lineage, provides a unique model in which the process of gene multiplication can be analyzed. The outcome has important implications for the reliability of local molecular clocks. The two polymers differ from each other at 11.7% of amino acid sites; however when corresponding individual domains are compared between polymers, amino acid substitution fluctuates by a factor of 2.7-fold from lowest to highest. This variation is not obvious at the DNA level: Domain pair identity values fluctuate by 1.3-fold. Identity values are, however, uncorrected for multiple substitutions, and both silent and nonsilent changes are pooled. Therefore, to determine the variability in relative substitution rates at the DNA level, we have used the method of Li (1993, J Mol Evol 36:96–99) to determine estimates of nonsynonymous (K A ) and synonymous (K S ) substitutions per site for the nine pairs of domains. As expected, the overall level of silent substitutions (K S of 56.9%) far exceeded nonsilent substitutions (K A of 6.7%); however, for corresponding domain pairs, K A fluctuates by 2.3-fold and K S by 1.7-fold. The large discrepancies reflected in the expressed protein have accrued within a single lineage and the implication is that divergence dates of different genera based on amino acid sequences, even with well-studied proteins of reasonable size, can be wrong by a factor well in excess of 2. Received: 4 June 1997 / Accepted: 17 December 1997  相似文献   

9.
On transition bias in mitochondrial genes of pocket gophers   总被引:1,自引:0,他引:1  
The relative contribution of mutation and purifying selection to transition bias has not been quantitatively assessed in mitochondrial protein genes. The observed transition/transversion (s/v) ratio is (μ s P s)/(μ v P v), where μ s and μ v denote mutation rate of transitions and transversions, respectively, andP s andP v denote fixation probabilities of transitions and transversions, respectively. Because selection against synonymous transitions can be assumed to be roughly equal to that against synonymous transversions,P s/Pv ≈ 1 at fourfold degenerate sites, so that thes/v ratio at fourfold degenerate sites is approximately μ s v , which is a measure of mutational contribution to transition bias. Similarly, thes/v ratio at nondegenerate sites is also an estimate of μ s v if we assume that selection against nonsynonymous transitions is roughly equal to that against nonsynonymous transversions. In two mitochondrial genes, cytochrome oxidase subunit I (COI) and cytochromeb (cyt-b) in pocket gophers, thes/v ratio is about two at nondegenerate and fourfold degenerate sites for both the COI and the cyt-b genes. This implies that mutation contribution to transition bias is relatively small. In contrast, thes/v ratio is much greater at twofold degenerate sites, being 48 for COI and 40 for cyt-b. Given that the μ s v ratio is about 2, theP s/Pv ratio at twofold degenerate sites must be on the order of 20 or greater. This suggests a great effect of purifying selection on transition bias in mitochondrial protein genes because transitions are synonymous and transversions are nonsynonymous at twofold degenerate sites in mammalian mitochondrial genes. We also found that nonsynonymous mutations at twofold degenerate sites are more neutral than nonsynonymous mutations at nondegenerate sites, and that the COI gene is subject to stronger purifying selection than is the cyt-b gene. A model is presented to integrate the effect of purifying selection, codon bias, DNA repair and GC content ons/v ratio of protein-coding genes. Correspondence to: X. Xia  相似文献   

10.
The hypervariable region 1 (HVR-1) of the putative envelope encoding E2 region of hepatitis C virus (HCV) RNA was analyzed in sequential samples from three patients with acute type C hepatitis infected from different sources to address (i) the dynamics of intrahost HCV variability during the primary infection and (ii) the role of host selective pressure in driving viral genetic evolution. HVR-1 sequences from 20 clones per each point in time were analyzed after amplification, cloning, and purification of plasmid DNA from single colonies of transformed cells. The intrasample evolutionary analysis (nonsynonymous mutations per nonsynonymous site [Ka], synonymous mutations per synonymous site [Ks], Ka/Ks ratio, and genetic distances [gd]) documented low gd in early samples (ranging from 2.11 to 7.79%) and a further decrease after seroconversion (from 0 to 4.80%), suggesting that primary HCV infection is an oligoclonal event, and found different levels and dynamics of host pressure in the three cases. The intersample analysis (pairwise comparisons of intrapatient sequences; rKa, rKs, rKa/rKs ratio, and gd) confirmed the individual features of HCV genetic evolution in the three subjects and pointed to the relative contribution of either neutral evolution or selective forces in driving viral variability, documenting that adaptation of HCV for persistence in vivo follows different routes, probably representing the molecular counterpart of the viral fitness for individual environments.  相似文献   

11.
New methods for estimating the numbers of synonymous and nonsynonymous substitutions per site were developed. The methods are unweighted pathway methods based on Kimura's two-parameter model. Computer simulations were conducted to evaluate the accuracies of the new methods, Nei and Gojobori's (NG) method, Miyata and Yasunaga's (MY) method, Li, Wu, and Luo's (LWL) method, and Pamilo, Bianchi, and Li's (PBL) method. The following results were obtained: (1) The NG, MY, and LWL methods give overestimates of the number of synonymous substitutions and underestimates of the number of nonsynonymous substitutions. The major cause for the biased estimation is that these three methods underestimate the number of synonymous sites and overestimate the number of nonsynonymous sites. (2) The PBL method gives better estimates of the numbers of synonymous and nonsynonymous substitutions than those obtained by the NG, MY, and LWL methods. (3) The new methods also give better estimates of the numbers of synonymous and nonsynonymous substitutions than those obtained by the NG, MY, and LWL methods. In addition, estimates of the numbers of synonymous and nonsynonymous sites obtained by the new methods are reasonably accurate. (4) In some cases, the new methods and the PBL method give biased estimates of substitution numbers. However, from the number of nucleotide substitutions at the third position of codons, we can examine whether estimates obtained by the new methods are good or not, whereas we cannot make an examination of estimates obtained by the PBL method. (5) When there are strong transition/transversion and nucleotide-frequency biases like mitochondrial genes, all of the above methods give biased estimates of substitution numbers. In such cases, Kondo et al.'s method is recommended to be used for estimating the number of synonymous substitutions, although their method cannot estimate the number of nonsynonymous substitutions and is time-consuming. These results, particularly result (1), call for reexaminations of some genes. This is because evolutionary pictures of genes have often been discussed on the basis of results obtained by the NG, MY, and LWL methods, which are favorable for the neutral theory of molecular evolution.  相似文献   

12.
Data on gene expression in the development of the root in Arabidopsis thaliana were used to test for expression profile differences among multi-gene families and to examine the extent to which expression differences accompanied coding sequences divergence within families. Significant differences among families were observed on two principal axes, accounting for over 80% of the variance in the expression data. The number of synonymous nucleotide substitutions per synonymous site (dS) and the number of nonsynonymous nucleotide substitutions per nonsynonymous site (dN) were estimated between the members of two-member families (N=428) and between phylogenetically independent sister pairs (N=190) of sequences within larger families. Ribosomal proteins and a few other proteins were exceptional in showing highly divergent expression patterns in spite of very low levels of amino acid sequence divergence, as indicated by the low dN relative to dS. However, the majority of gene duplicates showed relatively high levels of amino acid sequence divergence without appreciable change in expression pattern in the cell types analyzed. Reviewing Editor:Dr. Manyuan Long  相似文献   

13.
While gammacoronaviruses mainly comprise infectious bronchitis virus (IBV) and its closely related bird coronaviruses (CoVs), the only mammalian gammacoronavirus was discovered from a white beluga whale (beluga whale CoV [BWCoV] SW1) in 2008. In this study, we discovered a novel gammacoronavirus from fecal samples from three Indo-Pacific bottlenose dolphins (Tursiops aduncus), which we named bottlenose dolphin CoV (BdCoV) HKU22. All the three BdCoV HKU22-positive samples were collected on the same date, suggesting a cluster of infection, with viral loads of 1 × 103 to 1 × 105 copies per ml. Clearance of virus was associated with a specific antibody response against the nucleocapsid of BdCoV HKU22. Complete genome sequencing and comparative genome analysis showed that BdCoV HKU22 and BWCoV SW1 have similar genome characteristics and structures. Their genome size is about 32,000 nucleotides, the largest among all CoVs, as a result of multiple unique open reading frames (NS5a, NS5b, NS5c, NS6, NS7, NS8, NS9, and NS10) between their membrane (M) and nucleocapsid (N) protein genes. Although comparative genome analysis showed that BdCoV HKU22 and BWCoV SW1 should belong to the same species, a major difference was observed in the proteins encoded by their spike (S) genes, which showed only 74.3 to 74.7% amino acid identities. The high ratios of the number of synonymous substitutions per synonymous site (Ks) to the number of nonsynonymous substitutions per nonsynonymous site (Ka) in multiple regions of the genome, especially the S gene (Ka/Ks ratio, 2.5), indicated that BdCoV HKU22 may be evolving rapidly, supporting a recent transmission event to the bottlenose dolphins. We propose a distinct species, Cetacean coronavirus, in Gammacoronavirus, to include BdCoV HKU22 and BWCoV SW1, whereas IBV and its closely related bird CoVs represent another species, Avian coronavirus, in Gammacoronavirus.  相似文献   

14.
Nei and Gojobori (1986) developed a simple method to estimate the numbers of synonymous (ds) and nonsynonymous (dN) substitutions per site. In the present paper, we have developed a method for computing variances and covariances of ds's and dN's and of the proportions of synonymous (ps) and nonsynonymous (pN) differences. We also have developed a method for computing the variances of mean dS, dN, pS, pN, without constructing a phylogenetic tree of the genes. We have conducted computer simulations based on simple evolutionary models and have shown that the new method gives good estimates of variances and covariances.   相似文献   

15.
Fimbriae or pili are essential adherence factors usually found in pathogenic bacteria to aid colonization of host cells. Three major structural pilin genes, fimA, sfaA, and papA, from Escherichia coli natural isolates were examined and nucleotide sequence data revealed elevated levels of both synonymous and nonsynonymous site variation at these loci. Examination of synonymous site variation shows a fivefold increase in fimA sites, relative to the housekeeping gene mdh; and similarly the sfaA and papA genes have increased synonymous sites variation relative to fimA. Nonsynonymous site variation is also elevated at all three loci but, in particular, at the papA locus (k N= 0.44). The k N/k S ratio for the three genes are among the highest yet reported for E. coli genes. Regional variation in nucleotide polymorphism within each of the genes reveal hypervariable segments where nonsynonymous substitutions exceed synonymous substitutions. We propose that at the fimA, papA, and sfaA genes, diversifying selection has brought about the increase levels of polymorphism. Received: 7 August 1997 / Accepted: 8 March 1998  相似文献   

16.
We surveyed the molecular evolutionary characteristics of 25 plant gene families, with the goal of better understanding general processes in plant gene family evolution. The survey was based on 247 GenBank sequences representing four grass species (maize, rice, wheat, and barley). For each gene family, orthology and paralogy relationships were uncertain. Recognizing this uncertainty, we characterized the molecular evolution of each gene family in four ways. First, we calculated the ratio of nonsynonymous to synonymous substitutions (d N/d S) both on branches of gene phylogenies and across codons. Our results indicated that the d N/d S ratio was statistically heterogeneous across branches in 17 of 25 (68%) gene families. The vast majority of d N/d S estimates were <<1.0, suggestive of selective constraint on amino acid replacements, and no estimates were >1.0, either across phylogenetic lineages or across codons. Second, we tested separately for nonsynonymous and synonymous molecular clocks. Sixty-eight percent of gene families rejected a nonsynonymous molecular clock, and 52% of gene families rejected a synonymous molecular clock. Thus, most gene families in this study deviated from clock-like evolution at either synonymous or nonsynonymous sites. Third, we calculated the effective number of codons and the proportion of G+C synonymous sites for each sequence in each gene family. One or both quantities vary significantly within 18 of 25 gene families. Finally, we tested for gene conversion, and only six gene families provided evidence of gene conversion events. Altogether, evolution for these 25 gene families is marked by selective constraint that varies among gene family members, a lack of molecular clock at both synonymous and nonsynonymous sites, and substantial variation in codon usage. Received: 25 May 2000 / Accepted: 16 October 2000  相似文献   

17.
Investigating ancient duplication events in the Arabidopsis genome   总被引:10,自引:0,他引:10  
The complete genomic analysis of Arabidopsis thaliana has shown that a major fraction of the genome consists of paralogous genes that probably originated through one or more ancient large-scale gene or genome duplication events. However, the number and timing of these duplications still remains unclear, and several different hypotheses have been put forward recently. Here, we reanalyzed duplicated blocks found in the Arabidopsis genome described previously and determined their date of divergence based on silent substitution estimations between the paralogous genes and, where possible, by phylogenetic reconstruction. We show that methods based on averaging protein distances of heterogeneous classes of duplicated genes lead to unreliable conclusions and that a large fraction of blocks duplicated much more recently than assumed previously. We found clear evidence for one large-scale gene or even complete genome duplication event somewhere between 70 to 90 million years ago. Traces pointing to a much older (probably more than 200 million years) large-scale gene duplication event could be detected. However, for now it is impossible to conclude whether these old duplicates are the result of one or more large-scale gene duplication events. abbreviations dA, fraction of amino acid substitutions; Kn, number of nonsynonymous substitutions per nonsynonymous site; Ks, number of synonymous substitutions per synonymous site; MYA, million years ago  相似文献   

18.
The variation in nucleotide sequence observed in the envelope (E) gene and the prM (precursor of M protein) region of different strains of Japanese encephalitis virus (JEV) was analysed. Presence of selective forces acting on these regions was investigated by computing the relative rates of synonymous (K s) and nonsynonymous (K a) substitutions. The ratioK s/K a was used as an indicator of the overall selective constraints on the amino acid sequence of JEV proteins. The possibility that different regions of the gene may be subject to varying selective pressures was tested by dividing the gene into three regions and estimating theK s/K a ratio for each region. On the basis of analysis of a limited number (17) of strains of JEV, evidence suggestive of positive selection acting on certain regions of the E gene of the virus, and in some cases on the entire gene, was obtained. Analysis ofK a diversity in the prM region of 46 JEV strains grouped into three genotypes revealed that strains included in genotype II were more heterogeneous than strains belonging to genotype I, while the differences between meanK a values for genotypes I and III and genotypes II and III were not statistically significant. Analysis of host-specific heterogeneity in the prM region revealed that pig isolates were more Xa-diverse than human isolates.  相似文献   

19.
Two simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions are presented. Although they give no weights to different types of codon substitutions, these methods give essentially the same results as those obtained by Miyata and Yasunaga's and by Li et al.'s methods. Computer simulation indicates that estimates of synonymous substitutions obtained by the two methods are quite accurate unless the number of nucleotide substitutions per site is very large. It is shown that all available methods tend to give an underestimate of the number of nonsynonymous substitutions when the number is large.   相似文献   

20.
Summary Focusing on the synonymous substitution rate, we carried out detailed sequence analyses of hominoid mitochondrial (mt) DNAs of ca. 5-kb length. Owing to the outnumbered transitions and strong biases in the base compositions, synonymous substitutions in mtDNA reach rapidly a rather low saturation level. The extent of the compositional biases differs from gene to gene. Such changes in base compositions, even if small, can bring about considerable variation in observed synonymous differences and may result in the region-dependent estimate of the synonymous substitution rate. We demonstrate that such a region dependency is due to a failure to take proper account of heterogeneous compositional biases from gene to gene but that the actual synonymous substitution rate is rather uniform. The synonymous substitution rate thus estimated is 2.37 ± 0.11 × 10–8 per site per year and comparable to the overall rate for the noncoding region. On the other hand, the rate of nonsynonymous substitutions differs considerably from gene to gene, as expected under the neutral theory of molecular evolution. The lowest rate is 0.8 × 10–9 per site per year forCOI and the highest rate is 4.5 × 10–9 forATPase 8, the degree of functional constraints (measured by the ratio of the nonsynonymous to the synonymous substitution rate) being 0.03 and 0.19, respectively. Transfer RNA (tRNA) genes also show variability in the base contents and thus in the nucleotide differences. The average rate for 11 tRNAs contained in the 5-kb region is 3.9 × 10–9 per site per year. The nucleotide substitutions in the genome suggest that the transition rate is about 17 times faster than the transversion rate.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号