首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
It has long been known, from the distribution of multiple amino acid replacements, that not all amino acids of a sequence are replaceable. More recently, the phenomenon was observed at the nucleotide level in mitochondrial DNA even after allowing for different rates of transition and transversion substitutions. We have extended the search to globin gene sequences from various organisms, with the following results: (1) Nearly every data set showed evidence of invariable nucleotide positions. (2) In all data sets, substitution rates of transversions and transitions were never in the ratio of 2/1, and rarely was the ratio even constant. (3) Only rarely (e.g., the third codon position of beta hemoglobins) was it possible to fit the data set solely by making allowance for the number of invariable positions and for the relative rates of transversion and transition substitutions. (4) For one data set (the second codon position of beta hemoglobins) we were able to simulate the observed data by making the allowance in (3) and having the set of covariotides (concomitantly variable nucleotides) be small in number and be turned over in a stochastic manner with a probability that was appreciable. (5) The fit in the latter case suggests, if the assumptions are correct and at all common, that current procedures for estimating the total number of nucleotide substitutions in two genes since their divergence from their common ancestor could be low by as much as an order of magnitude. (6) The fact that only a small fraction of the nucleotide positions differ is no guarantee that one is not seriously underestimating the total amount of divergence (substitutions). (7) Most data sets are so heterogeneous in their number of transition and transversion differences that none of the current models of nucleotide substitution seem to fit them even after (a) segregation of coding from noncoding sequences and (b) splitting of the codon into three subsets by codon position. (8) These frequently occurring problems cannot be seen unless several reasonably divergent orthologous genes are examined together.   相似文献   

2.
Characteristics of human and mouse orthologous gene sequences which have large G+C content variations were investigated in this study. The orthologous gene pairs were classified into two groups according to the deviation between human and mouse G+C content at the third codon position (GC3) and were subsequently analyzed. In one group, mouse genes had higher GC3 than the corresponding human genes and in another group, human genes had higher GC3 than mouse. Furthermore, the orthologous pairs were separated based on the deviation between human or mouse GC3 and the G+C content at the third codon position of identical codons (IC3), to examine the effect of increased or decreased G+C content in human or mouse sequences. The nucleotide substitution patterns between human and mouse sequences in the two groups were remarkably distinct, and consistent with the state of G+C-rich or G+C-poor sequences. The effect of increase or decrease of G+C content in human or mouse sequences was not clear in the nucleotide substitution patterns. The chromosomal locations of human and mouse orthologous gene pairs were different between the two groups. The genes located on an identical syntenic segment showed the trend of having similar G+C content. Moreover, the same gene order of some genes on different chromosomes of both species demonstrated the gene rearrangements between human and mouse. Our study indicated that the chromosomal locations and rearrangements are associated with the GC3 variation between human and mouse sequences.Key Words: Human mouse orthologs, G+C content variation, nucleotide substitution, gene location, gene rearrangement.  相似文献   

3.
A model of nucleotide substitution that allows the transition/transversion rate bias to vary across sites was constructed. We examined the fit of this model using likelihood-ratio tests by analyzing 13 protein coding genes and 1 pseudogene. Likelihood-ratio testing indicated that a model that allows variation in the transition/transversion rate bias across sites provided a significant improvement in fit for most protein coding genes but not for the pseudogene. When the analysis was repeated with parameters estimated separately for first, second, and third codon positions, strong heterogeneity was uncovered for the first and second codon positions; the variation in the transition/transversion rate was generally weaker at the third codon position. The transition rate bias and branch lengths are underestimated when variation in the transition/transversion rate was not accommodated, suggesting that it may be important to accommodate variation in the pattern of nucleotide substitution for accurate estimation of evolutionary parameters. Received: 4 November 1997 / Accepted: 19 May 1998  相似文献   

4.
Phylogenetic codon models are routinely used to characterize selective regimes in coding sequences. Their parametric design, however, is still a matter of debate, in particular concerning the question of how to account for differing nucleotide frequencies and substitution rates. This problem relates to the fact that nucleotide composition in protein-coding sequences is the result of the interactions between mutation and selection. In particular, because of the structure of the genetic code, the nucleotide composition differs between the three coding positions, with the third position showing a more extreme composition. Yet, phylogenetic codon models do not correctly capture this phenomenon and instead predict that the nucleotide composition should be the same for all three positions. Alternatively, some models allow for different nucleotide rates at the three positions, an approach conflating the effects of mutation and selection on nucleotide composition. In practice, it results in inaccurate estimation of the strength of selection. Conceptually, the problem comes from the fact that phylogenetic codon models do not correctly capture the fixation bias acting against the mutational pressure at the mutation–selection equilibrium. To address this problem and to more accurately identify mutation rates and selection strength, we present an improved codon modeling approach where the fixation rate is not seen as a scalar, but as a tensor. This approach gives an accurate representation of how mutation and selection oppose each other at equilibrium and yields a reliable estimate of the mutational process, while disentangling the mean fixation probabilities prevailing in different mutational directions.  相似文献   

5.
The genomes of the ancestors of mammals and birds underwent a compositional change in which the gene-richest regions increased their GC levels. Here we investigated this compositional transition by analyzing the levels of G and C in third codon positions, as well as the codon frequencies of orthologous genes from human, chicken and Xenopus. The results may be summed up as follows: (i) GC-poor genes, that did not undergo the compositional transition, showed only minor differences in orthologous sets from Xenopus, human and chicken; this is remarkable in view of the very many nucleotide substitutions that occurred over the long evolutionary times separating these species; (ii) GC-rich genes, that underwent the compositional transition, showed large differences between Xenopus and warm-blooded vertebrates, but not between chicken and human. In other words, the independent changes that occurred in avian and mammalian genes, on the average, were the same.  相似文献   

6.
Sueoka N  Kawanishi Y 《Gene》2000,261(1):53-62
The human genome, as in other eukaryotes, has a wide heterogeneity in the DNA base composition. The evolutionary basis for this heterogeneity has been unknown. A previous study of the human genome (846 genes analyzed) has shown that, in the major range of the G+C content in the third codon position (0.25-0.75), biases from the Parity Rule 2 (PR2) among the synonymous codons of the four-codon amino acids are similar except in the highest G+C range (Sueoka, N., 1999. Translation-coupled violation of Parity Rule 2 in human genes is not the cause of heterogeneity of the DNA G+C content of third codon position. Gene 238, 53-58.). PR2 is an intra-strand rule where A=T and G=C are expected when there are no biases between the two complementary strands of DNA in mutation and selection rates (substitution rates). In this study, 14,026 human genes were analyzed. In addition, the third codon positions of two-codon amino acids were analyzed. New results show the following: (a) The G+C contents of the third codon position of human genes are scattered in the G+C range of 0.22-0.96 in the third codon position. (b) The PR2 biases are similar in the range of 0.25-0.75, whereas, in the high G+C range (0.75-0.96; 13% of the genes), the PR2-bias fingerprints are different from those of the major range. (c) Unlike the PR2 biases, the G+C contents of the third codon position for both four-codon and two-codon amino acids are all correlated almost perfectly with the G+C content of the third codon position over the total G+C ranges. These results support the notion that the directional mutation pressure, rather than the directional selection pressure, is mainly responsible for the heterogeneity of the G+C content of the third codon position.  相似文献   

7.
Summary Based on the rates of synonymous substitution in 42 protein-codin gene pairs from rat and human, a correlation is shown to exist between the frequency of the nucleotides in all positions of the codon and the synonymous substitution rate. The correlation coefficients were positive for A and T and negative for C and G. This means that AT-rich genes accumulate more synonymous substitutions than GC-rich genes. Biased patterns of mutation could not account for this phenomenon. Thus, the variation in synonymous substitution rates and the resulting unequal codon usage must be the consequence of selection against A and T in synonymous positions. Most of the varition in rates of synonymous substitution can be explained by the nucleotide composition in synonymous positions. Codon-anticodon interactions, dinucleotide frequencies, and contextual factors influence neither the rates of synonymous substitution nor codon usage. Interestingly, the nucleotide in the second position of codons (always a nonsynonymous position) was found to affect the rate of synonymous substitution. This finding links the rate of nonsynonymous substitution with the synonymous rate. Consequently, highly conservative proteins are expected to be encoded by genes that evolve slowly in terms of synonymous substitutions, and are consequently highly biased in their codon usage.  相似文献   

8.
Abstract

The nucleotide contents of the three codon positions show a number of statistical pairwise correlations, some of which are universal for all analysed genomes. Among the most prominent of these correlations are negative correlations between G and T contents found in genes of all species analysed. The pair A/C, which is complementary to G/T shows similar negative correlation in genes of most species. In the genes of several species including all mammalian genes studied, positive correlations between A and T contents, and G and C contents are found. Since these regularities are observed in all three codon positions they are connected with amino-acid content of proteins. Such correlations may origin from features of the mutation process or/and translation reading frame check. The well-known bias of the preference for G in the first codon position and its deficiency in the second is accompanied by opposite bias in T content. In the third codon position there is no general nucleotide preference, but its content is often biased with regard to GC content of the gene. G and T contents in this case are always shifted in the opposite directions Several ideas are drawn to explain this preference.  相似文献   

9.
It has been suggested that codon volatility (the proportion of the point-mutation neighbors of a codon that encode different amino acids) can be used as an index of past positive selection. We compared codon volatility with patterns of synonymous and nonsynonymous nucleotide substitution in genome-wide comparisons of orthologous genes between three pairs of related genomes: (1) the protists Plasmodium falciparum and P. yoelii, (2) the fungi Saccharomyces cerevisiae and S. paradoxus, and (3) the mammals mouse and rat. Codon volatility was not consistently associated with an elevated rate of nonsynonymous substitution, as would be expected under positive selection. Rather, the most consistent and powerful correlate of elevated codon volatility was nucleotide content at the second codon position, as expected, given the nature of the genetic code.  相似文献   

10.
We conducted a genome-wide analysis of variations in guanine plus cytosine (G+C) content at the third codon position at silent substitution sites of orthologous human and mouse protein-coding nucleotide sequences. Alignments of 3776 human protein-coding DNA sequences with mouse orthologs having >50 synonymous codons were analyzed, and nucleotide substitutions were counted by comparing sequences in the alignments extracted from gap-free regions. The G+C content at silent sites in these pairs of genes showed a strong negative correlation (r = -0.93). Some gene pairs showed significant differences in G+C content at the third codon position at silent substitution sites. For example, human thymine-DNA glycosylase was A+T-rich at the silent substitution sites, while the orthologous mouse sequence was G+C-rich at the corresponding sites. In contrast, human matrix metalloproteinase 23B was G+C-rich at silent substitution sites, while the mouse ortholog was A+T-rich. We discuss possible implications of this significant negative correlation of G+C content at silent sites.  相似文献   

11.
In analyzing the silent nucleotide substitutions in some mammalian mitochondrial mRNA coding genes, we had found that the frequency of each of the four nucleotides in rat, mouse, and cow, but not in humans, is the same in the silent third codon position (Lanave C, Preparata G, Saccone C, Serio G (1984) J Mol Evol 20:86-93). Because our findings for these three species were compatible with a stationary Markov process for the evolution of nucleotide sequences, we applied such a model to calculate the effective evolutionary silent substitution rate (vs) and the divergence times among the species. In this paper we have analyzed the first and second codon positions in the same mammalian mitochondrial genes. We found that in the first and second codon positions the human mitochondrial genes satisfy the stationarity conditions. This has allowed us to use the stochastic model mentioned above to calculate the divergence times among mouse, rat, cow, and human. Furthermore, we have analyzed the silent substitution rate in one nuclear gene for these four mammals. We found that in this gene the effective silent substitution rate is about 3 times lower than in mitochondrial genes, and that humans are in this case stationary with respect to the other three mammals in the third codon position as well. Application of our Markov model to this latter gene yields divergence times consistent with our previous determinations.  相似文献   

12.
The hepatitis B virus (HBV) has a circular DNA genome of about 3,200 base pairs. Economical use of the genome with overlapping reading frames may have led to severe constraints on nucleotide substitutions along the genome and to highly variable rates of substitution among nucleotide sites. Nucleotide sequences from 13 complete HBV genomes were compared to examine such variability of substitution rates among sites and to examine the phylogenetic relationships among the HBV variants. The maximum likelihood method was employed to fit models of DNA sequence evolution that can account for the complexity of the pattern of nucleotide substitution. Comparison of the models suggests that the rates of substitution are different in different genes and codon positions; for example, the third codon position changes at a rate over ten times higher than the second position. Furthermore, substantial variation of substitution rates was detected even after the effects of genes and codon positions were corrected; that is, rates are different at different sites of the same gene or at the same codon position. Such rates after the correction were also found to be positively correlated at adjacent sites, which indicated the existence of conserved and variable domains in the proteins encoded by the viral genome. A multiparameter model validates the earlier finding that the variation in nucleotide conservation is not random around the HBV genome. The test for the existence of a molecular clock suggests that substitution rates are more or less constant among lineages. The phylogenetic relationships among the viral variants were examined. Although the data do not seem to contain sufficient information to resolve the details of the phylogeny, it appears quite certain that the serotypes of the viral variants do not reflect their genetic relatedness. Correspondence to: Z. Yang  相似文献   

13.
DNA序列进化过程中核苷酸替代的非独立性研究   总被引:4,自引:2,他引:2  
杨子恒 《遗传学报》1990,17(5):354-359
本文评述了DNA序列间核苷酸替代数的估计方法,并通过对七个物种中组蛋白基因的比较对DNA进化的模型进行了考察。发现H2A基因第三位点上的碱基组成在物种间变异很大,并且跟H2A基因第一位点、H4基因第一、三位点及H2A上游,下游序列中的碱基组成有强正相关,提示DNA序列进化过程中存在着物种特异的区域性约束力。可能的原因是高等真核生物中GC含量升高,或者是染色体重组使这些同源序列位于不同的等质区段,从而受到不同的选择突变压。密码内各位点上核苷酸替代的相关性分析表明不同位点的替代是非独立的,其原因可能是一次替代事件引起多个位点的变化。文中讨论了这些结果对进化树推断的意义。  相似文献   

14.
Along the gene, nucleotides in various codon positions tend to exert a slight but observable influence on the nucleotide choice at neighboring positions. Such context biases are different in different organisms and can be used as genomic signatures. In this paper, we will focus specifically on the dinucleotide composed of a third codon position nucleotide and its succeeding first position nucleotide. Using the 16 possible dinucleotide combinations, we calculate how well individual genes conform to the observed mean dinucleotide frequencies of an entire genome, forming a distance measure for each gene. It is found that genes from different genomes can be separated with a high degree of accuracy, according to these distance values. In particular, we address the problem of recent horizontal gene transfer, and how imported genes may be evaluated by their poor assimilation to the host's context biases. By concentrating on the third- and succeeding first position nucleotides, we eliminate most spurious contributions from codon usage and amino-acid requirements, focusing mainly on mutational effects. Since imported genes are expected to converge only gradually to genomic signatures, it is possible to question whether a gene present in only one of two closely related organisms has been imported into one organism or deleted in the other. Striking correlations between the proposed distance measure and poor homology are observed when Escherichia coli genes are compared to Salmonella typhi, indicating that sets of outlier genes in E. coli may contain a high number of genes that have been imported into E. coli, and not deleted in S. typhi. Received: 16 January 2001 / Accepted: 30 August 2001  相似文献   

15.
Summary In analyzing the silent nucleotide substitutions in some mammalian mitochondrial mRNA coding genes, we had found that the frequency of each of the four nucleotides in rat, mouse, and cow, but not in humans, is the same in the silent third codon position (Lanave C, Preparata G, Saccone C, Serio G (1984) J Mol Evol 20:86-93). Because our findings for these three species were compatible with a stationary Markov process for the evolution of nucleotide sequences, we applied such a model to calculate the effective evolutionary silent substitution rate (vs) and the divergence times among the species. In this paper we have analyzed the first and second codon positions in the same mammalian mitochondrial genes. We found that in the first and second codon positions the human mitochondrial genes satisfy the stationarity conditions. This has allowed us to use the stochastic model mentioned above to calculate the divergence times among mouse, rat, cow, and human. Furthermore, we have analyzed the silent substitution rate in one nuclear gene for these four mammals. We found that in this gene the effective silent substitution rate is about 3 times lower than in mitochondrial genes, and that humans are in this case stationary with respect to the other three mammals in the third codon position as well. Application of our Markov model to this latter gene yields divergence times consistent with our previous determinations.  相似文献   

16.
We propose a method by which the intensity of purifying selection on a functional protein-coding gene is estimated by using three aligned homologous sequences: a processed pseudogene (psi), a functional paralog from the same species (g), and a functional ortholog from a different species (o). For each such trio, we calculate the numbers of nucleotide substitutions along the branches leading to psi and g, i.e., K psi and K(g). If we assume that the mutation rates are the same in the genes and the pseudogenes and that mutations occurring in a pseudogene do not affect the fitness of the organism, we can show that the fraction of mutations that are selectively neutral, fg, is equal to the ratio K(g)/K psi. Since advantageous mutations occur only very rarely, such that they do not contribute significantly to the rate of molecular evolution, the fraction of deleterious mutations that are subject to purifying selection is 1-fg. Therefore, the K(g)/K psi ratio can be used directly to estimate the intensity of purifying selection, thereby isolating its effects on the rate of evolution from those of mutation. We compared the selection intensities of 12 orthologous protein-coding pairs from humans and murids. As expected, the fraction of mutations that are subject to purifying selection is strongest in the second codon position and weakest in the third. Interestingly, the mean fractions of effectively neutral mutations in the third codon position were only 41% and 42% for murids and humans, respectively, indicating that many synonymous mutations are subject to selective constraint. In several orthologous genes, we found that the intensity of purifying selection is very different between murid and human orthologous genes. There was no statistically significant difference in overall intensity of purifying selection between humans and murids. Thus, purifying selection does not seem to be an important factor contributing to the observed differences in the rates of evolution between these two taxa.  相似文献   

17.
We characterized rates and patterns of synonymous and nonsynonymous substitution in 242 duplicated gene pairs on chromosomes 2 and 4 of Arabidopsis thaliana. Based on their collinear order along the two chromosomes, the gene pairs were likely duplicated contemporaneously, and therefore comparison of genetic distances among gene pairs provides insights into the distribution of nucleotide substitution rates among plant nuclear genes. Rates of synonymous substitution varied 13.8-fold among the duplicated gene pairs, but 90% of gene pairs differed by less than 2.6-fold. Average nonsynonymous rates were approximately fivefold lower than average synonymous rates; this rate difference is lower than that of previously studied nonplant lineages. The coefficient of variation of rates among genes was 0.65 for nonsynonymous rates and 0.44 for synonymous rates, indicating that synonymous and nonsynonymous rates vary among genes to roughly the same extent. The causes underlying rate variation were explored. Our analyses tentatively suggest an effect of physical location on synonymous substitution rates but no similar effect on nonsynonymous rates. Nonsynonymous substitution rates were negatively correlated with GC content at synonymous third codon positions, and synonymous substitution rates were negatively correlated with codon bias, as observed in other systems. Finally, the 242 gene pairs permitted investigation of the processes underlying divergence between paralogs. We found no evidence of positive selection, little evidence that paralogs evolve at different rates, and no evidence of differential codon usage or third position GC content.  相似文献   

18.
19.
We present a further application of the stochastic model previously described (Lanave et al., 1984, 1985) for measuring the nucleotide substitution rate in the mammalian evolution of the mitochondrial DNA (mtDNA). The applicability of this method depends on the validity of "stationarity conditions" (equal nucleotide frequencies at first, second and third silent codon positions in homologous protein coding genes). In the comparison of homologous sequences satisfying the stationarity condition at the silent sites, only the four codon families (quartets) for which both transitions and transversions are silent at the third position are considered here. This has allowed us to estimate the transition and transversion rates for any pair of species. We have analyzed the third silent codon position of the triplet rat-mouse-cow, of a series of slightly divergent primates and of two Drosophila species. In terms of two external dating input we have then determined the phylogenetic trees for rat, mouse, and cow as well as for a number of primates including man. The phylogenetic tree that we have derived for the triplet rat, mouse and cow agrees with that we had previously determined by analyzing the first, second and third silent codon positions (in both duets and quartets) of mt genes (Lanave et al., 1985). For primates our method leads to the following branching order from the oldest to the most recent: Gibbon, Orangutan, Gorilla, Chimpanzee and Man. In absolute time, fixing the distance Chimpanzee-Man as 5 million years (Myr) we estimate the dating of the divergence nodes as: Gorilla 7 Myr; Orangutan 16 Myr; Gibbon 20 Myr. In all cases analyzed, the transition rate has been found to be substantially higher than the transversion rate. Moreover we have found that the transition/transversion ratio is different in the various lineages. We suggest that this fact is probably related to the nucleotide frequencies at the third silent codon position.  相似文献   

20.
Analysis of DNA sequences of 132 introns and 140 exons from 42 pairs of orthologous genes of mouse and rat was used to compare patterns of evolutionary change between introns and exons. The mean of the absolute difference in length (measured in base pairs) between the two species was nearly five times as high in the case of introns as in the case of exons. The average rate of nucleotide substitution in introns was very similar to the rate of synonymous substitution in exons, and both were about three times the rate of substitution at nonsynonymous sites in exons. G+C content of introns and exons of the same gene were correlated; but mean G+C content at the third positions of exons was significantly higher than that of introns or positions 1–2 of exons from the same gene. G+C content was conserved over evolutionary time, as indicated by strong correlations between mouse and rat; but the change in G+C content was greatest at position 3 of exons, intermediate in introns, and lowest at positions 1–2 in introns. Received: 23 December 1996 / Accepted: 1 April 1997  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号