首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Past analyses of the genome of the yeast Saccharomyces cerevisiae have revealed substantial regional variation in G+C content. Important questions remain, though, as to the origin, nature, significance, and generality of this variation. We conducted an extensive analysis of the yeast genome to try to answer these questions. Our results indicate that open reading frames (ORFs) with similar G+C contents at silent codon positions are significantly clustered on chromosomes. This clustering can be explained by very short range correlations of silent-site G+C contents at neighboring ORFs. ORFs of high silent-site G+C content are disproportionately concentrated on shorter chromosomes, which causes a negative relationship between chromosome length and G+C content. Contrary to previous reports, there is no correlation between gene density and silent-site G+C content in yeast. Chromosome III is atypical in many regards, and possible reasons for this are discussed.  相似文献   

2.
BACKGROUND: Nucleotide substitution rates and G + C content vary considerably among mammalian genes. It has been proposed that the mammalian genome comprises a mosaic of regions - termed isochores - with differing G + C content. The regional variation in gene G + C content might therefore be a reflection of the isochore structure of chromosomes, but the factors influencing the variation of nucleotide substitution rate are still open to question. RESULTS: To examine whether nucleotide substitution rates and gene G + C content are influenced by the chromosomal location of genes, we compared human and murid (mouse or rat) orthologues known to belong to one of the chromosomal (autosomal) segments conserved between these species. Multiple members of gene families were excluded from the dataset. Sets of neighbouring genes were defined as those lying within 1 centiMorgan (cM) of each other on the mouse genetic map. For both synonymous substitution rates and G + C content at silent sites, neighbouring genes were found to be significantly more similar to each other than sets of genes randomly drawn from the dataset. Moreover, we demonstrated that the regional similarities in G + C content (isochores) and synonymous substitution rate were independent of each other. CONCLUSIONS: Our results provide the first substantial statistical evidence for the existence of a regional variation in the synonymous substitution rate within the mammalian genome, indicating that different chromosomal regions evolve at different rates. This regional phenomenon which shapes gene evolution could reflect the existence of 'evolutionary rate units' along the chromosome.  相似文献   

3.
The patterns of synonymous codon usage in 91 Drosophila melanogaster genes have been examined. Codon usage varies strikingly among genes. This variation is associated with differences in G+C content at silent sites, but (unlike the situation in mammalian genes) these differences are not correlated with variation in intron base composition and so are not easily explicable in terms of mutational biases. Instead, those genes with high G+C content at silent sites, resulting from a strong "preference" for a particular subset of the codons that are mostly C- ending, appear to be the more highly expressed genes. This suggests that G+C content is reduced in sequences where selective constraints are weaker, as indeed seen in a pseudogene. These and other data discussed are consistent with the effects of translational selection among synonymous codons, as seen in unicellular organisms. The existence of selective constraints on silent substitutions, which may vary in strength among genes, has implications for the use of silent molecular clocks.   相似文献   

4.
We conducted a genome-wide analysis of variations in guanine plus cytosine (G+C) content at the third codon position at silent substitution sites of orthologous human and mouse protein-coding nucleotide sequences. Alignments of 3776 human protein-coding DNA sequences with mouse orthologs having >50 synonymous codons were analyzed, and nucleotide substitutions were counted by comparing sequences in the alignments extracted from gap-free regions. The G+C content at silent sites in these pairs of genes showed a strong negative correlation (r = -0.93). Some gene pairs showed significant differences in G+C content at the third codon position at silent substitution sites. For example, human thymine-DNA glycosylase was A+T-rich at the silent substitution sites, while the orthologous mouse sequence was G+C-rich at the corresponding sites. In contrast, human matrix metalloproteinase 23B was G+C-rich at silent substitution sites, while the mouse ortholog was A+T-rich. We discuss possible implications of this significant negative correlation of G+C content at silent sites.  相似文献   

5.
A novel method to calculate the G+C content of genomic DNA sequences.   总被引:2,自引:0,他引:2  
The base composition of a DNA fragment or genome is usually measured by the proportion of A+T or G+C in the sequence. The G+C content along genomic sequences is usually calculated using an overlapping or non-overlapping sliding window method. The result and accuracy of such an approach depends on the size of the window and the moving distance adopted. In this paper, a novel windowless technique to calculate the G+C content of genomic sequences is proposed. By this method, the G+C content can be calculated at different "resolution". In an extreme case, the G+C content may be computed at a specific point, rather than in a window of finite size. This is particularly useful to analyze the fine variation of base composition along genomic sequences. As the first example, the variation of G+C content along each of 16 yeast chromosomes is analyzed. The G+C-rich regions with length larger than 5 kb sequences are detected and listed in details. It is found that each chromosome consists of several G+C-rich and G+C-poor regions alternatively, i.e., a mosaic structure. Another example is to analyze the G+C content for each of the two chromosomes of the Vibrio cholerae genome. Based on the variations of the G+C content in each chromosome, it is shown that some fragments in the Vibrio cholerae genome may have been transferred from other species. Especially, the position and size of the large integron island on the smaller chromosome was precisely predicted. This method would be a useful tool for analyzing genomic sequences.  相似文献   

6.
We compared levels of sequence divergence between fourfold synonymous coding sites and noncoding sites from the intergenic and intronic regions of the Plasmodium falciparum and Plasmodium reichenowi genomes. We observed significant differences in the level of divergence between these classes of silent sites. Fourfold synonymous coding sites exhibited the highest level of sequence divergence, followed by introns, and then intergenic sequences. This pattern of relative divergence rates has been observed in primate genomes but was unexpected in Plasmodium due to a paucity of variation at silent sites in P. falciparum and the corollary hypothesis that silent sites in this genome may be subject to atypical selective constraints. Exclusion of hypermutable CpG dinucleotides reduces the divergence level of synonymous coding sites to that of intergenic sites but does not diminish the significantly higher divergence level of introns relative to intergenic sites. A greater than expected incidence of CpG dinucleotides in intergenic regions less than 500 bp from genes may indicate selective maintenance of regulatory motifs containing CpGs. Divergence rates of different classes of silent sites in these Plasmodium genomes are determined by a combination of mutational and selective pressures.  相似文献   

7.
DNA sequences of 56 human genes for which information on both exons and introns was available were examined. The variance in G+C content among genes is estimated and shown to be substantial. There is a high correlation in G+C content between exons and introns within the same gene. The dinucleotide frequencies of introns are similar to those of intergenic spacer regions and are in reasonable agreement with predictions from substitution rates estimated from pseudogenes, except that the observed deficiency of TA doublets is not predicted. Duplicated bases also show a frequency greater than the expectation under independence. There is marked variability among genes in the frequency of the doublet CG relative to its expectation under independence. This variation is evolutionarily conserved and is correlated with the G+C content. Pseudogenes behave as if they are in a low -G+C, CG-deficient part of the genome, although the genes from which they arose are variable in these respects.   相似文献   

8.
DNA mismatch repair and synonymous codon evolution in mammals   总被引:4,自引:3,他引:1  
It has been suggested that the differences in synonymous codon use between mammalian genes within a genome are due to differences in the efficiency of DNA mismatch repair. This hypothesis was tested by developing a model of mismatch repair, which was used to predict the expected relationship between the rate of substitution and G+C content at silent sites. It was found that the silent-substitution rate should decline with increasing G+C content over most of the G+C-content range, if it is assumed that mismatch repair is G+C biased, an assumption which is supported by data. This prediction was then tested on a set of 58 primate and artiodactyl genes. There was no evidence of a direct decline in substitution rate with increasing G+C content, for either twofold- or fourfold-degenerate sites. It was therefore concluded that variation in the efficiency of mismatch repair is not responsible for the differences in synonymous codon use between mammalian genes. In support of this conclusion, analysis of the model also showed that the parameter range over which mismatch repair can explain the differences in synonymous codon use between genes is very small.   相似文献   

9.
Lobry JR  Sueoka N 《Genome biology》2002,3(10):research0058.1-research005814

Background

When there are no strand-specific biases in mutation and selection rates (that is, in the substitution rates) between the two strands of DNA, the average nucleotide composition is theoretically expected to be A = T and G = C within each strand. Deviations from these equalities are therefore evidence for an asymmetry in selection and/or mutation between the two strands. By focusing on weakly selected regions that could be oriented with respect to replication in 43 out of 51 completely sequenced bacterial chromosomes, we have been able to detect asymmetric directional mutation pressures.

Results

Most of the 43 chromosomes were found to be relatively enriched in G over C and T over A, and slightly depleted in G+C, in their weakly selected positions (intergenic regions and third codon positions) in the leading strand compared with the lagging strand. Deviations from A = T and G = C were highly correlated between third codon positions and intergenic regions, with a lower degree of deviation in intergenic regions, and were not correlated with overall genomic G+C content.

Conclusions

During the course of bacterial chromosome evolution, the effects of asymmetric directional mutation pressures are commonly observed in weakly selected positions. The degree of deviation from equality is highly variable among species, and within species is higher in third codon positions than in intergenic regions. The orientation of these effects is almost universal and is compatible in most cases with the hypothesis of an excess of cytosine deamination in the single-stranded state during DNA replication. However, the variation in G+C content between species is influenced by factors other than asymmetric mutation pressure.
  相似文献   

10.
The rates and patterns of molecular evolution in many eukaryotic organisms have been shown to be influenced by the compartmentalization of their genomes into fractions of distinct base composition and mutational properties. We have examined the Drosophila genome to explore relationships between the nucleotide content of large chromosomal segments and the base composition and rate of evolution of genes within those segments. Direct determination of the G + C contents of yeast artificial chromosome clones containing inserts of Drosophila melanogaster DNA ranging from 140-340 kb revealed significant heterogeneity in base composition. The G + C content of the large segments studied ranged from 36.9% G + C for a clone containing the hunchback locus in polytene region 85, to 50.9% G + C for a clone that includes the rosy region in polytene region 87. Unlike other organisms, however, there was no significant correlation between the base composition of large chromosomal regions and the base composition at fourfold degenerate nucleotide sites of genes encompassed within those regions. Despite the situation seen in mammals, there was also no significant association between base composition and rate of nucleotide substitution. These results suggest that nucleotide sequence evolution in Drosophila differs from that of many vertebrates and does not reflect distinct mutational biases, as a function of base composition, in different genomic regions. Significant negative correlations between codon-usage bias and rates of synonymous site divergence, however, provide strong support for an argument that selection among alternative codons may be a major contributor to variability in evolutionary rates within Drosophila genomes.  相似文献   

11.
Summary The G+C content of DNA varies widely in different organisms, especially microorganisms. This variation is accompanied by changes in the nucleotide composition of silent positions in codons. (Silent positions are defined and explained in the text.) These changes are mostly neutral or near neutral, and appear to result from mutation pressure in the direction of increasing either A+T (AT pressure) or G+C(GC pressure) content. Variations in G+C content are also accompanied by substitutions at replacement positions in codons. These substituions produce changes in the amino acid content of homologous proteins. The examples studied were genes for 13 mitochondrial proteins in five species, and A and B genes for bacterial tryptophan synthase in four species.In microorganisms, varying AT and GC mutational pressures, presumably resulting from shifts in the DNA polymerase system, exert strong effects on molecular evolution by changing the G+C content of DNA. These effects may be greater than those of random drift. The effects of GC pressure on silent substitutions in the systems examined are several times as great as the effects on replacement substitutions.GC pressure is exerted on noncoding as well as coding regions in mitochondrial DNA. This is shown by the close correlation (correlation coefficient, 0.99) of the G+C content of the noncoding D loop of mitochondria with the G+C content of silent positions in the corresponding mitochondrial genes.  相似文献   

12.
The G + C content of silent sites in codons varies greatly among Serratia marcescens genes; the value in any one gene seems to reflect a balance between mutation pressure towards high G + C content and natural selection constraining choice among synonymous codons. Interestingly, non-coding sequences have substantially lower G + C content than silent sites thought to be under little selective constraint.  相似文献   

13.
Stability, structure and complexity of yeast chromosome III.   总被引:1,自引:1,他引:0       下载免费PDF全文
G J King 《Nucleic acids research》1993,21(18):4239-4245
The complete sequence of yeast chromosome III provides a model for studies relating DNA sequence and structure at different levels of organisation in eukaryotic chromosomes. DNA helical stability, intrinsic curvature and sequence complexity have been calculated for the complete chromosome. These features are compartmentalised at different levels of organisation. Compartmentalisation of thermal stability is observed from the level delineating coding/non-coding sequences, to higher levels of organisation which correspond to regions varying in G + C content. The three-dimensional path reveals a symmetrical structure for the chromosome, with a densely packed central region and more diffuse and linear subtelomeric regions. This interspersion of regions of high and low curvature is reflected at lower levels of organisation. Complexity of n-tuplets (n = 1 to 6) also reveals compartmentalisation of the chromosome at different levels of organisation, in many cases corresponding to the structural features. DNA stability, conformation and complexity delineate telomeres, centromere, autonomous replication sequences (ARS), transposition hotspots, recombination hotspots and the mating-type loci.  相似文献   

14.
酵母基因上游序列中潜在的转录正调控位点分析   总被引:3,自引:0,他引:3  
前期研究表明,高效转录酵母基因内含子在序列长度、寡核苷酸使用、以及位置分布等方面都有着区别于低转录内含子的特征 . 进一步观察发现:上游基因间区域的序列长度与基因转录频率也有与内含子序列相同的现象,转录频率高的上游基因间序列一般都比转录频率低的长 . 对高效转录和低效转录上游基因间序列的寡核苷酸使用频率进行统计比较分析,抽提出高转录基因上游区可能的转录正调控元件 . 与酵母的所有非编码序列比较,这些可能的正调控元件基本上也是过表达的 (over-represented) ,其中多数和实验所得的一些位点特征相吻合 . 这些元件富含 G 、 C ,这与内含子中可能的正调控元件在碱基组成上有一定的互补性 . 从这些特征看,高效转录基因上游的序列结构确实有利于基因的转录 .  相似文献   

15.
Selection on Silent Sites in the Rodent H3 Histone Gene Family   总被引:6,自引:0,他引:6       下载免费PDF全文
R. W. DeBry  W. F. Marzluff 《Genetics》1994,138(1):191-202
Selection promoting differential use of synonymous codons has been shown for several unicellular organisms and for Drosophila, but not for mammals. Selection coefficients operating on synonymous codons are likely to be extremely small, so that a very large effective population size is required for selection to overcome the effects of drift. In mammals, codon-usage bias is believed to be determined exclusively by mutation pressure, with differences between genes due to large-scale variation in base composition around the genome. The replication-dependent histone genes are expressed at extremely high levels during periods of DNA synthesis, and thus are among the most likely mammalian genes to be affected by selection on synonymous codon usage. We suggest that the extremely biased pattern of codon usage in the H3 genes is determined in part by selection. Silent site G + C content is much higher than expected based on flanking sequence G + C content, compared to other rodent genes with similar silent site base composition but lower levels of expression. Dinucleotide-mediated mutation bias does affect codon usage, but the affect is limited to the choice between G and C in some fourfold degenerate codons. Gene conversion between the two clusters of histone genes has not been an important force in the evolution of the H3 genes, but gene conversion appears to have had some effect within the cluster on chromosome 13.  相似文献   

16.
A Eyre-Walker 《Genetics》1999,152(2):675-683
It has been suggested that mutation bias is the major determinant of base composition bias at synonymous, intron, and flanking DNA sites in mammals. Here I test this hypothesis using population genetic data from the major histocompatibility genes of several mammalian species. The results of two tests are inconsistent with the mutation hypothesis in coding, noncoding, CpG-island, and non-CpG-island DNA, but are consistent with selection or biased gene conversion. It is argued that biased gene conversion is unlikely to affect silent site base composition in mammals. The results therefore suggest that selection is acting upon silent site G + C content. This may have broad implications, since silent site base composition reflects large-scale variation in G + C content along mammalian chromosomes. The results therefore suggest that selection may be acting upon the base composition of isochores and large sections of junk DNA.  相似文献   

17.
The spontaneous deamination of cytosine produces uracil mispaired with guanine in DNA, which will produce a mutation, unless repaired. In all domains of life, uracil-DNA glycosylases (UDGs) are responsible for the elimination of uracil from DNA. Thus, UDGs contribute to the integrity of the genetic information and their loss results in mutator phenotypes. We are interested in understanding the role of UDG genes in the evolutionary variation of the rate and the spectrum of spontaneous mutations. To this end, we determined the presence or absence of the five main UDG families in more than 1,000 completely sequenced genomes and analyzed their patterns of gene loss and gain in eubacterial lineages. We observe nonindependent patterns of gene loss and gain between UDG families in Eubacteria, suggesting extensive functional overlap in an evolutionary timescale. Given that UDGs prevent transitions at G:C sites, we expected the loss of UDG genes to bias the mutational spectrum toward a lower equilibrium G + C content. To test this hypothesis, we used phylogenetically independent contrasts to compare the G + C content at intergenic and 4-fold redundant sites between lineages where UDG genes have been lost and their sister clades. None of the main UDG families present in Eubacteria was associated with a higher G + C content at intergenic or 4-fold redundant sites. We discuss the reasons of this negative result and report several features of the evolution of the UDG superfamily with implications for their functional study. uracil-DNA glycosylase, mutation rate evolution, mutational bias, GC content, DNA repair, mutator gene.  相似文献   

18.
Summary This paper reports on the relationship between the number of silent differences and the codon usage changes in the lineages leading to human and rat. Examination of 102 pairs of homologous genes gives rise to four main conclusions: (1) We have previously demonstrated the existence of a codon usage change (called the minor shift) between human and rat; this was confirmed here with a larger sample. For genes with extreme C+G frequencies, the C+G level in the third codon position is less extreme in rat than in human. (2) Protein similarity and percentage of positive differences are the two main factors that discriminate homologous genes when characterized by differences between rat and human. By definition, positive differences result from silent changes between A or T and C or G with a direction implying a C+G content variation in the same direction as the overall gene variation. (3) For genes showing both codon usage change and low protein similarity, a majority of amino acid replacements contributes to C+G level variation in positions I and II in the same direction as the variation in position III. This is thus a new example of protein evolution due to constraints acting at the DNA level. (4) In heavy isochores (high C+G content) no direct correlation exists between codon usage change (measured by the dissymmetry of differences) and silent dissimilarity. In light isochores the opposite situation is observed: modification of codon usage is associated with a high synonymous dissimilarity. This result shows that, in some cases, modification of constraints acting at the DNA level could accelerate divergence between genomes.  相似文献   

19.
The global, rather than local, variation in G+C content along the nuclear DNA sequences of various organisms was studied using GenBank sequence data. When long DNA sequences of the genomes of Escherichia coli and Saccharomyces cerevisiae were examined, the levels of their G+C content (G+C%) were found to be within a narrow range around that of the whole genome. The G+C% levels for sequences of vertebrate genomes, however, were found to cover a wide range, showing that their genome is a mosaic of sequences with different G+C% levels, in each of which the sequence is fairly homogeneous in its G+C% for a very long distance. Through surveying a human genetic map and GenBank DNA sequences, the global variations in G+C% along the human genome DNA were found to be correlated with chromosome band structures.  相似文献   

20.
Mammalian gene evolution: Nucleotide sequence divergence between mouse and rat   总被引:16,自引:0,他引:16  
As a paradigm of mammalian gene evolution, the nature and extent of DNA sequence divergence between homologous protein-coding genes from mouse and rat have been investigated. The data set examined includes 363 genes totalling 411 kilobases, making this by far the largest comparison conducted between a single pair of species. Mouse and rat genes are on average 93.4% identical in nucleotide sequence and 93.9% identical in amino acid sequence. Individual genes vary substantially in the extent of nonsynonymous nucleotide substitution, as expected from protein evolution studies; here the variation is characterized. The extent of synonymous (or silent) substitution also varies considerably among genes, though the coefficient of variation is about four times smaller than for nonsynonymous substitutions. A small number of genes mapped to the X-chromosome have a slower rate of molecular evolution than average, as predicted if molecular evolution is male-driven. Base composition at silent sites varies from 33% to 95% G + C in different genes; mouse and rat homologues differ on average by only 1.7% in silent-site G + C, but it is shown that this is not necessarily due to any selective constraint on their base composition. Synonymous substitution rates and silent site base composition appear to be related (genes at intermediate G + C have on average higher rates), but the relationship is not as strong as in our earlier analyses. Rates of synonymous and nonsynonymous substitution are correlated, apparently because of an excess of substitutions involving adjacent pairs of nucleotides. Several factors suggest that synonymous codon usage in rodent genes is not subject to selection.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号