首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Bernardi G 《Gene》2000,241(1):3-17
The nuclear genomes of vertebrates are mosaics of isochores, very long stretches (>300kb) of DNA that are homogeneous in base composition and are compositionally correlated with the coding sequences that they embed. Isochores can be partitioned in a small number of families that cover a range of GC levels (GC is the molar ratio of guanine+cytosine in DNA), which is narrow in cold-blooded vertebrates, but broad in warm-blooded vertebrates. This difference is essentially due to the fact that the GC-richest 10-15% of the genomes of the ancestors of mammals and birds underwent two independent compositional transitions characterized by strong increases in GC levels. The similarity of isochore patterns across mammalian orders, on the one hand, and across avian orders, on the other, indicates that these higher GC levels were then maintained, at least since the appearance of ancestors of warm-blooded vertebrates. After a brief review of our current knowledge on the organization of the vertebrate genome, evidence will be presented here in favor of the idea that the generation and maintenance of the GC-richest isochores in the genomes of warm-blooded vertebrates were due to natural selection.  相似文献   

2.
Summary We have investigated the compositional properties of coding sequences from cold-blooded vertebrates and we have compared them with those from warm-blooded vertebrates. Moreover, we have studied the compositional correlations of coding sequences with the genomes in which they are contained, as well as the compositional correlations among the codon positions of the genes analyzed.The distribution of GC levels of the third codon positions of genes from cold-blooded vertebrates are distinctly different from those of warm-blooded vertebrates in that they do not reach the high values attained by the latter. Moreover, coding sequences from cold-blooded vertebrates are either equal, or, in most cases, lower in GC (not only in third, but also in first and second codon positions) than homologous coding sequences from warm-blooded vertebrates; higher values are exceptional. These results at the gene level are in agreement with the compositional differences between cold-blooded and warm-blooded vertebrates previously found at the whole genome (DNA) level (Bernardi and Bernardi 1990a,b).Two linear correlations were found: one between the GC levels of coding sequences (or of their third codon positions) and the GC levels of the genomes of cold-blooded vertebrates containing them; and another between the GC levels of third and first+ second codon positions of genes from cold-blooded vertebrates. The first correlation applies to the genomes (or genome compartments) of all vertebrates and the second to the genes of all living organisms. These correlations are tantamount to a genomic code.  相似文献   

3.
Clay O  Carels N  Douady C  Macaya G  Bernardi G 《Gene》2001,276(1-2):15-24
GC level distributions of a species' nuclear genome, or of its compositional fractions, encode key information on structural and functional properties of the genome and on its evolution. They can be calculated either from absorbance profiles of the DNA in CsCl density gradients at sedimentation equilibrium, or by scanning long contigs of largely sequenced genomes. In the present study, we address the quantitative characterization of the compositional heterogeneity of genomes, as measured by the GC distributions of fixed-length fragments. Special attention is given to mammalian genomes, since their compartmentalization into isochores implies two levels of heterogeneity, intra-isochore (local) and inter-isochore (global). This partitioning is a natural one, since large-scale compositional properties vary much more among isochores than within them. Intra-isochore GC distributions become roughly Gaussian for long fragments, and their standard deviations decrease only slowly with increasing fragment length, unlike random sequences. This effect can be explained by 'long-range' correlations, often overlooked, that are present along isochores.  相似文献   

4.
Isochore patterns and gene distributions in fish genomes   总被引:2,自引:0,他引:2  
The compositional approach developed in our laboratory many years ago revealed a large-scale compositional heterogeneity in vertebrate genomes, in which GC-rich and GC-poor regions, the isochores, were found to be characterized by high and low gene densities, respectively. Here we mapped isochores on fish chromosomes and assessed gene densities in isochore families. Because of the availability of sequence data, we have concentrated our investigations on four species, zebrafish (Brachydanio rerio), medaka (Oryzias latipes), stickleback (Gasterosteus aculeatus), and pufferfish (Tetraodon nigroviridis), which belong to four distant orders and cover almost the entire GC range of fish genomes. These investigations produced isochore maps that were drastically different not only from those of mammals (in that only two major isochore families were essentially present in each genome vs five in the human genome) but also from each other (in that different isochore families were represented in different genomes). Gene density distributions for these fish genomes were also obtained and shown to follow the expected increase with increasing isochore GC. Finally, we discovered a remarkable conservation of the average size of the isochores (which match replicon clusters in the case of human chromosomes) and of the average GC levels of isochore families in both fish and human genomes. Moreover, in each genome the GC-poorest isochore families comprised a group of "long isochores" (2-20 Mb in size), which were the lowest in GC and varied in size distribution and relative amount from one genome to the other.  相似文献   

5.
The number of completely sequenced archaeal genomes has been sufficient for a large-scale bioinformatic study.We have conducted analyses for each coding region from 36 archaeal genomes using the original CGS algorithm by calculating the total GC content(G+C),GC content in first,second and third codon positions as well as in fourfold and twofold degenerated sites from third codon positions,levels of arginine codon usage(Arg2:AGA/G;Arg4:CGX),levels of amino acid usage and the entropy of amino acid content distribution.In archaeal genomes with strong GC pressure,arginine is coded preferably by GC-rich Arg4 codons,whereas in most of archaeal genomes with G+C0.6,arginine is coded preferably by AT-rich Arg2 codons.In the genome of Haloquadratum walsbyi,which is closely related to GC-rich archaea,GC content has decreased mostly in third codon positions,while Arg4Arg2 bias still persists.Proteomes of archaeal species carry characteristic amino acid biases:levels of isoleucine and lysine are elevated,while levels of alanine,histidine,glutamine and cytosine are relatively decreased.Numerous genomic and proteomic biases observed can be explained by the hypothesis of previously existed strong mutational AT pressure in the common predecessor of all archaea.  相似文献   

6.
The DNA strands in most prokaryotic genomes experience strand-biased spontaneous mutation, especially C→T mutations produced by deamination that occur preferentially in the leading strand. This has often been invoked to account for the asymmetry in nucleotide composition, typically measured by GC skew, between the leading and the lagging strand. Casting such strand asymmetry in the framework of a nucleotide substitution model is important for understanding genomic evolution and phylogenetic reconstruction. We present a substitution model showing that the increased C→T mutation will lead to positive GC skew in one strand but negative GC skew in the other, with greater C→T mutation pressure associated with greater differences in GC skew between the leading and the lagging strand. However, the model based on mutation bias alone does not predict any positive correlation in GC skew between the leading and lagging strands. We computed GC skew for coding sequences collinear with the leading and lagging strands across 339 prokaryotic genomes and found a strong and positive correlation in GC skew between the two strands. We show that the observed positive correlation can be satisfactorily explained by an improved substitution model with one additional parameter incorporating a general trend of C avoidance.  相似文献   

7.
Compositional Properties of Green-Plant Plastid Genomes   总被引:2,自引:0,他引:2  
We studied variation of GC contents among plastid (Pt) genomes of green plants. In the green plants, the GC contents of the whole Pt genomes range from 42.14 to 28.81%. These values are similar to those observed in the mitochondrial (Mt) genomes of the green plants, however, the GC contents in the Pt genomes are not related to those in the Mt genomes or the nuclear (Nc) genomes. In addition, some compositional properties of the three types of genomes are different. Thus, it is suggested that the GC contents of the Pt genomes are maintained independently of the other genomes within a cell. We found that the compositional bias toward AT is strong at the third codon position and in intergenic spacer (IGS) regions in the Pt genomes, and the GC contents (GC3 and GCIGS) at these sites are generally similar within each genome. Additionally, the GC3 and GCIGS are strongly related to the whole-genome GC content. Therefore, the interspecific variation of the GC contents in the Pt genomes is suggested to be mainly caused by the variation of the GC3 and GCIGS, both of which are considered to be under weak selective constraints. Using a maximum likelihood approach, we estimated equilibrium GC3 (eqGC3) of 12 genes in the land-plant Pt genomes. We found an increase in eqGC3 after the divergence of liverworts. These results suggest that genome-wide factors such as GC mutational bias are important for the biased base composition in the Pt genomes.Reviewing Editor: Dr. Brian Morton  相似文献   

8.
Eubacterial genomes have highly variable GC content (0.17-0.75) and the primary mechanism of such variability remains unknown. The place to look for is what actually catalyzes the synthesis of DNA, where DNA polymerase III is at the center stage, particularly one of its 10 subunits--the alpha subunit. According to the dimeric combination of alpha subunits, GC contents of eubacterial genomes were partitioned into three groups with distinct GC content variation spectra: dnaE1 (full-spectrum), dnaE2/dnaE1 (high-GC), and polC/dnaE3 (low-GC). Therefore, genomic GC content variability is believed to be governed primarily by the alpha subunit grouping of DNA polymerase III; it is of essence in genome composition analysis to take full account of such a grouping principle. Since horizontal gene transfer is very frequent among bacterial genomes, exceptions of the grouping scheme, a few percents of the total, are readily identifiable and should be excluded from in-depth analyses on nucleotide compositions.  相似文献   

9.
10.
The compositional distributions of large (main-band) DNA fragments from eight birds belonging to eight different orders (including both paleognathous and neognathous species) are very broad and extremely close to each other. These findings, which are paralleled by the compositional similarity of homologous coding sequences and their codon positions, support the idea that birds are a monophyletic group.The compositional distribution of third-codon positions of genes from chicken, the only avian species for which a relatively large number of coding sequences is known, is very broad and bimodal, the minor GC-richer peak reaching 100% GC. The very high compositional heterogeneity of avian genomes is accompanied (as in the case of mammalian genomes) by a very high speciation rate compared to cold-blooded vertebrates which are characterized by genomes that are much less heterogeneous. The higher GC levels attained by avian compared to mammalian genomes might be correlated with the higher body temperature (41–43°C) of birds compared to mammals (37°C).A comparison of GC levels of coding sequences and codon positions from man and chicken revealed very close average GC levels and standard deviations. Homologous coding sequences and codon positions from man and chicken showed a surprisingly high degree of compositional similarity which was, however, higher for GC-poor than for GC-rich sequences. This indicates that GC-poor isochores of warm-blooded vertebrates reflect the composition of the isochores of the genome of the common reptilian ancestor of mammals and birds, which underwent only a small compositional change at the transition from cold- to warm-blooded vertebrates. In contrast, the GC-rich isochores of birds and mammals are the result of large compositional changes at the same evolutionary transition, where were in part different in the two classes of warm-blooded vertebrates.Correspondence to: G. Bernaadi  相似文献   

11.
Compositional distributions in the three codon positions of the coding sequences of 12 fully sequenced prokaryotic genomes, which are publicly available, were investigated. A universal compositional correlation was observed in most of the genomes under investigation irrespective of their overall genomic GC contents. In all the genomes, the GC contents at the first codon positions are always greater than the overall GC contents of the genomes whereas the reverse is true in the case of second codon positions. GC contents at the third codon positions are higher than the overall genomic GC contents in high GC containing genomes, and the opposite situation was found in case of low GC genomes except for Helicobacter pylori. In high-GC rich genomes, the GC contents at the first + second codon positions are less than the GC contents at the third codon positions, and they are low in low-GC genomes except for Helicobacter pylori. The distributions of four bases at the three different positions were also investigated for all 12 organisms. It was observed that in high-GC genomes G is the most dominant base and in low-GC genomes A is the most dominant base in the first codon positions. But purine bases, i.e., (A + G), predominantly occur in the first codon position. In the second codon position, A is the most dominant base in most of the organisms and G is the least dominant base in all the organisms. There is no unique regular pattern of individual bases at the third codon positions; however, there are significant differences in the occurrences of (G + C) contents in the third codon positions among the different organisms. Calculations of dinucleotide frequencies in 12 different organisms indicate that in GC-rich genomes GG, GC, CC, and CG dinucleotides are the most dominant whereas the reverse is true in case of low-GC genomes. Biological implications of these results are discussed in this paper.  相似文献   

12.
The vertebrate genome: isochores and evolution   总被引:18,自引:6,他引:12  
  相似文献   

13.
The GC contents of 2670 prokaryotic genomes that belong to diverse phylogenetic lineages were analyzed in this paper. These genomes had GC contents that ranged from 13.5% to 74.9%. We analyzed the distance of base frequencies at the three codon positions, codon frequencies, and amino acid compositions across genomes with respect to the differences in the GC content of these prokaryotic species. We found that although the phylogenetic lineages were remote among some species, a similar genomic GC content forced them to adopt similar base usage patterns at the three codon positions, codon usage patterns, and amino acid usage patterns. Our work demonstrates that in prokaryotic genomes: a) base usage, codon usage, and amino acid usage change with GC content with a linear correlation; b) the distance of each usage has a linear correlation with the GC content difference; and c) GC content is more essential than phylogenetic lineage in determining base usage, codon usage, and amino acid usage. This work is exceptional in that we adopted intuitively graphic methods for all analyses, and we used these analyses to examine as many as 2670 prokaryotes. We hope that this work is helpful for understanding common features in the organization of microbial genomes.  相似文献   

14.
Musto et al. [FEBS Lett. 573 (2004) 73] studied the correlations between GC levels and optimal growth temperatures in 20 prokaryotic families. They reported that positive correlations are generally observed, and many of these are significant. Here, we have shown that these correlations are not "robust," i.e., correlation coefficients and/or significance of correlations can be considerably influenced by exclusion of very few (even as small as one) species from each dataset. The sensitivity of correlations is assumed as a result of high levels of bias in the family datasets. We concluded that solely based on these data, one cannot establish that GC contents of prokaryotic genomes increase as a result of growth temperature increments.  相似文献   

15.
We report here results which indicate (i) that the nuclear genomes of angiosperms is characterized by a compositional compartmentalization and an isochore structure; and (ii) that the nuclear genomes of some Gramineae exhibit strikingly different compositional patterns compared to those of many dicots. Indeed, the compositional distribution of nuclear DNA molecules (in the 50-100 Kb size range) from three dicots (pea, sunflower and tobacco) and three monocots (maize, rice and wheat) were found to be centered around lower (41%) and higher (45% for rice, 48% for maize and wheat) GC levels, respectively (and to trail towards even higher GC values in maize and wheat). Experiments on gene localization in density gradient fractions showed a remarkable compositional homogeneity in vast (greater than 100-200 Kb) regions surrounding the genes. On the other hand, the compositional distribution of coding sequences (GenBank and literature data) from dicots (several orders) was found to be narrow, symmetrical and centered around 46% GC, that from monocots (essentially barley, maize and wheat) to be broad, asymmetrical and characterized by an upward trend towards high GC values, with the majority of sequences between 60 and 70% GC. Introns exhibited a similar compositional distribution, but lower GC levels, compared to exons from the same genes.  相似文献   

16.
The nucleotide composition of genomes undergoes dramatic variations among all three kingdoms of life. GC content, an important characteristic for a genome, is related to many important functions, and therefore GC content and its distribution are routinely reported for sequenced genomes. Traditionally, GC content distribution is assessed by computing GC contents in windows that slide along the genome. Disadvantages of this routinely used window-based method include low resolution and low sensitivity. Additionally, different window sizes result in different GC content distribution patterns within the same genome. We proposed a windowless method, the GC profile, for displaying GC content variations across the genome. Compared to the window-based method, the GC profile has the following advantages: 1) higher sensitivity, because of variation-amplifying procedures; 2) higher resolution, because boundaries between domains can be determined at one single base pair; 3) uniqueness, because the GC profile is unique for a given genome and 4) the capacity to show both global and regional GC content distributions. These characteristics are useful in identifying horizontally-transferred genomic islands and homogenous GC-content domains. Here, we review the applications of the GC profile in identifying genomic islands and genome segmentation points, and in serving as a platform to integrate with other algorithms for genome analysis. A web server generating GC profiles and implementing relevant genome segmentation algorithms is available at: www.zcurve.net.  相似文献   

17.
The compositional properties of human genes   总被引:8,自引:0,他引:8  
Summary The present work represents the first attempt to study in greater detail previously proposed compositional correlations in genomes, based on a body of additional data relating to gene localizations as well as to extended flanking sequences extracted from gene banks. We have investigated the correlations that exist between (1) the GC levels of exons of human genes, and (2) the GC levels of either intergenic sequences or introns associated with the genes under consideration. In both cases, linear relationships with slopes close to unity were found. The similarity of the linear relationships indicates similar GC levels in intergenic sequences and introns located in the same isochores. Moreover, both intergenic sequences and introns showed GC levels 5–10% lower than the corresponding exons. The above findings considerably strengthen the previously drawn conclusion that coding and noncoding sequences (both inter- and intragenic) from the same isochores of the human genome are compositionally correlated. In addition, we find linear correlations between the GC levels of codon positions and of the intergenic sequences or introns associated with the corresponding genes, as well as among the GC levels of codon positions of genes.  相似文献   

18.
The genomic distribution of 23 nuclear genes from three dicotyledons (pea, sunflower, tobacco) and five monocotyledons of the Gramineae family (barley, maize, rice, oat, wheat) was studied by localizing these genes in DNA fractions obtained by preparative centrifugation in Cs2SO4/BAMD density gradients. Each one of these genes (and of many other related genes and pseudogenes) was found to be located in DNA fragments (50-100 Kb in size) that were less than 1-2% GC apart from each other. This definitively demonstrates the existence of isochores in plant genomes, namely of compositionally homogeneous DNA regions at least 100-200 Kb in size. Moreover, the GC levels of the 23 coding sequences studied, of their first, second and third codon positions, and of the corresponding introns were found to be linearly correlated with the GC levels of the isochores harboring those genes. Compositional correlations displayed increasing slopes when going from second to first to third codon position with obvious effects on codon usage. Coding sequences for seed storage proteins and phytochrome of Gramineae deviate from the compositional correlations just described. Finally, CpG doublets of coding sequences were characterized by a shortage that decreased and vanished with increasing GC levels of the sequences. A number of these findings bear a striking similarity with results previously obtained for vertebrate genes.  相似文献   

19.
CpG islands: features and distribution in the genomes of vertebrates   总被引:4,自引:0,他引:4  
B A?ssani  G Bernardi 《Gene》1991,106(2):173-183
We have investigated the distribution of unmethylated CpG islands in vertebrate genomes fractionated according to their base composition. Genomes from warm-blooded vertebrates (man, mouse and chicken) are characterized by abundant CpG islands, whose frequency increases in DNA fractions of increasing % of guanine + cytosine; % G + C (GC), in parallel with the distribution of genes and CpG doublets. Small, yet significant, differences in the distribution of CpG islands were found in the three genomes. In contrast, genomes from cold-blooded vertebrates (two reptiles, one amphibian, and two fishes) were characterized by an extreme scarcity or absence of CpG islands (detected in these experiments as HpaII tiny fragments or HTF). CpG islands associated with homologous genes from cold- and warm-blooded vertebrates were then compared by analyzing CpG frequencies, GC levels, HpaII sites, rare-cutter sites and G/C boxes (GGGGCGGGGC and closely related motifs) in sequences available in gene banks. Small, yet significant, differences were again detected among the CpG islands associated with homologous genes from warm-blooded vertebrates, in that CpG islands associated with mouse or rat genes often showed low CpG and/or GC levels, as well as low numbers of HpaII sites, rare-cutter sites and G/C boxes, compared to homologous human genes; more rarely, CpG islands were just absent. As far as cold-blooded vertebrates were concerned, a number of genes showed CpG islands, which exhibited a much lower frequency of CpG doublets than that found in CpG islands of warm-blooded vertebrates, but still approached the statistically expected frequency; none of the other features of CpG islands associated with genes from warm-blooded vertebrates were present. Other genes did not show any associated CpG islands, unlike their homologues from warm-blooded vertebrates.  相似文献   

20.
Zhang SH  Wang L 《Genomics》2011,97(5):330-331
It has been reported that there is a majority triplet profile among genomes, which was considered as a reflection of general mechanisms of genome evolution (Albrecht-Buehler, 2007). However, there are actually, according to our further analysis and at least among prokaryotic genomes, two common triplet profiles: one is from low-GC content genomes; the other is from high-GC content genomes. Both common profiles would be direct reflections of GC content variations and strand symmetry of genomic sequences.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号