首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Clay O  Carels N  Douady C  Macaya G  Bernardi G 《Gene》2001,276(1-2):15-24
GC level distributions of a species' nuclear genome, or of its compositional fractions, encode key information on structural and functional properties of the genome and on its evolution. They can be calculated either from absorbance profiles of the DNA in CsCl density gradients at sedimentation equilibrium, or by scanning long contigs of largely sequenced genomes. In the present study, we address the quantitative characterization of the compositional heterogeneity of genomes, as measured by the GC distributions of fixed-length fragments. Special attention is given to mammalian genomes, since their compartmentalization into isochores implies two levels of heterogeneity, intra-isochore (local) and inter-isochore (global). This partitioning is a natural one, since large-scale compositional properties vary much more among isochores than within them. Intra-isochore GC distributions become roughly Gaussian for long fragments, and their standard deviations decrease only slowly with increasing fragment length, unlike random sequences. This effect can be explained by 'long-range' correlations, often overlooked, that are present along isochores.  相似文献   

2.
Chen LL  Gao F 《The FEBS journal》2005,272(13):3328-3336
Eukaryotic genomes are composed of isochores, i.e. long sequences relatively homogeneous in GC content. In this paper, the isochore structure of Arabidopsis thaliana genome has been studied using a windowless technique based on the Z curve method and intuitive curves are drawn for all the five chromosomes. Using these curves, we can calculate the GC content at any resolution, even at the base level. It is observed that all the five chromosomes are composed of several GC-rich and AT-rich regions alternatively. Usually, these regions, named 'isochore-like regions', have large fluctuations in the GC content. Five isochores with little fluctuations are also observed. Detailed analyses have been performed for these isochores. A GC-rich 'isochore-like region' and a GC-isochore in chromosome II and IV, respectively, are the nucleolar organizer regions (NORs), and genes located in the two regions prefer to use GC-ending codons. Another GC-isochore located in chromosome II is a mitochondrial DNA insertion region, the position and size of this region is precisely predicted by the current method. The amino acid usage and codon preference of genes in this organellar-to-nuclear transfer region show significant difference from other regions. Moreover, the centromeres are located in GC-rich 'isochore-like regions' in all the five chromosomes. The current method can provide a useful tool for analyzing whole genomic sequences of eukaryotes.  相似文献   

3.
Vanishing GC-rich isochores in mammalian genomes   总被引:25,自引:0,他引:25  
Duret L  Semon M  Piganeau G  Mouchiroud D  Galtier N 《Genetics》2002,162(4):1837-1847
To understand the origin and evolution of isochores-the peculiar spatial distribution of GC content within mammalian genomes-we analyzed the synonymous substitution pattern in coding sequences from closely related species in different mammalian orders. In primate and cetartiodactyls, GC-rich genes are undergoing a large excess of GC --> AT substitutions over AT --> GC substitutions: GC-rich isochores are slowly disappearing from the genome of these two mammalian orders. In rodents, our analyses suggest both a decrease in GC content of GC-rich isochores and an increase in GC-poor isochores, but more data will be necessary to assess the significance of this pattern. These observations question the conclusions of previous works that assumed that base composition was at equilibrium. Analysis of allele frequency in human polymorphism data, however, confirmed that in the GC-rich parts of the genome, GC alleles have a higher probability of fixation than AT alleles. This fixation bias appears not strong enough to overcome the large excess of GC --> AT mutations. Thus, whatever the evolutionary force (neutral or selective) at the origin of GC-rich isochores, this force is no longer effective in mammals. We propose a model based on the biased gene conversion hypothesis that accounts for the origin of GC-rich isochores in the ancestral amniote genome and for their decline in present-day mammals.  相似文献   

4.
In a recent paper in these pages, Cohen et al. search for isochores in the human genome, based on a system of attributes that they assign to isochores. The putative isochores that they find and choose for presentation are almost all below 45% GC and cover only about 41% of the genome. Closer inspection reveals that the authors' methodology systematically loses GC-rich isochores because it does not anticipate the considerable fluctuations and corresponding long-range correlations that characterize mammalian DNA and that are highest in GC-rich DNA. Thus, they over-fragment GC-rich isochores (and also many GC-poor isochores) beyond recognition.  相似文献   

5.
The compositional properties of human genes   总被引:8,自引:0,他引:8  
Summary The present work represents the first attempt to study in greater detail previously proposed compositional correlations in genomes, based on a body of additional data relating to gene localizations as well as to extended flanking sequences extracted from gene banks. We have investigated the correlations that exist between (1) the GC levels of exons of human genes, and (2) the GC levels of either intergenic sequences or introns associated with the genes under consideration. In both cases, linear relationships with slopes close to unity were found. The similarity of the linear relationships indicates similar GC levels in intergenic sequences and introns located in the same isochores. Moreover, both intergenic sequences and introns showed GC levels 5–10% lower than the corresponding exons. The above findings considerably strengthen the previously drawn conclusion that coding and noncoding sequences (both inter- and intragenic) from the same isochores of the human genome are compositionally correlated. In addition, we find linear correlations between the GC levels of codon positions and of the intergenic sequences or introns associated with the corresponding genes, as well as among the GC levels of codon positions of genes.  相似文献   

6.
Summary We have analyzed the correlation that exists between the GC levels of third and first or second codon position for about 1400 human coding sequences. The linear relationship that was found indicates that the large differences in GC level of third codon positions of human genes are paralleled by smaller differences in GC levels of first and second codon positions. Whereas third codon position differences correspond to very large differences in codon usage within the human genome, the first and second codon position differences correspond to smaller, yet very remarkable, differences in the amino acid composition of encoded proteins. Because GC levels of codon positions are linearly correlated with the GC levels of the isochores harboring the corresponding genes, both codon usage and amino acid composition are different for proteins encoded by genes located in isochores of different GC levels. Furthermore, we have also shown that a linear relationship with a unity slope and a correlation coefficient of 0.77 exists between GC levels of introns and exons from the 238 human genes currently available for this analysis. Introns are, however, about 5% lower in GC, on average, than exons from the same genes.  相似文献   

7.
Arabidopsis thaliana is an important model system for the study of plant biology. We have analyzed the complete genome sequences of Arabidopsis by using a newly developed windowless method for the GC content computation, the cumulative GC profile. It is shown that the Arabidopsis genome is organized into a mosaic structure of isochores. All the centromeric regions are located in GC-rich isochores, called centromere-isochores, which are characterized by a high GC content but low gene and T-DNA insertion densities. This characteristic distinguishes centromere-isochores from the other class of GC-rich isochores, called GC-isochores, which have high gene and T-DNA insertion densities. Consequently, 15 isochores have been identified, i.e., 7 AT-isochores, 3 GC-isochores, and 5 centromere-isochores. The genes in centromere-isochores, which have the highest GC content, have much shorter intron lengths and lower intron numbers, compared to those of the other two types. There is also considerable difference in the numbers and lengths of transposable elements (TEs) between AT and GC-isochores, i.e., the TE number (length) of AT-isochores is 6.3 (7.3) times that of GC-isochores. It is generally believed that TEs are accumulated in the regions surrounding the centromeres. However, within these TE-rich regions, there are regions of extremely low TE numbers (TE deserts), which correspond to the positions of centromere-isochores. In addition, a heterochromatic knob is located at the boundary of an AT-isochore. Furthermore, we show that the differences in GC content among isochores are mainly due to the GC content variation of introns, the third codon positions and intergenic regions.[Reviewing Editor: Martin Kreitman]  相似文献   

8.
The genomic distribution of 23 nuclear genes from three dicotyledons (pea, sunflower, tobacco) and five monocotyledons of the Gramineae family (barley, maize, rice, oat, wheat) was studied by localizing these genes in DNA fractions obtained by preparative centrifugation in Cs2SO4/BAMD density gradients. Each one of these genes (and of many other related genes and pseudogenes) was found to be located in DNA fragments (50-100 Kb in size) that were less than 1-2% GC apart from each other. This definitively demonstrates the existence of isochores in plant genomes, namely of compositionally homogeneous DNA regions at least 100-200 Kb in size. Moreover, the GC levels of the 23 coding sequences studied, of their first, second and third codon positions, and of the corresponding introns were found to be linearly correlated with the GC levels of the isochores harboring those genes. Compositional correlations displayed increasing slopes when going from second to first to third codon position with obvious effects on codon usage. Coding sequences for seed storage proteins and phytochrome of Gramineae deviate from the compositional correlations just described. Finally, CpG doublets of coding sequences were characterized by a shortage that decreased and vanished with increasing GC levels of the sequences. A number of these findings bear a striking similarity with results previously obtained for vertebrate genes.  相似文献   

9.
An analysis of silent substitutions in pairwise comparisons of homologous genes from different mammals has shown that, in spite of individual fluctuations, their frequencies (which are very strongly correlated with the frequency of substitutions per synonymous site calculated according to Li et al. 1985) do not vary, on the average, with the GC levels of silent positions. This holds in the general case, in which silent positions of pairs of homologous genes share the same composition, namely in the human/other primates, human/artiodactyls, and in the mouse/rat pairs, as well as in the special cases in which the composition of silent positions are different, namely in the human/rabbit and the human/rat (or human/mouse) pairs. A slightly lower frequency found for low GC values in the human/bovine and human/pig pairs seems to be due to the specific gene samples used. These results contradict the previously claimed existence of differences in mutation rates and of mutational biases in third codon positions of coding sequences located in different isochores of mammalian genomes. They also imply that the variations in nucleotide precursor pools through the cell cycle and the differences in replication timing, or in repair efficiency, which were reported for different isochores, do not lead, as claimed, to differences in mutation rates, not in mutational biases in mammals. The differences claimed appear to be due to using small gene samples when individual fluctuations from gene to gene are relatively large. Correspondence to: G. Bernardi  相似文献   

10.
Summary We have made pairwise comparisons between the coding sequences of 21 genes from coldblooded vertebrates and 41 homologous sequences from warm-blooded vertebrates. In the case of 12 genes, GC levels were higher, especially in third codon positions, in warm-blooded vertebrates compared to cold-blooded vertebrates. Six genes showed no remarkable difference in GC level and three showed a lower level. In the first case, higher GC levels appear to be due to a directional fixation of mutations, presumably under the influence of body temperature (see Bernardi and Bernardi 1986b). These GC-richer genes of warm-blooded vertebrates were located, in all cases studied, in isochores higher in GC than those comprising the homologous genes of cold-blooded vertebrates. In the third case, increases appear to be due to a limited formation of GC-rich isochores which took place in some cold-blooded vertebrates after the divergence of warm-blooded vertebrates. The directional changes in the GC content of coding sequences and the evolutionary conservation of both increased and unchanged GC levels are in keeping with the existence of compositional constraints on the genome.  相似文献   

11.
DNA methylation is a major epigenetic modification of the genome that affects basic biological functions, such as gene expression and cell development. We used the human genome sequences and the DNA methylation data that are available in order to establish a map of the levels of GC and methylation in isochores. We also looked for the correlations that hold between GC levels and the distribution of the (1) dinucleotide CpG, (2) ratio 5mC/CpG, and (3) CpG islands. Our results show that methylation levels, CpG frequencies, and the density of CpG islands are positively correlated with the GC level of isochores. In contrast, the correlation between the 5mC/CpG ratio and GC is a negative one because the increase in methylation lags behind that of CpG, to reach a plateau in the GC-richest, gene-richest isochore families H2 and H3. In conclusion, there are more CpG targets that remain unmethylated in the GC-richest, gene-richest isochores in comparison with the other isochores. This conclusion supports the idea that the widespread methylation under consideration here has a general inhibitory effect on gene expression.  相似文献   

12.
We compared the exon/intron organization of vertebrate genes belonging to different isochore classes, as predicted by their GC content at third codon position. Two main features have emerged from the analysis of sequences published in GenBank: (1) genes coding for long proteins (i.e., 500 aa) are almost two times more frequent in GC-poor than in GC-rich isochores; (2) intervening sequences (=sum of introns) are on average three times longer in GC-poor than in GC-rich isochores. These patterns are observed among human, mouse, rat, cow, and even chicken genes and are therefore likely to be common to all warm-blooded vertebrates. Analysis of Xenopus sequences suggests that the same patterns exist in cold-blooded vertebrates. It could be argued that such results do not reflect the reality because sequence databases are not representative of entire genomes. However, analysis of biases in GenBank revealed that the observed discrepancies between GC-rich and GC-poor isochores are not artifactual, and are probably largely underestimated. We investigated the distribution of microsatellites and interspersed repeats in introns of human and mouse genes from different isochores. This analysis confirmed previous studies showing that Ll repeats are almost absent from GC-rich isochores. Microsatellites and SINES (Alu, B1, B2) are found at roughly equal frequencies in introns from all isochore classes. Globally, the presence of repeated sequences does not account for the increased intron length in GC-poor isochores. The relationships between gene structure and global genome organization and evolution are discussed.  相似文献   

13.
Hümbelin M  Thomas A  Lin J  Li J  Jore J  Berry A 《Gene》2002,300(1-2):129-139
Three statistical/mathematical analyses are carried out on isochore sequences: spectral analysis, analysis of variance, and segmentation analysis. Spectral analysis shows that there are GC content fluctuations at different length scales in isochore sequences. The analysis of variance shows that the null hypothesis (the mean value of a group of GC contents remains the same along the sequence) may or may not be rejected for an isochore sequence, depending on the subwindow sizes at which GC contents are sampled, and the window size within which group members are defined. The segmentation analysis shows that there are stronger indications of GC content changes at isochore borders than within an isochore. These analyses support the notion of isochore sequences, but reject the assumption that isochore sequences are homogeneous at the base level. An isochore sequence may pass a homogeneity test when GC content fluctuations at smaller length scales are ignored or averaged out.  相似文献   

14.
15.
Recent investigations have revealed 1) that the isochores of the human genome group into two super‐families characterized by two different long‐range 3D structures, and 2) that these structures, essentially based on the distribution and topology of short sequences, mold primary chromatin domains (and define nucleosome binding). More specifically, GC‐poor, gene‐poor isochores are low‐heterogeneity sequences with oligo‐A spikes that mold the lamina‐associated domains (LADs), whereas GC‐rich, gene‐rich isochores are characterized by single or multiple GC peaks that mold the topologically associating domains (TADs). The formation of these “primary TADs” may be followed by extrusion under the action of cohesin and CTCF. Finally, the genomic code, which is responsible for the pervasive encoding and molding of primary chromatin domains (LADs and primary TADs, namely the “gene spaces”/“spatial compartments”) resolves the longstanding problems of “non‐coding DNA,” “junk DNA,” and “selfish DNA” leading to a new vision of the genome as shaped by DNA sequences.  相似文献   

16.
Warm-blooded isochore structure in Nile crocodile and turtle.   总被引:11,自引:0,他引:11  
  相似文献   

17.
The genomes of homeothermic (warm-blooded) vertebrates are mosaic interspersions of homogeneously GC-rich and GC-poor regions (isochores). Evolution of genome compartmentalization and GC-rich isochores is hypothesized to reflect either selective advantages of an elevated GC content or chromosome location and mutational pressure associated with the timing of DNA replication in germ cells. To address the present controversy regarding the origins and maintenance of isochores in homeothermic vertebrates, newly obtained as well as published nucleotide sequences of the insulin and insulin-like growth factor (IGF) genes, members of a well-characterized gene family believed to have evolved by repeated duplication and divergence, were utilized to examine the evolution of base composition in nonconstrained (flanking) and weakly constrained (introns and fourfold degenerate sites) regions. A phylogeny derived from amino acid sequences supports a common evolutionary history for the insulin/IGF family genes. In cold- blooded vertebrates, insulin and the IGFs were similar in base composition. In contrast, insulin and IGF-II demonstrate dramatic increases in GC richness in mammals, but no such trend occurred in IGF- I. Base composition of the coding portions of the insulin and IGF genes across vertebrates correlated (r = 0.90) with that of the introns and flanking regions. The GC content of homologous introns differed dramatically between insulin/IGF-II and IGF-I genes in mammals but was similar to the GC level of noncoding regions in neighboring genes. Our findings suggest that the base composition of introns and flanking regions is determined by chromosomal location and the mutational pressure of the isochore in which the sequences are embedded. An elevated GC content at codon third positions in the insulin and the IGF genes may reflect selective constraints on the usage of synonymous codons.   相似文献   

18.
《Gene》1997,194(1):107-113
A compositional map of the centromere and of the subcentromeric region of the long arm of human chromosome 21 was established by determining the GC levels (GC is the molar fraction of guanine+cytosine in DNA) of 11 YACs (yeast artificial chromosomes) covering this 13–14 Mb region which extends from the α-satellite sequences of the C(entromeric) band qll.1, through R(everse) band q11.2, to the proximal part of G(iemsa) band q21. The entire region is made up of GC-poor, or L, isochores with only one GC-rich H1 isochore, at least 2 Mb in size, located in band q21. The almost identical GC levels of the centromeric α-satellite repeats (38.5%), of R band q11.2 (39%), and of G bands (38–40%) provide a direct demonstration that base composition cannot be the only cause of the cytogenetic differences between C, G, and the majority of R bands, namely the H3- R bands (which do not contain the GC-richest H3 isochores). The results obtained also show that isochores may be as long as 6 Mb, at least in the GC-poor regions of the genome, and support previous observations suggesting that YACs from isochore borders are unstable and/or difficult to clone. Genes and CpG islands are very rare in the GC-poor region investigated, as expected from the fact that their concentration is proportional to the GC levels of the isochores in which they are contained.  相似文献   

19.
The compositional distributions of large (main-band) DNA fragments from eight birds belonging to eight different orders (including both paleognathous and neognathous species) are very broad and extremely close to each other. These findings, which are paralleled by the compositional similarity of homologous coding sequences and their codon positions, support the idea that birds are a monophyletic group.The compositional distribution of third-codon positions of genes from chicken, the only avian species for which a relatively large number of coding sequences is known, is very broad and bimodal, the minor GC-richer peak reaching 100% GC. The very high compositional heterogeneity of avian genomes is accompanied (as in the case of mammalian genomes) by a very high speciation rate compared to cold-blooded vertebrates which are characterized by genomes that are much less heterogeneous. The higher GC levels attained by avian compared to mammalian genomes might be correlated with the higher body temperature (41–43°C) of birds compared to mammals (37°C).A comparison of GC levels of coding sequences and codon positions from man and chicken revealed very close average GC levels and standard deviations. Homologous coding sequences and codon positions from man and chicken showed a surprisingly high degree of compositional similarity which was, however, higher for GC-poor than for GC-rich sequences. This indicates that GC-poor isochores of warm-blooded vertebrates reflect the composition of the isochores of the genome of the common reptilian ancestor of mammals and birds, which underwent only a small compositional change at the transition from cold- to warm-blooded vertebrates. In contrast, the GC-rich isochores of birds and mammals are the result of large compositional changes at the same evolutionary transition, where were in part different in the two classes of warm-blooded vertebrates.Correspondence to: G. Bernaadi  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号