首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
Analytical DNA ultracentrifugation revealed that eukaryotic genomes are mosaics of isochores: long DNA segments (>300 kb on average) relatively homogeneous in G+C. Important genome features are dependent on this isochore structure, e.g. genes are found predominantly in the GC-richest isochore classes. However, no reliable method is available to rigorously partition the genome sequence into relatively homogeneous regions of different composition, thereby revealing the isochore structure of chromosomes at the sequence level. Homogeneous regions are currently ascertained by plain statistics on moving windows of arbitrary length, or simply by eye on G+C plots. On the contrary, the entropic segmentation method is able to divide a DNA sequence into relatively homogeneous, statistically significant domains. An early version of this algorithm only produced domains having an average length far below the typical isochore size. Here we show that an improved segmentation method, specifically intended to determine the most statistically significant partition of the sequence at each scale, is able to identify the boundaries between long homogeneous genome regions displaying the typical features of isochores. The algorithm precisely locates classes II and III of the human major histocompatibility complex region, two well-characterized isochores at the sequence level, the boundary between them being the first isochore boundary experimentally characterized at the sequence level. The analysis is then extended to a collection of human large contigs. The relatively homogeneous regions we find show many of the features (G+C range, relative proportion of isochore classes, size distribution, and relationship with gene density) of the isochores identified through DNA centrifugation. Isochore chromosome maps, with many potential applications in genomics, are then drawn for all the completely sequenced eukaryotic genomes available.  相似文献   

2.
Prior to genome sequencing, information on base composition (GC level) and its variation in mammalian genomes could be obtained using density gradient ultracentrifugation. Analyses using this approach led to the conclusion that mammalian genomes are organized into mosaics of fairly homogeneous regions, called isochores. We present an initial compositional overview of the chromosomes of the recently available draft human genome sequence, in the form of color-coded moving window plots and corresponding GC level histograms. Results obtained from the draft human genome sequence agree well with those obtained or deduced earlier from CsCl experiments. The draft sequence now permits the visualization of the mosaic organization of the human genome at the DNA sequence level.  相似文献   

3.
Isochore structures in the mouse genome   总被引:2,自引:0,他引:2  
Zhang CT  Zhang R 《Genomics》2004,83(3):384-394
The distribution of the G+C content in the mouse genome has been studied using a windowless technique. We have found that: (i). Abrupt variations of the G+C content from a GC-rich region to a GC-poor region, and vice versa, occur frequently at some sites along the sequence of the mouse genome. (ii). Long domains with relatively homogeneous G+C content (isochores) exist, which usually have sharp boundaries. Consequently, 28 isochores longer than 1 Mb have been identified in the mouse genome. A homogeneity index was used to quantify the variations of the G+C content within isochores. The precise boundaries, sizes, and G+C contents of these isochores have been determined. The windowless technique for the G+C content computation was also used to analyze the DNA sequence containing the mouse MHC region, which has a GC-poor isochore. This isochore is located at the central part of the sequence with boundaries at 468459 and 812716 bp, where the sequence is extended from the centromeric end to the telomeric end. In addition, the analysis of a segment of the rat genome shows that the rat genome also has clear isochore structures.  相似文献   

4.
An isochore map of the human genome based on the Z curve method   总被引:4,自引:0,他引:4  
Zhang CT  Zhang R 《Gene》2003,317(1-2):127-135
The distribution of the G+C content in the human genome has been studied by using a windowless technique derived from the Z curve method. The most important findings presented in this paper are twofold. First, abrupt variations of the G+C content along human chromosome sequences are the main variation patterns of G+C content. It is found that at some sites, the G+C content undergoes abrupt changes from a G+C-rich region to a G+C-poor region alternatively and vice versa. Second, it is shown that long domains with relatively homogeneous G+C content along each chromosome do exist. These domains are thought to be isochores, which usually have sharp boundaries. Consequently, 56 isochores longer than 3 Mb have been identified in chromosomes 1-22, X and Y. Boundaries, size and G+C content of each isochore identified are listed in detail. As an example to demonstrate the power of the method, the boundary between the Classes III and II isochores of the MHC sequence has been determined and found to be at 2,477,936, which is in good agreement with the experimental evidence. A homogeneity index is introduced to measure the homogeneity of G+C content in isochores. We emphasize that the homogeneity of G+C content is relative. The isochores in which the G+C content keeps absolutely constant do not exist. Isochore structures appear to be a basic organization of the human genome. Due to the relevance to many important biological functions, the clarification of isochore structures will provide much insight into the understanding of the human genome.  相似文献   

5.
The human genome is a mosaic of isochores, which are long DNA segments (300 kbp) relatively homogeneous in G+C. Human isochores were first identified by density-gradient ultracentrifugation of bulk DNA, and differ in important features, e.g. genes are found predominantly in the GC-richest isochores. Here, we use a reliable segmentation method to partition the longest contigs in the human genome draft sequence into long homogeneous genome regions (LHGRs), thereby revealing the isochore structure of the human genome. The advantages of the isochore maps presented here are: (1) sequence heterogeneities at different scales are shown in the same plot; (2) pair-wise compositional differences between adjacent regions are all statistically significant; (3) isochore boundaries are accurately defined to single base pair resolution; and (4) both gradual and abrupt isochore boundaries are simultaneously revealed. Taking advantage of the wide sample of genome sequence analyzed, we investigate the correspondence between LHGRs and true human isochores revealed through DNA centrifugation. LHGRs show many of the typical isochore features, mainly size distribution, G+C range, and proportions of the isochore classes. The relative density of genes, Alu and long interspersed nuclear element repeats and the different types of single nucleotide polymorphisms on LHGRs also coincide with expectations in true isochores. Potential applications of isochore maps range from the improvement of gene-finding algorithms to the prediction of linkage disequilibrium levels in association studies between marker genes and complex traits. The coordinates for the LHGRs identified in all the contigs longer than 2 Mb in the human genome sequence are available at the online resource on isochore mapping: http://bioinfo2.ugr.es/isochores.  相似文献   

6.
The human genome is composed of large sequence segments with fairly homogeneous GC content, namely isochores, which have been linked to many important functions; biological implications of most isochore boundaries, however, remain elusive, partly due to the difficulty in determining these boundaries at high resolution. Using the segmentation algorithm based on the quadratic divergence, we re-determined all 79 boundaries of previously identified human isochores at single-nucleotide resolution, and then compared the boundary coordinates with other genome features. We found that 55.7% of isochore boundaries coincide with termini of repeat elements; 45.6% of isochore boundaries coincide with termini of highly conserved sequences based on alignment of 17 vertebrate genomes, i.e., the highly conserved genome sequence switches to a less or non-conserved one at the isochore boundary; some isochore boundaries coincide with abrupt change of CpG island distribution (note that one boundary can associate with more than one genome feature). In addition, sequences around isochore boundaries are highly conserved. It seems reasonable to deduce that the boundaries of all the isochores studied here would be replication timing sites in the human genome. These results suggest possible key roles of the isochore boundaries and may further our understanding of the human genome organization.  相似文献   

7.
It has been suggested that the mammalian genome is composed mainly of long compositionally homogeneous domains. Such domains are frequently identified using recursive segmentation algorithms based on the Jensen–Shannon divergence. However, a common difficulty with such methods is deciding when to halt the recursive partitioning and what criteria to use in deciding whether a detected boundary between two segments is real or not. We demonstrate that commonly used halting criteria are intrinsically biased, and propose IsoPlotter, a parameter-free segmentation algorithm that overcomes such biases by using a simple dynamic halting criterion and tests the homogeneity of the inferred domains. IsoPlotter was compared with an alternative segmentation algorithm, DJS, using two sets of simulated genomic sequences. Our results show that IsoPlotter was able to infer both long and short compositionally homogeneous domains with low GC content dispersion, whereas DJS failed to identify short compositionally homogeneous domains and sequences with low compositional dispersion. By segmenting the human genome with IsoPlotter, we found that one-third of the genome is composed of compositionally nonhomogeneous domains and the remaining is a mixture of many short compositionally homogeneous domains and relatively few long ones.  相似文献   

8.
Since the G + C content of a gene is correlated to that of the isochore in which it resides, and early replicating isochores are thought to be relatively G + C rich, early replicating genes should also be rich in G + C. This hypothesis is tested on a sample of 44 mammalian genes for which replication time data and sequence information are available. Early replicating genes do not appear to be more G + C rich than late replicating genes, instead there is considerable variation in the G + C content of genes replicated during both halves of S phase. These results show that both G + C rich and poor fractions of the genome are replicated early and late in the cell cycle, and suggest that isochores are not maintained by the replication of DNA sequences in compositionally biased free nucleotide pools.  相似文献   

9.
Abstract

The human genome is composed of large sequence segments with fairly homogeneous GC content, namely isochores, which have been linked to many important functions; biological implications of most isochore boundaries, however, remain elusive, partly due to the difficulty in determining these boundaries at high resolution. Using the segmentation algorithm based on the quadratic divergence, we re-determined all 79 boundaries of previously identified human isochores at single-nucleotide resolution, and then compared the boundary coordinates with other genome features. We found that 55.7% of isochore boundaries coincide with termini of repeat elements; 45.6% of isochore boundaries coincide with termini of highly conserved sequences based on alignment of 17 vertebrate genomes, i.e., the highly conserved genome sequence switches to a less or non-conserved one at the isochore boundary; some isochore boundaries coincide with abrupt change of CpG island distribution (note that one boundary can associate with more than one genome feature). In addition, sequences around isochore boundaries are highly conserved. It seems reasonable to deduce that the boundaries of all the isochores studied here would be replication timing sites in the human genome. These results suggest possible key roles of the isochore boundaries and may further our understanding of the human genome organization.  相似文献   

10.
The global, rather than local, variation in G+C content along the nuclear DNA sequences of various organisms was studied using GenBank sequence data. When long DNA sequences of the genomes of Escherichia coli and Saccharomyces cerevisiae were examined, the levels of their G+C content (G+C%) were found to be within a narrow range around that of the whole genome. The G+C% levels for sequences of vertebrate genomes, however, were found to cover a wide range, showing that their genome is a mosaic of sequences with different G+C% levels, in each of which the sequence is fairly homogeneous in its G+C% for a very long distance. Through surveying a human genetic map and GenBank DNA sequences, the global variations in G+C% along the human genome DNA were found to be correlated with chromosome band structures.  相似文献   

11.
The human genome is described in the literature as being composed of the isochores, i.e., long (hundreds of kilobases) segments with a homogeneous (G + C) content. We calculated the (G + C) content variations along the DNA molecules of the human chromosomes 21 and 22 and found the variations to be higher everywhere compared to the randomized sequences. Hence the (G + C) content is certainly not homogeneous on the isochore scale in the two human chromosomes. In addition, we found no significant difference between the two human molecules and the genome of E. coli regarding the (G + C) content variations. Hence no isochores are either present in the DNA molecules of the human chromosomes 21 and 22, or the isochores are also present in the genome of Escherichia coli. In any case, the present communication demonstrates that the isochores should be defined in unambiguous molecular terms if they are to be used for an up-to-date genome structure characterization.  相似文献   

12.
Li W 《Gene》2001,276(1-2):57-72
The concept of homogeneity of G+C content is always relative and subjective. This point is emphasized and quantified in this paper using a simple example of one sequence segmented into two subsequences. Whether the sequence is homogeneous or not can be answered by whether the two-subsequence model describes the DNA sequence better than the one-sequence model. There are at least three equivalent ways of looking at the 1-to-2 segmentation: Jensen-Shannon divergence measure, log likelihood ratio test, and model selection using Bayesian information criterion. Once a criterion is chosen, a DNA sequence can be recursively segmented into multiple domains. We use one subjective criterion called segmentation strength based on the Bayesian information criterion. Whether or not a sequence is homogeneous and how many domains it has depend on this criterion. We compare six different genome sequences (yeast S. cerevisiae chromosome III and IV, bacterium M. pneumoniae, human major histocompatibility complex sequence, longest contigs in human chromosome 21 and 22) by recursive segmentations at different strength criteria. Results by recursive segmentation confirm that yeast chromosome IV is more homogeneous than yeast chromosome III, human chromosome 21 is more homogeneous than human chromosome 22, and bacterial genomes may not be homogeneous due to short segments with distinct base compositions. The recursive segmentation also provides a quantitative criterion for identifying isochores in human sequences. Some features of our recursive segmentation, such as the possibility of delineating domain borders accurately, are superior to those of the moving-window approach commonly used in such analyses.  相似文献   

13.
Bernardi G 《Gene》2000,241(1):3-17
The nuclear genomes of vertebrates are mosaics of isochores, very long stretches (>300kb) of DNA that are homogeneous in base composition and are compositionally correlated with the coding sequences that they embed. Isochores can be partitioned in a small number of families that cover a range of GC levels (GC is the molar ratio of guanine+cytosine in DNA), which is narrow in cold-blooded vertebrates, but broad in warm-blooded vertebrates. This difference is essentially due to the fact that the GC-richest 10-15% of the genomes of the ancestors of mammals and birds underwent two independent compositional transitions characterized by strong increases in GC levels. The similarity of isochore patterns across mammalian orders, on the one hand, and across avian orders, on the other, indicates that these higher GC levels were then maintained, at least since the appearance of ancestors of warm-blooded vertebrates. After a brief review of our current knowledge on the organization of the vertebrate genome, evidence will be presented here in favor of the idea that the generation and maintenance of the GC-richest isochores in the genomes of warm-blooded vertebrates were due to natural selection.  相似文献   

14.
Incorporated with the Z curve method, the technique of wavelet multiresolution (also known as multiscale) analysis has been proposed to identify the boundaries of isochores in the human genome. The human MHC sequence and the longest contigs of human chromosomes 21 and 22 are used as examples. The boundary between the isochores of Class III and Class II in the MHC sequence has been detected and found to be situated at the position 2,490,368bp. This result is in good agreement with the experimental evidence. An isochore with a length of about 7Mb in chromosome 21 has been identified and found to be gene- and Alu-poor. We have also found that the G+C content of chromosome 21 is more homogeneous than that of chromosome 22. Compared with the window-based methods, the present method has the highest resolution for identifying the boundaries of isochores, even at a scale of single base. Compared with the entropic segmentation method, the present method has the merits of more intuitiveness and less calculations. The important conclusion drawn in this study is that the segmentation points, at which the G+C content undergoes relatively dramatic changes, do exist in the human genome. These 'singularity' points may be considered to be candidates of isochore boundaries in the human genome. The method presented is a general one and can be used to analyze any other genomes.  相似文献   

15.
A compositional map of human chromosome 21.   总被引:9,自引:0,他引:9       下载免费PDF全文
K Gardiner  B Aissani    G Bernardi 《The EMBO journal》1990,9(6):1853-1858
GC-poor and GC-rich isochores, the long (greater than 300 kb) compositionally homogeneous DNA segments that form the genome of warm-blooded vertebrates, are located in G- and R-bands respectively of metaphase chromosomes. The precise correspondence between GC-rich isochores and R-band structure is still, however, an open problem, because GC-rich isochores are compositionally heterogeneous and only represent one-third of the genome, with the GC-richest family (which is by far the highest in gene concentration) corresponding to less than 5% of the genome. In order to clarify this issue and, more generally, to correlate DNA composition and chromosomal structure in an unequivocal way, we have developed a new approach, compositional mapping. This consists of assessing the base composition over 0.2-0.3 Mb (megabase) regions surrounding landmarks that were previously localized on the physical map. Compositional mapping was applied here to the long arm of human chromosome 21, using 53 probes that had already been used in physical mapping. The results obtained provide a direct demonstration that the DNA stretches of G-bands essentially correspond to GC-poor isochores, and that R-band DNA is characterized by a compositional heterogeneity that is much more striking than expected, in that it comprises isochores covering the full spectrum of GC levels. GC-poor isochores of R-bands may, however, correspond to 'thin' G-bands, as visualized at high resolution, leaving GC-rich and very GC-rich isochores as the real components of (high-resolution) R-band DNA.(ABSTRACT TRUNCATED AT 250 WORDS)  相似文献   

16.
17.
The mammalian genome is not a random sequence but shows a specific, evolutionarily conserved structure that becomes manifest in its isochore pattern. Isochores, i.e. stretches of DNA with a distinct sequence composition and thus a specific GC content, cause the chromosomal banding pattern. This fundamental level of genome organization is related to several functional features like the replication timing of a DNA sequence. GC richness of genomic regions generally corresponds to an early replication time during S phase. Recently, we demonstrated this interdependency on a molecular level for an abrupt transition from a GC-poor isochore to a GC-rich one in the NF1 gene region; this isochore boundary also separates late from early replicating chromatin. Now, we analyzed another genomic region containing four isochores separated by three sharp isochore transitions. Again, the GC-rich isochores were found to be replicating early, the GC-poor isochores late in S phase; one of the replication time zones was discovered to consist of one single replicon. At the boundaries between isochores, that all show no special sequence elements, the replication machinery stopped for several hours. Thus, our results emphasize the importance of isochores as functional genomic units, and of isochore transitions as genomic landmarks with a key function for chromosome organization and basic biological properties.  相似文献   

18.
T Bettecken  B Aissani  C R Müller  G Bernardi 《Gene》1992,122(2):329-335
The genomes of warm-blooded vertebrates are mosaics of long DNA segments (> 300 kb, on the average), the isochores, homogeneous in GC levels, which belong to a small number of compositional families. In the present work, the human dystrophin-encoding gene, spanning more than 2.3 Mb in Giemsa band Xp21 (on the short arm of the X chromosome), was analyzed in its isochore organization by hybridizing cDNA probes, corresponding to eight contiguous segments of the coding sequence, on compositional fractions from human DNA. Five DNA regions of uniform (+/- 0.5%) GC content, separated by compositional discontinuities of about 2% GC, were found, so providing the first high-resolution compositional map obtained for a human genome locus and the first direct estimate of isochore size (360 kb to more than 770 kb, in the locus under consideration). One of the isochores contains 71% and another one 21% of deletion breakpoints found in patients suffering from Duchenne's and Becker's muscular dystrophies.  相似文献   

19.
Arabidopsis thaliana is an important model system for the study of plant biology. We have analyzed the complete genome sequences of Arabidopsis by using a newly developed windowless method for the GC content computation, the cumulative GC profile. It is shown that the Arabidopsis genome is organized into a mosaic structure of isochores. All the centromeric regions are located in GC-rich isochores, called centromere-isochores, which are characterized by a high GC content but low gene and T-DNA insertion densities. This characteristic distinguishes centromere-isochores from the other class of GC-rich isochores, called GC-isochores, which have high gene and T-DNA insertion densities. Consequently, 15 isochores have been identified, i.e., 7 AT-isochores, 3 GC-isochores, and 5 centromere-isochores. The genes in centromere-isochores, which have the highest GC content, have much shorter intron lengths and lower intron numbers, compared to those of the other two types. There is also considerable difference in the numbers and lengths of transposable elements (TEs) between AT and GC-isochores, i.e., the TE number (length) of AT-isochores is 6.3 (7.3) times that of GC-isochores. It is generally believed that TEs are accumulated in the regions surrounding the centromeres. However, within these TE-rich regions, there are regions of extremely low TE numbers (TE deserts), which correspond to the positions of centromere-isochores. In addition, a heterochromatic knob is located at the boundary of an AT-isochore. Furthermore, we show that the differences in GC content among isochores are mainly due to the GC content variation of introns, the third codon positions and intergenic regions.[Reviewing Editor: Martin Kreitman]  相似文献   

20.
The vertebrate genome: isochores and evolution   总被引:18,自引:6,他引:12  
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号