首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
CpG islands in vertebrate genomes   总被引:120,自引:0,他引:120  
  相似文献   

2.
Summary The extent to which CpG dinucleotides were depleted in a large set of angiosperm genes was, on average, very similar to the extent of CpG depletion in total angiosperm genomic DNA and far less than the extent of CpG depletion in vertebrate genes. Gene sequences from Arabidopsis thaliana, a dicotyledonous species with relatively low levels of total 5-methylcytosine, were just as CpG depleted as the angiosperm genes in general. Furthermore, levels of TpG and CpA, the potential deamination mutation products of methylated CpG, were elevated in A. thaliana genes, supporting a high rate of deamination mutation as the cause of the CpG deficiency. Using a method that takes into account the dinucleotide frequencies within each sequence of interest, we calculated the expected frequencies of CpNpG trinucleotides, which are also highly methylated in angiosperm genomes. CpNpG trinucleotides were not extensively enriched or depleted in the angiosperm genes. Two hypotheses could account for our results. Differential depletion of CpG and CpNpG within angiosperm genes and differential depletion of CpG in angiosperm and vertebrate genes could arise from different efficiencies of mismatch repair or from different levels of cytosine methylation in the cell lineages that contribute to germ cells.Offprint requests to: M. Gardiner-Garden  相似文献   

3.

Background

Mammalian CpG islands (CGIs) normally escape DNA methylation in all adult tissues and developmental stages. However, in our previous study we unexpectedly identified many methylated CGIs in human peripheral blood leukocytes. Methylated CpG dinucleotides convert to TpG dinucleotides through deaminization of their cytosine bases more frequently than hypomethylated CpG dinucleotides. Therefore, we wondered how methylated CGIs in germline or non-germline cells maintain their CpG-rich sequences. It is known that events such as germline hypomethylation, CpG selection, biased gene conversion (BGC), and frequent CpG fixation can contribute to the maintenance of CpG-rich sequences in methylated CGIs in germline or non-germline cells. However, it has not been investigated which of the processes maintain CpG-rich sequences of methylated CGIs in each genomic position.

Results

In this study, we comprehensively examined the contribution of the processes described above to the maintenance of CpG-rich sequences in methylated CGIs in germline and non-germline cells which were classified by genomic positions. Approximately 60–80% of CGIs with high methylation in H1 cell line (H1-HM) in all the genomic positions showed a low average CpG → TpG/CpA substitution rate. In contrast, fewer than half the numbers of CGIs with H1-HM in all the genomic positions showed a low average CpG → TpG/CpA substitution rate and low levels of methylation in sperm cells (SPM-LM). Furthermore, a small fraction of CGIs with a low average CpG → TpG/CpA substitution rate and high levels of methylation in sperm cells (SPM-HM) showed CpG selection.On the other hand, independent of the positions in genes, most CGIs with SPM-HM showed a slightly higher average TpG/CpA → CpG substitution rate compared with those with SPM-LM.

Conclusions

Relatively high numbers (approximately 60–80%) of CGIs with H1-HM in all the genomic positions preserve their CpG-rich sequences by a low CpG → TpG/CpA substitution rate caused mainly by their SPM-LM, and for those with SPM-HM partly by CpG selection and TpG/CpA → CpG fixation. BGC has little contribution to the maintenance of CpG-rich sequences of CGIs with SPM-HM which were classified by genomic positions.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1286-x) contains supplementary material, which is available to authorized users.  相似文献   

4.
We screened plant genome sequences, primarily from rice and Arabidopsis thaliana, for CpG islands, and identified DNA segments rich in CpG dinucleotides within these sequences. These CpG-rich clusters appeared in the analysed sequences as discrete peaks and occurred at the frequencies of one per 4.7 kb in rice and one per 4.0 kb in A. thaliana. In rice and A. thaliana, most of the CpG-rich clusters were associated with genes, which suggests that these clusters are useful landmarks in genome sequences for identifying genes in plants with small genomes. In contrast, in plants with larger genomes, only a few of the clusters were associated with genes. These plant CpG-rich clusters satisfied the criteria used for identifying human CpG islands, which suggests that these CpG clusters may be regarded as plant CpG islands. The position of each island relative to the 5'-end of its associated gene varied considerably. Genes in the analysed sequences were grouped into five classes according to the position of the CpG islands within their associated genes. A large proportion of the genes belonged to one of two classes, in which a CpG island occurred near the 5'-end of the gene or covered the whole gene region. The position of a plant CpG island within its associated gene appeared to be related to the extent of tissue-specific expression of the gene; the CpG islands of most of the widely expressed rice genes occurred near the 5'-end of the genes.  相似文献   

5.
The DNA of most vertebrates is depleted in CpG dinucleotides, the target for DNA methylation. The remaining CpGs tend to cluster in regions referred to as CpG islands (CGI). CGI have been useful as marking functionally relevant epigenetic loci for genome studies. For example, CGI are enriched in the promoters of vertebrate genes and thought to play an important role in regulation. Currently, CGI are defined algorithmically as an observed-to-expected ratio (O/E) of CpG greater than 0.6, G+C content greater than 0.5, and usually but not necessarily greater than a certain length. Here we find that the current definition leaves out important CpG clusters associated with epigenetic marks, relevant to development and disease, and does not apply at all to nonvertabrate genomes. We propose an alternative Hidden Markov model-based approach that solves these problems. We fit our model to genomes from 30 species, and the results support a new epigenomic view toward the development of DNA methylation in species diversity and evolution. The O/E of CpG in islands and nonislands segregated closely phylogenetically and showed substantial loss in both groups in animals of greater complexity, while maintaining a nearly constant difference in CpG O/E between islands and nonisland compartments. Lists of CGI for some species are available at http://www.rafalab.org.  相似文献   

6.
Summary Sequence data from regions of five vertebrate vitellogenin genes were used to examine the frequency, distribution, and mutability of the dinucleotide CpG, the preferred modification site for eukaryotic DNA methyltransferases. The observed level of the CpG dinucleotide in all five genes was markedly lower than that expected from the known mononucleotide frequencies. CpG suppression was greater in introns than in exons. CpG-containing codons were found to be avoided in the vitellogenin genes, but not completely despite the redundancy of the genetic code. Frequency and distribution patterns of this dinucleotide varied dramatically among these otherwise closely related genes. Dense clusters of CpG dinucleotides tended to appear in regions of either functional or structural interest (e.g., in the transposon-like Vi-element ofXenopus) and these clusters contained 5-methylcytosine (5 mC). 5 mC is known to undergo deamination to form thymidine, but the extent to which this transition occurs in the heavily methylated genomes of vertebrates and its contribution to CpG suppression are still unclear. Sequence comparison of the methylated vitellogenin gene regions identified CT and GA substitutions that were found to occur at relatively high frequencies. The predicted products of CpG deamination, TpG and CpA, were elevated. These findings are consistent with the view that CpG distribution and methylation are interdependent and that deamination of 5 mC plays an important role in promoting evolutionary change at the nucleotide sequence level.  相似文献   

7.
CpG islands in genes showing tissue-specific expression   总被引:2,自引:0,他引:2  
Patterns of DNA methylation at CpG dinucleotides and their relations with gene expression are complex. Methylation-free CpG clusters, so-called HTF islands, are most often associated with the promoter regions of housekeeping genes, whereas genes expressed in a single-cell type are usually deficient in these sequences. However, in the human carbonic anhydrase (CA) gene family, both the ubiquitously expressed CAII and the muscle specific CAIII appear to have such CpG islands although erythrocyte-specific CAI does not. The CAII island is quantitatively more CpG rich than that of CAIII, with a CpG:GpC ratio of 0.94 compared with 0.82 for CAIII. Estimation of CpG:GpC ratios in the proximal-promoter regions of 44 vertebrate genes suggest that 40% of genes with tissue-specific or limited tissue distribution may show methylation-free CpG clusters in their promoter regions. In many cases the CpG:GpC ratio is less than that found in housekeeping genes and this may reflect variation in the interaction of CpG clusters with regulatory factors that define different patterns of tissue expression.  相似文献   

8.
We report the isolation of the complete genes encoding nucleolin from rat and hamster. The DNA clones were obtained from partial genomic libraries by probing with a genomic DNA fragment containing the leader and promoter regions of the mouse nucleolin gene. We have determined the complete nucleotide sequence of the 5'-terminal region for the three rodent species. The sequenced regions extend over 1 kb downstream and upstream from the cap sites and include a conserved CpG island 1500 nucleotides (nt) long. The 5' end of the CpG island in each species has maintained a long alternating purine-pyrimidine sequence which could adopt a Z-DNA conformation. By sequence comparison, 42 blocks of homology are defined in the 5'-terminal region, of which 36 appear in the CpG island and contain numerous conserved CpG dinucleotides. Two blocks, 110 and 49 nt long, encompassing the cap sites and the region immediately upstream, respectively, present features characteristic of regulated genes: a possible TATA box (ATTA), two pyrimidine-rich nucleotide stretches and two inverted juxtaposed CCAAT-like boxes (GGTTGG). Furthermore, the adjacent upstream conserved region presents features characteristic of housekeeping genes: four G/C boxes, embedded in a high G + C-content sequence, among them one presenting a perfect consensus Sp 1-binding site (GCCCCGCCCC). Among unusual features, we report numerous large G + C-rich conserved sequences located in the first intron. One of these sequences contains two G/C boxes which border a sequence presenting a dyad symmetry (GCGCACGTGCTC). Our findings shed some light on the putative role of the CpG island. We show that CpG-rich sequence motifs are under strong selective pressure over the whole 5'-terminal region and are presumably involved in regulatory mechanisms.  相似文献   

9.
In vertebrate genomes the dinucleotide CpG is heavily methylated, except in CpG islands, which are normally unmethylated. It is not clear why the CpG islands are such poor substrates for DNA methyltransferase. Plant genomes display methylation, but otherwise the genomes of plants and animals represent two very divergent evolutionary lines. To gain a further understanding of the resistance of CpG islands to methylation, we introduced a human CpG island from the proteasome-like subunit I gene into the genome of the plant Arabidopsis thaliana. Our results show that prevention of methylation is an intrinsic property of CpG islands, recognized even if a human CpG island is transferred to a plant genome. Two different parts of the human CpG island – the promoter region/ first exon and exon2–4 – both displayed resistance against methylation, but the promoter/ exon1 construct seemed to be most resistant. In contrast, certain sites in a plant CpG-rich region used as a control transgene were always methylated. The frequency of silencing of the adjacent nptII (KmR) gene in the human CpG constructs was lower than observed for the plant CpG-rich region. These results have implications for understanding DNA methylation, and for construction of vectors that will reduce transgene silencing.  相似文献   

10.
CpG islands: features and distribution in the genomes of vertebrates   总被引:4,自引:0,他引:4  
B A?ssani  G Bernardi 《Gene》1991,106(2):173-183
We have investigated the distribution of unmethylated CpG islands in vertebrate genomes fractionated according to their base composition. Genomes from warm-blooded vertebrates (man, mouse and chicken) are characterized by abundant CpG islands, whose frequency increases in DNA fractions of increasing % of guanine + cytosine; % G + C (GC), in parallel with the distribution of genes and CpG doublets. Small, yet significant, differences in the distribution of CpG islands were found in the three genomes. In contrast, genomes from cold-blooded vertebrates (two reptiles, one amphibian, and two fishes) were characterized by an extreme scarcity or absence of CpG islands (detected in these experiments as HpaII tiny fragments or HTF). CpG islands associated with homologous genes from cold- and warm-blooded vertebrates were then compared by analyzing CpG frequencies, GC levels, HpaII sites, rare-cutter sites and G/C boxes (GGGGCGGGGC and closely related motifs) in sequences available in gene banks. Small, yet significant, differences were again detected among the CpG islands associated with homologous genes from warm-blooded vertebrates, in that CpG islands associated with mouse or rat genes often showed low CpG and/or GC levels, as well as low numbers of HpaII sites, rare-cutter sites and G/C boxes, compared to homologous human genes; more rarely, CpG islands were just absent. As far as cold-blooded vertebrates were concerned, a number of genes showed CpG islands, which exhibited a much lower frequency of CpG doublets than that found in CpG islands of warm-blooded vertebrates, but still approached the statistically expected frequency; none of the other features of CpG islands associated with genes from warm-blooded vertebrates were present. Other genes did not show any associated CpG islands, unlike their homologues from warm-blooded vertebrates.  相似文献   

11.
CpG islands of the X chromosome are gene associated.   总被引:6,自引:0,他引:6       下载免费PDF全文
Unmethylated CpG rich islands are a feature of vertebrate DNA: they are associated with housekeeping and many tissue specific genes. CpG islands on the active X chromosome of mammals are also unmethylated. However, islands on the inactive X chromosome are heavily methylated. We have identified a CpG island in the 5' region of the G6PD gene, and two islands forty Kb 3' from the G6PD gene, on the human X chromosome. Expression of the G6PD gene is associated with concordant demethylation of all three CpG islands. We have shown that one of the two islands is in the promoter region of a housekeeping gene, GdX. In this paper we show that the second CpG island is also associated with a gene, P3. The P3 gene has no homology to previously described genes. It is a single copy, 4 kb gene, conserved in evolution, and it has the features of a housekeeping two genes is within the CpG island and that sequences in the islands have promoter function.  相似文献   

12.
Initiation of DNA replication at CpG islands in mammalian chromosomes.   总被引:19,自引:2,他引:17       下载免费PDF全文
S Delgado  M Gómez  A Bird    F Antequera 《The EMBO journal》1998,17(8):2426-2435
  相似文献   

13.
A cDNA library was constructed from mRNA prepared from light-treated seedlings of Scots pine (Pinus sylvestris L.) and cDNAs for the chlorophyll a/b-binding protein LHC-II were identified using a pea gene as the heterologous probe. Three cDNA clones were sequenced. The deduced amino acid sequences of two of the genes corresponded to Type I and one to Type II LHC-II proteins which were ca. 90% homologous to their angiosperm counterparts. The transit peptides of the Scots pine preLHC-II showed features common to angiosperm transit peptides. The three cDNAs had a 70 to 75% preference for G+C in the third base position. CpG and GpC profiles and degenerate codon position bias suggested that two of the corresponding genes lie within CpG islands.  相似文献   

14.
CpG islands, genes and isochores in the genomes of vertebrates   总被引:6,自引:0,他引:6  
B A?ssani  G Bernardi 《Gene》1991,106(2):185-195
We have shown that human genes associated with CpG islands increase in number as they increase in % of guanine + cytosine (GC) levels, and that most genes associated with CpG islands are located in the GC-richest compartment of the human genome. This is an independent confirmation of the concentration gradient of CpG islands (detected as HpaII tiny fragments, or HTF) which was demonstrated in the genome of warm-blooded vertebrates [A?ssani and Bernardi, Gene 106 (1991) 173-183]. We then reassessed the location of CpG islands using the data currently available and confirmed that CpG islands are most frequently located in the 5'-flanking sequences of genes and that they overlap genes to variable extents. We have shown that such extents increase with the increasing GC levels of genes, the GC-richest genes being completely included in CpG islands. Under such circumstances, we have investigated the properties of the 'extragenic' CpG islands located in the 5'-flanking segments of homologous genes from both warm- and cold-blooded vertebrates. We have confirmed that, in cold-blooded vertebrates, CpG islands are often absent; when present, they have lower GC and CpG levels; the latter attain, however, statistically expected values. Finally, we have shown that CpG doublets increase with the increasing GC of exons, introns and intergenic sequences (including 'extragenic' CpG islands) in the genomes from both warm- and cold-blooded vertebrates. The correlations found are the same for both classes of vertebrates, and are similar for exons, introns and intergenic sequences (including 'extragenic' CpG islands). The findings just outlined indicate that the origin and evolution of CpG islands in the vertebrate genome are associated with compositional transitions (GC increases) in genes and isochores.  相似文献   

15.
The frequencies of neighboring b.p. in more than 1100 genes of vertebrates in the EMBL bank (1000 kb) have been analysed. It has been found that the majority of such genes exhibit a lack of CpG duplexes and an excess of TpG+CpA. The loss of CpG may indicate that the major part of these sites in the genome is methylated and has been subjected to the pressure of CpG----TpG+CpA mutations. The methylated genes grouped into compartment M+ are represented by a fraction of repeated sequences and by genes of the most rapidly diverging families of proteins (globins, immunoglobulins, structural proteins, etc.). The genes of this compartment are characterized by a correlation between the G+C content and the value of CpG-suppression. A group of genes has been detected in which the CpG mutation process has gone so far that nearly all of these dinucleotides have disappeared from DNA. Judging by the value of CpG-suppression, these genes, grouped in the Mo+ compartment, used to be strongly methylated before. However, in the now extant vertebrates they have fully depleted their CpG reserve and for this reason lost the methylation capacity. Transitions in methylated CpG may be one of the sources of spontaneous mutagenesis resulting in the enhanced genetic instability of the cell. A gene compartment has been detected with an intermediate level of CpG deficiency; this compartment has been designated as M+. In these genes only a few of the available CpGs have been steadily methylated (and subjected to mutation). It has been found that the genome of vertebrates contains a specific CpG-rich fraction which exhibits no CpG-suppression, irrespective of the overall content of G+C. Probably, CpG sites have persisted unmethylated throughout the existence of these genes. We suggest them to constitute a M- compartment. This compartment comprises the genes of tRNA and rRNA (5S, 5.8S, 18S, 28S) and small nuclear RNAs U2-U6, as well as the genes of core histones, some enzymes, viruses and 5'-flanking sequences of certain protein-coding genes. In the genome of vertebrates, the genes of the evolutionary most conserved proteins and RNAs have not undergone methylation. A list of genes, belonging to different compartments of the vertebrate genome, is given. Compartment Mo+ constitutes 19%, M(+)--35%, M(+/-)--28% and M(-)--8% of all the vertebrate genes studied. Possible mechanisms, protecting the functionally most significant genes of vertebrates from methylation, and discussed.  相似文献   

16.
CpG islands as gene markers in the human genome.   总被引:65,自引:0,他引:65  
F Larsen  G Gundersen  R Lopez  H Prydz 《Genomics》1992,13(4):1095-1107
  相似文献   

17.
18.
CpG and TpA dinucleotides are underrepresented in the human genome. The CpG deficiency is due to the high mutation rate from C to T in methylated CpG's. The TpA suppression was thought to reflect a counterselection against TpA's destabilizing effect in RNA. Unexpectedly, the TpA and CpG deficiencies vary according to the G+C contents of sequences. It has been proposed that the variation in CpG suppression was correlated with a particular chromatin organization in G+C-rich isochores. Here, we present an improved model of dinucleotide evolution accounting for the overlap between successive dinucleotides. We show that an increased mutation rate from CpG to TpG or CpA induces both an apparent TpA deficiency and a correlation between CpG and TpA deficiencies and G+C content. Moreover, this model shows that the ratio of observed over expected CpG frequency underestimates the real CpG deficiency in G+C-rich sequences. The predictions of our model fit well with observed frequencies in human genomic data. This study suggests that previously published selectionist interpretations of patterns of dinucleotide frequencies should be taken with caution. Moreover, we propose new criteria to identify unmethylated CpG islands taking into account this bias in the measure of CpG depletion.  相似文献   

19.
20.
Vertebrate genomes are characterized with CpG deficiency, particularly for GCpoor regions. The GC content-related CpG deficiency is probably caused by context-dependent deamination of methylated CpG sites. This hypothesis was examined in this study by comparing nucleotide frequencies at CpG flanking positions among invertebrate and vertebrate genomes. The finding is a transition of nucleotide preference of 5' T to 5' A at the invertebrate-vertebrate boundary, indicating that a large number of CpG sites with 5' Ts were depleted because of global DNA methylation developed in vertebrates. At genome level, we investigated CpG observed/expected (obs/exp) values in 500 bp fragments, and found that higher CpG obs/exp value is shown in GC-poor regions of invertebrate genomes (except sea urchin) but in GC-rich sequences of vertebrate genomes. We next compared GC content at CpG flanking positions with genomic average, showing that the GC content is lower than the average in invertebrate genomes, but higher than that in vertebrate genomes. These results indicate that although 5' T and 5' A are different in inducing deamination of methylated CpG sites, GC content is even more important in affecting the deamination rate. In all the tests, the results of sea urchin are similar to vertebrates perhaps due to its fractional DNA methylation. CpG deficiency is therefore suggested to be mainly a result of high mutation rates of methylated CpG sites in GC-poor regions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号