Linking the epigenome to the genome: correlation of different features to DNA methylation of CpG islands |
| |
Authors: | Wrzodek Clemens Büchel Finja Hinselmann Georg Eichner Johannes Mittag Florian Zell Andreas |
| |
Affiliation: | Center for Bioinformatics Tübingen, ZBIT, University of Tübingen, Tübingen, Germany. clemens.wrzodek@uni-tuebingen.de |
| |
Abstract: | DNA methylation of CpG islands plays a crucial role in the regulation of gene expression. More than half of all human promoters contain CpG islands with a tissue-specific methylation pattern in differentiated cells. Still today, the whole process of how DNA methyltransferases determine which region should be methylated is not completely revealed. There are many hypotheses of which genomic features are correlated to the epigenome that have not yet been evaluated. Furthermore, many explorative approaches of measuring DNA methylation are limited to a subset of the genome and thus, cannot be employed, e.g., for genome-wide biomarker prediction methods. In this study, we evaluated the correlation of genetic, epigenetic and hypothesis-driven features to DNA methylation of CpG islands. To this end, various binary classifiers were trained and evaluated by cross-validation on a dataset comprising DNA methylation data for 190 CpG islands in HEPG2, HEK293, fibroblasts and leukocytes. We achieved an accuracy of up to 91% with an MCC of 0.8 using ten-fold cross-validation and ten repetitions. With these models, we extended the existing dataset to the whole genome and thus, predicted the methylation landscape for the given cell types. The method used for these predictions is also validated on another external whole-genome dataset. Our results reveal features correlated to DNA methylation and confirm or disprove various hypotheses of DNA methylation related features. This study confirms correlations between DNA methylation and histone modifications, DNA structure, DNA sequence, genomic attributes and CpG island properties. Furthermore, the method has been validated on a genome-wide dataset from the ENCODE consortium. The developed software, as well as the predicted datasets and a web-service to compare methylation states of CpG islands are available at http://www.cogsys.cs.uni-tuebingen.de/software/dna-methylation/. |
| |
Keywords: | |
本文献已被 PubMed 等数据库收录! |
|