Principal component analysis- and tensor decomposition-based unsupervised feature extraction to select more suitable differentially methylated cytosines: Optimization of standard deviation versus state-of-the-art methods |
| |
Affiliation: | 1. Department of Physics, Chuo University, 1-13-27, Kasuga, Bunkyo-ku, Tokyo 112-8551, Japan;2. Department of Computer Science, King Abdulaziz University, Jeddah 21589, Saudi Arabia |
| |
Abstract: | In contrast to RNA-seq analysis, which has various standard methods, no standard methods for identifying differentially methylated cytosines (DMCs) exist. To identify DMCs, we tested principal component analysis and tensor decomposition-based unsupervised feature extraction with optimized standard deviation, which has been shown to be effective for differentially expressed gene (DEG) identification. The proposed method outperformed certain conventional methods, including those that assume beta-binomial distribution for methylation as the proposed method does not require this, especially when applied to methylation profiles measured using high throughput sequencing. DMCs identified by the proposed method also significantly overlapped with various functional sites, including known differentially methylated regions, enhancers, and DNase I hypersensitive sites. The proposed method was applied to data sets retrieved from The Cancer Genome Atlas to identify DMCs using American Joint Committee on Cancer staging system edition labels. This suggests that the proposed method is a promising standard method for identifying DMCs. |
| |
Keywords: | Tensor decomposition Unsupervised learning DNA methylation Methylation profiles Applications in epigenetics |
本文献已被 ScienceDirect 等数据库收录! |
|