首页 | 本学科首页   官方微博 | 高级检索  
     


scLM: Automatic Detection of Consensus Gene Clusters Across Multiple Single-cell Datasets
Authors:Qianqian Song  Jing Su  Lance D.Miller  Wei Zhang
Affiliation:1. Center for Cancer Genomics and Precision Oncology, Wake Forest Baptist Comprehensive Cancer Center, Wake Forest Baptist Medical Center, Winston Salem, NC 27157, USA;2. Department of Cancer Biology, Wake Forest School of Medicine, Winston Salem, NC 27157, USA;3. Department of Biostatistics, Indiana University School of Medicine, Indianapolis, IN 46202, USA
Abstract:In gene expression profiling studies, including single-cell RNA sequencing(sc RNA-seq)analyses, the identification and characterization of co-expressed genes provides critical information on cell identity and function. Gene co-expression clustering in sc RNA-seq data presents certain challenges. We show that commonly used methods for single-cell data are not capable of identifying co-expressed genes accurately, and produce results that substantially limit biological expectations of co-expressed genes. Herein, we present single-cell Latent-variable Model(sc LM), a gene coclustering algorithm tailored to single-cell data that performs well at detecting gene clusters with significant biologic context. Importantly, sc LM can simultaneously cluster multiple single-cell datasets, i.e., consensus clustering, enabling users to leverage single-cell data from multiple sources for novel comparative analysis. sc LM takes raw count data as input and preserves biological variation without being influenced by batch effects from multiple datasets. Results from both simulation data and experimental data demonstrate that sc LM outperforms the existing methods with considerably improved accuracy. To illustrate the biological insights of sc LM, we apply it to our in-house and public experimental sc RNA-seq datasets. sc LM identifies novel functional gene modules and refines cell states, which facilitates mechanism discovery and understanding of complex biosystems such as cancers. A user-friendly R package with all the key features of the sc LM method is available at https://github.com/QSong-github/sc LM.
Keywords:Single-cell RNA sequencing  Consensus clustering  Latent space  Markov Chain Monte Carlo  Maximum likelihood approach
本文献已被 CNKI 万方数据 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号