首页 | 本学科首页   官方微博 | 高级检索  
     


SIDEseq: A Cell Similarity Measure Defined by Shared Identified Differentially Expressed Genes for Single-Cell RNA sequencing Data
Authors:Courtney Schiffman  Christina Lin  Funan Shi  Luonan Chen  Lydia Sohn  Haiyan Huang
Affiliation:1.Department of Biostatistics,UC Berkeley,Berkeley,USA;2.Department of Chemical Biology and Department of Molecular and Cellular Biology,UC Berkeley,Berkeley,USA;3.Department of Statistics,UC Berkeley,Berkeley,USA;4.Key Laboratory of Systems Biology, Innovation Center for Cell Signaling Network, Institute of Biochemistry and Cell Biology,Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences,Shanghai,China;5.Department of Mechanical Engineering,UC Berkeley,Berkeley,USA
Abstract:One goal of single-cell RNA sequencing (scRNA seq) is to expose possible heterogeneity within cell populations due to meaningful, biological variation. Examining cell-to-cell heterogeneity, and further, identifying subpopulations of cells based on scRNA seq data has been of common interest in life science research. A key component to successfully identifying cell subpopulations (or clustering cells) is the (dis)similarity measure used to group the cells. In this paper, we introduce a novel measure, named SIDEseq, to assess cell-to-cell similarity using scRNA seq data. SIDEseq first identifies a list of putative differentially expressed (DE) genes for each pair of cells. SIDEseq then integrates the information from all the DE gene lists (corresponding to all pairs of cells) to build a similarity measure between two cells. SIDEseq can be implemented in any clustering algorithm that requires a (dis)similarity matrix. This new measure incorporates information from all cells when evaluating the similarity between any two cells, a characteristic not commonly found in existing (dis)similarity measures. This property is advantageous for two reasons: (a) borrowing information from cells of different subpopulations allows for the investigation of pairwise cell relationships from a global perspective and (b) information from other cells of the same subpopulation could help to ensure a robust relationship assessment. We applied SIDEseq to a newly generated human ovarian cancer scRNA seq dataset, a public human embryo scRNA seq dataset, and several simulated datasets. The clustering results suggest that the SIDEseq measure is capable of uncovering important relationships between cells, and outperforms or at least does as well as several popular (dis)similarity measures when used on these datasets.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号