期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Independent component analysis based gene co-expression network inference (ICAnet) to decipher functional modules for better single-cell clustering and batch integration

Weixu Wang Huanhuan Tan Mingwan Sun Yiqing Han Wei Chen Shengnu Qiu Ke Zheng Gang Wei Ting Ni 《Nucleic acids research》2021,49(9):e54

With the tremendous increase of publicly available single-cell RNA-sequencing (scRNA-seq) datasets, bioinformatics methods based on gene co-expression network are becoming efficient tools for analyzing scRNA-seq data, improving cell type prediction accuracy and in turn facilitating biological discovery. However, the current methods are mainly based on overall co-expression correlation and overlook co-expression that exists in only a subset of cells, thus fail to discover certain rare cell types and sensitive to batch effect. Here, we developed independent component analysis-based gene co-expression network inference (ICAnet) that decomposed scRNA-seq data into a series of independent gene expression components and inferred co-expression modules, which improved cell clustering and rare cell-type discovery. ICAnet showed efficient performance for cell clustering and batch integration using scRNA-seq datasets spanning multiple cells/tissues/donors/library types. It works stably on datasets produced by different library construction strategies and with different sequencing depths and cell numbers. We demonstrated the capability of ICAnet to discover rare cell types in multiple independent scRNA-seq datasets from different sources. Importantly, the identified modules activated in acute myeloid leukemia scRNA-seq datasets have the potential to serve as new diagnostic markers. Thus, ICAnet is a competitive tool for cell clustering and biological interpretations of single-cell RNA-seq data analysis. 相似文献

2.

SDImpute: A statistical block imputation method based on cell-level and gene-level information for dropouts in single-cell RNA-seq data

Jing Qi Yang Zhou Zicen Zhao Shuilin Jin 《PLoS computational biology》2021,17(6)

The single-cell RNA sequencing (scRNA-seq) technologies obtain gene expression at single-cell resolution and provide a tool for exploring cell heterogeneity and cell types. As the low amount of extracted mRNA copies per cell, scRNA-seq data exhibit a large number of dropouts, which hinders the downstream analysis of the scRNA-seq data. We propose a statistical method, SDImpute (Single-cell RNA-seq Dropout Imputation), to implement block imputation for dropout events in scRNA-seq data. SDImpute automatically identifies the dropout events based on the gene expression levels and the variations of gene expression across similar cells and similar genes, and it implements block imputation for dropouts by utilizing gene expression unaffected by dropouts from similar cells. In the experiments, the results of the simulated datasets and real datasets suggest that SDImpute is an effective tool to recover the data and preserve the heterogeneity of gene expression across cells. Compared with the state-of-the-art imputation methods, SDImpute improves the accuracy of the downstream analysis including clustering, visualization, and differential expression analysis. 相似文献

3.

Clustering approaches to identifying gene expression patterns from DNA microarray data

Do JH Choi DK 《Molecules and cells》2008,25(2):279-288

相似文献

4.

Approximate distance correlation for selecting highly interrelated genes across datasets

Qunlun Shen Shihua Zhang 《PLoS computational biology》2021,17(11)

With the rapid accumulation of biological omics datasets, decoding the underlying relationships of cross-dataset genes becomes an important issue. Previous studies have attempted to identify differentially expressed genes across datasets. However, it is hard for them to detect interrelated ones. Moreover, existing correlation-based algorithms can only measure the relationship between genes within a single dataset or two multi-modal datasets from the same samples. It is still unclear how to quantify the strength of association of the same gene across two biological datasets with different samples. To this end, we propose Approximate Distance Correlation (ADC) to select interrelated genes with statistical significance across two different biological datasets. ADC first obtains the k most correlated genes for each target gene as its approximate observations, and then calculates the distance correlation (DC) for the target gene across two datasets. ADC repeats this process for all genes and then performs the Benjamini-Hochberg adjustment to control the false discovery rate. We demonstrate the effectiveness of ADC with simulation data and four real applications to select highly interrelated genes across two datasets. These four applications including 21 cancer RNA-seq datasets of different tissues; six single-cell RNA-seq (scRNA-seq) datasets of mouse hematopoietic cells across six different cell types along the hematopoietic cell lineage; five scRNA-seq datasets of pancreatic islet cells across five different technologies; coupled single-cell ATAC-seq (scATAC-seq) and scRNA-seq data of peripheral blood mononuclear cells (PBMC). Extensive results demonstrate that ADC is a powerful tool to uncover interrelated genes with strong biological implications and is scalable to large-scale datasets. Moreover, the number of such genes can serve as a metric to measure the similarity between two datasets, which could characterize the relative difference of diverse cell types and technologies. 相似文献

5.

DeBi: Discovering Differentially Expressed Biclusters using a Frequent Itemset Approach

Serin A Vingron M 《Algorithms for molecular biology : AMB》2011,6(1):18-12

相似文献

6.

Comprehensive analysis of forty yeast microarray datasets reveals a novel subset of genes (APha-RiB) consistently negatively associated with ribosome biogenesis

Basel Abu-Jamous Rui Fa David J Roberts Asoke K Nandi 《BMC bioinformatics》2014,15(1)

相似文献

7.

SSRE: Cell Type Detection Based on Sparse Subspace Representation and Similarity Enhancement

Zhenlan Liang Min Li Ruiqing Zheng Yu Tian Xuhua Yan Jin Chen Fang-Xiang Wu Jianxin Wang 《基因组蛋白质组与生物信息学报(英文版)》2021,19(2):282-291

Accurate identification of cell types from single-cell RNA sequencing(scRNA-seq) data plays a critical role in a variety of scRNA-seq analysis studies. This task corresponds to solving an unsupervised clustering problem, in which the similarity measurement between cells affects the result significantly. Although many approaches for cell type identification have been proposed,the accuracy still needs to be improved. In this study, we proposed a novel single-cell clustering framework based on similarity learning, called SSRE. SSRE models the relationships between cells based on subspace assumption, and generates a sparse representation of the cell-to-cell similarity.The sparse representation retains the most similar neighbors for each cell. Besides, three classical pairwise similarities are incorporated with a gene selection and enhancement strategy to further improve the effectiveness of SSRE. Tested on ten real scRNA-seq datasets and five simulated datasets, SSRE achieved the superior performance in most cases compared to several state-of-the-art single-cell clustering methods. In addition, SSRE can be extended to visualization of scRNA-seq data and identification of differentially expressed genes. The matlab and python implementations of SSRE are available at https://github.com/CSUBioGroup/SSRE. 相似文献

8.

Methods for predicting single-cell miRNA in breast cancer

《Genomics》2022,114(3):110353

It has been demonstrated that miRNAs are involved in many biological processes including cell proliferation and differentiation, apoptosis, and stress responses. Although single-cell RNA sequencing technology is prevailing nowadays, it still remains challenging in quantifying miRNA at the single-cell level. Herein, we present the computational methods to infer the single-cell miRNA expression level using its target gene abundances. Firstly, we developed an enrichment-based approach in estimating miRNA expression considering miRNA-mRNA regulation information and miRNA-mRNA correlation signal captured from existing TCGA datasets. Further efforts were made to infer the miRNA expression with machine learning models. The methods were applied to compare the accuracy and robustness with the simulated single-cell data. Finally, we applied the method in single-cell RNA-seq triple negative breast cancer (TNBC) patients to further discover miRNA marker at the single-cell level for the malignant cells. Our tool is available online at: https://github.com/ChengkuiZhao/Single-cell-miRNA-prediction. 相似文献

9.

SAIC: an iterative clustering approach for analysis of single cell RNA-seq data

Lu Yang Jiancheng Liu Qiang Lu Arthur D. Riggs Xiwei Wu 《BMC genomics》2017,18(6):689

相似文献

10.

CellDepot: A Unified Repository for scRNA-seq Data and Visual Exploration

《Journal of molecular biology》2022,434(11):167425

CellDepot containing over 270 datasets from 8 species and many tissues serves as an integrated web application to empower scientists in exploring single-cell RNA-seq (scRNA-seq) datasets and comparing the datasets among various studies through a user-friendly interface with advanced visualization and analytical capabilities. To begin with, it provides an efficient data management system that users can upload single cell datasets and query the database by multiple attributes such as species and cell types. In addition, the graphical multi-logic, multi-condition query builder and convenient filtering tool backed by MySQL database system, allows users to quickly find the datasets of interest and compare the expression of gene(s) across these. Moreover, by embedding the cellxgene VIP tool, CellDepot enables fast exploration of individual dataset in the manner of interactivity and scalability to gain more refined insights such as cell composition, gene expression profiles, and differentially expressed genes among cell types by leveraging more than 20 frequently applied plotting functions and high-level analysis methods in single cell research. In summary, the web portal available at http://celldepot.bxgenomics.com, prompts large scale single cell data sharing, facilitates meta-analysis and visualization, and encourages scientists to contribute to the single-cell community in a tractable and collaborative way. Finally, CellDepot is released as open-source software under MIT license to motivate crowd contribution, broad adoption, and local deployment for private datasets. 相似文献

11.

Gene expression data analysis using multiobjective clustering improved with SVM based ensemble

Mukhopadhyay A Maulik U Bandyopadhyay S 《In silico biology》2011,11(1-2):19-27

Microarray technology facilitates the monitoring of the expression levels of thousands of genes over different experimental conditions simultaneously. Clustering is a popular data mining tool which can be applied to microarray gene expression data to identify co-expressed genes. Most of the traditional clustering methods optimize a single clustering goodness criterion and thus may not be capable of performing well on all kinds of datasets. Motivated by this, in this article, a multiobjective clustering technique that optimizes cluster compactness and separation simultaneously, has been improved through a novel support vector machine classification based cluster ensemble method. The superiority of MOCSVMEN (MultiObjective Clustering with Support Vector Machine based ENsemble) has been established by comparing its performance with that of several well known existing microarray data clustering algorithms. Two real-life benchmark gene expression datasets have been used for testing the comparative performances of different algorithms. A recently developed metric, called Biological Homogeneity Index (BHI), which computes the clustering goodness with respect to functional annotation, has been used for the comparison purpose. 相似文献

12.

Functional clustering and lineage markers: Insights into cellular differentiation and gene function from large-scale microarray studies of purified primary cell populations

David A. Hume Kim M. Summers Sobia Raza J. Kenneth Baillie Thomas C. Freeman 《Genomics》2010,95(6):328-338

相似文献

13.

Identification of cancer subtypes from single-cell RNA-seq data using a consensus clustering method

Yanglan Gan Ning Li Guobing Zou Yongchang Xin Jihong Guan 《BMC medical genomics》2018,11(6):117

Background

Human cancers are complex ecosystems composed of cells with distinct molecular signatures. Such intratumoral heterogeneity poses a major challenge to cancer diagnosis and treatment. Recent advancements of single-cell techniques such as scRNA-seq have brought unprecedented insights into cellular heterogeneity. Subsequently, a challenging computational problem is to cluster high dimensional noisy datasets with substantially fewer cells than the number of genes.

Methods

In this paper, we introduced a consensus clustering framework conCluster, for cancer subtype identification from single-cell RNA-seq data. Using an ensemble strategy, conCluster fuses multiple basic partitions to consensus clusters.

Results

Applied to real cancer scRNA-seq datasets, conCluster can more accurately detect cancer subtypes than the widely used scRNA-seq clustering methods. Further, we conducted co-expression network analysis for the identified melanoma subtypes.

Conclusions

Our analysis demonstrates that these subtypes exhibit distinct gene co-expression networks and significant gene sets with different functional enrichment.

相似文献

14.

A high-resolution cell atlas of the domestic pig lung and an online platform for exploring lung single-cell data

《遗传学报》2021,48(5):411-425

相似文献

15.

Cell function and identity revealed by comparative scRNA-seq analysis in human nasal,bronchial and epididymis epithelia

《European journal of cell biology》2022,101(3):151231

相似文献

16.

A metric for evaluating biological information in gene sets and its application to identify co-expressed gene clusters in PBMC

Jason Bennett Mikhail Pomaznoy Akul Singhania Bjoern Peters 《PLoS computational biology》2021,17(10)

相似文献

17.

Revealing allele-specific gene expression by single-cell transcriptomics

《The international journal of biochemistry & cell biology》2017

Single-cell sequencing has emerged as a revolutionary method that reveals biological processes with unprecedented resolution and scale, and has already greatly impacted biology and medicine. To investigate processes such as alternative splicing, novel exon detection and allele-specific expression (ASE), full-length based single-cell RNA-seq methods are required for broad sequence coverage and single nucleotide polymorphism (SNP) identification. In this review, we revisit recent achievements from studies that used single-cell RNA-seq to advance our understanding of ASE in the context of both autosomal and X-chromosome genes. We also recapitulate useful bioinformatic tools developed to identify haplotype phase. 相似文献

18.

Paradigm of Tunable Clustering Using Binarization of Consensus Partition Matrices (Bi-CoPaM) for Gene Discovery

Basel Abu-Jamous Rui Fa David J. Roberts Asoke K. Nandi 《PloS one》2013,8(2)

Clustering analysis has a growing role in the study of co-expressed genes for gene discovery. Conventional binary and fuzzy clustering do not embrace the biological reality that some genes may be irrelevant for a problem and not be assigned to a cluster, while other genes may participate in several biological functions and should simultaneously belong to multiple clusters. Also, these algorithms cannot generate tight clusters that focus on their cores or wide clusters that overlap and contain all possibly relevant genes. In this paper, a new clustering paradigm is proposed. In this paradigm, all three eventualities of a gene being exclusively assigned to a single cluster, being assigned to multiple clusters, and being not assigned to any cluster are possible. These possibilities are realised through the primary novelty of the introduction of tunable binarization techniques. Results from multiple clustering experiments are aggregated to generate one fuzzy consensus partition matrix (CoPaM), which is then binarized to obtain the final binary partitions. This is referred to as Binarization of Consensus Partition Matrices (Bi-CoPaM). The method has been tested with a set of synthetic datasets and a set of five real yeast cell-cycle datasets. The results demonstrate its validity in generating relevant tight, wide, and complementary clusters that can meet requirements of different gene discovery studies. 相似文献

19.

基于基因互作网络熵量化细胞分化状态

关天昊高洁《生物工程学报》2022,38(2):820-830

细胞动态过程的研究表明,细胞在动态过程中会发生状态变化,主要由细胞内部的基因表达情况控制.随着高通量测序技术的发展,大量的基因表达数据能够在单细胞水平上获得细胞真实的基因表达信息.然而,现有大多数研究方法需要使用除基因表达以外其他的信息,带来了额外的复杂度和不确定性.此外,普遍存在的"缺失值"事件更是影响了对细胞动态发... 相似文献

20.

Fuzzy J-Means and VNS methods for clustering genes from microarray data 总被引：4，自引：0，他引：4

Belacel N Cuperlović-Culf M Laflamme M Ouellette R 《Bioinformatics (Oxford, England)》2004,20(11):1690-1701

MOTIVATION: In the interpretation of gene expression data from a group of microarray experiments that include samples from either different patients or conditions, special consideration must be given to the pleiotropic and epistatic roles of genes, as observed in the variation of gene coexpression patterns. Crisp clustering methods assign each gene to one cluster, thereby omitting information about the multiple roles of genes. RESULTS: Here, we present the application of a local search heuristic, Fuzzy J-Means, embedded into the variable neighborhood search metaheuristic for the clustering of microarray gene expression data. We show that for all the datasets studied this algorithm outperforms the standard Fuzzy C-Means heuristic. Different methods for the utilization of cluster membership information in determining gene coregulation are presented. The clustering and data analyses were performed on simulated datasets as well as experimental cDNA microarray data for breast cancer and human blood from the Stanford Microarray Database. AVAILABILITY: The source code of the clustering software (C programming language) is freely available from Nabil.Belacel@nrc-cnrc.gc.ca 相似文献