Clustering single-cell rna-seq data with regularized gaussian graphical model

Research output: Contribution to journalArticlepeer-review

Abstract

Single-cell RNA-seq (scRNA-seq) is a powerful tool to measure the expression patterns of individual cells and discover heterogeneity and functional diversity among cell populations. Due to variability, it is challenging to analyze such data efficiently. Many clustering methods have been developed using at least one free parameter. Different choices for free parameters may lead to substantially different visualizations and clusters. Tuning free parameters is also time consuming. Thus there is need for a simple, robust, and efficient clustering method. In this paper, we propose a new regularized Gaussian graphical clustering (RGGC) method for scRNA-seq data. RGGC is based on high-order (partial) correlations and subspace learning, and is robust over a widerange of a regularized parameter λ. Therefore, we can simply set λ = 2 or λ = log(p) for AIC (Akaike information criterion) or BIC (Bayesian information criterion) without cross-validation. Cell subpopulations are discovered by the Louvain community detection algorithm that determines the number of clusters automatically. There is no free parameter to be tuned with RGGC. When evaluated with simulated and benchmark scRNA-seq data sets against widely used methods, RGGC is computationally efficient and one of the top performers. It can detect inter-sample cell heterogeneity, when applied to glioblastoma scRNA-seq data.

Original languageEnglish (US)
Article number311
Pages (from-to)1-12
Number of pages12
JournalGenes
Volume12
Issue number2
DOIs
StatePublished - Feb 2021

All Science Journal Classification (ASJC) codes

  • Genetics
  • Genetics(clinical)

Fingerprint Dive into the research topics of 'Clustering single-cell rna-seq data with regularized gaussian graphical model'. Together they form a unique fingerprint.

Cite this