Deep co-clustering

Dongkuan Xu, Wei Cheng, Bo Zong, Jingchao Ni, Dongjin Song, Wenchao Yu, Yuncong Chen, Haifeng Chen, Xiang Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Co-clustering partitions instances and features simultaneously by leveraging the duality between them and it often yields impressive performance improvement over traditional clustering algorithms. The recent development in learning deep representations has demonstrated the advantage in extracting effective features. However, the research on leveraging deep learning frameworks for co-clustering is limited for two reasons: 1) current deep clustering approaches usually decouple feature learning and cluster assignment as two separate steps, which cannot yield the task-specific feature representation; 2) existing deep clustering approaches cannot learn representations for instances and features simultaneously. In this paper, we propose a deep learning model for co-clustering called DeepCC. DeepCC utilizes the deep autoencoder for dimension reduction, and employs a variant of Gaussian Mixture Model (GMM) to infer the cluster assignments. A mutual information loss is proposed to bridge the training of instances and features. DeepCC jointly optimizes the parameters of the deep autoencoder and the mixture model in an end-to-end fashion on both the instance and the feature spaces, which can help the deep autoencoder escape from local optima and the mixture model circumvent the Expectation-Maximization (EM) algorithm. To the best of our knowledge, DeepCC is the first deep learning model for co-clustering. Experimental results on various datasets demonstrate the effectiveness of DeepCC.

Original languageEnglish (US)
Title of host publicationSIAM International Conference on Data Mining, SDM 2019
PublisherSociety for Industrial and Applied Mathematics Publications
Pages414-422
Number of pages9
ISBN (Electronic)9781611975673
StatePublished - Jan 1 2019
Event19th SIAM International Conference on Data Mining, SDM 2019 - Calgary, Canada
Duration: May 2 2019May 4 2019

Publication series

NameSIAM International Conference on Data Mining, SDM 2019

Conference

Conference19th SIAM International Conference on Data Mining, SDM 2019
CountryCanada
CityCalgary
Period5/2/195/4/19

Fingerprint

Clustering algorithms
Deep learning

All Science Journal Classification (ASJC) codes

  • Software

Cite this

Xu, D., Cheng, W., Zong, B., Ni, J., Song, D., Yu, W., ... Zhang, X. (2019). Deep co-clustering. In SIAM International Conference on Data Mining, SDM 2019 (pp. 414-422). (SIAM International Conference on Data Mining, SDM 2019). Society for Industrial and Applied Mathematics Publications.
Xu, Dongkuan ; Cheng, Wei ; Zong, Bo ; Ni, Jingchao ; Song, Dongjin ; Yu, Wenchao ; Chen, Yuncong ; Chen, Haifeng ; Zhang, Xiang. / Deep co-clustering. SIAM International Conference on Data Mining, SDM 2019. Society for Industrial and Applied Mathematics Publications, 2019. pp. 414-422 (SIAM International Conference on Data Mining, SDM 2019).
@inproceedings{6162e149ba884c9a962ec0a6d7aff80d,
title = "Deep co-clustering",
abstract = "Co-clustering partitions instances and features simultaneously by leveraging the duality between them and it often yields impressive performance improvement over traditional clustering algorithms. The recent development in learning deep representations has demonstrated the advantage in extracting effective features. However, the research on leveraging deep learning frameworks for co-clustering is limited for two reasons: 1) current deep clustering approaches usually decouple feature learning and cluster assignment as two separate steps, which cannot yield the task-specific feature representation; 2) existing deep clustering approaches cannot learn representations for instances and features simultaneously. In this paper, we propose a deep learning model for co-clustering called DeepCC. DeepCC utilizes the deep autoencoder for dimension reduction, and employs a variant of Gaussian Mixture Model (GMM) to infer the cluster assignments. A mutual information loss is proposed to bridge the training of instances and features. DeepCC jointly optimizes the parameters of the deep autoencoder and the mixture model in an end-to-end fashion on both the instance and the feature spaces, which can help the deep autoencoder escape from local optima and the mixture model circumvent the Expectation-Maximization (EM) algorithm. To the best of our knowledge, DeepCC is the first deep learning model for co-clustering. Experimental results on various datasets demonstrate the effectiveness of DeepCC.",
author = "Dongkuan Xu and Wei Cheng and Bo Zong and Jingchao Ni and Dongjin Song and Wenchao Yu and Yuncong Chen and Haifeng Chen and Xiang Zhang",
year = "2019",
month = "1",
day = "1",
language = "English (US)",
series = "SIAM International Conference on Data Mining, SDM 2019",
publisher = "Society for Industrial and Applied Mathematics Publications",
pages = "414--422",
booktitle = "SIAM International Conference on Data Mining, SDM 2019",
address = "United States",

}

Xu, D, Cheng, W, Zong, B, Ni, J, Song, D, Yu, W, Chen, Y, Chen, H & Zhang, X 2019, Deep co-clustering. in SIAM International Conference on Data Mining, SDM 2019. SIAM International Conference on Data Mining, SDM 2019, Society for Industrial and Applied Mathematics Publications, pp. 414-422, 19th SIAM International Conference on Data Mining, SDM 2019, Calgary, Canada, 5/2/19.

Deep co-clustering. / Xu, Dongkuan; Cheng, Wei; Zong, Bo; Ni, Jingchao; Song, Dongjin; Yu, Wenchao; Chen, Yuncong; Chen, Haifeng; Zhang, Xiang.

SIAM International Conference on Data Mining, SDM 2019. Society for Industrial and Applied Mathematics Publications, 2019. p. 414-422 (SIAM International Conference on Data Mining, SDM 2019).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Deep co-clustering

AU - Xu, Dongkuan

AU - Cheng, Wei

AU - Zong, Bo

AU - Ni, Jingchao

AU - Song, Dongjin

AU - Yu, Wenchao

AU - Chen, Yuncong

AU - Chen, Haifeng

AU - Zhang, Xiang

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Co-clustering partitions instances and features simultaneously by leveraging the duality between them and it often yields impressive performance improvement over traditional clustering algorithms. The recent development in learning deep representations has demonstrated the advantage in extracting effective features. However, the research on leveraging deep learning frameworks for co-clustering is limited for two reasons: 1) current deep clustering approaches usually decouple feature learning and cluster assignment as two separate steps, which cannot yield the task-specific feature representation; 2) existing deep clustering approaches cannot learn representations for instances and features simultaneously. In this paper, we propose a deep learning model for co-clustering called DeepCC. DeepCC utilizes the deep autoencoder for dimension reduction, and employs a variant of Gaussian Mixture Model (GMM) to infer the cluster assignments. A mutual information loss is proposed to bridge the training of instances and features. DeepCC jointly optimizes the parameters of the deep autoencoder and the mixture model in an end-to-end fashion on both the instance and the feature spaces, which can help the deep autoencoder escape from local optima and the mixture model circumvent the Expectation-Maximization (EM) algorithm. To the best of our knowledge, DeepCC is the first deep learning model for co-clustering. Experimental results on various datasets demonstrate the effectiveness of DeepCC.

AB - Co-clustering partitions instances and features simultaneously by leveraging the duality between them and it often yields impressive performance improvement over traditional clustering algorithms. The recent development in learning deep representations has demonstrated the advantage in extracting effective features. However, the research on leveraging deep learning frameworks for co-clustering is limited for two reasons: 1) current deep clustering approaches usually decouple feature learning and cluster assignment as two separate steps, which cannot yield the task-specific feature representation; 2) existing deep clustering approaches cannot learn representations for instances and features simultaneously. In this paper, we propose a deep learning model for co-clustering called DeepCC. DeepCC utilizes the deep autoencoder for dimension reduction, and employs a variant of Gaussian Mixture Model (GMM) to infer the cluster assignments. A mutual information loss is proposed to bridge the training of instances and features. DeepCC jointly optimizes the parameters of the deep autoencoder and the mixture model in an end-to-end fashion on both the instance and the feature spaces, which can help the deep autoencoder escape from local optima and the mixture model circumvent the Expectation-Maximization (EM) algorithm. To the best of our knowledge, DeepCC is the first deep learning model for co-clustering. Experimental results on various datasets demonstrate the effectiveness of DeepCC.

UR - http://www.scopus.com/inward/record.url?scp=85066110843&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85066110843&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85066110843

T3 - SIAM International Conference on Data Mining, SDM 2019

SP - 414

EP - 422

BT - SIAM International Conference on Data Mining, SDM 2019

PB - Society for Industrial and Applied Mathematics Publications

ER -

Xu D, Cheng W, Zong B, Ni J, Song D, Yu W et al. Deep co-clustering. In SIAM International Conference on Data Mining, SDM 2019. Society for Industrial and Applied Mathematics Publications. 2019. p. 414-422. (SIAM International Conference on Data Mining, SDM 2019).