Correlated multi-label feature selection

Quanquan Gu, Zhenhui Li, Jiawei Han

Research output: Chapter in Book/Report/Conference proceedingConference contribution

47 Citations (Scopus)

Abstract

Multi-label learning studies the problem where each instance is associated with a set of labels. There are two challenges in multi-label learning: (1) the labels are interdependent and correlated, and (2) the data are of high dimensionality. In this paper, we aim to tackle these challenges in one shot. In particular, we propose to learn the label correlation and do feature selection simultaneously. We introduce a matrix-variate Normal prior distribution on the weight vectors of the classifier to model the label correlation. Our goal is to find a subset of features, based on which the label correlation regularized loss of label ranking is minimized. The resulting multi-label feature selection problem is a mixed integer programming, which is reformulated as quadratically constrained linear programming (QCLP). It can be solved by cutting plane algorithm, in each iteration of which a minimax optimization problem is solved by dual coordinate descent and projected sub-gradient descent alternatively. Experiments on benchmark data sets illustrate that the proposed methods outperform single-label feature selection method and many other state-of-the-art multi-label learning methods.

Original languageEnglish (US)
Title of host publicationCIKM'11 - Proceedings of the 2011 ACM International Conference on Information and Knowledge Management
Pages1087-1096
Number of pages10
DOIs
StatePublished - Dec 13 2011
Event20th ACM Conference on Information and Knowledge Management, CIKM'11 - Glasgow, United Kingdom
Duration: Oct 24 2011Oct 28 2011

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings

Other

Other20th ACM Conference on Information and Knowledge Management, CIKM'11
CountryUnited Kingdom
CityGlasgow
Period10/24/1110/28/11

Fingerprint

Feature selection
Minimax
Dimensionality
Ranking
Optimization problem
Experiment
Mixed integer programming
Benchmark
Gradient
Cutting planes
Classifier
Linear programming
Learning methods

All Science Journal Classification (ASJC) codes

  • Decision Sciences(all)
  • Business, Management and Accounting(all)

Cite this

Gu, Q., Li, Z., & Han, J. (2011). Correlated multi-label feature selection. In CIKM'11 - Proceedings of the 2011 ACM International Conference on Information and Knowledge Management (pp. 1087-1096). (International Conference on Information and Knowledge Management, Proceedings). https://doi.org/10.1145/2063576.2063734
Gu, Quanquan ; Li, Zhenhui ; Han, Jiawei. / Correlated multi-label feature selection. CIKM'11 - Proceedings of the 2011 ACM International Conference on Information and Knowledge Management. 2011. pp. 1087-1096 (International Conference on Information and Knowledge Management, Proceedings).
@inproceedings{a9f2cf05472b4ea38fa73daf1efd931a,
title = "Correlated multi-label feature selection",
abstract = "Multi-label learning studies the problem where each instance is associated with a set of labels. There are two challenges in multi-label learning: (1) the labels are interdependent and correlated, and (2) the data are of high dimensionality. In this paper, we aim to tackle these challenges in one shot. In particular, we propose to learn the label correlation and do feature selection simultaneously. We introduce a matrix-variate Normal prior distribution on the weight vectors of the classifier to model the label correlation. Our goal is to find a subset of features, based on which the label correlation regularized loss of label ranking is minimized. The resulting multi-label feature selection problem is a mixed integer programming, which is reformulated as quadratically constrained linear programming (QCLP). It can be solved by cutting plane algorithm, in each iteration of which a minimax optimization problem is solved by dual coordinate descent and projected sub-gradient descent alternatively. Experiments on benchmark data sets illustrate that the proposed methods outperform single-label feature selection method and many other state-of-the-art multi-label learning methods.",
author = "Quanquan Gu and Zhenhui Li and Jiawei Han",
year = "2011",
month = "12",
day = "13",
doi = "10.1145/2063576.2063734",
language = "English (US)",
isbn = "9781450307178",
series = "International Conference on Information and Knowledge Management, Proceedings",
pages = "1087--1096",
booktitle = "CIKM'11 - Proceedings of the 2011 ACM International Conference on Information and Knowledge Management",

}

Gu, Q, Li, Z & Han, J 2011, Correlated multi-label feature selection. in CIKM'11 - Proceedings of the 2011 ACM International Conference on Information and Knowledge Management. International Conference on Information and Knowledge Management, Proceedings, pp. 1087-1096, 20th ACM Conference on Information and Knowledge Management, CIKM'11, Glasgow, United Kingdom, 10/24/11. https://doi.org/10.1145/2063576.2063734

Correlated multi-label feature selection. / Gu, Quanquan; Li, Zhenhui; Han, Jiawei.

CIKM'11 - Proceedings of the 2011 ACM International Conference on Information and Knowledge Management. 2011. p. 1087-1096 (International Conference on Information and Knowledge Management, Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Correlated multi-label feature selection

AU - Gu, Quanquan

AU - Li, Zhenhui

AU - Han, Jiawei

PY - 2011/12/13

Y1 - 2011/12/13

N2 - Multi-label learning studies the problem where each instance is associated with a set of labels. There are two challenges in multi-label learning: (1) the labels are interdependent and correlated, and (2) the data are of high dimensionality. In this paper, we aim to tackle these challenges in one shot. In particular, we propose to learn the label correlation and do feature selection simultaneously. We introduce a matrix-variate Normal prior distribution on the weight vectors of the classifier to model the label correlation. Our goal is to find a subset of features, based on which the label correlation regularized loss of label ranking is minimized. The resulting multi-label feature selection problem is a mixed integer programming, which is reformulated as quadratically constrained linear programming (QCLP). It can be solved by cutting plane algorithm, in each iteration of which a minimax optimization problem is solved by dual coordinate descent and projected sub-gradient descent alternatively. Experiments on benchmark data sets illustrate that the proposed methods outperform single-label feature selection method and many other state-of-the-art multi-label learning methods.

AB - Multi-label learning studies the problem where each instance is associated with a set of labels. There are two challenges in multi-label learning: (1) the labels are interdependent and correlated, and (2) the data are of high dimensionality. In this paper, we aim to tackle these challenges in one shot. In particular, we propose to learn the label correlation and do feature selection simultaneously. We introduce a matrix-variate Normal prior distribution on the weight vectors of the classifier to model the label correlation. Our goal is to find a subset of features, based on which the label correlation regularized loss of label ranking is minimized. The resulting multi-label feature selection problem is a mixed integer programming, which is reformulated as quadratically constrained linear programming (QCLP). It can be solved by cutting plane algorithm, in each iteration of which a minimax optimization problem is solved by dual coordinate descent and projected sub-gradient descent alternatively. Experiments on benchmark data sets illustrate that the proposed methods outperform single-label feature selection method and many other state-of-the-art multi-label learning methods.

UR - http://www.scopus.com/inward/record.url?scp=83055191234&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=83055191234&partnerID=8YFLogxK

U2 - 10.1145/2063576.2063734

DO - 10.1145/2063576.2063734

M3 - Conference contribution

SN - 9781450307178

T3 - International Conference on Information and Knowledge Management, Proceedings

SP - 1087

EP - 1096

BT - CIKM'11 - Proceedings of the 2011 ACM International Conference on Information and Knowledge Management

ER -

Gu Q, Li Z, Han J. Correlated multi-label feature selection. In CIKM'11 - Proceedings of the 2011 ACM International Conference on Information and Knowledge Management. 2011. p. 1087-1096. (International Conference on Information and Knowledge Management, Proceedings). https://doi.org/10.1145/2063576.2063734