Cuckoo feature hashing

Dynamic weight sharing for sparse analytics

Jinyang Gao, Beng Chin Ooi, Yanyan Shen, Wang-chien Lee

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Feature hashing is widely used to process large scale sparse features for learning of predictive models. Collisions inherently happen in the hashing process and hurt the model performance. In this paper, we develop a new feature hashing scheme called Cuckoo Feature Hashing (CCFH), which treats feature hashing as a problem of dynamic weight sharing during model training. By leveraging a set of indicators to dynamically decide the weight of each feature based on alternative hash locations, CCFH effectively prevents the collisions between important features to the model, i.e. predictive features, and thus avoid model performance degradation. Experimental results on prediction tasks with hundred-millions of features demonstrate that CCFH can achieve the same level of performance by using only 15%-25% parameters compared with conventional feature hashing.

Original languageEnglish (US)
Title of host publicationProceedings of the 27th International Joint Conference on Artificial Intelligence, IJCAI 2018
EditorsJerome Lang
PublisherInternational Joint Conferences on Artificial Intelligence
Pages2135-2141
Number of pages7
ISBN (Electronic)9780999241127
StatePublished - Jan 1 2018
Event27th International Joint Conference on Artificial Intelligence, IJCAI 2018 - Stockholm, Sweden
Duration: Jul 13 2018Jul 19 2018

Publication series

NameIJCAI International Joint Conference on Artificial Intelligence
Volume2018-July
ISSN (Print)1045-0823

Other

Other27th International Joint Conference on Artificial Intelligence, IJCAI 2018
CountrySweden
CityStockholm
Period7/13/187/19/18

Fingerprint

Degradation

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence

Cite this

Gao, J., Ooi, B. C., Shen, Y., & Lee, W. (2018). Cuckoo feature hashing: Dynamic weight sharing for sparse analytics. In J. Lang (Ed.), Proceedings of the 27th International Joint Conference on Artificial Intelligence, IJCAI 2018 (pp. 2135-2141). (IJCAI International Joint Conference on Artificial Intelligence; Vol. 2018-July). International Joint Conferences on Artificial Intelligence.
Gao, Jinyang ; Ooi, Beng Chin ; Shen, Yanyan ; Lee, Wang-chien. / Cuckoo feature hashing : Dynamic weight sharing for sparse analytics. Proceedings of the 27th International Joint Conference on Artificial Intelligence, IJCAI 2018. editor / Jerome Lang. International Joint Conferences on Artificial Intelligence, 2018. pp. 2135-2141 (IJCAI International Joint Conference on Artificial Intelligence).
@inproceedings{0a3a0e8b0c71432894c45141bca9df76,
title = "Cuckoo feature hashing: Dynamic weight sharing for sparse analytics",
abstract = "Feature hashing is widely used to process large scale sparse features for learning of predictive models. Collisions inherently happen in the hashing process and hurt the model performance. In this paper, we develop a new feature hashing scheme called Cuckoo Feature Hashing (CCFH), which treats feature hashing as a problem of dynamic weight sharing during model training. By leveraging a set of indicators to dynamically decide the weight of each feature based on alternative hash locations, CCFH effectively prevents the collisions between important features to the model, i.e. predictive features, and thus avoid model performance degradation. Experimental results on prediction tasks with hundred-millions of features demonstrate that CCFH can achieve the same level of performance by using only 15{\%}-25{\%} parameters compared with conventional feature hashing.",
author = "Jinyang Gao and Ooi, {Beng Chin} and Yanyan Shen and Wang-chien Lee",
year = "2018",
month = "1",
day = "1",
language = "English (US)",
series = "IJCAI International Joint Conference on Artificial Intelligence",
publisher = "International Joint Conferences on Artificial Intelligence",
pages = "2135--2141",
editor = "Jerome Lang",
booktitle = "Proceedings of the 27th International Joint Conference on Artificial Intelligence, IJCAI 2018",

}

Gao, J, Ooi, BC, Shen, Y & Lee, W 2018, Cuckoo feature hashing: Dynamic weight sharing for sparse analytics. in J Lang (ed.), Proceedings of the 27th International Joint Conference on Artificial Intelligence, IJCAI 2018. IJCAI International Joint Conference on Artificial Intelligence, vol. 2018-July, International Joint Conferences on Artificial Intelligence, pp. 2135-2141, 27th International Joint Conference on Artificial Intelligence, IJCAI 2018, Stockholm, Sweden, 7/13/18.

Cuckoo feature hashing : Dynamic weight sharing for sparse analytics. / Gao, Jinyang; Ooi, Beng Chin; Shen, Yanyan; Lee, Wang-chien.

Proceedings of the 27th International Joint Conference on Artificial Intelligence, IJCAI 2018. ed. / Jerome Lang. International Joint Conferences on Artificial Intelligence, 2018. p. 2135-2141 (IJCAI International Joint Conference on Artificial Intelligence; Vol. 2018-July).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Cuckoo feature hashing

T2 - Dynamic weight sharing for sparse analytics

AU - Gao, Jinyang

AU - Ooi, Beng Chin

AU - Shen, Yanyan

AU - Lee, Wang-chien

PY - 2018/1/1

Y1 - 2018/1/1

N2 - Feature hashing is widely used to process large scale sparse features for learning of predictive models. Collisions inherently happen in the hashing process and hurt the model performance. In this paper, we develop a new feature hashing scheme called Cuckoo Feature Hashing (CCFH), which treats feature hashing as a problem of dynamic weight sharing during model training. By leveraging a set of indicators to dynamically decide the weight of each feature based on alternative hash locations, CCFH effectively prevents the collisions between important features to the model, i.e. predictive features, and thus avoid model performance degradation. Experimental results on prediction tasks with hundred-millions of features demonstrate that CCFH can achieve the same level of performance by using only 15%-25% parameters compared with conventional feature hashing.

AB - Feature hashing is widely used to process large scale sparse features for learning of predictive models. Collisions inherently happen in the hashing process and hurt the model performance. In this paper, we develop a new feature hashing scheme called Cuckoo Feature Hashing (CCFH), which treats feature hashing as a problem of dynamic weight sharing during model training. By leveraging a set of indicators to dynamically decide the weight of each feature based on alternative hash locations, CCFH effectively prevents the collisions between important features to the model, i.e. predictive features, and thus avoid model performance degradation. Experimental results on prediction tasks with hundred-millions of features demonstrate that CCFH can achieve the same level of performance by using only 15%-25% parameters compared with conventional feature hashing.

UR - http://www.scopus.com/inward/record.url?scp=85055715599&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85055715599&partnerID=8YFLogxK

M3 - Conference contribution

T3 - IJCAI International Joint Conference on Artificial Intelligence

SP - 2135

EP - 2141

BT - Proceedings of the 27th International Joint Conference on Artificial Intelligence, IJCAI 2018

A2 - Lang, Jerome

PB - International Joint Conferences on Artificial Intelligence

ER -

Gao J, Ooi BC, Shen Y, Lee W. Cuckoo feature hashing: Dynamic weight sharing for sparse analytics. In Lang J, editor, Proceedings of the 27th International Joint Conference on Artificial Intelligence, IJCAI 2018. International Joint Conferences on Artificial Intelligence. 2018. p. 2135-2141. (IJCAI International Joint Conference on Artificial Intelligence).