Cache-aware approximate computing for decision tree learning

Orhan Kislal, Mahmut Kandemir, Jagadish Kotra

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Citations (Scopus)

Abstract

The memory performance of data mining applications became crucial due to increasing dataset sizes and multi-level cache hierarchies. Decision tree learning is one of the most important algorithms in this field, and numerous researchers worked on improving the accuracy of model tree as well as enhancing the overall performance of the learning process. Most modern applications that employ decision tree learning favor creating multiple models for higher accuracy by sacrificing performance. In this work, we exploit the flexibility inherent in decision tree learning based applications regarding performance and accuracy tradeoffs, and propose a framework to improve performance with negligible accuracy losses. This framework employs a data access skipping module (DASM) using which costly cache accesses are skipped according to the aggressiveness of the strategy specified by the user and a heuristic to predict skipped data accesses to keep accuracy losses at minimum. Our experimental evaluation shows that the proposed framework offers significant performance improvements (up to 25%) with relatively much smaller losses in accuracy (up to 8%) over the original case. We demonstrate that our framework is scalable under various accuracy requirements via exploring accuracy changes over time and replacement policies. In addition, we explore NoC/SNUCA systems for similar opportunities of memory performance improvement.

Original languageEnglish (US)
Title of host publicationProceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1413-1422
Number of pages10
ISBN (Electronic)9781509021406
DOIs
StatePublished - Jul 18 2016
Event30th IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2016 - Chicago, United States
Duration: May 23 2016May 27 2016

Publication series

NameProceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016

Other

Other30th IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2016
CountryUnited States
CityChicago
Period5/23/165/27/16

Fingerprint

Decision trees
Data storage equipment
Data mining

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications

Cite this

Kislal, O., Kandemir, M., & Kotra, J. (2016). Cache-aware approximate computing for decision tree learning. In Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016 (pp. 1413-1422). [7530032] (Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/IPDPSW.2016.116
Kislal, Orhan ; Kandemir, Mahmut ; Kotra, Jagadish. / Cache-aware approximate computing for decision tree learning. Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016. Institute of Electrical and Electronics Engineers Inc., 2016. pp. 1413-1422 (Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016).
@inproceedings{22c25fd632984b9890cd44d94b10029e,
title = "Cache-aware approximate computing for decision tree learning",
abstract = "The memory performance of data mining applications became crucial due to increasing dataset sizes and multi-level cache hierarchies. Decision tree learning is one of the most important algorithms in this field, and numerous researchers worked on improving the accuracy of model tree as well as enhancing the overall performance of the learning process. Most modern applications that employ decision tree learning favor creating multiple models for higher accuracy by sacrificing performance. In this work, we exploit the flexibility inherent in decision tree learning based applications regarding performance and accuracy tradeoffs, and propose a framework to improve performance with negligible accuracy losses. This framework employs a data access skipping module (DASM) using which costly cache accesses are skipped according to the aggressiveness of the strategy specified by the user and a heuristic to predict skipped data accesses to keep accuracy losses at minimum. Our experimental evaluation shows that the proposed framework offers significant performance improvements (up to 25{\%}) with relatively much smaller losses in accuracy (up to 8{\%}) over the original case. We demonstrate that our framework is scalable under various accuracy requirements via exploring accuracy changes over time and replacement policies. In addition, we explore NoC/SNUCA systems for similar opportunities of memory performance improvement.",
author = "Orhan Kislal and Mahmut Kandemir and Jagadish Kotra",
year = "2016",
month = "7",
day = "18",
doi = "10.1109/IPDPSW.2016.116",
language = "English (US)",
series = "Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "1413--1422",
booktitle = "Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016",
address = "United States",

}

Kislal, O, Kandemir, M & Kotra, J 2016, Cache-aware approximate computing for decision tree learning. in Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016., 7530032, Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016, Institute of Electrical and Electronics Engineers Inc., pp. 1413-1422, 30th IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2016, Chicago, United States, 5/23/16. https://doi.org/10.1109/IPDPSW.2016.116

Cache-aware approximate computing for decision tree learning. / Kislal, Orhan; Kandemir, Mahmut; Kotra, Jagadish.

Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016. Institute of Electrical and Electronics Engineers Inc., 2016. p. 1413-1422 7530032 (Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Cache-aware approximate computing for decision tree learning

AU - Kislal, Orhan

AU - Kandemir, Mahmut

AU - Kotra, Jagadish

PY - 2016/7/18

Y1 - 2016/7/18

N2 - The memory performance of data mining applications became crucial due to increasing dataset sizes and multi-level cache hierarchies. Decision tree learning is one of the most important algorithms in this field, and numerous researchers worked on improving the accuracy of model tree as well as enhancing the overall performance of the learning process. Most modern applications that employ decision tree learning favor creating multiple models for higher accuracy by sacrificing performance. In this work, we exploit the flexibility inherent in decision tree learning based applications regarding performance and accuracy tradeoffs, and propose a framework to improve performance with negligible accuracy losses. This framework employs a data access skipping module (DASM) using which costly cache accesses are skipped according to the aggressiveness of the strategy specified by the user and a heuristic to predict skipped data accesses to keep accuracy losses at minimum. Our experimental evaluation shows that the proposed framework offers significant performance improvements (up to 25%) with relatively much smaller losses in accuracy (up to 8%) over the original case. We demonstrate that our framework is scalable under various accuracy requirements via exploring accuracy changes over time and replacement policies. In addition, we explore NoC/SNUCA systems for similar opportunities of memory performance improvement.

AB - The memory performance of data mining applications became crucial due to increasing dataset sizes and multi-level cache hierarchies. Decision tree learning is one of the most important algorithms in this field, and numerous researchers worked on improving the accuracy of model tree as well as enhancing the overall performance of the learning process. Most modern applications that employ decision tree learning favor creating multiple models for higher accuracy by sacrificing performance. In this work, we exploit the flexibility inherent in decision tree learning based applications regarding performance and accuracy tradeoffs, and propose a framework to improve performance with negligible accuracy losses. This framework employs a data access skipping module (DASM) using which costly cache accesses are skipped according to the aggressiveness of the strategy specified by the user and a heuristic to predict skipped data accesses to keep accuracy losses at minimum. Our experimental evaluation shows that the proposed framework offers significant performance improvements (up to 25%) with relatively much smaller losses in accuracy (up to 8%) over the original case. We demonstrate that our framework is scalable under various accuracy requirements via exploring accuracy changes over time and replacement policies. In addition, we explore NoC/SNUCA systems for similar opportunities of memory performance improvement.

UR - http://www.scopus.com/inward/record.url?scp=84991593702&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84991593702&partnerID=8YFLogxK

U2 - 10.1109/IPDPSW.2016.116

DO - 10.1109/IPDPSW.2016.116

M3 - Conference contribution

T3 - Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016

SP - 1413

EP - 1422

BT - Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Kislal O, Kandemir M, Kotra J. Cache-aware approximate computing for decision tree learning. In Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016. Institute of Electrical and Electronics Engineers Inc. 2016. p. 1413-1422. 7530032. (Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016). https://doi.org/10.1109/IPDPSW.2016.116