Lemna: Explaining deep learning based security applications

Wenbo Guo, Dongliang Mu, Jun Xu, Purui Su, Gang Wang, Xinyu Xing

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Citations (Scopus)

Abstract

While deep learning has shown a great potential in various domains, the lack of transparency has limited its application in security or safety-critical areas. Existing research has attempted to develop explanation techniques to provide interpretable explanations for each classification decision. Unfortunately, current methods are optimized for non-security tasks (e.g., image analysis). Their key assumptions are often violated in security applications, leading to a poor explanation fidelity. In this paper, we propose LEMNA, a high-fidelity explanation method dedicated for security applications. Given an input data sample, LEMNA generates a small set of interpretable features to explain how the input sample is classified. The core idea is to approximate a local area of the complex deep learning decision boundary using a simple interpretable model. The local interpretable model is specially designed to (1) handle feature dependency to better work with security applications (e.g., binary code analysis); and (2) handle nonlinear local boundaries to boost explanation fidelity. We evaluate our system using two popular deep learning applications in security (a malware classifier, and a function start detector for binary reverse-engineering). Extensive evaluations show that LEMNA’s explanation has a much higher fidelity level compared to existing methods. In addition, we demonstrate practical use cases of LEMNA to help machine learning developers to validate model behavior, troubleshoot classification errors, and automatically patch the errors of the target models.

Original languageEnglish (US)
Title of host publicationCCS 2018 - Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security
PublisherAssociation for Computing Machinery
Pages364-379
Number of pages16
ISBN (Electronic)9781450356930
DOIs
StatePublished - Oct 15 2018
Event25th ACM Conference on Computer and Communications Security, CCS 2018 - Toronto, Canada
Duration: Oct 15 2018 → …

Publication series

NameProceedings of the ACM Conference on Computer and Communications Security
ISSN (Print)1543-7221

Other

Other25th ACM Conference on Computer and Communications Security, CCS 2018
CountryCanada
CityToronto
Period10/15/18 → …

Fingerprint

Binary codes
Reverse engineering
Transparency
Image analysis
Learning systems
Classifiers
Deep learning
Detectors
Malware

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Networks and Communications

Cite this

Guo, W., Mu, D., Xu, J., Su, P., Wang, G., & Xing, X. (2018). Lemna: Explaining deep learning based security applications. In CCS 2018 - Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (pp. 364-379). (Proceedings of the ACM Conference on Computer and Communications Security). Association for Computing Machinery. https://doi.org/10.1145/3243734.3243792
Guo, Wenbo ; Mu, Dongliang ; Xu, Jun ; Su, Purui ; Wang, Gang ; Xing, Xinyu. / Lemna : Explaining deep learning based security applications. CCS 2018 - Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. Association for Computing Machinery, 2018. pp. 364-379 (Proceedings of the ACM Conference on Computer and Communications Security).
@inproceedings{dc1b7972a48a4b6d8200c3f4cf76523c,
title = "Lemna: Explaining deep learning based security applications",
abstract = "While deep learning has shown a great potential in various domains, the lack of transparency has limited its application in security or safety-critical areas. Existing research has attempted to develop explanation techniques to provide interpretable explanations for each classification decision. Unfortunately, current methods are optimized for non-security tasks (e.g., image analysis). Their key assumptions are often violated in security applications, leading to a poor explanation fidelity. In this paper, we propose LEMNA, a high-fidelity explanation method dedicated for security applications. Given an input data sample, LEMNA generates a small set of interpretable features to explain how the input sample is classified. The core idea is to approximate a local area of the complex deep learning decision boundary using a simple interpretable model. The local interpretable model is specially designed to (1) handle feature dependency to better work with security applications (e.g., binary code analysis); and (2) handle nonlinear local boundaries to boost explanation fidelity. We evaluate our system using two popular deep learning applications in security (a malware classifier, and a function start detector for binary reverse-engineering). Extensive evaluations show that LEMNA’s explanation has a much higher fidelity level compared to existing methods. In addition, we demonstrate practical use cases of LEMNA to help machine learning developers to validate model behavior, troubleshoot classification errors, and automatically patch the errors of the target models.",
author = "Wenbo Guo and Dongliang Mu and Jun Xu and Purui Su and Gang Wang and Xinyu Xing",
year = "2018",
month = "10",
day = "15",
doi = "10.1145/3243734.3243792",
language = "English (US)",
series = "Proceedings of the ACM Conference on Computer and Communications Security",
publisher = "Association for Computing Machinery",
pages = "364--379",
booktitle = "CCS 2018 - Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security",

}

Guo, W, Mu, D, Xu, J, Su, P, Wang, G & Xing, X 2018, Lemna: Explaining deep learning based security applications. in CCS 2018 - Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. Proceedings of the ACM Conference on Computer and Communications Security, Association for Computing Machinery, pp. 364-379, 25th ACM Conference on Computer and Communications Security, CCS 2018, Toronto, Canada, 10/15/18. https://doi.org/10.1145/3243734.3243792

Lemna : Explaining deep learning based security applications. / Guo, Wenbo; Mu, Dongliang; Xu, Jun; Su, Purui; Wang, Gang; Xing, Xinyu.

CCS 2018 - Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. Association for Computing Machinery, 2018. p. 364-379 (Proceedings of the ACM Conference on Computer and Communications Security).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Lemna

T2 - Explaining deep learning based security applications

AU - Guo, Wenbo

AU - Mu, Dongliang

AU - Xu, Jun

AU - Su, Purui

AU - Wang, Gang

AU - Xing, Xinyu

PY - 2018/10/15

Y1 - 2018/10/15

N2 - While deep learning has shown a great potential in various domains, the lack of transparency has limited its application in security or safety-critical areas. Existing research has attempted to develop explanation techniques to provide interpretable explanations for each classification decision. Unfortunately, current methods are optimized for non-security tasks (e.g., image analysis). Their key assumptions are often violated in security applications, leading to a poor explanation fidelity. In this paper, we propose LEMNA, a high-fidelity explanation method dedicated for security applications. Given an input data sample, LEMNA generates a small set of interpretable features to explain how the input sample is classified. The core idea is to approximate a local area of the complex deep learning decision boundary using a simple interpretable model. The local interpretable model is specially designed to (1) handle feature dependency to better work with security applications (e.g., binary code analysis); and (2) handle nonlinear local boundaries to boost explanation fidelity. We evaluate our system using two popular deep learning applications in security (a malware classifier, and a function start detector for binary reverse-engineering). Extensive evaluations show that LEMNA’s explanation has a much higher fidelity level compared to existing methods. In addition, we demonstrate practical use cases of LEMNA to help machine learning developers to validate model behavior, troubleshoot classification errors, and automatically patch the errors of the target models.

AB - While deep learning has shown a great potential in various domains, the lack of transparency has limited its application in security or safety-critical areas. Existing research has attempted to develop explanation techniques to provide interpretable explanations for each classification decision. Unfortunately, current methods are optimized for non-security tasks (e.g., image analysis). Their key assumptions are often violated in security applications, leading to a poor explanation fidelity. In this paper, we propose LEMNA, a high-fidelity explanation method dedicated for security applications. Given an input data sample, LEMNA generates a small set of interpretable features to explain how the input sample is classified. The core idea is to approximate a local area of the complex deep learning decision boundary using a simple interpretable model. The local interpretable model is specially designed to (1) handle feature dependency to better work with security applications (e.g., binary code analysis); and (2) handle nonlinear local boundaries to boost explanation fidelity. We evaluate our system using two popular deep learning applications in security (a malware classifier, and a function start detector for binary reverse-engineering). Extensive evaluations show that LEMNA’s explanation has a much higher fidelity level compared to existing methods. In addition, we demonstrate practical use cases of LEMNA to help machine learning developers to validate model behavior, troubleshoot classification errors, and automatically patch the errors of the target models.

UR - http://www.scopus.com/inward/record.url?scp=85056873197&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85056873197&partnerID=8YFLogxK

U2 - 10.1145/3243734.3243792

DO - 10.1145/3243734.3243792

M3 - Conference contribution

AN - SCOPUS:85056873197

T3 - Proceedings of the ACM Conference on Computer and Communications Security

SP - 364

EP - 379

BT - CCS 2018 - Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security

PB - Association for Computing Machinery

ER -

Guo W, Mu D, Xu J, Su P, Wang G, Xing X. Lemna: Explaining deep learning based security applications. In CCS 2018 - Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. Association for Computing Machinery. 2018. p. 364-379. (Proceedings of the ACM Conference on Computer and Communications Security). https://doi.org/10.1145/3243734.3243792