Anomaly detection of attacks (ADA) on DNN classifiers at test time

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

A significant threat to wide deployment of machine learning-based classifiers is adversarial learning attacks, especially at test-time. Recently there has been significant development in defending against such attacks. Several such works seek to robustify the classifier to make «correct» decisions on perturbed patterns. We argue it is often operationally more important to detect the attack, rather than to «correctly classify» in the face of it (Classification can proceed if no attack is detected). We hypothesize that, even if human-imperceptible, adversarial perturbations are machine-detectable. We propose a purely unsupervised anomaly detector (AD), based on suitable (null hypothesis) density models for the different layers of a deep neural net and a novel decision statistic built upon the Kullback-Leibler divergence. This paper addresses: 1) when is it appropriate to aim to «correctly classify» a perturbed pattern?; 2) What is a good AD detection statistic, one which exploits all likely sources of anomalousness associated with a test-time attack? 3) Where in a deep neural net (DNN) (in an early layer, a middle layer, or at the penultimate layer) will the most anomalous signature manifest? Tested on MNIST and CIFAR-10 image databases under three prominent attack strategies, our approach outperforms previous detection methods, achieving strong ROC AUC detection accuracy on two attacks and substantially better accuracy than previously reported on the third (strongest) attack.

Original languageEnglish (US)
Title of host publication2018 IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2018 - Proceedings
EditorsNelly Pustelnik, Zheng-Hua Tan, Zhanyu Ma, Jan Larsen
PublisherIEEE Computer Society
ISBN (Electronic)9781538654774
DOIs
StatePublished - Oct 31 2018
Event28th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2018 - Aalborg, Denmark
Duration: Sep 17 2018Sep 20 2018

Publication series

NameIEEE International Workshop on Machine Learning for Signal Processing, MLSP
Volume2018-September
ISSN (Print)2161-0363
ISSN (Electronic)2161-0371

Other

Other28th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2018
CountryDenmark
CityAalborg
Period9/17/189/20/18

Fingerprint

Classifiers
Statistics
Detectors
Neural networks
Learning systems

All Science Journal Classification (ASJC) codes

  • Human-Computer Interaction
  • Signal Processing

Cite this

Miller, D. J., Wang, Y., & Kesidis, G. (2018). Anomaly detection of attacks (ADA) on DNN classifiers at test time. In N. Pustelnik, Z-H. Tan, Z. Ma, & J. Larsen (Eds.), 2018 IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2018 - Proceedings [8517069] (IEEE International Workshop on Machine Learning for Signal Processing, MLSP; Vol. 2018-September). IEEE Computer Society. https://doi.org/10.1109/MLSP.2018.8517069
Miller, David Jonathan ; Wang, Yujia ; Kesidis, George. / Anomaly detection of attacks (ADA) on DNN classifiers at test time. 2018 IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2018 - Proceedings. editor / Nelly Pustelnik ; Zheng-Hua Tan ; Zhanyu Ma ; Jan Larsen. IEEE Computer Society, 2018. (IEEE International Workshop on Machine Learning for Signal Processing, MLSP).
@inproceedings{60d05ee525604e82b43714cb74830f23,
title = "Anomaly detection of attacks (ADA) on DNN classifiers at test time",
abstract = "A significant threat to wide deployment of machine learning-based classifiers is adversarial learning attacks, especially at test-time. Recently there has been significant development in defending against such attacks. Several such works seek to robustify the classifier to make «correct» decisions on perturbed patterns. We argue it is often operationally more important to detect the attack, rather than to «correctly classify» in the face of it (Classification can proceed if no attack is detected). We hypothesize that, even if human-imperceptible, adversarial perturbations are machine-detectable. We propose a purely unsupervised anomaly detector (AD), based on suitable (null hypothesis) density models for the different layers of a deep neural net and a novel decision statistic built upon the Kullback-Leibler divergence. This paper addresses: 1) when is it appropriate to aim to «correctly classify» a perturbed pattern?; 2) What is a good AD detection statistic, one which exploits all likely sources of anomalousness associated with a test-time attack? 3) Where in a deep neural net (DNN) (in an early layer, a middle layer, or at the penultimate layer) will the most anomalous signature manifest? Tested on MNIST and CIFAR-10 image databases under three prominent attack strategies, our approach outperforms previous detection methods, achieving strong ROC AUC detection accuracy on two attacks and substantially better accuracy than previously reported on the third (strongest) attack.",
author = "Miller, {David Jonathan} and Yujia Wang and George Kesidis",
year = "2018",
month = "10",
day = "31",
doi = "10.1109/MLSP.2018.8517069",
language = "English (US)",
series = "IEEE International Workshop on Machine Learning for Signal Processing, MLSP",
publisher = "IEEE Computer Society",
editor = "Nelly Pustelnik and Zheng-Hua Tan and Zhanyu Ma and Jan Larsen",
booktitle = "2018 IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2018 - Proceedings",
address = "United States",

}

Miller, DJ, Wang, Y & Kesidis, G 2018, Anomaly detection of attacks (ADA) on DNN classifiers at test time. in N Pustelnik, Z-H Tan, Z Ma & J Larsen (eds), 2018 IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2018 - Proceedings., 8517069, IEEE International Workshop on Machine Learning for Signal Processing, MLSP, vol. 2018-September, IEEE Computer Society, 28th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2018, Aalborg, Denmark, 9/17/18. https://doi.org/10.1109/MLSP.2018.8517069

Anomaly detection of attacks (ADA) on DNN classifiers at test time. / Miller, David Jonathan; Wang, Yujia; Kesidis, George.

2018 IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2018 - Proceedings. ed. / Nelly Pustelnik; Zheng-Hua Tan; Zhanyu Ma; Jan Larsen. IEEE Computer Society, 2018. 8517069 (IEEE International Workshop on Machine Learning for Signal Processing, MLSP; Vol. 2018-September).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Anomaly detection of attacks (ADA) on DNN classifiers at test time

AU - Miller, David Jonathan

AU - Wang, Yujia

AU - Kesidis, George

PY - 2018/10/31

Y1 - 2018/10/31

N2 - A significant threat to wide deployment of machine learning-based classifiers is adversarial learning attacks, especially at test-time. Recently there has been significant development in defending against such attacks. Several such works seek to robustify the classifier to make «correct» decisions on perturbed patterns. We argue it is often operationally more important to detect the attack, rather than to «correctly classify» in the face of it (Classification can proceed if no attack is detected). We hypothesize that, even if human-imperceptible, adversarial perturbations are machine-detectable. We propose a purely unsupervised anomaly detector (AD), based on suitable (null hypothesis) density models for the different layers of a deep neural net and a novel decision statistic built upon the Kullback-Leibler divergence. This paper addresses: 1) when is it appropriate to aim to «correctly classify» a perturbed pattern?; 2) What is a good AD detection statistic, one which exploits all likely sources of anomalousness associated with a test-time attack? 3) Where in a deep neural net (DNN) (in an early layer, a middle layer, or at the penultimate layer) will the most anomalous signature manifest? Tested on MNIST and CIFAR-10 image databases under three prominent attack strategies, our approach outperforms previous detection methods, achieving strong ROC AUC detection accuracy on two attacks and substantially better accuracy than previously reported on the third (strongest) attack.

AB - A significant threat to wide deployment of machine learning-based classifiers is adversarial learning attacks, especially at test-time. Recently there has been significant development in defending against such attacks. Several such works seek to robustify the classifier to make «correct» decisions on perturbed patterns. We argue it is often operationally more important to detect the attack, rather than to «correctly classify» in the face of it (Classification can proceed if no attack is detected). We hypothesize that, even if human-imperceptible, adversarial perturbations are machine-detectable. We propose a purely unsupervised anomaly detector (AD), based on suitable (null hypothesis) density models for the different layers of a deep neural net and a novel decision statistic built upon the Kullback-Leibler divergence. This paper addresses: 1) when is it appropriate to aim to «correctly classify» a perturbed pattern?; 2) What is a good AD detection statistic, one which exploits all likely sources of anomalousness associated with a test-time attack? 3) Where in a deep neural net (DNN) (in an early layer, a middle layer, or at the penultimate layer) will the most anomalous signature manifest? Tested on MNIST and CIFAR-10 image databases under three prominent attack strategies, our approach outperforms previous detection methods, achieving strong ROC AUC detection accuracy on two attacks and substantially better accuracy than previously reported on the third (strongest) attack.

UR - http://www.scopus.com/inward/record.url?scp=85057066178&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85057066178&partnerID=8YFLogxK

U2 - 10.1109/MLSP.2018.8517069

DO - 10.1109/MLSP.2018.8517069

M3 - Conference contribution

AN - SCOPUS:85057066178

T3 - IEEE International Workshop on Machine Learning for Signal Processing, MLSP

BT - 2018 IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2018 - Proceedings

A2 - Pustelnik, Nelly

A2 - Tan, Zheng-Hua

A2 - Ma, Zhanyu

A2 - Larsen, Jan

PB - IEEE Computer Society

ER -

Miller DJ, Wang Y, Kesidis G. Anomaly detection of attacks (ADA) on DNN classifiers at test time. In Pustelnik N, Tan Z-H, Ma Z, Larsen J, editors, 2018 IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2018 - Proceedings. IEEE Computer Society. 2018. 8517069. (IEEE International Workshop on Machine Learning for Signal Processing, MLSP). https://doi.org/10.1109/MLSP.2018.8517069