TY - JOUR
T1 - Adversarial Learning Targeting Deep Neural Network Classification
T2 - A Comprehensive Review of Defenses against Attacks
AU - Miller, David J.
AU - Xiang, Zhen
AU - Kesidis, George
N1 - Funding Information:
Manuscript received April 12, 2019; revised November 22, 2019; accepted January 24, 2020. Date of publication February 26, 2020; date of current version March 4, 2020. This work was supported in part by the Air Force Office of Scientific Research Dynamic Data-Driven Application Systems (AFOSR DDDAS) Grant, in part by the Cisco Systems URP Gift, and in part by the AWS Credits Gift. (Corresponding author: David J. Miller.) The authors are with the School of Electrical Engineering and Computer Science, Pennsylvania State University, University Park, PA 16802 USA (e-mail: djm25@psu.edu; gik2@psu.edu).
Funding Information:
Dr. Miller was a member of the Machine Learning for Signal Processing Technical Committee, IEEE Signal Processing Society, from 1997 to 2010. He is currently a member of the IEEE Signal Processing Society Conference Board and the Machine Learning for Signal Processing Technical Committee. He received the National Science Foundation Career Award in 1996. From 2007 to 2009, he was the Chair of the Machine Learning for Signal Processing Technical Committee, IEEE Signal Processing Society. He was the
Funding Information:
Following eight years as a Professor of electrical and computer engineering at the University of Waterloo, he has been a Professor of EE and computer science and engineering (CSE) with Pennsylvania State University, University Park, PA, USA, since 2000. In the past, his research in these areas has been supported by over a dozen NSF grants and several Cisco Systems URP gifts. His research was supported by grants from NSF CSR, DARPA XD3, Air Force Office of Scientific Research Dynamic Data-Driven Application Systems (AFOSR DDDAS), and a Cisco URP gift. His research interests include problems in networking, cyber-security, machine learning, performance evaluation, and cloud computing.
Publisher Copyright:
© 2020 IEEE.
PY - 2020/3
Y1 - 2020/3
N2 - With wide deployment of machine learning (ML)-based systems for a variety of applications including medical, military, automotive, genomic, multimedia, and social networking, there is great potential for damage from adversarial learning (AL) attacks. In this article, we provide a contemporary survey of AL, focused particularly on defenses against attacks on deep neural network classifiers. After introducing relevant terminology and the goals and range of possible knowledge of both attackers and defenders, we survey recent work on test-time evasion (TTE), data poisoning (DP), backdoor DP, and reverse engineering (RE) attacks and particularly defenses against the same. In so doing, we distinguish robust classification from anomaly detection (AD), unsupervised from supervised, and statistical hypothesis-based defenses from ones that do not have an explicit null (no attack) hypothesis. We also consider several scenarios for detecting backdoors. We provide a technical assessment for reviewed works, including identifying any issues/limitations, required hyperparameters, needed computational complexity, as well as the performance measures evaluated and the obtained quality. We then delve deeper, providing novel insights that challenge conventional AL wisdom and that target unresolved issues, including: Robust classification versus AD as a defense strategy; the belief that attack success increases with attack strength, which ignores susceptibility to AD; small perturbations for TTE attacks: A fallacy or a requirement; validity of the universal assumption that a TTE attacker knows the ground-truth class for the example to be attacked; black, gray, or white-box attacks as the standard for defense evaluation; and susceptibility of query-based RE to an AD defense. We also discuss attacks on the privacy of training data. We then present benchmark comparisons of several defenses against TTE, RE, and backdoor DP attacks on images. The article concludes with a discussion of continuing research directions, including the supreme challenge of detecting attacks whose goal is not to alter classification decisions, but rather simply to embed, without detection, 'fake news' or other false content.
AB - With wide deployment of machine learning (ML)-based systems for a variety of applications including medical, military, automotive, genomic, multimedia, and social networking, there is great potential for damage from adversarial learning (AL) attacks. In this article, we provide a contemporary survey of AL, focused particularly on defenses against attacks on deep neural network classifiers. After introducing relevant terminology and the goals and range of possible knowledge of both attackers and defenders, we survey recent work on test-time evasion (TTE), data poisoning (DP), backdoor DP, and reverse engineering (RE) attacks and particularly defenses against the same. In so doing, we distinguish robust classification from anomaly detection (AD), unsupervised from supervised, and statistical hypothesis-based defenses from ones that do not have an explicit null (no attack) hypothesis. We also consider several scenarios for detecting backdoors. We provide a technical assessment for reviewed works, including identifying any issues/limitations, required hyperparameters, needed computational complexity, as well as the performance measures evaluated and the obtained quality. We then delve deeper, providing novel insights that challenge conventional AL wisdom and that target unresolved issues, including: Robust classification versus AD as a defense strategy; the belief that attack success increases with attack strength, which ignores susceptibility to AD; small perturbations for TTE attacks: A fallacy or a requirement; validity of the universal assumption that a TTE attacker knows the ground-truth class for the example to be attacked; black, gray, or white-box attacks as the standard for defense evaluation; and susceptibility of query-based RE to an AD defense. We also discuss attacks on the privacy of training data. We then present benchmark comparisons of several defenses against TTE, RE, and backdoor DP attacks on images. The article concludes with a discussion of continuing research directions, including the supreme challenge of detecting attacks whose goal is not to alter classification decisions, but rather simply to embed, without detection, 'fake news' or other false content.
UR - http://www.scopus.com/inward/record.url?scp=85080115696&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85080115696&partnerID=8YFLogxK
U2 - 10.1109/JPROC.2020.2970615
DO - 10.1109/JPROC.2020.2970615
M3 - Article
AN - SCOPUS:85080115696
SN - 0018-9219
VL - 108
SP - 402
EP - 433
JO - Proceedings of the Institute of Radio Engineers
JF - Proceedings of the Institute of Radio Engineers
IS - 3
M1 - 9013065
ER -