TY - GEN
T1 - SirenAttack
T2 - 15th ACM Asia Conference on Computer and Communications Security, ASIA CCS 2020
AU - Du, Tianyu
AU - Ji, Shouling
AU - Li, Jinfeng
AU - Gu, Qinchen
AU - Wang, Ting
AU - Beyah, Raheem
N1 - Funding Information:
This work was partly supported by the Zhejiang Provincial Natural Science Foundation for Distinguished Young Scholars under No. LR19F020003, the Provincial Key Research and Development Program of Zhejiang, China under No. 2019C01055, NSFC under No. 61772466, U1936215, and U1836202, the Ant Financial Research Funding, the National Key Research and Development Program of China under No. 2018YFB0804102, and the Alibaba-ZJU Joint Research Institute of Frontier Technologies.
PY - 2020/10/5
Y1 - 2020/10/5
N2 - Despite their immense popularity, deep learning-based acoustic systems are inherently vulnerable to adversarial attacks, wherein maliciously crafted audios trigger target systems to misbehave. In this paper, we present SirenAttack, a new class of attacks to generate adversarial audios. Compared with existing attacks, SirenAttack highlights with a set of significant features: (i) versatile - it is able to deceive a range of end-to-end acoustic systems under both white-box and black-box settings; (ii) effective - it is able to generate adversarial audios that can be recognized as specific phrases by target acoustic systems; and (iii) stealthy - it is able to generate adversarial audios indistinguishable from their benign counterparts to human perception. We empirically evaluate SirenAttack on a set of state-of-the-art deep learning-based acoustic systems (including speech command recognition, speaker recognition and sound event classification), with results showing the versatility, effectiveness, and stealthiness of SirenAttack. For instance, it achieves 99.45% attack success rate on the IEMOCAP dataset against the ResNet18 model, while the generated adversarial audios are also misinterpreted by multiple popular ASR platforms, including Google Cloud Speech, Microsoft Bing Voice, and IBM Speech-to-Text. We further evaluate three potential defense methods to mitigate such attacks, including adversarial training, audio downsampling, and moving average filtering, which leads to promising directions for further research.
AB - Despite their immense popularity, deep learning-based acoustic systems are inherently vulnerable to adversarial attacks, wherein maliciously crafted audios trigger target systems to misbehave. In this paper, we present SirenAttack, a new class of attacks to generate adversarial audios. Compared with existing attacks, SirenAttack highlights with a set of significant features: (i) versatile - it is able to deceive a range of end-to-end acoustic systems under both white-box and black-box settings; (ii) effective - it is able to generate adversarial audios that can be recognized as specific phrases by target acoustic systems; and (iii) stealthy - it is able to generate adversarial audios indistinguishable from their benign counterparts to human perception. We empirically evaluate SirenAttack on a set of state-of-the-art deep learning-based acoustic systems (including speech command recognition, speaker recognition and sound event classification), with results showing the versatility, effectiveness, and stealthiness of SirenAttack. For instance, it achieves 99.45% attack success rate on the IEMOCAP dataset against the ResNet18 model, while the generated adversarial audios are also misinterpreted by multiple popular ASR platforms, including Google Cloud Speech, Microsoft Bing Voice, and IBM Speech-to-Text. We further evaluate three potential defense methods to mitigate such attacks, including adversarial training, audio downsampling, and moving average filtering, which leads to promising directions for further research.
UR - http://www.scopus.com/inward/record.url?scp=85091900184&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85091900184&partnerID=8YFLogxK
U2 - 10.1145/3320269.3384733
DO - 10.1145/3320269.3384733
M3 - Conference contribution
AN - SCOPUS:85091900184
T3 - Proceedings of the 15th ACM Asia Conference on Computer and Communications Security, ASIA CCS 2020
SP - 357
EP - 369
BT - Proceedings of the 15th ACM Asia Conference on Computer and Communications Security, ASIA CCS 2020
PB - Association for Computing Machinery, Inc
Y2 - 5 October 2020 through 9 October 2020
ER -