TEXTSHIELD: Robust text classification based on multimodal embedding and neural machine translation

Jinfeng Li, Tianyu Du, Shouling Ji, Rong Zhang, Quan Lu, Min Yang, Ting Wang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

Text-based toxic content detection is an important tool for reducing harmful interactions in online social media environments. Yet, its underlying mechanism, deep learning-based text classification (DLTC), is inherently vulnerable to maliciously crafted adversarial texts. To mitigate such vulnerabilities, intensive research has been conducted on strengthening English-based DLTC models. However, the existing defenses are not effective for Chinese-based DLTC models, due to the unique sparseness, diversity, and variation of the Chinese language. In this paper, we bridge this striking gap by presenting TEXTSHIELD, a new adversarial defense framework specifically designed for Chinese-based DLTC models. TEXTSHIELD differs from previous work in several key aspects: (i) generic - it applies to any Chinese-based DLTC models without requiring re-training; (ii) robust - it significantly reduces the attack success rate even under the setting of adaptive attacks; and (iii) accurate - it has little impact on the performance of DLTC models over legitimate inputs. Extensive evaluations show that it outperforms both existing methods and the industry-leading platforms. Future work will explore its applicability in broader practical tasks.

Original languageEnglish (US)
Title of host publicationProceedings of the 29th USENIX Security Symposium
PublisherUSENIX Association
Pages1381-1398
Number of pages18
ISBN (Electronic)9781939133175
StatePublished - 2020
Event29th USENIX Security Symposium - Virtual, Online
Duration: Aug 12 2020Aug 14 2020

Publication series

NameProceedings of the 29th USENIX Security Symposium

Conference

Conference29th USENIX Security Symposium
CityVirtual, Online
Period8/12/208/14/20

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Information Systems
  • Safety, Risk, Reliability and Quality

Fingerprint Dive into the research topics of 'TEXTSHIELD: Robust text classification based on multimodal embedding and neural machine translation'. Together they form a unique fingerprint.

Cite this