Residential College | false |
Status | 已發表Published |
A Unified Framework for Detecting Audio Adversarial Examples | |
Xia Du1; Chi-Man Pun1; Zheng Zhang1,2 | |
2020-10-12 | |
Conference Name | The 28th ACM International Conference on Multimedia |
Source Publication | MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia |
Pages | 3986-3994 |
Conference Date | 12 - 16 October 2020 |
Conference Place | Seattle WA USA |
Country | USA |
Abstract | Adversarial attacks have been widely recognized as the security vulnerability of deep neural networks, especially in deep automatic speech recognition (ASR) systems. The advanced detection methods against adversarial attacks mainly focus on pre-processing the input audio to alleviate the threat of adversarial noise. Although these methods could detect some simplex adversarial attacks, they fail to handle robust complex attacks especially when the attacker knows the detection details. In this paper, we propose a unified adversarial detection framework for detecting adaptive audio adversarial examples, which combines noise padding with sound reverberation. Specifically, a well-designed adaptive artificial utterances generator is proposed to balance the design complexity, such that the artificial utterances (speech with reverberation) are efficiently determined to reduce the false positive rate and false negative rate of detection results. Moreover, to destroy the continuity of the adversarial noise, we develop a novel multi-noise padding strategy, which implants the Gaussian noises in the silent fragments of the input speech by the voice activity detector. Furthermore, our proposed method can effectively tackle the robust adaptive attacks in an adaptive learning manner. Importantly, the conceived system is easily embedded into any ASR models without requiring additional retraining or modification. The experimental results show that our method consistently outperforms the state-of-the-art audio defense methods, even for the adaptive and robust attacks. |
Keyword | Adversarial Examples Detecting Artificial Utterances Generation Multi-fragment Noise Padding Unified Pre-processing Mechanism |
DOI | 10.1145/3394171.3413603 |
URL | View the original |
Indexed By | CPCI-S |
Language | 英語English |
WOS Research Area | Computer Science ; Imaging Science & Photographic Technology |
WOS Subject | Computer Science, Artificial Intelligence ; Computer Science, Information Systems ; Computer Science, Interdisciplinary Applications ; Computer Science, Software Engineering ; Imaging Science & Photographic Technology |
WOS ID | WOS:000810735004005 |
Scopus ID | 2-s2.0-85106948984 |
Fulltext Access | |
Citation statistics | |
Document Type | Conference paper |
Collection | Faculty of Science and Technology DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE |
Corresponding Author | Chi-Man Pun |
Affiliation | 1.Department of Computer and Information Science, University of Macau, Macau, China 2.Harbin Institute of Technology, Shenzhen, China |
First Author Affilication | University of Macau |
Corresponding Author Affilication | University of Macau |
Recommended Citation GB/T 7714 | Xia Du,Chi-Man Pun,Zheng Zhang. A Unified Framework for Detecting Audio Adversarial Examples[C], 2020, 3986-3994. |
APA | Xia Du., Chi-Man Pun., & Zheng Zhang (2020). A Unified Framework for Detecting Audio Adversarial Examples. MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia, 3986-3994. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment