Constructive Noise Defeats Adversarial Noise: Adversarial Example Detection for Commercial DNN Services

Meng Shen (Beijing Institute of Technology), Jiangyuan Bi (Beijing Institute of Technology), Hao Yu (National University of Defense Technology), Zhenming Bai (Beijing Institute of Technology), Wei Wang (Xi'an Jiaotong University), Liehuang Zhu (Beijing Institute of Technology)

Commercial DNN services have been developed in the form of machine learning as a service (MLaaS). To mitigate the potential threats of adversarial examples, various detection methods have been proposed. However, the existing methods usually require access to details or the training dataset of the target model, which is commonly unavailable in MLaaS scenarios. Their detection accuracy experiences a significant drop in a setting where neither the details nor the training dataset of the target model can be acquired.

In this paper, we propose Falcon, an adversarial example detection method offered by a third party, which achieves accuracy and efficiency simultaneously. Based on the disparity in noise tolerance between clean and adversarial examples, we explore constructive noise that cannot affect the model’s output labels when added to clean examples while causing noticeable changes in model outputs when added to adversarial examples. For each input, Falcon generates constructive noise with a specific distribution and intensity and achieves detection through differences in the output of the target model before and after adding constructive noise. Extensive experiments are conducted on 4 public datasets to evaluate the performance of Falcon in detecting 10 typical attacks. Falcon outperforms SOTA detection methods with the highest True Positive Rate (TPR) of adversarial examples and the lowest False Positive Rate (FPR) of clean examples. Furthermore, Falcon achieves a TPR of about 80% with an FPR of 5% on 6 well-known commercial DNN services, which outperforms the SOTA methods. Falcon can also maintain its accuracy although the adversary has complete knowledge of the detection details.

Paper

View More Papers

Replication: A Study on How Users (Don’t) Use Password...

Pithayuth Charnsethikul (University of Southern California), Anushka Fattepurkar (University of Southern California), Dipsy Desai (University of Southern California), Gale Lucas (University of Southern California), Jelena Mirkovic (University of Southern California)

EXIA: Trusted Transitions for Enclaves via External-Input Attestation

Zhen Huang (Shanghai Jiao Tong University), Yidi Kao (Auburn University), Sanchuan Chen (Auburn University), Guoxing Chen (Shanghai Jiao Tong University), Yan Meng (Shanghai Jiao Tong University), Haojin Zhu (Shanghai Jiao Tong University)

IsolatOS: Detecting Double Fetch Bugs in COTS RTOS by...

Yingjie Cao (Sun Yat-sen University and The Hong Kong Polytechnic University), Xiaogang Zhu (Adelaide University), Dean Sullivan (University of New Hampshire, US), Haowei Yang, Lei Xue (Sun Yat-sen University), Xian Li (Swinburne University of Technology, Australia), Chenxiong Qian (University of Hong Kong, China), Minrui Yan (Swinburne University of Technology, Australia), Xiapu Luo (The Hong Kong…