Hanlei Zhang (Zhejiang University), Yijie Bai (Zhejiang University), Yanjiao Chen (Zhejiang University), Zhongming Ma (Zhejiang University), Wenyuan Xu (Zhejiang University)

Backdoor attacks are an essential risk to deep learning model sharing. Fundamentally, backdoored models are different from benign models considering latent separability, i.e., distinguishable differences in model latent representations. However, existing methods quantify latent separability by clustering latent representations or computing distances between latent representations, which are easy to be compromised by adaptive attacks. In this paper, we propose BARBIE, a backdoor detection approach that can pinpoint latent separability under adaptive backdoor attacks. To achieve this goal, we propose a new latent separability metric, named relative competition score (RCS), by characterizing the dominance of latent representations over model output, which is robust against various backdoor attacks and is hard to compromise. Without the need to access any benign or backdoored sample, we invert two sets of latent representations of each label, reflecting the normal latent representations of benign models and intensifying the abnormal ones of backdoored models, to calculate RCS. We compute a series of RCS-based indicators to comprehensively reflect the differences between backdoored models and benign models. We validate the effectiveness of BARBIE on more than 10,000 models on 4 datasets against 14 types of backdoor attacks, including the adaptive attacks against latent separability. Compared with 7 baselines, BARBIE improves the average true positive rate by 17.05% against source-agnostic attacks, 27.72% against source-specific attacks, 43.17% against sample-specific attacks and 11.48% against clean-label attacks. BARBIE also maintains lower false positive rates than baselines. The source code is available at: https://github.com/Forliqr/BARBIE.

View More Papers

SecuWear: Secure Data Sharing Between Wearable Devices

Sujin Han (KAIST) Diana A. Vasile (Nokia Bell Labs), Fahim Kawsar (Nokia Bell Labs, University of Glasgow), Chulhong Min (Nokia Bell Labs)

Read More

Work-in-Progress: Towards Browser-Based Consent Management

Gayatri Priyadarsini Kancherla and Abhishek Bichhawat (Indian Institute of Technology Gandhinagar)

Read More

TZ-DATASHIELD: Automated Data Protection for Embedded Systems via Data-Flow-Based...

Zelun Kong (University of Texas at Dallas), Minkyung Park (University of Texas at Dallas), Le Guan (University of Georgia), Ning Zhang (Washington University in St. Louis), Chung Hwan Kim (University of Texas at Dallas)

Read More