Yansong Gao (The University of Western Australia), Huaibing Peng (Nanjing University of Science and Technology), Hua Ma (CSIRO's Data61), Zhi Zhang (The University of Western Australia), Shuo Wang (Shanghai Jiao Tong University), Rayne Holland (CSIRO's Data61), Anmin Fu (Nanjing University of Science and Technology), Minhui Xue (CSIRO's Data61), Derek Abbott (The University of Adelaide, Australia)

In the Data as a Service (DaaS) model, data curators, such as commercial providers like Amazon Mechanical Turk, Appen, and TELUS International, aggregate quality data from numerous contributors and monetize it for deep learning (DL) model providers. However, malicious contributors can poison this data, embedding backdoors in the trained DL models. Existing methods for detecting poisoned samples face significant limitations: they often rely on reserved clean data; they are sensitive to the poisoning rate, trigger type, and backdoor type; and they are specific to classification tasks. These limitations hinder their practical adoption by data curators.

This work, for the first time, investigates the textit{training trajectory} of poisoned samples in the textit{spectrum domain}, revealing distinctions from benign samples that are not apparent in the original non-spectrum domain. Building on this novel perspective, we propose TellTale to detect and sanitize poisoned samples as a one-time effort, addressing textit{all} of the aforementioned limitations of prior work. Through extensive experiments, TellTale demonstrates the ability to defeat both universal and challenging partial backdoor types without relying on any reserved clean data. TellTale is also validated to be agnostic to various trigger types, including the advanced clean-label trigger attack, Narcissus (CCS'2023). Moreover, TellTale proves effective across diverse data modalities (e.g., image, audio and text) and non-classification tasks (e.g., regression)---making it the only known training phase poisoned sample detection method applicable to non-classification tasks. In all our evaluations, TellTale achieves a detection accuracy (i.e., accurately identifying poisoned samples) of at least 95.52% and a false positive rate (i.e., falsely recognizing benign samples as poisoned ones) no higher than 0.61%. Comparisons with state-of-the-art methods, ASSET (Usenix'2023) and CT (Usenix'2023), further affirm TellTale's superior performance. More specifically, ASSET fails to handle partial backdoor types and incurs an unbearable false positive rate with clean/benign datasets common in practice, while CT fails against the Narcissus trigger. In contrast, TellTale proves highly effective across testing scenarios where prior work fails. The source code is released at https://github.com/MPaloze/Telltale.

View More Papers

DShield: Defending against Backdoor Attacks on Graph Neural Networks...

Hao Yu (National University of Defense Technology), Chuan Ma (Chongqing University), Xinhang Wan (National University of Defense Technology), Jun Wang (National University of Defense Technology), Tao Xiang (Chongqing University), Meng Shen (Beijing Institute of Technology, Beijing, China), Xinwang Liu (National University of Defense Technology)

Read More

On Borrowed Time – Preventing Static Side-Channel Analysis

Robert Dumitru (Ruhr University Bochum and The University of Adelaide), Thorben Moos (UCLouvain), Andrew Wabnitz (Defence Science and Technology Group), Yuval Yarom (Ruhr University Bochum)

Read More

MTZK: Testing and Exploring Bugs in Zero-Knowledge (ZK) Compilers

Dongwei Xiao (The Hong Kong University of Science and Technology), Zhibo Liu (The Hong Kong University of Science and Technology), Yiteng Peng (The Hong Kong University of Science and Technology), Shuai Wang (The Hong Kong University of Science and Technology)

Read More

The Midas Touch: Triggering the Capability of LLMs for...

Yi Yang (Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, China), Jinghua Liu (Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, China), Kai Chen (Institute of Information Engineering, Chinese Academy of…

Read More