Mengyuan Sun (Wuhan University), Yu Li (Wuhan University), Yunjie Ge (Wuhan University), Yuchen Liu (Wuhan University), Bo Du (Wuhan University), Qian Wang (Wuhan University)

Multimodal contrastive learning models like CLIP have demonstrated remarkable vision-language alignment capabilities and now serve as foundational components in many large-scale multimodal systems. However, their vulnerability to backdoor attacks poses critical security risks. Attackers can implant latent triggers that persist through downstream tasks, enabling malicious control of model behavior upon trigger presentation. Despite great success in recent defense mechanisms, they remain impractical due to strong assumptions about attacker knowledge or excessive clean data requirements.

In this paper, we introduce InverTune, the first backdoor defense framework for multimodal models under minimal attacker assumptions, requiring neither prior knowledge of attack targets nor access to the poisoned dataset. Unlike existing defense methods that rely on the same dataset used in the poisoning stage, InverTune effectively identifies and removes backdoor artifacts through three key components, achieving robust protection against backdoor attacks. Specifically, (1) InverTune first exposes attack signatures through adversarial simulation, probabilistically identifying the target label by analyzing model response patterns. (2) Building on this, we develop a gradient inversion technique to reconstruct latent triggers through activation pattern analysis. (3) Finally, a clustering-guided fine-tuning strategy is employed to erase the backdoor function with only a small amount of arbitrary clean data, while preserving the original model capabilities. Experimental results show that InverTune reduces the average attack success rate (ASR) by 97.87% against the state-of-the-art (SOTA) attacks while limiting clean accuracy (CA) degradation to just 3.07%. This work establishes a new paradigm for securing multimodal systems, advancing security in foundation model deployment without compromising performance.

View More Papers

ACTS: Attestations of Contents in TLS Sessions

Pierpaolo Della Monica (Sapienza University of Rome), Ivan Visconti (Sapienza University of Rome), Andrea Vitaletti (Sapienza University of Rome), Marco Zecchini (Sapienza University of Rome)

Read More

On the Difficulty of Selecting Few-Shot Examples for Effective...

Md Abdul Hannan (Colorado State University), Ronghao Ni (Carnegie Mellon University), Chi Zhang (Carnegie Mellon University), Limin Jia (Carnegie Mellon University), Ravi Mangal (Colorado State University), Corina S. Pasareanu (Carnegie Mellon University)

Read More

Icarus: Achieving Performant Asynchronous BFT with Only Optimistic Paths

Xiaohai Dai (Huazhong University of Science and Technology), Yiming Yu (Huazhong University of Science and Technology), Sisi Duan (Tsinghua University), Rui Hao (Wuhan University of Technology), Jiang Xiao (Huazhong University of Science and Technology), Hai Jin (Huazhong University of Science and Technology)

Read More