Yinan Zhong (Zhejiang University), Qianhao Miao (Zhejiang University), Yanjiao Chen (Zhejiang University), Jiangyi Deng (Zhejiang University), Yushi Cheng (Zhejiang University), Wenyuan Xu (Zhejiang University)

Large Language Models (LLMs) have been integrated into many applications (e.g., web agents) to perform more sophisticated tasks. However, LLM-empowered applications are vulnerable to Indirect Prompt Injection (IPI) attacks, where instructions are injected via untrustworthy external data sources. This paper presents Rennervate, a defense framework to detect and prevent IPI attacks. Rennervate leverages attention features to detect the covert injection at a fine-grained token level, enabling precise sanitization that neutralizes IPI attacks while maintaining LLM functionalities. Specifically, the token-level detector is materialized with a 2-step attentive pooling mechanism, which aggregates attention heads and response tokens for IPI detection and sanitization. Moreover, we establish a fine-grained IPI dataset, FIPI, to be open-sourced to support further research. Extensive experiments verify that Rennervate outperforms 15 commercial and academic IPI defense methods, achieving high precision on 5 LLMs and 6 datasets. We also demonstrate that Rennervate is transferable to unseen attacks and robust against adaptive adversaries.

View More Papers

Losing the Beat: Understanding and Mitigating Desynchronization Risks in...

Zhi Li (Huazhong University of Science and Technology), Zhen Xu (Huazhong University of Science and Technology), Weijie Liu (Nankai University), XiaoFeng Wang (Nanyang Technological University), Hai Jin (Huazhong University of Science and Technology), Zheli Liu (Nankai University)

Read More

Select-Then-Compute: Encrypted Label Selection and Analytics over Distributed Datasets...

Nirajan Koirala (University of Notre Dame), Seunghun Paik (Hanyang University), Sam Martin (University of Notre Dame), Helena Berens (University of Notre Dame), Tasha Januszewicz (University of Notre Dame), Jonathan Takeshita (Old Dominion University), Jae Hong Seo (Hanyang University), Taeho Jung (University of Notre Dame)

Read More

Cross-Consensus Reliable Broadcast and its Applications

Yue Huang (Tsinghua University), Xin Wang (Tsinghua University), Haibin Zhang (Yangtze Delta Region Institute of Tsinghua University, Zhejiang), Sisi Duan (Tsinghua University)

Read More