Evan Li (Northeastern University), Tushin Mallick (Northeastern University), Evan Rose (Northeastern University), William Robertson (Northeastern University), Alina Oprea (Northeastern University), Cristina Nita-Rotaru (Northeastern University)

LLM-integrated app systems extend the utility of Large Language Models (LLMs) with third-party apps that are invoked by a system LLM using interleaved planning and execution phases to answer user queries. These systems introduce new attack vectors where malicious apps can cause integrity violation of planning or execution, availability breakdown, or privacy compromise during execution.

In this work, we identify new attacks impacting the integrity of planning, as well as the integrity and availability of execution in LLM-integrated apps, and demonstrate them against IsolateGPT, a recent solution designed to mitigate attacks from malicious apps. We propose Abstract-Concrete-Execute (ACE), a new secure architecture for LLM-integrated app systems that provides security guarantees for system planning and execution. Specifically, ACE decouples planning into two phases by first creating an abstract execution plan using only trusted information, and then mapping the abstract plan to a concrete plan using installed system apps. We verify that the plans generated by our system satisfy user-specified secure information flow constraints via static analysis on the structured plan output. During execution, ACE enforces data and capability barriers between apps, and ensures that the execution is conducted according to the trusted abstract plan. We show experimentally that ACE is secure against attacks from the InjecAgent and Agent Security Bench benchmarks for indirect prompt injection, and our newly introduced attacks. We also evaluate the utility of ACE in realistic environments, using the Tool Usage suite from the LangChain benchmark. Our architecture represents a significant advancement towards hardening LLM-based systems using system security principles.

View More Papers

BACnet or “BADnet”? On the (In)Security of Implicitly Reserved...

Qiguang Zhang (Southeast University), Junzhou Luo (Southeast University, Fuyao University of Science and Technology), Zhen Ling (Southeast University), Yue Zhang (Shandong University), Chongqing Lei (Southeast University), Christopher Morales (University of Massachusetts Lowell), Xinwen Fu (University of Massachusetts Lowell)

Read More

PrivORL: Differentially Private Synthetic Dataset for Offline Reinforcement Learning

Chen GONG (University of Virginia), Zheng Liu (University of Virginia), Kecen Li (University of Virginia), Tianhao Wang (University of Virginia)

Read More

Paladin: Defending LLM-enabled Phishing Emails with a New Trigger-Tag...

Yan Pang (University of Virginia), Wenlong Meng (University of Virginia), Xiaojing Liao (Indiana University Bloomington), Tianhao Wang (University of Virginia)

Read More