XiangFan Wu (Ocean University of China; QI-ANXIN Technology Research Institute), Lingyun Ying (QI-ANXIN Technology Research Institute), Guoqiang Chen (QI-ANXIN Technology Research Institute), Yacong Gu (Tsinghua University; Tsinghua University-QI-ANXIN Group JCNS), Haipeng Qu (Department of Computer Science and Technology, Ocean University of China)

Large Language Models (LLMs) are rapidly reshaping digital interactions. Their performance and efficiency are critically dependent on advanced caching mechanisms, such as prefix caching and semantic caching.
However, these mechanisms introduce a new attack surface. Unlike prior work focused on LLMs poisoning attacks during the training phase, this paper presents the first comprehensive investigation into cache-related security risks that arise during the LLM inference-time.

We conducted a systematic study of the cache implementations in mainstream LLM serving frameworks and then identified six novel attack vectors categorized as: (1) User-oriented Fraud Attacks, which manipulate cache entries to deliver malicious content to users via prefix cache collisions and semantic fuzzy poisoning; and (2) System Integrity Attacks, which exploit cache vulnerabilities to bypass security checks, such as using block-wise or multimodal collisions to evade content moderation.
Our experiments on leading open-source frameworks validated these attack vectors and evaluated their impact and cost.
Furthermore, we proposed five multilayer defense strategies and assessed their effectiveness.
We responsibly disclosed our findings to affected vendors, including vLLM, SGLang, GPTCache, AIBrix, rtp-llm and LMDeploy. All of them have acknowledged the vulnerabilities, and notably, vLLM, GPTCache, and AIBrix have adopted our proposed mitigation methods and fixed their vulnerabilities.
Our findings underscore the importance of secure the caching infrastructure in the rapidly expanding LLM ecosystem.

View More Papers

G-Prove: Gossip-Based Provenance for Scalable Detection of Cross-Domain Flow...

Moustapha Awwalou Diouf (SnT, University of Luxembourg), Maimouna Tamah Diao (SnT, University of Luxembourg), El-hacen Diallo (SnT, University of Luxembourg), Samuel Ouya (Cheikh Hamidou KANE Digital University), Jacques Klein (SnT, University of Luxembourg), Tegawendé F. Bissyandé (SnT, University of Luxembourg)

Read More

Limitless Scalability: A High-Throughput and Replica-Agnostic BFT Consensus

Chenyu Zhang (Tianjin University), Xiulong Liu (Tianjin University), Hao Xu (Tianjin University), Haochen Ren (Tianjin University), Muhammad Shahzad (North Carolina State University), Guyue Liu (Peking University), Keqiu Li (Tianjin University)

Read More

Mapping the Cloud: A Mixed-Methods Study of Cloud Security...

Sumair Ijaz Hashmi (CISPA Helmholtz Center for Information Security, Saarland University), Shafay Kashif (The University of Auckland), Lea Gröber (Lahore University of Management Sciences), Katharina Krombholz (CISPA Helmholtz Center for Information Security), Mobin Javed (Lahore University of Management Sciences)

Read More