Prεεmpt: Sanitizing Sensitive Prompts for LLMs

Amrita Roy Chowdhury (University of Michigan, Ann Arbor), David Glukhov (University of Toronto), Divyam Anshumaan (University of Wisconsin), Prasad Chalasani (Langroid), Nicholas Papernot (University of Toronto), Somesh Jha (University of Wisconsin), Mihir Bellare (UCSD)

The rise of large language models (LLMs) has introduced new privacy challenges, particularly during textit{inference} where sensitive information in prompts may be exposed to proprietary LLM APIs. In this paper, we address the problem of formally protecting the sensitive information contained in a prompt while maintaining response quality. To this end, first, we introduce a cryptographically inspired notion of a textit{prompt sanitizer} which transforms an input prompt to protect its sensitive tokens. Second, we propose Pr$epsilonepsilon$mpt, a novel system that implements a prompt sanitizer, focusing on the sensitive information that can be derived solely from the individual tokens. Pr$epsilonepsilon$mpt categorizes sensitive tokens into two types: (1) those where the LLM's response depends solely on the format (such as SSNs, credit card numbers), for which we use format-preserving encryption (FPE); and (2) those where the response depends on specific values, (such as age, salary) for which we apply metric differential privacy (mDP). Our evaluation demonstrates that Pr$epsilonepsilon$mpt is a practical method to achieve meaningful privacy guarantees, while maintaining high utility compared to unsanitized prompts, and outperforming prior methods.

Paper

Slides

View More Papers

The Role of Privacy Guarantees in Voluntary Donation of...

Ruizhe Wang (University of Waterloo), Roberta De Viti (MPI-SWS), Aarushi Dubey (University of Washington), Elissa Redmiles (Georgetown University)

Crack in the Armor: Underlying Infrastructure Threats to RPKI...

Yunhao Liu (Tsinghua University & Zhongguancun Laboratory), Jessie Hui Wang (Tsinghua University & Zhongguancun Laboratory), Yuedong Xu (Fudan University), Zongpeng Li (Tsinghua University), Yangyang Wang (Tsinghua University & Zhongguancun Laboratory), Jilong Wang (Tsinghua University & Zhongguancun Laboratory)

LAPSE: Automatic, Formal Fault-Tolerant Correctness Proofs for Native Code

Charles Averill, Ilan Buzzetti (The University of Texas at Dallas), Alex Bellon (UC San Diego), Kevin Hamlen (The University of Texas at Dallas)