Ruixuan Liu (Emory University), Toan Tran (Emory University), Tianhao Wang (University of Virginia), Hongsheng Hu (Shanghai Jiao Tong University), Shuo Wang (Shanghai Jiao Tong University), Li Xiong (Emory University)

As large language models increasingly memorize web-scraped training content, they risk exposing copyrighted or private information. Existing protections require compliance from crawlers or model developers, fundamentally limiting their effectiveness. We propose ExpShield, a proactive self-guard that mitigates memorization while maintaining readability via invisible perturbations, and we formulate it as a constrained optimization problem. Due to the lack of an individual-level risk metric for natural text, we first propose instance exploitation, a metric that measures how much training on a specific text increases the chance of guessing that text from a set of candidates—with zero indicating perfect defense. Directly solving the problem is infeasible for defenders without sufficient knowledge, thus we develop two effective proxy solutions: single-level optimization and synthetic perturbation. To enhance the defense, we reveal and verify the memorization trigger hypothesis, which can help to identify key tokens for memorization. Leveraging this insight, we design targeted perturbations that (i) neutralize inherent trigger tokens to reduce memorization and (ii) introduce artificial trigger tokens to misdirect model memorization. Experiments validate our defense across attacks, model scales, and tasks in language and vision-to-language modeling. Even with privacy backdoor, the Membership Inference Attack (MIA) AUC drops from 0.95 to 0.55 under the defense, and the instance exploitation approaches zero. This suggests that compared to the ideal no-misuse scenario, the risk of exposing a text instance remains nearly unchanged despite its inclusion in the training data.

View More Papers

ropbot: Reimaging Code Reuse Attack Synthesis

Kyle Zeng (Arizona State University), Moritz Schloegel (CISPA Helmholtz Center for Information Security), Christopher Salls (UC Santa Barbara), Adam Doupé (Arizona State University), Ruoyu Wang (Arizona State University), Yan Shoshitaishvili (Arizona State University), Tiffany Bao (Arizona State University)

Read More

NetRadar: Enabling Robust Carpet Bombing DDoS Detection

Junchen Pan (Tsinghua University), Lei Zhang (Zhongguancun Laboratory), Xiaoyong Si (Tencent Technology (Shenzhen) Company Limited), Jie Zhang (Tsinghua University), Xinggong Zhang (Peking University), Yong Cui (Tsinghua University)

Read More

Discovering Blind-Trust Vulnerabilities in PLC Binaries via State Machine...

Fangzhou Dong (Arizona State University), Arvind S Raj (Arizona State University), Efrén López-Morales (New Mexico State University), Siyu Liu (Arizona State University), Yan Shoshitaishvili (Arizona State University), Tiffany Bao (Arizona State University), Adam Doupé (Arizona State University), Muslum Ozgur Ozmen (Arizona State University), Ruoyu Wang (Arizona State University)

Read More