Anna Ablove (University of Michigan), Shreyas Chandrashekaran (University of Michigan), Xiao Qiang (University of California at Berkeley), Roya Ensafi (University of Michigan)

From network-level censorship by the Great Firewall to platform-specific mechanisms implemented by third-party services like TOM-Skype and WeChat, Internet censorship in China has continually evolved in response to new technologies. In the current era of AI, emerging tools like Large Language Models (LLMs) are no exception. Yet, ensuring compliance with China’s strict, legally mandated censorship standards presents a unique and complex challenge for service providers. While current research on content moderation in LLMs is primarily focused on alignment techniques, their lack of reliability prevents sufficient compliance with strictly enforced information controls.

In this work, we present the first study of overt blocking embedded in Chinese LLM services. We leverage information leaks in the communication between the server and client during active chat sessions and aim to extract where blocking decisions are embedded within the LLM services' workflow. We observe a persistent reliance on traditional, dated blocking strategies in prominent services: Baidu-Chat, DeepSeek, Doubao, Kimi, and Qwen. We find blocking placements during the input, output, and search phases, with the latter two leaking varying amounts of censored information to client machines, including near-complete responses and search references not rendered in the browser.

Seeing the need to balance competition on the global stage with homegrown censorship restrictions, we observe in real time the concessions made by service providers hosting models at war with themselves. Through this work, we emphasize the importance of a more holistic threat model of LLM content accessibility, integrating live deployments to study access as it pertains to real world usage, especially in heavily censored regions.

View More Papers

Poster: Crowdsourcing and Mapping COSPAS-SARSAT 406 MHz Distress Beacons

Ahsan Saleem (Faculty of Information Technology, University of Jyvaskyla, Finland), Andrei Costin (Faculty of Information Technology, University of Jyvaskyla, Finland), Guillermo Suarez-Tangil (IMDEA Networks Institute, Madrid, Spain)

Read More

TBTrackerX: Fantastic Trigger Bots and Where to Find Malicious...

Mohammad Majid Akhtar (School of Computer Science and Engineering, University of New South Wales, Sydney, Australia), Rahat Masood (School of Computer Science and Engineering, University of New South Wales, Sydney, Australia), Muhammad Ikram (School of Computing, Macquarie University, Sydney, Australia), Salil S. Kanhere (School of Computer Science and Engineering, University of New South Wales, Sydney,…

Read More

STIP: Three-Party Privacy-Preserving and Lossless Inference for Large Transformers...

Mu Yuan (The Chinese University of Hong Kong), Lan Zhang (University of Science and Technology of China), Yihang Cheng (University of Science and Technology of China), Miao-Hui Song (University of Science and Technology of China), Guoliang Xing (The Chinese University of Hong Kong), Xiang-Yang Li (University of Science and Technology of China)

Read More