Wen-jie Lu (Ant Group), Zhicong Huang (Ant Group), Zhen Gu (Alibaba Group), Jingyu Li (Ant Group & Zhejiang University), Jian Liu (Zhejiang University), Cheng Hong (Ant Group), Kui Ren (Zhejiang University), Tao Wei (Ant Group), WenGuang Chen (Ant Group)

Large transformer-based models have realized state-of-the-art performance on lots of real-world tasks such as natural language processing and computer vision.
However, with the increasing sensitivity of the data and tasks they handle, privacy has become a major concern during model deployment.
In this work, we focus on private inference in two-party settings, where one party holds private inputs and the other holds the model.
We introduce BumbleBee, a fast and communication-friendly two-party private transformer inference system.
Our contributions are three-fold:
First, we propose optimized protocols for matrix multiplication, which significantly reduce communication costs by 80% -- 90% compared to previous techniques.
Secondly, we develop a methodology for constructing efficient protocols tailored to the non-linear activation functions employed in transformer models.
The proposed activation protocols have realized a significant enhancement in processing speed, alongside a remarkable reduction in communication costs by 80% -- 95% compared with two prior methods.
Lastly, we have performed extensive benchmarks on five transformer models.
BumbleBee demonstrates its capability by evaluating the LLaMA-7B model, generating one token in approximately 8 minutes using CPUs.
Our results further reveal that BumbleBee outperforms Iron (NeurIPS22) by over an order of magnitude and is three times faster than BOLT (Oakland24) with one-tenth communication.

View More Papers

A New PPML Paradigm for Quantized Models

Tianpei Lu (The State Key Laboratory of Blockchain and Data Security, Zhejiang University), Bingsheng Zhang (The State Key Laboratory of Blockchain and Data Security, Zhejiang University), Xiaoyuan Zhang (The State Key Laboratory of Blockchain and Data Security, Zhejiang University), Kui Ren (The State Key Laboratory of Blockchain and Data Security, Zhejiang University)

Read More

RContainer: A Secure Container Architecture through Extending ARM CCA...

Qihang Zhou (Institute of Information Engineering, Chinese Academy of Sciences), Wenzhuo Cao (Institute of Information Engineering, Chinese Academy of Sciences; School of Cyberspace Security, University of Chinese Academy of Sciences), Xiaoqi Jia (Institute of Information Engineering, Chinese Academy of Sciences), Peng Liu (The Pennsylvania State University, USA), Shengzhi Zhang (Department of Computer Science, Metropolitan College,…

Read More

“I’m 73, you can’t expect me to have multiple...

Ashley Sheil (Munster Technological University), Jacob Camilleri (Munster Technological University), Michelle O Keeffe (Munster Technological University), Melanie Gruben (Munster Technological University), Moya Cronin (Munster Technological University) and Hazel Murray (Munster Technological University)

Read More

Towards Anonymous Chatbots with (Un)Trustworthy Browser Proxies

Dzung Pham, Jade Sheffey, Chau Minh Pham, and Amir Houmansadr (University of Massachusetts Amherst)

Read More