Wen-jie Lu (Ant Group), Zhicong Huang (Ant Group), Zhen Gu (Alibaba Group), Jingyu Li (Ant Group & Zhejiang University), Jian Liu (Zhejiang University), Cheng Hong (Ant Group), Kui Ren (Zhejiang University), Tao Wei (Ant Group), WenGuang Chen (Ant Group)

Large transformer-based models have realized state-of-the-art performance on lots of real-world tasks such as natural language processing and computer vision.
However, with the increasing sensitivity of the data and tasks they handle, privacy has become a major concern during model deployment.
In this work, we focus on private inference in two-party settings, where one party holds private inputs and the other holds the model.
We introduce BumbleBee, a fast and communication-friendly two-party private transformer inference system.
Our contributions are three-fold:
First, we propose optimized protocols for matrix multiplication, which significantly reduce communication costs by 80% -- 90% compared to previous techniques.
Secondly, we develop a methodology for constructing efficient protocols tailored to the non-linear activation functions employed in transformer models.
The proposed activation protocols have realized a significant enhancement in processing speed, alongside a remarkable reduction in communication costs by 80% -- 95% compared with two prior methods.
Lastly, we have performed extensive benchmarks on five transformer models.
BumbleBee demonstrates its capability by evaluating the LLaMA-7B model, generating one token in approximately 8 minutes using CPUs.
Our results further reveal that BumbleBee outperforms Iron (NeurIPS22) by over an order of magnitude and is three times faster than BOLT (Oakland24) with one-tenth communication.

View More Papers

Vision: Comparison of AI-assisted Policy Development Between Professionals and...

Rishika Thorat (Purdue University), Tatiana Ringenberg (Purdue University)

Read More

Horcrux: Synthesize, Split, Shift and Stay Alive; Preventing Channel...

Anqi Tian (Institute of Software, Chinese Academy of Sciences; School of Computer Science and Technology, University of Chinese Academy of Sciences), Peifang Ni (Institute of Software, Chinese Academy of Sciences; Zhongguancun Laboratory, Beijing, P.R.China), Yingzi Gao (Institute of Software, Chinese Academy of Sciences; University of Chinese Academy of Sciences), Jing Xu (Institute of Software, Chinese…

Read More

L-HAWK: A Controllable Physical Adversarial Patch Against a Long-Distance...

Taifeng Liu (Xidian University), Yang Liu (Xidian University), Zhuo Ma (Xidian University), Tong Yang (Peking University), Xinjing Liu (Xidian University), Teng Li (Xidian University), Jianfeng Ma (Xidian University)

Read More