Wen-jie Lu (Ant Group), Zhicong Huang (Ant Group), Zhen Gu (Alibaba Group), Jingyu Li (Ant Group & Zhejiang University), Jian Liu (Zhejiang University), Cheng Hong (Ant Group), Kui Ren (Zhejiang University), Tao Wei (Ant Group), WenGuang Chen (Ant Group)

Large transformer-based models have realized state-of-the-art performance on lots of real-world tasks such as natural language processing and computer vision.
However, with the increasing sensitivity of the data and tasks they handle, privacy has become a major concern during model deployment.
In this work, we focus on private inference in two-party settings, where one party holds private inputs and the other holds the model.
We introduce BumbleBee, a fast and communication-friendly two-party private transformer inference system.
Our contributions are three-fold:
First, we propose optimized protocols for matrix multiplication, which significantly reduce communication costs by 80% -- 90% compared to previous techniques.
Secondly, we develop a methodology for constructing efficient protocols tailored to the non-linear activation functions employed in transformer models.
The proposed activation protocols have realized a significant enhancement in processing speed, alongside a remarkable reduction in communication costs by 80% -- 95% compared with two prior methods.
Lastly, we have performed extensive benchmarks on five transformer models.
BumbleBee demonstrates its capability by evaluating the LLaMA-7B model, generating one token in approximately 8 minutes using CPUs.
Our results further reveal that BumbleBee outperforms Iron (NeurIPS22) by over an order of magnitude and is three times faster than BOLT (Oakland24) with one-tenth communication.

View More Papers

Be Careful of What You Embed: Demystifying OLE Vulnerabilities

Yunpeng Tian (Huazhong University of Science and Technology), Feng Dong (Huazhong University of Science and Technology), Haoyi Liu (Huazhong University of Science and Technology), Meng Xu (University of Waterloo), Zhiniang Peng (Huazhong University of Science and Technology; Sangfor Technologies Inc.), Zesen Ye (Sangfor Technologies Inc.), Shenghui Li (Huazhong University of Science and Technology), Xiapu Luo…

Read More

MOBIDOJO: A Virtual Security Combat Platform for 5G Cellular...

Hyunwoo Lee (Ohio State University), Haohuang Wen (Ohio State University), Phillip Porras (SRI), Vinod Yegneswaran (SRI), Ashish Gehani (SRI), Prakhar Sharma (SRI), Zhiqiang Lin (Ohio State University)

Read More

Modeling End-User Affective Discomfort With Mobile App Permissions Across...

Yuxi Wu (Georgia Institute of Technology and Northeastern University), Jacob Logas (Georgia Institute of Technology), Devansh Ponda (Georgia Institute of Technology), Julia Haines (Google), Jiaming Li (Google), Jeffrey Nichols (Apple), W. Keith Edwards (Georgia Institute of Technology), Sauvik Das (Carnegie Mellon University)

Read More

NodeMedic-FINE: Automatic Detection and Exploit Synthesis for Node.js Vulnerabilities

Darion Cassel (Carnegie Mellon University), Nuno Sabino (IST & CMU), Min-Chien Hsu (Carnegie Mellon University), Ruben Martins (Carnegie Mellon University), Limin Jia (Carnegie Mellon University)

Read More