Qiao Zhang (Old Dominion University), Chunsheng Xin (Old Dominion University), Hongyi Wu (Old Dominion University)

Machine Learning as a Service (MLaaS) is enabling a wide range of smart applications on end devices. However, privacy still remains a fundamental challenge. The schemes that exploit Homomorphic Encryption (HE)-based linear computations and Garbled Circuit (GC)-based nonlinear computations have demonstrated superior performance to enable privacy-preserved MLaaS. Nevertheless, there is still a significant gap in the computation speed. Our investigation has found that the HE-based linear computation dominates the total computation time for state-of-the-art deep neural networks. Furthermore, the most time-consuming component of the HE-based linear computation is a series of Permutation (Perm) operations that are imperative for dot product and convolution in privacy-preserved MLaaS. This work focuses on a deep optimization of the HE-based linear computations to minimize the Perm operations, thus substantially reducing the overall computation time. To this end, we propose GALA: Greedy computAtion for Linear Algebra in privacy-preserved neural networks, which views the HE-based linear computation as a series of Homomorphic Add, Mult and Perm operations and chooses the least expensive operation in each linear computation step to reduce the overall cost. GALA makes the following contributions: (1) It introduces a row-wise weight matrix encoding and combines the share generation that is needed for the GC-based nonlinear computation, to reduce the Perm operations for the dot product; (2) It designs a firstAdd-second-Perm approach (named kernel grouping) to reduce Perm operations for convolution. As such, GALA efficiently reduces the cost for the HE-based linear computation, which is a critical building block in almost all of the recent frameworks for privacy-preserved neural networks, including GAZELLE (Usenix Security’18), DELPHI (Usenix Security’20), and CrypTFlow2 (CCS’20). With its deep optimization of the HE-based linear computation, GALA can be a plug-and-play module integrated into these systems to further boost their efficiency. Our experiments show that it achieves a significant speedup up to 700× for the dot product and 14× for the convolution computation under different data dimensions. Meanwhile, GALA demonstrates an encouraging runtime boost by 2.5×, 2.7×, 3.2×, 8.3×, 7.7×, and 7.5× over GAZELLE and 6.5×, 6×, 5.7×, 4.5×, 4.2×, and 4.1× over CrypTFlow2, on AlexNet, VGG, ResNet-18, ResNet-50, ResNet-101, and ResNet-152, respectively.

View More Papers

OblivSketch: Oblivious Network Measurement as a Cloud Service

Shangqi Lai (Monash University), Xingliang Yuan (Monash University), Joseph K. Liu (Monash University), Xun Yi (RMIT University), Qi Li (Tsinghua University), Dongxi Liu (Data61, CSIRO), Surya Nepal (Data61, CSIRO)

Read More

Demo #10: Security of Deep Learning based Automated Lane...

Takami Sato, Junjie Shen, Ningfei Wang (UC Irvine), Yunhan Jia (ByteDance), Xue Lin (Northeastern University), and Qi Alfred Chen (UC Irvine)

Read More

Demo #4: Attacking Tesla Model X’s Autopilot Using Compromised...

Ben Nassi (Ben-Gurion University of the Negev), Yisroel Mirsky (Ben-Gurion University of the Negev, Georgia Tech), Dudi Nassi, Raz Ben Netanel (Ben-Gurion University of the Negev), Oleg Drokin (Independent Researcher), and Yuval Elovici (Ben-Gurion University of the Negev) Best Demo Award Winner ($300 cash prize)!

Read More

Awakening the Web's Sleeper Agents: Misusing Service Workers for...

Soroush Karami (University of Illinois at Chicago), Panagiotis Ilia (University of Illinois at Chicago), Jason Polakis (University of Illinois at Chicago)

Read More