Chenxu Wang (Southern University of Science and Technology (SUSTech) and The Hong Kong Polytechnic University), Fengwei Zhang (Southern University of Science and Technology (SUSTech)), Yunjie Deng (Southern University of Science and Technology (SUSTech)), Kevin Leach (Vanderbilt University), Jiannong Cao (The Hong Kong Polytechnic University), Zhenyu Ning (Hunan University), Shoumeng Yan (Ant Group), Zhengyu He (Ant…

Confidential computing is an emerging technique that provides users and third-party developers with an isolated and transparent execution environment. To support this technique, Arm introduced the Confidential Computing Architecture (CCA), which creates multiple isolated address spaces, known as realms, to ensure data confidentiality and integrity in security-sensitive tasks. Arm recently proposed the concept of confidential computing on GPU hardware, which is widely used in general-purpose, high-performance, and artificial intelligence computing scenarios. However, hardware and firmware supporting confidential GPU workloads remain unavailable. Existing studies leverage Trusted Execution Environments (TEEs) to secure GPU computing on Arm- or Intel-based platforms, but they are not suitable for CCA's realm-style architecture, such as using incompatible hardware or introducing a large trusted computing base (TCB). Therefore, there is a need to complement existing Arm CCA capabilities with GPU acceleration.

To address this challenge, we present CAGE to support confidential GPU computing for Arm CCA. By leveraging the existing security features in Arm CCA, CAGE ensures data security during confidential computing on unified-memory GPUs, the mainstream accelerators in Arm devices. To adapt the GPU workflow to CCA's realm-style architecture, CAGE proposes a novel shadow task mechanism to manage confidential GPU applications flexibly. Additionally, CAGE leverages the memory isolation mechanism in Arm CCA to protect data confidentiality and integrity from the strong adversary. Based on this, CAGE also optimizes security operations in memory isolation to mitigate performance overhead. Without hardware changes, our approach uses the generic hardware security primitives in Arm CCA to defend against a privileged adversary. We present two prototypes to verify CAGE's functionality and evaluate performance, respectively. Results show that CAGE effectively provides GPU support for Arm CCA with an average of 2.45% performance overhead.

View More Papers

ReqsMiner: Automated Discovery of CDN Forwarding Request Inconsistencies and...

Linkai Zheng (Tsinghua University), Xiang Li (Tsinghua University), Chuhan Wang (Tsinghua University), Run Guo (Tsinghua University), Haixin Duan (Tsinghua University; Quancheng Laboratory), Jianjun Chen (Tsinghua University; Zhongguancun Laboratory), Chao Zhang (Tsinghua University; Zhongguancun Laboratory), Kaiwen Shen (Tsinghua University)

Read More

MacOS versus Microsoft Windows: A Study on the Cybersecurity...

Cem Topcuoglu (Northeastern University), Andrea Martinez (Florida International University), Abbas Acar (Florida International University), Selcuk Uluagac (Florida International University), Engin Kirda (Northeastern University)

Read More

AutoWatch: Learning Driver Behavior with Graphs for Auto Theft...

Paul Agbaje, Abraham Mookhoek, Afia Anjum, Arkajyoti Mitra (University of Texas at Arlington), Mert D. Pesé (Clemson University), Habeeb Olufowobi (University of Texas at Arlington)

Read More

DorPatch: Distributed and Occlusion-Robust Adversarial Patch to Evade Certifiable...

Chaoxiang He (Huazhong University of Science and Technology), Xiaojing Ma (Huazhong University of Science and Technology), Bin B. Zhu (Microsoft Research), Yimiao Zeng (Huazhong University of Science and Technology), Hanqing Hu (Huazhong University of Science and Technology), Xiaofan Bai (Huazhong University of Science and Technology), Hai Jin (Huazhong University of Science and Technology), Dongmei Zhang…

Read More