Chenxu Wang (Southern University of Science and Technology (SUSTech) and The Hong Kong Polytechnic University), Fengwei Zhang (Southern University of Science and Technology (SUSTech)), Yunjie Deng (Southern University of Science and Technology (SUSTech)), Kevin Leach (Vanderbilt University), Jiannong Cao (The Hong Kong Polytechnic University), Zhenyu Ning (Hunan University), Shoumeng Yan (Ant Group), Zhengyu He (Ant Group)

Confidential computing is an emerging technique that provides users and third-party developers with an isolated and transparent execution environment. To support this technique, Arm introduced the Confidential Computing Architecture (CCA), which creates multiple isolated address spaces, known as realms, to ensure data confidentiality and integrity in security-sensitive tasks. Arm recently proposed the concept of confidential computing on GPU hardware, which is widely used in general-purpose, high-performance, and artificial intelligence computing scenarios. However, hardware and firmware supporting confidential GPU workloads remain unavailable. Existing studies leverage Trusted Execution Environments (TEEs) to secure GPU computing on Arm- or Intel-based platforms, but they are not suitable for CCA's realm-style architecture, such as using incompatible hardware or introducing a large trusted computing base (TCB). Therefore, there is a need to complement existing Arm CCA capabilities with GPU acceleration.

To address this challenge, we present CAGE to support confidential GPU computing for Arm CCA. By leveraging the existing security features in Arm CCA, CAGE ensures data security during confidential computing on unified-memory GPUs, the mainstream accelerators in Arm devices. To adapt the GPU workflow to CCA's realm-style architecture, CAGE proposes a novel shadow task mechanism to manage confidential GPU applications flexibly. Additionally, CAGE leverages the memory isolation mechanism in Arm CCA to protect data confidentiality and integrity from the strong adversary. Based on this, CAGE also optimizes security operations in memory isolation to mitigate performance overhead. Without hardware changes, our approach uses the generic hardware security primitives in Arm CCA to defend against a privileged adversary. We present two prototypes to verify CAGE's functionality and evaluate performance, respectively. Results show that CAGE effectively provides GPU support for Arm CCA with an average of 2.45% performance overhead.

View More Papers

Not your Type! Detecting Storage Collision Vulnerabilities in Ethereum...

Nicola Ruaro (University of California, Santa Barbara), Fabio Gritti (University of California, Santa Barbara), Robert McLaughlin (University of California, Santa Barbara), Ilya Grishchenko (University of California, Santa Barbara), Christopher Kruegel (University of California, Santa Barbara), Giovanni Vigna (University of California, Santa Barbara)

Read More

Beyond the Surface: Uncovering the Unprotected Components of Android...

Hao Zhou (The Hong Kong Polytechnic University), Shuohan Wu (The Hong Kong Polytechnic University), Chenxiong Qian (University of Hong Kong), Xiapu Luo (The Hong Kong Polytechnic University), Haipeng Cai (Washington State University), Chao Zhang (Tsinghua University)

Read More

ShapFuzz: Efficient Fuzzing via Shapley-Guided Byte Selection

Kunpeng Zhang (Shenzhen International Graduate School, Tsinghua University), Xiaogang Zhu (Swinburne University of Technology), Xi Xiao (Shenzhen International Graduate School, Tsinghua University), Minhui Xue (CSIRO's Data61), Chao Zhang (Tsinghua University), Sheng Wen (Swinburne University of Technology)

Read More