NDSS

CloudLeak: Large-Scale Deep Learning Models Stealing Through Adversarial Examples

Honggang Yu (University of Florida), Kaichen Yang (University of Florida), Teng Zhang (University of Central Florida), Yun-Yun Tsai (National Tsing Hua University), Tsung-Yi Ho (National Tsing Hua University), Yier Jin (University of Florida)

Cloud-based Machine Learning as a Service (MLaaS) is gradually gaining acceptance as a reliable solution to various real-life scenarios. These services typically utilize Deep Neural Networks (DNNs) to perform classification and detection tasks and are accessed through Application Programming Interfaces (APIs). Unfortunately, it is possible for an adversary to steal models from cloud-based platforms, even with black-box constraints, by repeatedly querying the public prediction API with malicious inputs. In this paper, we introduce an effective and efficient black-box attack methodology that extracts largescale DNN models from cloud-based platforms with near-perfect performance. In comparison to existing attack methods, we significantly reduce the number of queries required to steal the target model by incorporating several novel algorithms, including active learning, transfer learning, and adversarial attacks. During our experimental evaluations, we validate our proposed model for conducting theft attacks on various commercialized MLaaS platforms including two Microsoft Custom Vision APIs (Microsoft Traffic Recognition API and Microsoft Flower Recognition API), the Face++ Emotion Recognition API, the IBM Watson Visual Recognition API, Google AutoML API, and the Clarifai Safe for Work (NSFW) API. Our results demonstrate that the proposed method can easily reveal/steal large-scale DNN models from these cloud platforms. Further, the proposed attack method can also be used to accurately evaluates the robustness of DNN based MLaaS image classifiers against theft attacks.