Dark Hazard: Learning-based, Large-Scale Discovery of Hidden Sensitive Operations in Android Apps
Author(s): Xiaorui Pan, Xueqiang Wang, Yue Duan, XiaoFeng Wang, Heng Yin
Download: Paper (PDF)
Date: 27 Feb 2017
Document Type: Reports
Additional Documents: Slides Video
Associated Event: NDSS Symposium 2017
Hidden sensitive operations (HSO) such as stealing privacy user data upon receiving an SMS message are increasingly utilized by mobile malware and other potentially-harmful apps (PHAs) to evade detection. Identification of such behaviors is hard, due to the challenge in triggering them during an app s runtime. Current static approaches rely on the trigger conditions or hidden behaviors known beforehand and therefore cannot capture previously unknown HSO activities. Also these techniques tend to be computationally intensive and therefore less suitable for analyzing a large number of apps. As a result, our understanding of real-world HSO today is still limited, not to mention effective means to mitigate this threat.
In this paper, we present HSOMINER, an innovative machinelearning based program analysis technique that enables a largescale discovery of unknown HSO activities. Our approach leverages a set of program features that characterize an HSO branch1 and can be relatively easy to extract from an app. These features summarize a set of unique observations about an HSO condition, its paths and the relations between them, and are designed to be general for finding hidden suspicious behaviors. Particularly, we found that a trigger condition is less likely to relate to the path of its branch through data flows or shared resources, compared with a legitimate branch. Also, the behaviors exhibited by the two paths of an HSO branch tend to be conspicuously different (innocent on one side and sinister on the other). Most importantly, even though these individual features are not sufficiently accurate for capturing HSO on their own, collectively they are shown to be highly effective in identifying such behaviors. This differentiating power is harnessed by HSOMINER to classify Android apps, which achieves a high precision (98%) and coverage (94%), and is also efficient as discovered in our experiments. The new tool was further used in a measurement study involving 338,354 realworld apps, the largest one ever conducted on suspicious hidden operations. Our research brought to light the pervasiveness of HSO activities, which are present in 18.7% of the apps we analyzed, surprising trigger conditions (e.g., click on a certain region of a view) and behaviors (e.g., hiding operations in a dynamically generated receiver), which help better understand the problem and contribute to more effective defense against this new threat to the mobile platform.