Jiayun Xu (Singapore Management University), Yingjiu Li (University of Oregon), Robert H. Deng (Singapore Management University)

A common problem in machine learning-based malware detection is that training data may contain noisy labels and it is challenging to make the training data noise-free at a large scale. To address this problem, we propose a generic framework to reduce the noise level of training data for the training of any machine learning-based Android malware detection. Our framework makes use of all intermediate states of two identical deep learning classification models during their training with a given noisy training dataset and generate a noise-detection feature vector for each input sample. Our framework then applies a set of outlier detection algorithms on all noise-detection feature vectors to reduce the noise level of the given training data before feeding it to any machine learning based Android malware detection approach. In our experiments with three different Android malware detection approaches, our framework can detect significant portions of wrong labels in different training datasets at different noise ratios, and improve the performance of Android malware detection approaches.

View More Papers

QPEP: An Actionable Approach to Secure and Performant Broadband...

James Pavur (Oxford University), Martin Strohmeier (armasuisse), Vincent Lenders (armasuisse), Ivan Martinovic (Oxford University)

Read More

FARE: Enabling Fine-grained Attack Categorization under Low-quality Labeled Data

Junjie Liang (The Pennsylvania State University), Wenbo Guo (The Pennsylvania State University), Tongbo Luo (Robinhood), Vasant Honavar (The Pennsylvania State University), Gang Wang (University of Illinois at Urbana-Champaign), Xinyu Xing (The Pennsylvania State University)

Read More

A Formal Analysis of the FIDO UAF Protocol

Haonan Feng (Beijing University of Posts and Telecommunications), Hui Li (Beijing University of Posts and Telecommunications), Xuesong Pan (Beijing University of Posts and Telecommunications), Ziming Zhao (University at Buffalo)

Read More

From Library Portability to Para-rehosting: Natively Executing Microcontroller Software...

Wenqiang Li (State Key Laboratory of Information Security, Institute of Information Engineering, Chinese Academy of Sciences; Department of Computer Science, the University of Georgia, USA; School of Cyber Security, University of Chinese Academy of Sciences; Department of Electrical Engineering and Computer Science, the University of Kansas, USA), Le Guan (Department of Computer Science, the University…

Read More