Jiayun Xu (Singapore Management University), Yingjiu Li (University of Oregon), Robert H. Deng (Singapore Management University)

A common problem in machine learning-based malware detection is that training data may contain noisy labels and it is challenging to make the training data noise-free at a large scale. To address this problem, we propose a generic framework to reduce the noise level of training data for the training of any machine learning-based Android malware detection. Our framework makes use of all intermediate states of two identical deep learning classification models during their training with a given noisy training dataset and generate a noise-detection feature vector for each input sample. Our framework then applies a set of outlier detection algorithms on all noise-detection feature vectors to reduce the noise level of the given training data before feeding it to any machine learning based Android malware detection approach. In our experiments with three different Android malware detection approaches, our framework can detect significant portions of wrong labels in different training datasets at different noise ratios, and improve the performance of Android malware detection approaches.

View More Papers

SquirRL: Automating Attack Analysis on Blockchain Incentive Mechanisms with...

Charlie Hou (CMU, IC3), Mingxun Zhou (Peking University), Yan Ji (Cornell Tech, IC3), Phil Daian (Cornell Tech, IC3), Florian Tramèr (Stanford University), Giulia Fanti (CMU, IC3), Ari Juels (Cornell Tech, IC3)

Read More

FlowLens: Enabling Efficient Flow Classification for ML-based Network Security...

Diogo Barradas (INESC-ID, Instituto Superior Técnico, Universidade de Lisboa), Nuno Santos (INESC-ID, Instituto Superior Técnico, Universidade de Lisboa), Luis Rodrigues (INESC-ID, Instituto Superior Técnico, Universidade de Lisboa), Salvatore Signorello (LASIGE, Faculdade de Ciências, Universidade de Lisboa), Fernando M. V. Ramos (INESC-ID, Instituto Superior Técnico, Universidade de Lisboa), André Madeira (INESC-ID, Instituto Superior Técnico, Universidade de…

Read More

Processing Dangerous Paths – On Security and Privacy of...

Jens Müller (Ruhr University Bochum), Dominik Noss (Ruhr University Bochum), Christian Mainka (Ruhr University Bochum), Vladislav Mladenov (Ruhr University Bochum), Jörg Schwenk (Ruhr University Bochum)

Read More

Data Poisoning Attacks to Deep Learning Based Recommender Systems

Hai Huang (Tsinghua University), Jiaming Mu (Tsinghua University), Neil Zhenqiang Gong (Duke University), Qi Li (Tsinghua University), Bin Liu (West Virginia University), Mingwei Xu (Tsinghua University)

Read More