Qi Wang (University of Illinois Urbana-Champaign), Wajih Ul Hassan (University of Illinois Urbana-Champaign), Ding Li (NEC Laboratories America, Inc.), Kangkook Jee (University of Texas at Dallas), Xiao Yu (NEC Laboratories America, Inc.), Kexuan Zou (University Of Illinois Urbana-Champaign), Junghwan Rhee (NEC Laboratories America, Inc.), Zhengzhang Chen (NEC Laboratories America, Inc.), Wei Cheng (NEC Laboratories America, Inc.), Carl A. Gunter (University of Illinois Urbana-Champaign), Haifeng Chen (NEC Laboratories America, Inc.)

To subvert recent advances in perimeter and host security, the attacker community has developed and employed various attack vectors to make a malware much more stealthy than before to penetrate the target system and prolong its presence. The advanced malware, or stealthy malware, impersonates or abuses benign applications and legitimate system tools to minimize its footprints in the target system. One example of such stealthy malware is fileless malware, which resides its malicious logic mostly in the memory of well-trusted processes. It is difficult for traditional detection tools, such as malware scanners, to detect it, as the malware normally does not expose its malicious payload in a file and hides its malicious behaviors among the benign behaviors of the processes.

In this paper, we present PROVDETECTOR, a provenance-based approach for detecting stealthy malware. The intuition behind PROVDETECTOR is that although a stealthy malware may impersonate or abuse a benign process, it still exposes its malicious behaviors in the OS (operating system) level provenance. Based on this intuition, PROVDETECTOR first employs a novel selection algorithm to identify possibly malicious parts in the OS level provenance data of a process. Then, it applies a neural embedding and machine learning pipeline to automatically detect any behavior that deviates significantly from normal behaviors. We evaluate our approach on a large provenance dataset from an enterprise network and demonstrate that it achieves very high detection performance (an average F1 score of 0.974) of stealthy malware. Further, we conduct thorough interpretability studies to understand the internals of the learned machine learning models.

View More Papers

A Practical Approach for Taking Down Avalanche Botnets Under...

Victor Le Pochat (imec-DistriNet, KU Leuven), Tim Van hamme (imec-DistriNet, KU Leuven), Sourena Maroofi (Univ. Grenoble Alpes, CNRS, Grenoble INP, LIG), Tom Van Goethem (imec-DistriNet, KU Leuven), Davy Preuveneers (imec-DistriNet, KU Leuven), Andrzej Duda (Univ. Grenoble Alpes, CNRS, Grenoble INP, LIG), Wouter Joosen (imec-DistriNet, KU Leuven), Maciej Korczyński (Univ. Grenoble Alpes, CNRS, Grenoble INP, LIG)

Read More

Data-Driven Debugging for Functional Side Channels

Saeid Tizpaz-Niari (University of Colorado Boulder), Pavol Černý (TU Wien), Ashutosh Trivedi (University of Colorado Boulder)

Read More

HotFuzz: Discovering Algorithmic Denial-of-Service Vulnerabilities Through Guided Micro-Fuzzing

William Blair (Boston University), Andrea Mambretti (Northeastern University), Sajjad Arshad (Northeastern University), Michael Weissbacher (Northeastern University), William Robertson (Northeastern University), Engin Kirda (Northeastern University), Manuel Egele (Boston University)

Read More

Prevalence and Impact of Low-Entropy Packing Schemes in the...

Alessandro Mantovani (EURECOM), Simone Aonzo (University of Genoa), Xabier Ugarte-Pedrero (Cisco Systems), Alessio Merlo (University of Genoa), Davide Balzarotti (EURECOM)

Read More