ALchemist: Fusing Application and Audit Logs for Precise Attack Provenance without Instrumentation

Le Yu (Purdue University), Shiqing Ma (Rutgers University), Zhuo Zhang (Purdue University), Guanhong Tao (Purdue University), Xiangyu Zhang (Purdue University), Dongyan Xu (Purdue University), Vincent E. Urias (Sandia National Laboratories), Han Wei Lin (Sandia National Laboratories), Gabriela Ciocarlie (SRI International), Vinod Yegneswaran (SRI International), Ashish Gehani (SRI International)

Cyber-attacks are becoming more persistent and complex. Most state-of-the-art attack forensics techniques either require annotating and instrumenting software applications or rely on high quality execution profile to serve as the basis for anomaly detection. We propose a novel attack forensics technique ALchemist. It is based on the observations that built-in application logs provide critical high-level semantics and audit log provides low-level fine-grained information; and the two share a lot of common elements. ALchemist is hence a log fusion technique that couples application logs and audit log to derive critical attack information invisible in either log. It is based on a relational reasoning engine Datalog and features the capabilities of inferring new relations such as the task structure of execution(e.g., tabs in firefox), especially in the presence of complex asynchronous execution models, and high-level dependencies between log events. Our evaluation on 15 popular applications including firefox, Chromium, and OpenOffice, and 14 APT attacks from the literature demonstrates that although ALchemist does not require instrumentation, it is highly effective in partitioning execution to autonomous tasks(in order to avoid bogus dependencies) and deriving precise attack provenance graphs, with very small overhead. It also outperforms NoDoze and OmegaLog, two state-of-art techniques that do not require instrumentation.