Qizhi Cai (Zhejiang University), Lingzhi Wang (Northwestern University), Yao Zhu (Zhejiang University), Zhipeng Chen (Zhejiang University), Xiangmin Shen (Hofstra University), Zhenyuan Li (Zhejiang University)

In recent years, provenance-based intrusion detection and forensic systems have attracted significant attention, leading to a rapid growth of related research efforts. However, progress in this area has been hindered by the long-standing lack of updated datasets and benchmarks. Existing datasets suffer from several critical limitations, including outdated attack techniques, short temporal scales, and incomplete or fragmented attack chains. As a result, they fail to capture the characteristics of the latest, real-world Advanced Persistent Threat (APT) attacks. Moreover, the unclear, coarse-grained attack procedures underlying existing datasets make accurate labeling and reliable evaluation difficult. Consequently, the absence of a comprehensive, up-to-date dataset has become a major bottleneck for the progress of this area. To address this, we present our efforts in building a large-scale, diverse, and well-annotated dataset for provenance-based intrusion analysis. Our dataset is generated using an automated attack emulation framework that incorporates recent attack techniques and supports fine-grained ground-truth labeling. Using this dataset, we conduct a comprehensive evaluation of state-of-the-art provenance-based intrusion detection systems, revealing weaknesses that cannot be effectively benchmarked with existing datasets. Our results demonstrate the dataset’s value in enabling clearer, more informative evaluations and highlight its potential to advance future research in provenance-based intrusion detection and graph-based security analysis.

View More Papers

The Dark Side of Flexibility: Detecting Risky Permission Chaining...

Xunqi Liu (State Key Laboratory of Integrated Services Networks, School of Cyber Engineering, Xidian University), Nanzi Yang (University of Minnesota), Chang Li (State Key Laboratory of Integrated Services Networks, School of Cyber Engineering, Xidian University), Jinku Li (State Key Laboratory of Integrated Services Networks, School of Cyber Engineering, Xidian University), Jianfeng Ma (State Key Laboratory…

Read More

RTCON: Context-Adaptive Function-Level Fuzzing for RTOS Kernels

Eunkyu Lee (KAIST School of Electrical Engineering), Junyoung Park (KAIST School of Electrical Engineering), Insu Yun (KAIST School of Electrical Engineering)

Read More

Privacy Starts with UI: Privacy Patterns and Designer Perspectives...

Anxhela Maloku (Technical University of Munich), Alexandra Klymenko (Technical University of Munich), Stephen Meisenbacher (Technical University of Munich), Florian Matthes (Technical University of Munich)

Read More