Jiongchi Yu (Singapore Management University), Xiaofei Xie (Singapore Management University), Qiang Hu (Tianjin University), Yuhan Ma (Tianjin University), Ziming Zhao (Zhejiang University)

Insider threats represent a significant and persistent security risk, yet remain difficult to detect in complex enterprise environments, where malicious activities are often concealed within subtle user behaviors. While machine-learning–based insider threat detection (ITD) techniques have shown promising results, their effectiveness is fundamentally constrained by the lack of high-quality and realistic training data. This challenge stems from the highly sensitive nature of enterprise internal data that is rarely accessible and from the limitations of existing datasets, where public datasets are typically small in scale, and synthetic datasets often lack sufficient generalization, rich semantic context, and realistic behavioral patterns.

To address this challenge, we propose Chimera, a large language model (LLM)-based multi-agent framework that automatically simulates both benign and malicious insider activities and monitors comprehensive system logs across diverse enterprise environments. Chimera models each agent as an individual employee with fine-grained roles and incorporates group meetings, pairwise interactions, and self-organized scheduling to capture realistic organizational dynamics. Based on 15 insider attack types abstracted from real-world incidents, we deploy Chimera in three representative data-sensitive organizational scenarios and construct a new dataset, ChimeraLog, for supporting the development and evaluation of ITD methods.

We evaluate ChimeraLog through comprehensive human studies and quantitative analyses, demonstrating its diversity and realism. Experiments with existing ITD methods show that detection performance on ChimeraLog is substantially lower than existing ITD datasets, indicating a more challenging and realistic benchmark. Despite distribution shifts, ITD models trained on ChimeraLog exhibit strong generalization capability, highlighting the practical value of LLM-based multi-agent simulation for advancing ITD.

View More Papers

Revisiting Differentially Private Hyper-parameter Tuning

Zihang Xiang (KAUST), Tianhao Wang (University of Virginia), Cheng-Long Wang (KAUST), Di Wang (KAUST)

Read More

Distributed Broadcast Encryption for Confidential Interoperability across Private Blockchains

Angelo De Caro (IBM Research Zurich), Kaoutar Elkhiyaoui (IBM Research Zurich), Sandeep Nishad (IBM Research India), Sikhar Patranabis (IBM Research India), Venkatraman Ramakrishna (IBM Research India)

Read More

Beyond Jailbreak: Unveiling Risks in LLM Applications Arising from...

Yunyi Zhang (Tsinghua University), Shibo Cui (Tsinghua University), Baojun Liu (Tsinghua University), Jingkai Yu (Tsinghua University), Min Zhang (National University of Defense Technology), Fan Shi (National University of Defense Technology), Han Zheng (TrustAl Pte. Ltd.)

Read More