Marius Vangeli (KTH Royal Institute of Technology, Sweden), Joel Brynielsson (KTH Royal Institute of Technology, Sweden and FOI Swedish Defence Research Agency, Sweden), Mika Cohen (KTH Royal Institute of Technology, Sweden and FOI Swedish Defence Research Agency, Sweden), Farzad Kamrani (FOI Swedish Defence Research Agency, Sweden)

While large language model (LLM)-driven penetration testing is rapidly improving, autonomous agents still struggle with longer-duration multi-stage exploits. As agents perform reconnaissance, attempt exploits, and pivot through systems, the token context window fills up with exploration and failed attempts, degrading decision quality. We introduce context handoff for autonomous penetration testing (CHAP), a context-relay system for LLM-driven agents. CHAP enables agents to sustain long-running penetration tests by transferring accumulated knowledge as compact protocols to fresh agent instances.

We evaluate CHAP on an extended version of the AutoPen- Bench benchmark, targeting 11 real-world vulnerabilities. CHAP improved per-run success from 27.3% to 36.4% while reducing token expenditure by 32.4% compared to a baseline agent. We release our full implementation, benchmark enhancements, and a dataset of command logs with LLM reasoning traces.

View More Papers

WIP: Runtime Consistency Enforcement Between SBOM and Software Execution

Yuta Shimamoto (Okayama University, Okayama, Japan), Hiroyuki Uekawa (NTT Social Informatics Laboratories, Tokyo, Japan), Mitsuaki Akiyama (NTT Social Informatics Laboratories, Tokyo, Japan), Toshihiro Yamauchi (Okayama University, Okayama, Japan)

Read More

Indicator of Benignity: An Industry View of False Positive...

Daiping Liu (Palo Alto Networks, Inc.), Danyu Sun (University of California, Irvine), Zhenhua Chen (Palo Alto Networks, Inc.), Shu Wang (Palo Alto Networks, Inc.), Zhou Li (University of California, Irvine)

Read More

Cross-Boundary Mobile Tracking: Exploring Java-to-JavaScript Information Diffusion in WebViews

Sohom Datta (North Carolina State University, USA), Michalis Diamantaris (TTechnical University of Crete, Greece), Ahsan Zafar (North Carolina State University, USA), Junhua Su (North Carolina State University, USA), Anupam Das (North Carolina State University, USA), Jason Polakis (University of Illinois Chicago, USA), Alexandros Kapravelos (North Carolina State University, USA)

Read More