Akshat Singh Jaswal (Stux Labs), Ashish Baghel (Stux Labs)

Modern web applications are increasingly produced through AI-assisted development and rapid no-code deployment pipelines, widening the gap between accelerating software velocity and the limited adaptability of existing security tooling. Pattern-driven scanners fail to reason about novel contexts, while emerging LLM-based penetration testers rely on unconstrained exploration, yielding high cost, unstable behavior, and poor reproducibility.

We introduce AWE, a memory-augmented multi-agent framework for autonomous web penetration testing that embeds structured, vulnerability-specific analysis pipelines within a lightweight LLM orchestration layer. Unlike general-purpose agents, AWE couples context aware payload mutations and generations with persistent memory and browser-backed verification to produce deterministic, exploitation-driven results.

Evaluated on the 104-challenge XBOW benchmark, AWE achieves substantial gains on injection-class vulnerabilities - 87% XSS success (+30.5% over MAPTA) and 66.7% blind SQL injection success (+33.3%) - while being much faster, cheaper, and more token-efficient than MAPTA, despite using a midtier model (Claude Sonnet 4) versus MAPTA’s GPT-5. MAPTA retains higher overall coverage due to broader exploratory capabilities, underscoring the complementary strengths of specialized and general-purpose architectures. Our results demonstrate that architecture matters as much as model reasoning capabilities: integrating LLMs into principled, vulnerability-aware pipelines yields substantial gains in accuracy, efficiency, and determinism for injection-class exploits. The source code for AWE is available at: https://github.com/stuxlabs/AWE

View More Papers

CtPhishCapture: Uncovering Credential-Theft-Based Phishing Scams Targeting Cryptocurrency Wallets

Hui Jiang (Tsinghua University and Baidu Inc), Zhenrui Zhang (Baidu Inc), Xiang Li (Nankai University), Yan Li (Tsinghua University), Anpeng Zhou (Tsinghua University), Chenghui Wu (Baidu Inc), Man Hou (Zhongguancun Laboratory), Jia Zhang (Tsinghua University), Zongpeng Li (Tsinghua University)

Read More

LOKI: Proactively Discovering Online Scam Websites by Mining Toxic...

Pujan Paudel (Boston University), Gianluca Stringhini (Boston University)

Read More

CoordMail: Exploiting SMTP Timeout and Command Interaction to Coordinate...

Ruixuan Li (Tsinghua University and Beijing National Research Center for Information Science and Technology), Chaoyi Lu (Zhongguancun Laboratory), Baojun Liu (Tsinghua University and Beijing National Research Center for Information Science and Technology), Yanzhong Lin (Coremail Technology Co. Ltd), Qingfeng Pan (Coremail Technology Co. Ltd), Jun Shao (Zhejiang Gongshang University and Zhejiang Key Laboratory of Big…

Read More