Local LLMs for NL2Bash: A Large-Scale Open-Source Model Evaluation for Bash Command Generation

Jef Jacobs (DistriNet, KU Leuven), Jorn Lapon (DistriNet, KU Leuven), Vincent Naessens (DistriNet, KU Leuven)

Large Language Models (LLMs) are increasingly used as autonomous agents in domains such as cybersecurity and system administration. The performance of these agents depends heavily on their ability to interact effectively with operating systems, often through Bash commands. Current implementations primarily rely on proprietary cloud-based models, which raise privacy and data confidentiality concerns when deployed in real-world environments. Locally hosted open-source LLMs offer a promising alternative, but their performance for such tasks remains unclear.

This paper presents an empirical evaluation of 22 opensource language models (ranging from 1B to 32B parameters) on Natural Language–to–Bash translation tasks. We introduce an improved scoring system for assessing task success and analyze performance under 10 distinct prompting techniques. Our findings show that Qwen3 models achieve strong results in NL2Bash tasks, that role-play prompting significantly benefits most models, and Chain-of-Thought and RAG can surprisingly hurt local model performance if not carefully designed. We further observe that the impact of prompting strategies varies with model size.

Paper

View More Papers

CtPhishCapture: Uncovering Credential-Theft-Based Phishing Scams Targeting Cryptocurrency Wallets

Hui Jiang (Tsinghua University and Baidu Inc), Zhenrui Zhang (Baidu Inc), Xiang Li (Nankai University), Yan Li (Tsinghua University), Anpeng Zhou (Tsinghua University), Chenghui Wu (Baidu Inc), Man Hou (Zhongguancun Laboratory), Jia Zhang (Tsinghua University), Zongpeng Li (Tsinghua University)

Causal-Guided Detoxify Backdoor Attack of Open-Weight LoRA Models

Linzhi Chen (ShanghaiTech University), Yang Sun (Independent Researcher), Hongru Wei (ShanghaiTech University), Yuqi Chen (ShanghaiTech University)

NetRadar: Enabling Robust Carpet Bombing DDoS Detection

Junchen Pan (Tsinghua University), Lei Zhang (Zhongguancun Laboratory), Xiaoyong Si (Tencent Technology (Shenzhen)), Jie Zhang (Tsinghua University), Xinggong Zhang (Peking University), Yong Cui (Tsinghua University)