Shreyash Tiwari (Computer and Information Science, University of Massachusetts Dartmouth), Nathaniel D. Bastian (Electrical Engineering and Computer Science, United States Military Academy), Gokhan Kul (Computer and Information Science, University of Massachusetts Dartmouth)
Intrusion Detection Systems (IDS) remain vulnerable to zero-day attacks that manifest themselves as previously unseen traffic patterns. Traditional neural IDS models, constrained by closed-world assumptions, often misclassify such traffic as benign, leading to significant security risks. We present DQNIDS, a deep reinforcement learning framework that integrates a Convolutional Neural Network (CNN) for feature extraction with a Deep Q-Network (DQN) for uncertainty-aware decision-making. Unlike threshold-based open-set methods, DQN-IDS dynamically learns to separate known and unknown traffic using softmax-derived confidence metrics maximum probability, probability gap, and entropy as its state representation. Evaluated on the CICIDS-2017 and UNSW2015 datasets, the proposed system achieves a binary F1-score of 97.8% (known vs. unknown) and reduces missed zero-day traffic compared to state-of-the-art threshold-based approaches. The DQN stage introduces negligible runtime overhead relative to CNN inference, yielding a deployable two-stage open-set NIDS suitable for IoT and other resource-constrained environments.