Targeted Password Guessing Using k-Nearest Neighbors

Zhen Li (Nankai University), Ding Wang (Nankai University)

As the number of users' password accounts are constantly increasing, users are more and more inclined to reuse passwords. Recently, considerable efforts have been made to construct targeted password guessing models to characterize users' password reuse behaviors. However, existing studies mainly focus on characterizing slight modifications by training only on similar password pairs (e.g., textnormal{texttt{Shark0301} → texttt{shark03}}). This leads to overfitting and causes existing models to overlook users' large modification behaviors (e.g., textnormal{texttt{Shark0301} → texttt{Bear03}}). To fill this gap, this paper introduces a new non-parametric method named emph{k}-nearest-neighbors targeted password guessing (KNN-TPG). KNN-TPG builds a datastore that retains the context vector of all source passwords along with prefixes of the targeted passwords. During the generation of a new password, KNN-TPG retrieves emph{k} nearest neighbor vectors from the datastore to ensure that the generated passwords align better with realistic password distributions. By creatively combining KNN-TPG with our proposed Transformer-based password model, we propose a new targeted password guessing model, namely KNNGuess. At each step of generating a new password, KNNGuess predicts and utilizes three distinct distributions, aiming to comprehensively model users' password reuse behaviors.

We demonstrate the effectiveness of our KNNGuess model and the KNN-TPG method through extensive experiments, which include 12 large-scale real-world password datasets, containing 4.8 billion passwords. More specifically, when the victim's password at site A is compromised (namely $pw_A$), within 100 guesses, the cracking success rate of KNNGuess for guessing her password at site B (namely $pw_B$, and $pw_B$$neq$$pw_A$) is 25.40% (for common users) and 10.26% (for security-savvy users), which is 8.52%-119.0% (avg. 55.33%) higher than its foremost counterparts. When comparing with state-of-the-art password models (i.e., Pass2Edit and PointerGuess), this value is 8.52%-27.66% (avg. 18.09%) higher. Our results highlight that the threat of password tweaking attacks is higher than users expected.

Paper

View More Papers

Crack in the Armor: Underlying Infrastructure Threats to RPKI...

Yunhao Liu (Tsinghua University & Zhongguancun Laboratory), Jessie Hui Wang (Tsinghua University & Zhongguancun Laboratory), Yuedong Xu (Fudan University), Zongpeng Li (Tsinghua University), Yangyang Wang (Tsinghua University & Zhongguancun Laboratory), Jilong Wang (Tsinghua University & Zhongguancun Laboratory)

Bleeding Pathways: Vanishing Discriminability in LLM Hidden States Fuels...

Yingjie Zhang (Institute of Information Engineering, Chinese Academy of Sciences; School of Cyber Security, University of Chinese Academy of Sciences), Tong Liu (Institute of Information Engineering, Chinese Academy of Sciences; School of Cyber Security, University of Chinese Academy of Sciences), Zhe Zhao (Ant Group), Guozhu Meng (Institute of Information Engineering, Chinese Academy of Sciences; School…

Anota: Identifying Business Logic Vulnerabilities via Annotation-Based Sanitization

Meng Wang (CISPA Helmholtz Center for Information Security), Philipp Görz (CISPA Helmholtz Center for Information Security), Joschua Schilling (CISPA Helmholtz Center for Information Security), Keno Hassler (CISPA Helmholtz Center for Information Security), Liwei Guo (University of Electronic Science and Technology), Thorsten Holz (Max Planck Institute for Security and Privacy), Ali Abbasi (CISPA Helmholtz Center for…