Yunkai Zou (Nankai University), Ding Wang (Nankai University), Fei Duan (Nankai University)
While traditional whole-password guessing attacks have been extensively studied, few studies have explored mask password guessing, where an attacker has somehow obtained emph{partial information} of the target victim's password (e.g., length and/or some characters) by exploiting various side-channel attacks (e.g., shoulder surfing, smudge, and keystroke audio feedback).
To evaluate the threats posed by mask attackers with varied capabilities, we investigate emph{four major mask guessing scenarios}, each of which is based on different kinds of information exploited by the attacker (e.g., the length of the victim's password and some characters). For the first time, we systematically and comprehensively characterize the impacts of mask guessing that incorporate side-channel priors, personally identifiable information (PII), and previously leaked (sister) passwords, by proposing two password models: neural network-based PassSeq and probability statistics-based Kneser-Ney. Using the maximum likelihood estimation technique, we propose a new guess number estimation method to accurately and efficiently estimate the guess number required against the target password under a given password model. Extensive experiments on 15 large-scale datasets demonstrate the effectiveness of PassSeq and Kneser-Ney. Particularly, within ten guesses: (1) When a trawling attacker knows the character composition (without order) of the victim's 4-digit PIN, the success rate increases by 152% (from 14% to 35%); (2) When a PII-based targeted attacker knows the length of the victim's password, the success rate increases by 47%-82%; and (3) if this targeted attacker further knows emph{one character} of the victim's password (besides the length), the success rate generally emph{doubles}, reaching 7%-29% (and these figures will be 33%-73% for a targeted attacker that can exploit the victim’s sister password).
To further validate the practicality of our mask guessing models, we collect real-world keystroke audio data from 11 popular keyboards (e.g., Apple, Dell, Lenovo) and replicate attacks where partial password information is inferred via acoustic side channels. Experiments show that our PassSeq significantly boosts the success rates of existing keystroke inference attacks, achieving an emph{additional} 5.6%-166.7% improvement within 10 guesses. This work highlights that mask password guessing is a damaging threat that deserves more attention.