PhishLang: A Real-Time, Fully Client-Side Phishing Detection Framework Using MobileBERT

Sayak Saha Roy (The University of Texas at Arlington), Shirin Nilizadeh (The University of Texas at Arlington)

We present PhishLang, the first fully client-side anti-phishing framework implemented as a Chromium-based browser extension. PhishLang enables real-time, on-device detection of phishing websites by utilizing a lightweight language model (MobileBERT). Unlike traditional heuristic or static feature-based models that struggle with evasive threats, and deep learning approaches that are too resource-intensive for client-side use, PhishLang analyzes the contextual structure of a page’s source code, achieving detection performance on par with several state-of-the-art models while consuming up to 7 times less memory than comparable architectures. Over a 3.5-month period, we deployed the framework in real-time, successfully identifying approximately 26k phishing URLs, many of which were undetected by popular antiphishing blocklists, thus demonstrating PhishLang's potential to aid current detection measures. On the other hand, the browser extension outperformed several anti-phishing tools, detecting over 91% of the threats during zero-day. PhishLang also showed strong adversarial robustness, resisting 16 categories of realistic problem space evasions through a combination of parser-level defenses and adversarial retraining. To aid both end-users and the research community, we have open-sourced both the PhishLang framework and the browser extension.

Paper

View More Papers

An LLM-Driven Fuzzing Framework for Detecting Logic Instruction Bugs...

Jiaxing Cheng (Institute of Information Engineering, CAS; School of Cyber Security, UCAS), Ming Zhou (School of Cyber Science and Engineering, Nanjing University of Science and Technology), Haining Wang (Virginia Tech), Xin Chen (Institute of Institute of Information Engineering, CAS; School of Cyber Security, UCAS), Yuncheng Wang (Institute of Institute of Information Engineering, CAS; School of…

Lightening the Load: A Cluster-Based Framework for A Lower-Overhead,...

Khashayar Khajavi (Simon Fraser University), Tao Wang (Simon Fraser University)

NetCap: Data-Plane Capability-Based Defense Against Token Theft in Network...

Osama Bajaber (Virginia Tech), Bo Ji (Virginia Tech), Peng Gao (Virginia Tech)