He Shuang (University of Toronto), Lianying Zhao (Carleton University and University of Toronto), David Lie (University of Toronto)

Web tracking harms user privacy. As a result, the use of tracker detection and blocking tools is a common practice among Internet users. However, no such tool can be perfect, and thus there is a trade-off between avoiding breakage (caused by unintentionally blocking some required functionality) and neglecting to block some trackers. State-of-the-art tools usually rely on user reports and developer effort to detect breakages, which can be broadly categorized into two causes: 1) misidentifying non-trackers as trackers, and 2) blocking mixed trackers which blend tracking with functional components.

We propose incorporating a machine learning-based break- age detector into the tracker detection pipeline to automatically avoid misidentification of functional resources. For both tracker detection and breakage detection, we propose using differential features that can more clearly elucidate the differences caused by blocking a request. We designed and implemented a prototype of our proposed approach, Duumviri, for non-mixed trackers. We then adopt it to automatically identify mixed trackers, drawing differential features at partial-request granularity.

In the case of non-mixed trackers, evaluating Duumviri on 15K pages shows its ability to replicate the labels of human-generated filter lists, EasyPrivacy, with an accuracy of 97.44%. Through a manual analysis, we find that Duumviri can identify previously unreported trackers and its breakage detector can identify overly strict EasyPrivacy rules that cause breakage. In the case of mixed trackers, Duumviri is the first automated mixed tracker detector, and achieves a lower bound accuracy of 74.19%. Duumviri has enabled us to detect and confirm 22 previously unreported unique trackers and 26 unique mixed trackers.

View More Papers

Vision: Retiring Scenarios — Enabling Ecologically Valid Measurement in...

Oliver D. Reithmaier (Leibniz University Hannover), Thorsten Thiel (Atmina Solutions), Anne Vonderheide (Leibniz University Hannover), Markus Dürmuth (Leibniz University Hannover)

Read More

Decoupling Permission Management from Cryptography for Privacy-Preserving Systems

Ruben De Smet (Department of Engineering Technology (INDI), Department of Electronics and Informatics (ETRO), Vrije Universiteit Brussel), Tom Godden (Department of Engineering Technology (INDI), Vrije Universiteit Brussel), Kris Steenhaut (Department of Engineering Technology (INDI), Department of Electronics and Informatics (ETRO), Vrije Universiteit Brussel), An Braeken (Department of Engineering Technology (INDI), Vrije Universiteit Brussel)

Read More

An Empirical Study on Fingerprint API Misuse with Lifecycle...

Xin Zhang (Fudan University), Xiaohan Zhang (Fudan University), Zhichen Liu (Fudan University), Bo Zhao (Fudan University), Zhemin Yang (Fudan University), Min Yang (Fudan University)

Read More

DiStefano: Decentralized Infrastructure for Sharing Trusted Encrypted Facts and...

Sofia Celi (Brave Software), Alex Davidson (NOVA LINCS & Universidade NOVA de Lisboa), Hamed Haddadi (Imperial College London & Brave Software), Gonçalo Pestana (Hashmatter), Joe Rowell (Information Security Group, Royal Holloway, University of London)

Read More