Daniel Huici, Ricardo J. Rodríguez (University of Zaragoza), Andrei Costin (University of Jyvaskyla), Narges Yousefnezhad (Binare Oy)
Tracking N-day vulnerabilities in fragmented firmware ecosystems is an open challenge, often hampered by the disconnect between abstract CVE descriptions and the binary code actually distributed in production and connected devices. In this paper, we present a generic CVE-based framework for correlating vulnerable files in heterogeneous firmware images using similarity digests. Our approach leverages APOTHEOSIS, an open-source approximate nearest neighbor search system, to scale similarity queries across massive collections of artifacts. To bridge the semantic gap between vulnerability reports and binary reality, we introduce an automated process that lifts confirmed vulnerable implementations to high-level intermediate representations and generates function-level search signatures. We demonstrate the effectiveness of this system as a rapid triage tool using the OPENWRT ecosystem as a case study. In the event of a new CVE disclosure, our approach allows analysts to consult the pre-created APOTHEOSIS index to immediately generate a prioritized list of affected firmware versions, significantly accelerating impact assessment without being dependent on reliable nor accurate vendor/CVE metadata or source code.