Huaijin Wang (The Ohio State University), Zhiqiang Lin (The Ohio State University)

Binary Code Similarity Analysis (BCSA) plays a vital role in many security tasks, including malware analysis, vulnerability detection, and software supply chain security. While numerous BCSA techniques have been proposed over the past decade, few leverage the semantics of register and memory textit{values} for comparison, despite promising initial results. Existing value-based approaches often focus narrowly on values that remain invariant across compilation settings, thereby overlooking a broader spectrum of semantically rich information. In this paper, we identify three core challenges limiting the effectiveness of value-based BCSA: unscalable value extraction, lack of noise filtering, and inefficient value comparison. These shortcomings hinder both semantic coverage and scalability. To unlock the full potential of value-based BCSA, we propose vSim, a novel framework that systematically captures values from all register and memory operations, filters out semantically irrelevant values (e.g., global addresses), and normalizes and propagates the remaining values to enable robust and scalable similarity analysis. Extensive evaluation shows that vSim consistently outperforms state-of-the-art BCSA systems in accuracy, robustness, and scalability. It generalizes well across architectures and toolchains, producing reliable results on diverse datasets.

View More Papers

To Shuffle or not to Shuffle: Auditing DP-SGD with...

Meenatchi Sundaram Muthu Selva Annamalai (University College London), Borja Balle (Google Deepmind), Jamie Hayes (Deepmind), Emiliano De Cristofaro (UC Riverside)

Read More

RoundRole: Unlocking the Efficiency of Multi-party Computation with Bandwidth-aware...

xiaoyu fan (IIIS, Tsinghua University), Kun Chen (Ant Group), Jiping Yu (Tsinghua University), Xin Liu (Tsinghua University), Yunyi Chen (Tsinghua University), Wei Xu (Tsinghua Univesity)

Read More