vSim: Semantics-Aware Value Extraction for Efficient Binary Code Similarity Analysis

Huaijin Wang (The Ohio State University), Zhiqiang Lin (The Ohio State University)

Binary Code Similarity Analysis (BCSA) plays a vital role in many security tasks, including malware analysis, vulnerability detection, and software supply chain security. While numerous BCSA techniques have been proposed over the past decade, few leverage the semantics of register and memory textit{values} for comparison, despite promising initial results. Existing value-based approaches often focus narrowly on values that remain invariant across compilation settings, thereby overlooking a broader spectrum of semantically rich information. In this paper, we identify three core challenges limiting the effectiveness of value-based BCSA: unscalable value extraction, lack of noise filtering, and inefficient value comparison. These shortcomings hinder both semantic coverage and scalability. To unlock the full potential of value-based BCSA, we propose vSim, a novel framework that systematically captures values from all register and memory operations, filters out semantically irrelevant values (e.g., global addresses), and normalizes and propagates the remaining values to enable robust and scalable similarity analysis. Extensive evaluation shows that vSim consistently outperforms state-of-the-art BCSA systems in accuracy, robustness, and scalability. It generalizes well across architectures and toolchains, producing reliable results on diverse datasets.

Paper

View More Papers

Enhancing Website Fingerprinting Attacks against Traffic Drift

Xinhao Deng (Tsinghua University & Ant Group), Yixiang Zhang (Tsinghua University), Qi Li (Tsinghua University & Zhongguancun Laboratory), Zhuotao Liu (Tsinghua University & Zhongguancun Laboratory), Yabo Wang (Tsinghua University), Ke Xu (Tsinghua University & Zhongguancun Laboratory)

Time will Tell: Large-scale De-anonymization of Hidden I2P Services...

Hongze Wang (Southeast University), Zhen Ling (Southeast University), Xiangyu Xu (Southeast University), Yumingzhi Pan (Southeast University), Guangchi Liu (Southeast University), Junzhou Luo (Southeast University), Xinwen Fu (University of Massachusetts Lowell)

Trust Me, I Know This Function: Hijacking LLM Static...

Shir Bernstein (Ben Gurion University of the Negev), David Beste (CISPA Helmholtz Center for Information Security), Daniel Ayzenshteyn (Ben Gurion University of the Negev), Lea Schönherr (CISPA Helmholtz Center for Information Security), Yisroel Mirsky (Ben Gurion University of the Negev)