Linxi Jiang (The Ohio State University), Xin Jin (The Ohio State University), Zhiqiang Lin (The Ohio State University)

Function name inference in stripped binaries is an important yet challenging task for many security applications, such as malware analysis and vulnerability discovery, due to the need to grasp binary code semantics amidst diverse instruction sets, architectures, compiler optimizations, and obfuscations. While machine learning has made significant progress in this field, existing methods often struggle with unseen data, constrained by their reliance on a limited vocabulary-based classification approach. In this paper, we present SymGen, a novel framework employing an autoregressive generation paradigm powered by domain-adapted generative large language models (LLMs) for enhanced binary code interpretation. We have evaluated SymGen on a dataset comprising 2,237,915 binary functions across four architectures (x86-64, x86-32, ARM, MIPS) with four levels of optimizations (O0-O3) where it surpasses the state-of-the-art with up to 409.3%, 553.5%, and 489.4% advancement in precision, recall, and F1 score, respectively, showing superior effectiveness and generalizability. Our ablation and case studies also demonstrate the significant performance boosts achieved by our design, e.g., the domain adaptation approach, alongside showcasing SymGen’s practicality in analyzing real-world binaries, e.g., obfuscated binaries and malware executables.

View More Papers

Target-Centric Firmware Rehosting with Penguin

Andrew Fasano, Zachary Estrada, Luke Craig, Ben Levy, Jordan McLeod, Jacques Becker, Elysia Witham, Cole DiLorenzo, Caden Kline, Ali Bobi (MIT Lincoln Laboratory), Dinko Dermendzhiev (Georgia Institute of Technology), Tim Leek (MIT Lincoln Laboratory), William Robertson (Northeastern University)

Read More

SongBsAb: A Dual Prevention Approach against Singing Voice Conversion...

Guangke Chen (Pengcheng Laboratory), Yedi Zhang (National University of Singapore), Fu Song (Key Laboratory of System Software (Chinese Academy of Sciences) and State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Science; Nanjing Institute of Software Technology), Ting Wang (Stony Brook University), Xiaoning Du (Monash University), Yang Liu (Nanyang Technological University)

Read More

Poster: FORESIGHT, A Unified Framework for Threat Modeling and...

ChaeYoung Kim (Seoul Women's University), Kyounggon Kim (Naif Arab University for Security Sciences)

Read More

Spatial-Domain Wireless Jamming with Reconfigurable Intelligent Surfaces

Philipp Mackensen (Ruhr University Bochum), Paul Staat (Max Planck Institute for Security and Privacy), Stefan Roth (Ruhr University Bochum), Aydin Sezgin (Ruhr University Bochum), Christof Paar (Max Planck Institute for Security and Privacy), Veelasha Moonsamy (Ruhr University Bochum)

Read More