Linxi Jiang (The Ohio State University), Xin Jin (The Ohio State University), Zhiqiang Lin (The Ohio State University)

Function name inference in stripped binaries is an important yet challenging task for many security applications, such as malware analysis and vulnerability discovery, due to the need to grasp binary code semantics amidst diverse instruction sets, architectures, compiler optimizations, and obfuscations. While machine learning has made significant progress in this field, existing methods often struggle with unseen data, constrained by their reliance on a limited vocabulary-based classification approach. In this paper, we present SymGen, a novel framework employing an autoregressive generation paradigm powered by domain-adapted generative large language models (LLMs) for enhanced binary code interpretation. We have evaluated SymGen on a dataset comprising 2,237,915 binary functions across four architectures (x86-64, x86-32, ARM, MIPS) with four levels of optimizations (O0-O3) where it surpasses the state-of-the-art with up to 409.3%, 553.5%, and 489.4% advancement in precision, recall, and F1 score, respectively, showing superior effectiveness and generalizability. Our ablation and case studies also demonstrate the significant performance boosts achieved by our design, e.g., the domain adaptation approach, alongside showcasing SymGen’s practicality in analyzing real-world binaries, e.g., obfuscated binaries and malware executables.

View More Papers

LLM-xApp: A Large Language Model Empowered Radio Resource Management...

Xingqi Wu (University of Michigan-Dearborn), Junaid Farooq (University of Michigan-Dearborn), Yuhui Wang (University of Michigan-Dearborn), Juntao Chen (Fordham University)

Read More

WIP: Towards Privacy Compliance by Design in the Matter...

Yichen Liu (Indiana University Bloomington), Jingwen Yan (Clemson University), Song Liao (Texas Tech University), Long Cheng (Clemson University), Luyi Xing (Indiana University Bloomington)

Read More

Compiled Models, Built-In Exploits: Uncovering Pervasive Bit-Flip Attack Surfaces...

Yanzuo Chen (The Hong Kong University of Science and Technology), Zhibo Liu (The Hong Kong University of Science and Technology), Yuanyuan Yuan (The Hong Kong University of Science and Technology), Sihang Hu (Huawei Technologies), Tianxiang Li (Huawei Technologies), Shuai Wang (The Hong Kong University of Science and Technology)

Read More

Probe-Me-Not: Protecting Pre-trained Encoders from Malicious Probing

Ruyi Ding (Northeastern University), Tong Zhou (Northeastern University), Lili Su (Northeastern University), Aidong Adam Ding (Northeastern University), Xiaolin Xu (Northeastern University), Yunsi Fei (Northeastern University)

Read More