Linxi Jiang (The Ohio State University), Xin Jin (The Ohio State University), Zhiqiang Lin (The Ohio State University)

Function name inference in stripped binaries is an important yet challenging task for many security applications, such as malware analysis and vulnerability discovery, due to the need to grasp binary code semantics amidst diverse instruction sets, architectures, compiler optimizations, and obfuscations. While machine learning has made significant progress in this field, existing methods often struggle with unseen data, constrained by their reliance on a limited vocabulary-based classification approach. In this paper, we present SymGen, a novel framework employing an autoregressive generation paradigm powered by domain-adapted generative large language models (LLMs) for enhanced binary code interpretation. We have evaluated SymGen on a dataset comprising 2,237,915 binary functions across four architectures (x86-64, x86-32, ARM, MIPS) with four levels of optimizations (O0-O3) where it surpasses the state-of-the-art with up to 409.3%, 553.5%, and 489.4% advancement in precision, recall, and F1 score, respectively, showing superior effectiveness and generalizability. Our ablation and case studies also demonstrate the significant performance boosts achieved by our design, e.g., the domain adaptation approach, alongside showcasing SymGen’s practicality in analyzing real-world binaries, e.g., obfuscated binaries and malware executables.

View More Papers

What’s Done Is Not What’s Claimed: Detecting and Interpreting...

Chang Yue (Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China), Kai Chen (Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China), Zhixiu Guo (Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China), Jun Dai, Xiaoyan Sun (Department of Computer Science, Worcester Polytechnic Institute), Yi Yang (Institute of Information Engineering, Chinese Academy…

Read More

Cross-Origin Web Attacks via HTTP/2 Server Push and Signed...

Pinji Chen (Tsinghua University), Jianjun Chen (Tsinghua University & Zhongguancun Laboratory), Mingming Zhang (Zhongguancun Laboratory), Qi Wang (Tsinghua University), Yiming Zhang (Tsinghua University), Mingwei Xu (Tsinghua University), Haixin Duan (Tsinghua University)

Read More

Work-in-Progress: Detecting Browser-in-the-Browser Attacks from Their Behaviors and DOM...

Ryusei Ishikawa, Soramichi Akiyama, and Tetsutaro Uehara (Ritsumeikan University)

Read More

Security Advice on Content Filtering and Circumvention for Parents...

Ran Elgedawy (The University of Tennessee, Knoxville), John Sadik (The University of Tennessee, Knoxville), Anuj Gautam (The University of Tennessee, Knoxville), Trinity Bissahoyo (The University of Tennessee, Knoxville), Christopher Childress (The University of Tennessee, Knoxville), Jacob Leonard (The University of Tennessee, Knoxville), Clay Shubert (The University of Tennessee, Knoxville), Scott Ruoti (The University of Tennessee,…

Read More