Linxi Jiang (The Ohio State University), Xin Jin (The Ohio State University), Zhiqiang Lin (The Ohio State University)

Function name inference in stripped binaries is an important yet challenging task for many security applications, such as malware analysis and vulnerability discovery, due to the need to grasp binary code semantics amidst diverse instruction sets, architectures, compiler optimizations, and obfuscations. While machine learning has made significant progress in this field, existing methods often struggle with unseen data, constrained by their reliance on a limited vocabulary-based classification approach. In this paper, we present SymGen, a novel framework employing an autoregressive generation paradigm powered by domain-adapted generative large language models (LLMs) for enhanced binary code interpretation. We have evaluated SymGen on a dataset comprising 2,237,915 binary functions across four architectures (x86-64, x86-32, ARM, MIPS) with four levels of optimizations (O0-O3) where it surpasses the state-of-the-art with up to 409.3%, 553.5%, and 489.4% advancement in precision, recall, and F1 score, respectively, showing superior effectiveness and generalizability. Our ablation and case studies also demonstrate the significant performance boosts achieved by our design, e.g., the domain adaptation approach, alongside showcasing SymGen’s practicality in analyzing real-world binaries, e.g., obfuscated binaries and malware executables.

View More Papers

ABElity: Attribute Based Encryption for Securing RIC Communication in...

K Sowjanya (Indian Institute of Technology Delhi), Rahul Saini (Eindhoven University of Technology), Dhiman Saha (Indian Institute of Technology Bhilai), Kishor Joshi (Eindhoven University of Technology), Madhurima Das (Indian Institute of Technology Delhi)

Read More

TZ-DATASHIELD: Automated Data Protection for Embedded Systems via Data-Flow-Based...

Zelun Kong (University of Texas at Dallas), Minkyung Park (University of Texas at Dallas), Le Guan (University of Georgia), Ning Zhang (Washington University in St. Louis), Chung Hwan Kim (University of Texas at Dallas)

Read More

Ring of Gyges: Accountable Anonymous Broadcast via Secret-Shared Shuffle

Wentao Dong (City University of Hong Kong), Peipei Jiang (Wuhan University; City University of Hong Kong), Huayi Duan (ETH Zurich), Cong Wang (City University of Hong Kong), Lingchen Zhao (Wuhan University), Qian Wang (Wuhan University)

Read More

Decoupling Permission Management from Cryptography for Privacy-Preserving Systems

Ruben De Smet (Department of Engineering Technology (INDI), Department of Electronics and Informatics (ETRO), Vrije Universiteit Brussel), Tom Godden (Department of Engineering Technology (INDI), Vrije Universiteit Brussel), Kris Steenhaut (Department of Engineering Technology (INDI), Department of Electronics and Informatics (ETRO), Vrije Universiteit Brussel), An Braeken (Department of Engineering Technology (INDI), Vrije Universiteit Brussel)

Read More