Idioms: A Simple and Effective Framework for Turbo-Charging Local Neural Decompilation with Well-Defined Types

Luke Dramko (Carnegie Mellon University), Claire Le Goues (Carnegie Mellon University), Edward J. Schwartz (Carnegie Mellon University)

Decompilers help reverse engineers analyze software at a higher level of abstraction than assembly code. Unfortunately, because compilation is lossy, traditional decompilers, which are deterministic, produce code that lacks many characteristics that make source code readable in the first place, such as variable and type names. Neural decompilers offer the exciting possibility of statistically filling in these details. Unfortunately, existing work in neural decompilation suffers from substantial limitations that preclude its use on real code, such as the inability to provide definitions for user-defined composite types. In this work, we introduce Idioms, a simple, generalizable, and effective neural decompilation approach that can finetune any LLM into a neural decompiler capable of generating the appropriate user-defined type definitions alongside the decompiled code, and a new dataset, Realtype, that includes substantially more complicated and realistic types than existing neural decompilation benchmarks. We show that our approach yields state-of-the-art results in neural decompilation. On the most challenging existing benchmark---Exebench---our model achieves 54.4% accuracy vs. 46.3% for LLM4Decompile and 37.5% for Nova; on Realtype, our model performs at least 95% better.

Paper

View More Papers

FirmAgent: Leveraging Fuzzing to Assist LLM Agents with IoT...

Jiangan Ji (Information Engineering University,Tsinghua University), Chao Zhang (Tsinghua University), Shuitao Gan (Labortory for Advanced Computing and Intelligence Engineering), Lin Jian (Information Engineering University), Hangtian Liu (Information Engineering University), Tieming Liu (Information Engineering University), Lei Zheng (Tsinghua university), Zhipeng Jia (Information Engineering University)

Cascading and Proxy Membership Inference Attacks

Yuntao Du (Purdue University), Jiacheng Li (Purdue University), Yuetian Chen (Purdue University), Kaiyuan Zhang (Purdue University), Zhizhen Yuan (Purdue University), Hanshen Xiao (Purdue University), Bruno Ribeiro (Purdue University), Ninghui Li (Purdue University)

Actively Understanding the Dynamics and Risks of the Threat...

Tillson Galloway (Georgia Institute of Technology), Omar Alrawi (Georgia Institute of Technology), Allen Chang (Georgia Institute of Technology), Athanasios Avgetidis (Georgia Institute of Technology), Manos Antonakakis (Georgia Institute of Technology), Fabian Monrose (Georgia Institute of Technology)