Michael Pucher (University of Vienna), Christian Kudera (SBA Research), Georg Merzdovnik (SBA Research)

The complexity and functionality of malware is ever-increasing. Obfuscation is used to hide the malicious intent from virus scanners and increase the time it takes to reverse engineer the binary. One way to minimize this effort is function clone detection. Detecting whether a function is already known, or similar to an existing function, can reduce analysis effort. Outside of malware, the same function clone detection mechanism can be used to find vulnerable versions of functions in binaries, making it a powerful technique.

This work introduces a slim approach for the identification of obfuscated function clones, called OFCI, building on recent advances in machine learning based function clone detection. To tackle the issue of obfuscation, OFCI analyzes the effect of known function calls on function similarity. Furthermore, we investigate function similarity classification on code obfuscated through virtualization by applying function clone detection on execution traces. While not working adequately, it nevertheless provides insight into potential future directions.

Using the ALBERT transformer OFCI can achieve an 83% model size reduction in comparison to state-of-the-art approaches, while only causing an average 7% decrease in the ROC-AUC scores of function pair similarity classification. However, the reduction in model size comes at the cost of precision for function clone search. We discuss the reasons for this as well as other pitfalls of building function similarity detection tooling.

View More Papers

DRAWN APART: A Device Identification Technique based on Remote...

Tomer Laor (Ben-Gurion Univ. of the Negev), Naif Mehanna (Univ. Lille, CNRS, Inria), Antonin Durey (Univ. Lille, CNRS, Inria), Vitaly Dyadyuk (Ben-Gurion Univ. of the Negev), Pierre Laperdrix (Univ. Lille, CNRS, Inria), Clémentine Maurice (Univ. Lille, CNRS, Inria), Yossi Oren (Ben-Gurion Univ. of the Negev), Romain Rouvoy (Univ. Lille, CNRS, Inria / IUF), Walter Rudametkin…

Read More

HeadStart: Efficiently Verifiable and Low-Latency Participatory Randomness Generation at...

Hsun Lee (National Taiwan University), Yuming Hsu (National Taiwan University), Jing-Jie Wang (National Taiwan University), Hao Cheng Yang (National Taiwan University), Yu-Heng Chen (National Taiwan University), Yih-Chun Hu (University of Illinois at Urbana-Champaign), Hsu-Chun Hsiao (National Taiwan University)

Read More

Cross-Language Attacks

Samuel Mergendahl (MIT Lincoln Laboratory), Nathan Burow (MIT Lincoln Laboratory), Hamed Okhravi (MIT Lincoln Laboratory)

Read More

DRAGON: Predicting Decompiled Variable Data Types with Learned Confidence...

Caleb Stewart, Rhonda Gaede, Jeffrey Kulick (University of Alabama in Huntsville)

Read More