André Pacteau, Antonino Vitale, Davide Balzarotti, Simone Aonzo (EURECOM)

Cryptographic function detection in binaries is a crucial task in software reverse engineering (SRE), with significant implications for secure communications, regulatory compliance, and malware analysis. While traditional approaches based on cryptographic signatures are common, they are challenging to maintain and often prone to false negatives in the case of custom implementations or false positives when short signatures are used. Alternatively, techniques based on statistical analysis of mnemonics in disassembled code have emerged, positing that cryptographic functions tend to involve a high frequency of arithmetic and logic operations. However, these methods have predominantly been formulated as heuristics, with thresholds that may not always be optimal or universally applicable.

In this paper, we present Mnemocrypt, a machine learningbased tool for detecting cryptographic functions in x86 executables, which we release as an IDA Pro plugin. Using a random forest classifier, Mnemocrypt leverages both structural and content-related metrics of functions at varying levels of granularity to make its predictions. The primary design goal of Mnemocrypt is to minimize false positives, as misleading results could lead analysts down incorrect investigative paths, undermining the efficacy of reverse engineering efforts. Trained on a diverse dataset of cryptographic libraries compiled with different optimization levels, Mnemocrypt achieves robust detection capabilities without relying on predefined signatures or computationally expensive data flow graph analysis, ensuring high efficiency.

Our evaluation, conducted on 231 Portable Executable x86 Windows malware samples from different families, demonstrates that Mnemocrypt, when configured with a high confidence threshold, significantly outperforms existing solutions in terms of false positives. The few false positives detected by Mnemocrypt were only related to compression functions or complex data processing routines, further emphasizing the tool’s precision in distinguishing algorithms that use instructions similar to cryptographic processes. Finally, with a median execution time of six seconds, Mnemocrypt provides the reverse engineering community with a practical and efficient solution for identifying cryptographic functions, paving the way for further studies to improve this type of model.

View More Papers

Exploring User Perceptions of Security Auditing in the Web3...

Molly Zhuangtong Huang (University of Macau), Rui Jiang (University of Macau), Tanusree Sharma (Pennsylvania State University), Kanye Ye Wang (University of Macau)

Read More

Evaluating LLMs Towards Automated Assessment of Privacy Policy Understandability

Keika Mori (Deloitte Tohmatsu Cyber LLC, Waseda University), Daiki Ito (Deloitte Tohmatsu Cyber LLC), Takumi Fukunaga (Deloitte Tohmatsu Cyber LLC), Takuya Watanabe (Deloitte Tohmatsu Cyber LLC), Yuta Takata (Deloitte Tohmatsu Cyber LLC), Masaki Kamizono (Deloitte Tohmatsu Cyber LLC), Tatsuya Mori (Waseda University, NICT, RIKEN AIP)

Read More

Securing BGP ASAP: ASPA and other Post-ROV Defenses

Justin Furuness (University of Connecticut), Cameron Morris (University of Connecticut), Reynaldo Morillo (University of Connecticut), Arvind Kasiliya (University of Connecticut), Bing Wang (University of Connecticut), Amir Herzberg (University of Connecticut)

Read More