Michael Kadoshnikov, Clemente Izurieta, Matthew Revelle (Montana State University)
Program graphs have become essential for vulnerability detection on program binaries, particularly for approaches based on machine learning. However, many researchers focus on comparing the performance of their technique with others, often neglecting the rationale behind the chosen graph structure used in their approach. This paper explores the comparative performance of various program graphs, such as abstract syntax trees (ASTs), control flow graphs (CFGs), data dependence graphs (DDGs), and their combinations. Each graph variation is evaluated by measuring the classification performance of representation-specific graph neural networks in detecting vulnerabilities at the program level in compiled programs from the NIST SARD Juliet dataset. By evaluating each combination’s strengths and weaknesses, we identify the most effective graph structure for binary vulnerability detection. Performance is evaluated across all variations through a statistical analysis of the experimental results.