Aravind Machiry (UC Santa Barbara), Nilo Redini (UC Santa Barbara), Eric Gustafson (UC Santa Barbara), Hojjat Aghakhani (UC Santa Barbara), Christopher Kruegel (UC Santa Barbara), Giovanni Vigna (UC Santa Barbara)

Binary static analysis has seen a recent surge in interest, due to a rise in analysis targets for which no other method is appropriate, such as, embedded firmware. This has led to the proposal of a number of binary static analysis tools and techniques, handling various kinds of programs, and answering different research questions. While static analysis tools that focus on binaries inherit the undecidability of static analysis, they bring with them other challenges, particularly in dealing with the aliasing of code and data pointers. These tools may tackle these challenges in different ways, but unfortunately, there is currently no concrete means of comparing their effectiveness at solving these central, problem-independent aspects of static analysis.

In this paper, we propose a new method for creating a dataset of real-world programs, paired with the ground truth for static analysis. Our approach involves the injection of synthetic “facts” into a set of open-source programs, consisting of new variables and their possible values. The analyses’ goal is then to evaluate the possible values of these facts at certain program points. As the facts are injected randomly within an arbitrarily-large set of programs, the kinds of data flows that can be measured are widely-varied in size and complexity. We implemented this idea as a prototype system, AUTOFACTS, and used it to create a ground truth dataset of 29 programs, with various types and number of facts, resulting in a total of 2,088 binaries (with 72 versions for each program). To our knowledge, this is the first dataset aimed at the problem-independent evaluation of static analysis tools, and we contribute all code and the dataset itself to the community as open-source.

View More Papers

Investigating Graph Embedding Neural Networks with Unsupervised Features Extraction...

Luca Massarelli (Sapienza University of Rome), Giuseppe A. Di Luna (CINI - National Laboratory of Cybersecurity), Fabio Petroni (Independent Researcher), Leonardo Querzoni (Sapienza University of Rome), Roberto Baldoni (Italian Presidency of Ministry Council)

Read More

FirmDiff: Improving the Configuration of Linux Kernels Geared Towards...

Ioannis Angelakopoulos (Boston University), Gianluca Stringhini (Boston University), Manuel Egele (Boston University)

Read More

TBD

Ryo Ichikawa, Captain of CTF Team TokyoWesterns

Read More

PyPANDA: Taming the PANDAmonium of Whole System Dynamic Analysis

Luke Craig, Tim Leek (MIT Lincoln Laboratory), Andrew Fasano, Tiemoko Ballo (MIT Lincoln Laboratory, Northeastern University), Brendan Dolan-Gavitt (New York University), William Robertson (Northeastern University)

Read More