Yu Liang (The Pennsylvania State University), Peng Liu (The Pennsylvania State University)
Syntax-based testing is a promising technique for finding bugs in Database Management Systems (DBMSs). All existing syntax-based SQL generation tools apply a Top-down generation method. To construct a SQL query (syntax tree), the generator forward explores the SQL grammar starting from the root node, and it stops when no further grammar rule can be applied to the leaves of the syntax tree. However, the Top-down generation method tends to put more effort into exploring the shallow grammar close to the root and neglects the feature-rich grammar deeper in the grammar space. Therefore, it is not efficient in finding DBMS bugs.
This paper proposes a new Bottom-up syntax-based SQL generation technique that puts more testing resources into exploring the feature-rich grammar rules. The exploration of SQL grammar begins with one interesting grammar rule that outlines the syntax of feature-rich SQL functionalities. The generator then backtracks (Bottom-up) this grammar rule to the root to create a syntax path that unveils this interesting grammar. Multiple Bottom-up generated syntax paths are then expanded and merged to create diverse SQL queries for fuzzing. A prototype tool, SQLBull, adopts the Bottom-up generation technique for fuzzing. In the evaluation, SQLBull found 63 zero-day bugs from 5 well-tested DBMSs: MySQL, MariaDB, CockroachDB, DuckDB, and PostgreSQL. It outperforms all existing tools in both bug-finding and code coverage. The evaluation results verify the effectiveness of the Bottom-up generation technique.