*Shoham Shitrit(University of Rochester) and Sreepathi Pai (University of Rochester)*

Formal semantics for instruction sets can be used to validate implementations through formal verification. However, testing is often the only feasible method when checking an artifact such as a hardware processor, a simulator, or a compiler. In this work, we construct a pipeline that can be used to automatically generate a test suite for an instruction set from its executable semantics. Our method mutates the formal semantics, expressed as a C program, to introduce bugs in the semantics. Using a bounded model checker, we then check the mutated semantics to the original for equivalence. Since the mutated and original semantics are usually not equivalent, this yields counterexamples which can be used to construct a test suite. By combining a mutation testing engine with a bounded model checker, we obtain a fully automatic method for constructing test suites for a given formal semantics. We intend to instantiate this on a formal semantics of a portion of NVIDIA’s PTX instruction set for GPUs that we have developed. We will compare to our existing method of testing that uses stratified random sampling and evaluate effectiveness, cost, and feasibility.