CS 6340 – Lab 1 – Fuzzing
In Part 1 you will implement a simple tool to automatically check for divide-by-zero errors in C
programs at runtime. You will create an LLVM pass that will instrument C code with additional
instructions that will perform runtime checks, thus creating a sanitizer, a form of lightweight
dynamic analysis. In the spirit of automated testing, your tool will provide a code coverage
mechanism that will show the actual instructions that execute when a program runs.
In Part 2 you will implement a fuzzer that will create random inputs to automatically test simple
programs. As we discussed in the lesson, we hope to get lucky and cause the input program to
crash on some randomly generated data. You will see how this specialized form of mutation
analysis can perform well enough to encourage developers to use this technique to help test their
In Part 3 you will extend your fuzzer from Part 2 to make more interesting choices about the
kinds of input it generates to test a program. The fuzzer will use output from previous rounds of
test as feedback to direct future test generation. You will use the code coverage metrics
implemented in Part 1 to help select more interesting seed cases for your fuzzer to mutate.
Part 1 – Simple Dynamic Analysis
The skeleton code is located under /fuzzing/part1/. We will refer to the top level directory for
Part 1 as part1 when describing file locations.
Run the following commands to setup this part:
$ cd part1
$ mkdir build
$ cd build
$ cmake ..
You should see several files created in the current directory. This builds an LLVM pass from
code that we provide, part1/src/Instrument.cpp, named InstrumentPass.so.
Note each time you update Instrument.cpp you will need to rerun the make command in the
build directory before testing.
Next, let’s run our dummy Instrument pass over some C code that contains a divide-by-zero
If you’ve done everything correctly up to this point, you should see Floating point exception
(core dumped). For the lab, you will complete the Instrument pass to catch this error at
Format of Input Programs
All C programs are valid input programs.
In this lab, you will implement a dynamic analysis tool that catches divide-by-zero errors at
runtime. A key component of dynamic analysis is that we inspect a running program for
information about its state and behavior. We will use an LLVM pass to insert runtime checking
and monitoring code into an existing program. In this lab, our instrumentation will perform
divide-by-zero error checking, and record coverage information for a running program. In the
following part of the lab, we will introduce an automated testing framework using our dynamic
Instrumentation and Code Coverage Primer. Consider the following code snippet where we
have two potential divide-by-zero errors, one at Line 1, the other at Line 2.
We have transformed our unsafe version of the code in the first example to a safe one by
instrumenting all division instructions with some code that performs a divisor check. In this lab,
you will automate this process at the LLVM IR level using an LLVM compiler pass.
Debug Location Primer. When you compile C code with the -g option, LLVM will include
debug information for LLVM IR instructions. Using the aforementioned instrumentation
techniques, your LLVM pass can gather this debug information for an Instruction, and forward
it to __dbz_sanitizer__ to report the location a divide-by-zero error occurs. We will discuss the
specifics of this interface in the following sections.
Instrumentation Pass. We have provided a framework from which to build your LLVM
instrumentation pass. You will need to edit the part1/src/Instrument.cpp file to implement
your divide-by-zero sanitizer, as well as the code coverage analysis part1/lib/runtime.c
contains functions that you will use in your lab:
– void __dbz_sanitizer__(int divisor, int line, int col)
– Output an error for line:col if divisor is 0
– void __coverage__(int line, int col)
– Append coverage information for line:col in a file for the current executing
As you will create a runtime sanitizer, your dynamic analysis pass should instrument the code
with these functions. In particular, you will modify the runOnFunction method in
Instrument.cpp to perform this instrumentation for all LLVM instructions encountered inside a
Note that our runOnFunction method returns true in Lab 1. In Lab 0, we returned false in similar
places. As we are instrumenting the input code with additional functionality, we return true to
indicate that the pass modifies, or transforms the source code it traverses over.
In short, part 1 consists of the following tasks:
1. Implement the instrumentSanitizer function to insert a __dbz_sanitizer__ check for
a supplied Instruction
2. Modify runOnFunction to instrument all signed and unsigned integer division
instructions with the sanitizer for a block of code
3. Implement the instrumentCoverage function to insert __coverage__ checks for all
4. Modify runOnFunction to instrument all instructions with the coverage check
Inserting Instructions into LLVM code. By now you are familiar with the BasicBlock and
Instruction classes and working with LLVM instructions in general. For this lab you will need
to use the LLVM API to insert additional instructions into the code when traversing a
BasicBlock. There are manys ways to traverse programs in LLVM. One common pattern when
working with LLVM is to create a new instruction and insert it directly after some previous