[ Home | Download | User Guide | Publications | Acknowledgement ]
1. Prerequisites and Download
2. Extracting and Building
the Tools
3. Debugging the bc-1.06 program
3.1 Understanding the bug
3.2 Running the Automated Tools and
Interpreting the Outputs
4. Debugging your own programs
5. Known problems/bugs
6. Questions and Feedback
1) Our automated debugging tool collects data about the program using binary instrumentation. We use the PIN binary instrumentation platform. Therefore in order to use our binary instrumentation tools (pin-tools), you have to install PIN first. The latest copy of PIN can be obtained here.
2) The PIN platform and our pin-tools should be able to work on Windows, Linux and MacOS. However, we have developed and tested our tools on Linux only. Therefore, for the rest of this document, we assume a Linux OS. If you want to use our tools on Windows, you should have to modify only some of the automation scripts (we explain those scripts in detail later in this document, but they are basically used to run the program repeatedly for the debugging process).
3) Download our pin-tools and scripts as well as the example bug using the download link on this page.
1) The two files downloaded from the previous step should be: AutoDebug.tgz
and bc-1.06_BUG.tgz.
Place AutoDebug.tgz in the source/tools/ directory of PIN (
where the PIN source code examples reside).
Extract AutoDebug.tgz: tar -xzvf
AutoDebug.tgz
This should create a directory AutoDebug/, which contains the source
code for our pin-tools.
To build the pin-tools:
cd CHECK
make
cd ..
make
This should create a folder obj-ia32 where the compiled
pin-tools reside.
2) Create folder for the example buggy program and copy bc-1.06_BUG.tgz
into that folder. Extract bc-1.06_BUG.tgz, which creates a folder bc-1.06_BUG/.
Inside bc-1.06_BUG/ there is a folder src/
which contains the source code for the buggy program. Change into the source
folder and compile the bc program. Rename the executable to my_bc and
move it back into bc-1.06_BUG/ folder.
cd src
make
mv bc ../my_bc
Once we build the bc program, we should also disassemble the executable.
This is necessary, since we are going to be looking at the assembly code during
our debugging process.
objdump --line-numbers --source --disassemble-all my_bc
> my_bc.dasm
To automate the process, we use several scripts, which reside in the bc-1.06_BUG/
directory. Those scripts have to be updated to use the proper location of
PIN. Search for $PIN in the scripts and update it to point to your installation
of PIN. In my case, the PIN variable was set to:
my $PIN =
"/home/dimitrov/Spring2009/pin-2.5-24110-gcc.4.0.0-ia32_intel64-linux/pin";
BC is a program, which implements an arbitrary precision calculator language. The bug, which we are going to investigate is a memory corruption bug and the relevant code from file storage.c is shown below. the faulty code in function more_arrays(). This function is called when more storage needs to be allocated to an array. It allocates a new, larger array, copies the elements of the old array into the new one, and initializes the remaining entries of the new array to NULL. The defect is on line 18 and is due to the fact that a variable v_count is used mistakenly instead of the correct variable a_count. Thus, whenever v_count happens to be larger than a_count, the buffer arrays will be overflown and its size information, which is located right after the buffer, will be lost. This results in a segmentation fault when more_arrays() is called one more time, and the buffer with corrupted size information is freed at line 23.
1 void more_arrays () { |
Copy the pin-tools (the .so files) from their obj-ia32/ directory
to the bc-1.06_BUG/ directory.
Then execute the script:
pin_RUN_split_diduce.pl
This script will execute the bc program repeatedly. First it will execute the program with passing inputs and record certain properties of the program during each execution. Those properties are accumulated/updated during the passing runs in file named: trained.0.diduce_dump.txt
After the training step, the script will run bc with a failing input (an input which results in a segmentation fault) and our pin-tool will detect the execution anomalies that occur during the failing run. Notice the output file: diduce.1.unique.txt. This file contains ALL the execution anomalies that our tool detected. On my machine, and for my executable (compiled with -static option), the number of anomalies are 24. Each line of this file begins with "pc" or "NEW_pc". Each line corresponds to an assembly instruction in the program, for which our tool detected anomalous behavior. The instructions "NEW_pc" are marked anomalous because they were never executed during any of the training runs, thus they are simply new code. The rest of the instructions showed anomalous behavior as compared to the passing runs.
The file diduce.1.CrashToken.txt contains a (usually small) subset of all the anomalies. Those anomalies are related to the instruction which crashed the program, through data dependencies. Thus, they are much more likely to contain the root cause of the bug. We refer to this set of anomalies as the "isolated anomalies" - since we isolate only the relevant anomalies and discard the irrelevant ones. In my case I have 3 isolated anomalies.
At this point, if only a few anomalies were isolated, the programmer may
search for them in the assembly code and determine to which C/C++ instruction
they actually correspond. For example, the first anomaly in my diduce.1.CrashToken.txt
file is for pc=0x804d9c0. Searching for that pc in the assembly, we find:
arrays[indx] = NULL; /*infection: overflows its size
information */
804d9c0: c7 04 91 00 00 00 00 movl $0x0,(%ecx,%edx,4)
This is exactly the point where memory is corrupted due to the buffer
overflow and thus we have successfully detected the root cause of the problem.
Sometimes, even after isolating only the relevant anomalies, there are still
too many of them to be examined manually. Thus, we have developed another
(optional) step in our automated approach. In this step, which we call
"validation", we test each of the isolated anomalies by attempting to
automatically "fix" them. Our way of "fixing" is simple.
During program execution, we skip the anomalous instruction, thus preventing it
from corrupting memory. Our experiments show that this simple approach is
surprisingly effective and can reduce the number of anomalies to be examined.
To perform the validation step, simply execute the script:
validate_split.pl
After running this script, you should obtain a file: CrashToken.0.diduce.validated. This file contains a summary of the validation step. It classifies the isolated anomalies as: validated, unresolved and dismissed. Those ranked as validated are the most likely root cause of the bug, because skipping these instructions causes the program crash to disappear. In this particular case, the instruction 0x804d9c0 is classified as validated.
The process of debugging your own programs should be similar to debugging bc. What needs to be done, is that you need to modify the scripts to run your program. For example, in the pin_RUN_split_diduce.pl script, there is a section which defines how to run the program. The code looks as shown below. This code defines how to perform the training for program bc and also how to trigger the bug. Simply replace the command after "--" with your own commands. The same has to be done for the script RUN_delete_dynamic, which is part of the validation step. Replace the command after "--" with your own.
my @training_set = ( |
The process described in this document works automatically for programs, which crash. During a crash our pin-tool will automatically take control and isolate the anomalies relevant to the crash point. However, if the program does not crash but simply results in incorrect results, some more effort is required. We have to determine, where is the point of program failure (for example, the point where the incorrect result is being printed out) . Once we determine an instruction (pc), which should be considered the point of failure, we can specify it to our pin-tool by using the TOKEN_PC option, when running the pin-tool. (note that the pc has to be supplied as an unsigned integer value on the command line and not as a hex value. )
We have provided more scripts and pin-tools, than what we describe in this brief user guide. Those pin-tools implement different methods for detecting anomalies during program execution (namely the AccMon and Loop-count method, as described in our publication.) The process for using them is the same as the one currently described. The script for using them are also provided.
Since we are instrumenting (using dynamic binary translation) a program, which corrupts memory, we have experienced that sometimes the buggy program may cause PIN itself to crash, giving an error: C:Tool (or Pin) caused signal 11 at PC 0x128a013. Sometimes we can prevent this from happening by simply running the program on a different machine, or recompiling it with different options (such as with -static). At this point we still do not have a solution to this problem, but are communicating to the PIN developers about that issue.
Newer versions of the gcc compiler may use a modified version of malloc(), and actually attempt to catch some memory corruption bugs. In this case, the crash (for example due to double-free) will be intercepted by glibc instead of our pin-tool. If you want to disable the malloc() checks, then set the environment variable: export MALLOC_CHECK_=0 or export MALLOC_CHECK_=2
If you have questions about how to use the tool or to leave any feedback, please send us email at slice4e AT gmail.com or zhou AT eecs.ucf.edu
[ Home | Download | User Guide | Publications | Acknowledgement ]