Sunday, August 4, 2013

Code coverage - introduction

According to wikipedia:
In computer science, code coverage is a measure used to describe the degree to which the source code of a program is tested by a particular test suite.
Basically what code coverage does during compilation is instrumentation of every line of code.
It means that special counters are added to Our code to monitor program execution paths.
In all my examples I will be using GNU Compiler GCC which generates code coverage data and GCOV coed coverage tool.
I will be working in Eclipse IDE.
For Eclipse You should download gcov integration form Linux tools package:

I also recommend installing  CDT gcov plugin for eclipse. Thanks to the plugin after downloading coverage data from target and refreshing project We can see the results immediately:


In order for Our target to generate coverage data We have to compile it with --coverage option which is described at the end of this post.
Compilation won't be that easy because the compiler will generate a lot of errors:
undefined reference to `_exit'
undefined reference to `_fstat'
undefined reference to `_open'
undefined reference to `_sbrk'
undefined reference to `_kill'
undefined reference to `_getpid'
undefined reference to `_write'
undefined reference to `_close'
undefined reference to `_isatty'
undefined reference to `_lseek'
undefined reference to `_read'
We need to provide method stubs for the compiler.
The simplest way is to leave the methods empty. We will fill them in next steps.
After compiling Our program with  --coverage option the compiler will create .gcno data file.
Next step is to run Our program and terminate it properly.
Successful program exit is very important because .gcda files which contain actual coverage data are created at this point.

When we call _exit method code coverage data is generated.
Our program calls _open to open .gcda file. If the file exists program calls _read do read the file then merges coverage data to the file and calls _write to write the file and _close to close it.
When the file doesn't exist program just calls _write to write new data to file and _close.

Why merge files?
Files are merged so we can execute the program couple of times and the coverage data is the sum of all executions.

At the beginning just to get any coverage data We can return 0 for the _open method indicating that a new file was created (We will get the file name from open method argument). The program will go to
_write method giving Us file size and pointer to the memory location with the file content. Afther that We can use gdb to dump the memory contents into a file.

Problems!
1. Instrumentation and gcov library.
Instrumentation of Our code needs extra program and data memory. This can be a problem on an embedded target. Remember about it!
2. .gcda files.
.gcda coverage files aren't a problem but getting them out from the target is.
I think the simplest way is to dump the memory in which the file is located.
In my examples I will be using gdb open debugger which makes it possible.
The more "sophisticated" way is to get data out using some interface. For example serial port or Ethernet connection.


From GCC compiler documentation:
--coverage
This option is used to compile and link code instrumented for coverage analysis. The option is a synonym for -fprofile-arcs -ftest-coverage (when compiling) and -lgcov (when linking). See the documentation for those options for more details.
  • Compile the source files with -fprofile-arcs plus optimization and code generation options. For test coverage analysis, use the additional -ftest-coverage option. You do not need to profile every source file in a program.
  • Link your object files with -lgcov or -fprofile-arcs (the latter implies the former).
  • Run the program on a representative workload to generate the arc profile information. This may be repeated any number of times. You can run concurrent instances of your program, and provided that the file system supports locking, the data files will be correctly updated. Also fork calls are detected and correctly handled (double counting will not happen).
  • For profile-directed optimizations, compile the source files again with the same optimization and code generation options plus -fbranch-probabilities (see Options that Control Optimization).
  • For test coverage analysis, use gcov to produce human readable information from the .gcno and .gcda files. Refer to the gcov documentation for further information.
With -fprofile-arcs, for each function of your program GCC creates a program flow graph, then finds a spanning tree for the graph. Only arcs that are not on the spanning tree have to be instrumented: the compiler adds code to count the number of times that these arcs are executed. When an arc is the only exit or only entrance to a block, the instrumentation code can be added to the block; otherwise, a new basic block must be created to hold the instrumentation code.
-fprofile-arcs
Add code so that program flow arcs are instrumented. During execution the program records how many times each branch and call is executed and how many times it is taken or returns. When the compiled program exits it saves this data to a file called auxname.gcda for each source file. The data may be used for profile-directed optimizations (-fbranch-probabilities), or for test coverage analysis (-ftest-coverage). Each object file's auxname is generated from the name of the output file, if explicitly specified and it is not the final executable, otherwise it is the basename of the source file. In both cases any suffix is removed (e.g. foo.gcda for input file dir/foo.c, or dir/foo.gcda for output file specified as -o dir/foo.o). See Cross-profiling.
 -ftest-coverage
Produce a notes file that the gcov code-coverage utility (see gcov—a Test Coverage Program) can use to show program coverage. Each source file's note file is called auxname.gcno. Refer to the -fprofile-arcs option above for a description of auxname and instructions on how to generate test coverage data. Coverage data matches the source files more closely if you do not optimize.
About .gcno & .gcda files:
gcov uses two files for profiling. The names of these files are derived from the original object file by substituting the file suffix with either .gcno, or .gcda. All of these files are placed in the same directory as the object file, and contain data stored in a platform-independent format.
The .gcno file is generated when the source file is compiled with the GCC -ftest-coverage option. It contains information to reconstruct the basic block graphs and assign source line numbers to blocks.
The .gcda file is generated when a program containing object files built with the GCC -fprofile-arcs option is executed. A separate .gcda file is created for each object file compiled with this option. It contains arc transition counts, and some summary information.
The full details of the file format is specified in gcov-io.h, and functions provided in that header file should be used to access the coverage files.