What is Code Coverage?
Code coverage measurement simply determines those statements in a body of code which have been executed through a test run and those which have not. In general, a code coverage system collects information about the running program and then combines that with source information to generate a report on test suite’s code coverage.
Code coverage is part of a feedback loop in the development process. As tests are developed, code coverage highlights aspects of the code which may not be adequately tested and which require additional testing. This loop will continue until coverage meets some specified target.
Why Measure Code Coverage?
It is well understood that unit testing improves the quality and predictability of your software releases. However, how well your unit tests actually test your code? How many tests are enough? Do you need more tests? These are the questions code coverage measurement seeks to answer.
Coverage measurement also helps to avoid test entropy. As your code goes through multiple release cycles, there can be a tendency for unit tests to atrophy. As new code is added, it may not meet the same testing standards you put in place when the project was first released. Measuring code coverage can keep your testing up to the standards you require. You can be confident that when you go into production there will be minimal problems because you know the code not only passes its tests but that it is well tested.
In summary, we measure code coverage for the following reasons:
* To know how well our tests actually test our code
* To know whether we have enough testing in place
* To maintain the test quality over the lifecycle of a project
Code coverage is not a panacea. Coverage generally follows an 80-20 rule. Increasing coverage values becomes difficult with new tests delivering less and less incrementally. If you follow defensive programming principles where failure conditions are often checked at many levels in your software, some code can be very difficult to reach with practical levels of testing. Coverage measurement is not a replacement for good code review and good programming practices.
In general you should adopt a sensible coverage target and aim for even coverage across all of the modules that make up your code. Relying on a single overall coverage figure can hide large gaps in coverage.
How Code Coverage Works
There are many approaches to code coverage measurement. Broadly there are three approaches, which may be used in combination:
Source Code Instrumentation: This approach adds instrumentation statements to the source code and compiles the code with the normal compile tool chain to produce an instrumented assembly.
Intermediate code Instrumentation: Here the compiled class files are instrumented by adding new bytecodes and a new instrumented class generated.
Runtime Information collection: This approach collects information from the runtime environment as the code executes to determine coverage information
The Total Coverage Percentage allows entities to be ranked in reports. The Total Coverage Percentage (TPC) is calculated as follows:
TPC = (BT + BF + SC + MC)/(2*B + S + M)
BT – branches that evaluated to “true” at least once
BF – branches that evaluated to “false” at least once
SC – statements covered
MC – methods entered
B – total number of branches
S – total number of statements
M – total number of methods
Code Coverage Criteria
To measure how well the program is exercised by a test suite, one or more coverage criteria are used. There are a number of coverage criteria, the main ones being:
* Function coverage – Has each function in the program been executed?
* Statement coverage – Has each line of the source code been executed?
* Condition coverage (also known as Branch coverage) – Has each evaluation point (such as a true/false decision) been executed?
* Path coverage – Has every possible route through a given part of the code been executed?
* Entry/exit coverage – Has every possible call and return of the function been executed?
Safety-critical applications are often required to demonstrate that testing achieves 100% of some form of code coverage.
Some of the coverage criteria above are connected. For instance, path coverage implies condition, statement and entry/exit coverage. Condition coverage implies statement coverage, because every statement is part of a branch.
Full path coverage, of the type described above, is usually impractical or impossible. Any module with a succession of n decisions in it can have up to 2n paths within it; loop constructs can result in an infinite number of paths. Many paths may also be infeasible, in that there is no input to the program under test that can cause that particular path to be executed. However, a general-purpose algorithm for identifying infeasible paths has been proven to be impossible (such an algorithm could be used to solve the halting problem). Techniques for practical path coverage testing instead attempt to identify classes of code paths that differ only in the number of loop executions, and to achieve “basis path” coverage the tester must cover all the path classes.
This metric reports whether each executable statement is encountered. The chief advantage of this metric is that it can be applied directly to object code and does not require processing source code. Performance profilers commonly implement this metric. The chief disadvantage of statement coverage is that it is insensitive to some control structures. For example, consider the following C/C++ code fragment:
int* p = NULL;
p = &variable;
*p = 123;
Without a test case that causes condition to evaluate false, statement coverage rates this code fully covered. In fact, if condition ever evaluates false, this code fails. This is the most serious shortcoming of statement coverage. If-statements are very common.
Statement coverage does not report whether loops reach their termination condition – only whether the loop body was executed. With C, C++, and Java, this limitation affects loops that contain break statements.
Since do-while loops always execute at least once, statement coverage considers them the same rank as non-branching statements.
Statement coverage is completely insensitive to the logical operators (|| and &&).
Statement coverage cannot distinguish consecutive switch labels.
Test cases generally correlate more to decisions than to statements. You probably would not have 10 separate test cases for a sequence of 10 non-branching statements; you would have only one test case. For example, consider an if-else statement containing one statement in the then-clause and 99 statements in the else-clause. After exercising one of the two possible paths, statement coverage gives extreme results: either 1% or 99% coverage. Basic block coverage eliminates this problem.
One argument in favor of statement coverage over other metrics is that bugs are evenly distributed through code; therefore the percentage of executable statements covered reflects the percentage of faults discovered. However, one of our fundamental assumptions is that faults are related to control flow, not computations. Additionally, we could reasonably expect that programmers strive for a relatively constant ratio of branches to statements.
In summary, this metric is affected more by computational statements than by decisions.
This metric reports whether boolean expressions tested in control structures (such as the if-statement and while-statement) evaluated to both true and false. The entire boolean expression is considered one true-or-false predicate regardless of whether it contains logical-and or logical-or operators. Additionally, this metric includes coverage of switch-statement cases, exception handlers, and interrupt handlers.
Also known as: branch coverage, all-edges coverage, basis path coverage, decision-decision-path testing. “Basis path” testing selects paths that achieve decision coverage.
This metric has the advantage of simplicity without the problems of statement coverage.
A disadvantage is that this metric ignores branches within boolean expressions which occur due to short-circuit operators. For example, consider the following C/C++/Java code fragment:
if (condition1 && (condition2 || function1()))
This metric could consider the control structure completely exercised without a call to function1. The test expression is true when condition1 is true and condition2 is true, and the test expression is false when condition1 is false. In this instance, the short-circuit operators preclude a call to function1.
This metric reports whether each of the possible paths in each function have been followed. A path is a unique sequence of branches from the function entry to the exit.
Also known as predicate coverage. Predicate coverage views paths as possible combinations of logical conditions.
Since loops introduce an unbounded number of paths, this metric considers only a limited number of looping possibilities. A large number of variations of this metric exist to cope with loops. Boundary-interior path testing considers two possibilities for loops: zero repetitions and more than zero repetitions. For do-while loops, the two possibilities are one iteration and more than one iteration.
Path coverage has the advantage of requiring very thorough testing. Path coverage has two severe disadvantages. The first is that the number of paths is exponential to the number of branches. For example, a function containing 10 if-statements has 1024 paths to test. Adding just one more if-statement doubles the count to 2048. The second disadvantage is that many paths are impossible to exercise due to relationships of data. For example, consider the following C/C++ code fragment:
Path coverage considers this fragment to contain 4 paths. In fact, only two are feasible: success=false and success=true.
Researchers have invented many variations of path coverage to deal with the large number of paths. For example, n-length sub-path coverage reports whether you exercised each path of length n branches.