Regression-Verification for C code
Download
Report
Transcript Regression-Verification for C code
Regression-Verification
Benny Godlin
Ofer Strichman
Technion
1
Software Formal Verification
Tony Hoare’s grand challenge: "verifying compiler"
Suppose we do not require completeness.
Still, there are two major bottlenecks when applying functional
verification:
Needs formal specification
assertions, or
higher level temporal properties
Complexity.
2
A more modest challenge: Regression Verification
“For every problem that you cannot solve, there is an easier
problem that you cannot solve”
Develop a method for formally verifying the equivalence of two
closely-related programs.
Does not require formal specification.
Computationally easier than functional verification
On the other hand: defines a weaker notion of correctness.
3
Advantages of Regression Verification
Specification by reference to an earlier version.
Can be applied from early development stages.
Equivalence proofs are potentially easier than functional
verification.
Ideally, the complexity should depend on the semantic difference
between the programs, and not on their size.
4
Regression-based technologies
In Testing: Regression testing is the most popular automatic
testing method
In hardware verification: Equivalence checking is the most
used verification method in the industry
6
Hoare’s
1969
paper
has it all.
...
…and 6 pages later:
7
Previous work
In the theorem-proving world:
Not dealing with realistic programs / realistic programming
languages
Not utilizing the equivalence of most of the code for simplifying
the computational challenge
Industrial / realistic programs:
Loop-free/Recursion-free code:
(microcode @ Intel, embedded code @ Hu)
8
Goals
Develop notions of equivalence
Develop corresponding proof rules
Develop a prototype for verifying equivalence of C programs,
that will
incorporate the equivalence rules
and be sensitive to the magnitude of change rather than the
magnitude of the programs
9
Notions of equivalence (1 / 6)
Partial equivalence
For any 2 terminating executions of P1 and P2 on equal inputs
the results of the executions are equal.
Undecidable
10
Notions of equivalence (2 / 6)
Mutual termination
Given equal inputs,
P1 terminates , P2 terminates
Undecidable
11
Notions of equivalence (3 / 6)
Reactive equivalence
Let P1 and P2 be reactive programs.
Given equal inputs, every 2 executions of P1 and P2
which also receive the same input sequence,
emit the same output sequence.
The sequences may be finite or infinite
Undecidable
12
Notions of equivalence (4 / 6)
k-equivalence
For any 2 executions of P1 and P2 on equal inputs
where each loop and recursion is restricted to iterate up to k
times,
results in equal output.
Decidable
13
Notions of equivalence (5 / 6)
Full equivalence
P1 and P2 are partially equivalent and mutually terminate
Undecidable
14
Notions of equivalence (6 / 6)
Total equivalence
P1 and P2 are partially equivalent and both terminate
Undecidable
15
Notions of equivalence: summary
1.
2.
3.
4.
5.
6.
Partial equivalence
Mutual termination
Reactive equivalence
k-equivalence
Full equivalence* = Partial equivalence + mutual termination
Total equivalence* = partial equivalence + both terminate
* Appeared in earlier publications.
16
Outline
Inference rules for proving equivalence
Hoare’s rule of recursion
Rule 1: partial equivalence
Rule 2: mutual termination
Rule 3: reactive equivalence
Regression verification for C programs
CBMC – the back engine
Architecture
Decomposition
Handling dynamic data structures
17
Hoare’s Rule of Recursion
Let A be a recursive procedure.
Where
symbolizes a sound proof relation.
(explained on the next slide)
18
A recursive run: Diagram
level 1:
level 2:
last level:
A(...)
A(...)
A(...)
{
{
{
}
. . .
. . .
. . .
. . .
call A(...);
call A(...);
. . .
. . .
. . .
call A(...);
. . .
. . .
. . .
. . .
. . .
. . .
}
}
19
Rule 1: partial equivalence of recursive procedures.
//in[A]
A( . . . )
{
. . .
//in[call A]
call A(. . .);
//out[call A]
. . .
}
//out[A]
A
//in[B]
B( . . . )
{
. . .
// in[call B]
call B(. . .);
//out[call B]
. . .
}
//out[B]
B
21
Rule 1: encoding the premise
The only thing we know about the called function is that it
satisfies the congruence axiom:
(congruence) Calls to U with the same input values
return the same output values.
Making the called function Uninterpreted is enough.
We call such a program isolated.
22
Rule 1. Isolation example
unsigned gcd1(unsigned a, unsigned b)
unsigned gcd1(unsigned a, unsigned b)
{ unsigned g;
{ unsigned g;
if (b == 0)
if (b == 0)
g = a;
g = a;
else {
else {
}
a = a % b;
a = a % b;
g = gcd1(b, a);
g = U(b, a);
}
}
return g;
return g;
}
U is an uninterpreted function
23
Rule 1. Definitions
f: a function
UF(f): an uninterpreted function such that f, UF(f) have the
same prototype.
FUF : the isolated version of a function F :
24
Rule 1: partial equivalence made simple
Earlier we had:
Now we have:
25
Rule 1: partial equivalence made simple
//in[AUF]
AUF( . . . )
{
. . .
AUF
call U(. . .);
. . .
}
//out[AUF]
//in[BUF]
BUF( . . . )
{
. . .
BUF
call U(. . .);
. . .
}
//out[BUF]
U is an uninterpreted function
26
Rule 1: partial equivalence made simple
I has to reason about a combination of
programs without loops or functions calls, and
uninterpreted functions.
For a programming language such as C, there exists decision
Procedures (and tools) with these features.
27
Rule 1: example
unsigned gcd1(unsigned a, unsigned b)
unsigned gcd2(unsigned x, unsigned y)
{ unsigned g;
{ unsigned z;
if (b == 0)
g = a;
else {
a = a % b;
g = gcd1(b, a);
}
z = x;
?
=
if (y > 0)
z = gcd2(y, z % y);
}
return z;
}
return g;
}
28
Rule 1: example (side 1)
First, compute the transition relation using Static Single Assignment (SSA)
unsigned gcd1(unsigned a, unsigned b)
{ unsigned g;
if (b == 0)
g = a;
else {
a = a % b;
g = H(b, a);
}
return g;
}
29
Rule 1: example (side 2)
unsigned gcd2(unsigned x, unsigned y)
{ unsigned z;
z = x;
if (y > 0)
z = H(y, z % y);
}
return z; }
30
Rule 1: example
unsigned gcd1(unsigned a, unsigned b)
unsigned gcd2(unsigned x, unsigned y)
{
{
}
. . .
. . .
return g;
return z;
}
31
Rule 2: Proving Mutual termination
The rule:
If all paired procedures satisfy :
1.
Partial equivalence,
2.
The conditions under which they call each pair of mapped
procedures are equal, and
3.
The read arguments of the called procedures are the same when
they are called,
then
all paired procedures mutually terminate.
32
Rule 3: Proving reactive equivalence
The rule:
If all paired procedures satisfy
1.
given the same arguments and the same input sequences, they
return the same values,
2.
they consume the same number of inputs, and
3.
the interleaved sequence of procedure calls and output
statements inside the mapped procedures is the same (and the
procedure calls are made with the same arguments)
then
all paired procedures are reactive equivalent.
33
Outline
Inference rules for proving equivalence
Hoare’s rule of recursion
Rule 1: partial equivalence
Rule 2: mutual termination
Rule 3: reactive equivalence
Regression verification for C programs
CBMC – the back engine
Architecture
Decomposition
Handling dynamic data structures
34
CBMC
A C verification tool by D. Kroening
Normal use – bounded model checking of C programs.
We use CBMC as follows:
Given a loop-free, recursion-free C program and an assertion, check
whether the assertion can fail.
Supports “assume(…)” construct to enforce constraints.
35
The Regression Verification Tool (RVT)
Version A
result
counterexample
feedback
Version B
RVT Preprocessor
C program
CBMC
rename identical globals
enforce equality of inputs.
insert check points code
Supports:
Decomposition
Abstraction
some static analysis
…
36
The main challenge: complexity
The goal: complexity depends on the semantic difference
The key: decomposition.
Preprocessing: all loops are rewritten as recursive functions.
Map the functions in the two programs.
Mapping heuristics: Name, signature, ‘role’ played in the code.
Mapped functions that have already been proven to be equal are
abstracted with uninterpreted functions.
39
Call Graph Algorithm
Legend:
Equivalent pair
Equivalence undecided yet
Non-equivalent pair
Syntactically equivalent pair
CBMC
A:
CBMC
Unpaired function
B:
f1()
f2()
U
f1’()
f2’()
f5()
U
f5’()
f7’()
U f3()
U f4()
f6()
U f3’()
U f4’()
40
Mutual Recursion
Mutually Recursive Functions:
Recognized by MSCCs in the call graph
We can treat this case by an extension of the solution to the
simple recursion case.
41
Call Graph Algorithm (with SCCs)
Legend:
Equivalent pair
Equivalence undecided yet
Non-equivalent pair
Syntactically equivalent pair
Equivalent if MSCC
A:
CBMC
B:
f1()
f2() U
U f3()
f1’()
f2’()
U
U f5()
U f4()
f6()
U
U f3’()
U f5’()
U f4’()
f6’()
42
The Regression Verification Tool (RVT)
We tested RVT on several programs:
Small programs that calculate GCD, power, sum of digits ...
Simple interactive (reactive) calculator program.
Automatically generated sizable programs
up-to thousands of code lines
but without arrays (unsupported yet)
Limited-size industrial programs:
Parts of TCAS - Traffic Alert and Collision Avoidance System.
Core of MicroC/OS - real-time kernel for embedded systems.
Matlab examples: parts of engine-fuel-injection simulation.
43
Testing RVT on programs: Conclusions
For equivalent programs, partial-equivalence checks were very
fast:
proving equivalence in minutes.
For non-equivalent programs:
RVT attempts to prove partial-equivalence but fails
then RVT tries to prove k-equivalence
k-equivalence sometimes succeeds to show counter examples:
executions on equal inputs with non-equal values at check-points
k-equivalence may run for several hours or even run out of
memory.
44
Long list of future work
Finish implementation of:
Future work:
Rules 2/3, arrays (especially in UFs), slicing,…
How to strengthen the rules with invariants.
Test on real changes (CVS repository).
Use other program verification tools for C to prove
equivalence.
45
More Challenges
Identifying the connection between functional properties and
equivalence properties:
Assume that a high level property P is satisfied by some version
of the program,
Q: Which values should be followed by regression verification to
guarantee that this property still holds ?
Q: What is the ideal gap between two versions of the same
program, that makes Regression Verification most effective ?
46