Realization of solver based techniques for Dynamic Software Verification Andreas S Scherbakov

advertisement
Realization of solver based
techniques for Dynamic Software
Verification
Andreas S Scherbakov
Intel Corporation
andreas.s.scherbakov@intel.com
What’s program testing here?
• The problem: to test a program
means
• Find at least one set of input values such that
– a crash/an Illegal operation occur
or
– some user defined property has violated (unexpected
results/behaviour)
or
• Prove correctness of the program
– or at least demonstrate that it’s correct with some
high probability
SW testing: basic approaches
• Random testing
-> You execute your program repeatedly with random input values..
+
covers a lot of unpredictable cases
─
too much redundant iterations -> out of resources
• “Traditional “ testing - Custom test suites
-> You know you code and therefore you can create necessary examples to test
it?..
+
targets known critical points
─
misses most of unusual use cases
─
large effort, requires intimate knowledge of the code
• Directed testing
-> Try to get a significantly different run each attempt..
+
explores execution alternatives rapidly
+
effective for mixed whitebox/blackbox code
─
usually needs some collateral code
─
takes large resources if poorly optimized
SW testing: basic approaches - 2
• Static Analysis

─
─
+
+
Commercial tools: Coverity, Klocwork, …
Find dumb bugs, not application logic errors
Finds some “false positive” bugs, misses many real bugs
Good performance
Little expertise required

+
─
─
Academic and competitor tools: BLAST, CBMC, SLAM/SDV
Finds application logic errors
Finds some “false positive” bugs, but doesn’t miss any real ones
Significant user expertise required

+
─
─
Academic tools: HOL, Isabelle, …
Ultimate guarantee: proves conformance with specification
Scaling constraint is human effort, not machine time
Ultimate user expertise required: multiple FV PhDs
• Model Checking
• Formal Verification
Directed Testing:
as few runs as possible
• executes the program with two test
cases: i=0 and i=5
• 100% branch coverage
DART: Directed Automated Random
Testing
• Main idea has been proposeded in
Patrice Godefroid, Nils Klarlund, and Koushik Sen. DART:
Directed Automated Random Testing. In Proceedings
of the ACM SIGPLAN Conference on Programming
Language Design and Implementation. PLDI 2005:
213-223.
• Dependent upon a Satisfiability Modulo Theories
(SMT) solvers
-> SMT solvers are applications able to solve equation
sets. A theory here implies methods related to some
set of allowed data types/operands
What does it check?
• Does not verify the correctness of the program UNLESS YOU HAVE Express
the meaning of CORRECTNESS in form of ASSERTION CHECKERs
– Can not infer what the ‘correct’ behavior of the program is
• What does it check
– allows users to add assumptions to limit the search space and assertions
(‘ensure‘) to define the expected behavior.
– Assertions are treated as (‘bad’) branches – so test process will try to reach
them, or formally verify it is impossible.
– ‘built in’ checks for crashes, divide by 0, memory corruption
• requires some familiarity with the software under test for effectiveness.
7
Looking for a Snark in a Forest
Looking for a Bug in a Program
• A bug is a like a snark
• A program is like a forest with many
paths
• Source code is like a map of the
forest
Just the place for a Snark!
I have said it twice:
That alone should encourage the crew.
Just the place for a Snark!
I have said it thrice:
What I tell you three times is true.
The Hunting of the Snark
Lewis Carroll
Proof Rather than Snark Hunting
forest searching can be a very effective way to show the presence of snarks, but is
hopelessly inadequate for showing their absence.
The Humble Snark Hunter
•
How can we prove there no snarks in the forest?
–
–
–
–
•
Get a map of the forest
Find the area between trees
Assume a safe minimum diameter of a snark
If minimum snark diameter > minimum tree separation no snarks in forest
The gold standard, but:
–
–
–
–
You need a formal model of the forest
A mathematician
Substantial effort
As good as your model of forests and snarks
(are snarks really spherical?)
Snark Hunting Via Random Testing
• REPEAT
– Walk through the forest with a coin.
– On encountering a fork, toss the coin:
• heads, go left
• tails, go right
• UNTIL snark found or exhausted
• Easy to do: You don’t even need a map!
• But:
– Very low probability of finding a snark
Traditional Snark Hunting
• Study the forest map and use your experience to
choose the places where snarks are likely to hide.
• For each likely hiding place, write a sequence of “turn
left”, “turn right” instructions that will take you there.
• REPEAT
– Choose an unused instruction sequence
– Walk through the forest following the instructions
• UNTIL snark found or all instructions used
• But…
– Snarks notoriously good at hiding where you don’t expect
Snark Hunting Via Static Coverage Analysis
• Get a map of the forest
• Have a computer calculate instruction sequences that
go through all locations in the forest.
• REPEAT
– Choose an unused instruction sequence
– Walk through the forest following the instructions
• UNTIL snark found or enough of the forest covered
• But…
– Lot of computing power to calculate the paths
– there will be a lot of paths
Effective Snark Hunting Without A
Map
• Start with a blank Map
He had bought a large map representing the sea,
Without the least vestige of land:
And the crew were much pleased when they found it to be
A map they could all understand.
• REPEAT
– REPEAT
• Walk through the forest with
– a map (initially blank)
– sequence of instructions (initially blank)
• Add each fork that you haven’t seen before to your map.
• When encountering a fork:
– If there is an unused instruction, follow it
– Otherwise, toss a coin as in random testing
– UNTIL you exit the forest
• If there is a fork on your map with a branch not taken
– Write a sequence of instructions that lead down such a branch
• UNTIL snark found, no untaken branches on map, you’re tired
Comparison of alternatives
Formal
Verification
Model
checking
Accuracy
DART
Traditional
testing
Static
analysis
Expertise/Effort
14
How it Works
x= 0
y= 9
• f(x,y) run: 1
– Arbitrary
inputs:
• x=0
• y=9
false
false
x1 = x – 1;
x1 > y
x>y
=-1
void f (int x, int y) {
if (x > y) {
x = x + y;
y = x – y – 3;
x = x – y;
}
x = x – 1;
if (x > y) {
abort ();
}
return;
}
How it Works
x
y
• f(x,y) run: 2
– choose x, y so
– (x > y) = false
– x1 = x – 1
– (x1 > y) = true
– no such x, y!
x>y
x1 = x – 1;
x1 > y

void f (int x, int y) {
if (x > y) {
x = x + y;
y = x – y – 3;
x = x – y;
}
x = x – 1;
if (x > y) {
abort ();
}
return;
}
• f(x,y) run:2
– choose x, y
so
• (x > y) =
true
How it Works
x =9
y =0
true
x>y
x1 = x + y;
y1 = x1 – y;
x2 = x1 – y1 – 3;
x3 = x2 – 1;
x1 = x – 1;
– Inputs
• x=9
• y = 0x1 > y

false
x3 > y 1
=9
=9
=-3
=-4
void f (int x, int y) {
if (x > y) {
x = x + y;
y = x – y – 3;
x = x – y;
}
x = x – 1;
if (x > y) {
abort ();
}
return;
}
• f(x,y) run: 3
–
–
–
–
–
–
–
–
–
–
choose x, y so
(x > y) = true
x1 = x + y
y1 = x1 – y
x2 = x1 – y1 + 3
x 3 = x2 – 1
(x3 > y1) = true
Inputs:
x=1
y=0
How it Works
x =1
y =0
x>y
true
x1 = x + y;
y1 = x1 – y;
x2 = x1 – y1 – 3;
x3 = x2 – 1;
x3 > y1
void f (int x, int y) {
if (x > y) {
x = x + y;
y = x – y – 3;
x = x – y;
=1
}
=1
x = x – 1;
=-3
if (x > y) {
=-4
abort ();
}
return;
true
}
abort
A Simple Test Harness
The Program
void snarky (int x, int y) {
if (x > y) {
x = x + y;
y = x – y – 3;
x = x – y;
}
x = x – 1;
if (x > y) {
abort ();
}
}
int main () {
const int x =
choose_int ("x");
const int y =
choose_int ("y");
snarky (x, y);
return 0;
}
• instrumentation library routine
Quick Example
void
string_copy (const char *s, char *t) {
int i;
for (i=0; s[i] != '\0'; ++i) {
t[i] = s[i];
}
}
int
string_equal
(const char *s, const char *t)
{
int i = 0;
while (s[i] != '\0' && s[i] == t[i]) {
++i;
}
int
main () {
const size_t source_length =
choose_size_atmost (…);
const char *source =
choose_valid_string (…);
const size_t target_size =
choose_size_atleast (…);
const char *target =
choose_valid_char_array (…);
string_copy (source, target);
ensure
(string_equal (source, target));
return 0;
}
Quick example: Bug found
Bug found with the parameters:
target_size = 1
target[0] = 1
source_length = 0
(Killed by signal)
Overall Design
• Harness Library
– Supply specified values for inputs, or arbitrary values
– Check required/ensured constraints
• Instrumentation
– Modify a C program to produce an execution trace with
the required execution
• Observed Execution
– Observe path taken by a run and calculate predicate
describing a new path
• Constraint Solver
– Solver used to discover for a specified path condition
• If the path is feasible
• Inputs that would cause it to be executed
Testing Time
90
80
70
60
50
40
30
20
10
0
1
2
3
4
5
6
• Don’t expect to test all paths for realistically sized data
• You can, however, run many useful tests quickly
7
8
You Provide The Controllability
• For each “unit” you write
– A harness to call unit’s functions
– Stubs for functions the unit calls
• Provides functions to generate
values
– For harnesses to call with
– For stubs to return with
– Declarative specification of
constraints on the values
• This provides
– A model of the unit’s
environment
– Controllability over the unit
Harness code
Code under test
Stub code
Front End:
Instrumentation
Why do we track symbolic data?
We want to be able to choose another
branch next run..
if (x==y+3) {
/* branch A */
} else {
/* branch B */
}
To choose given branch, we need to solve:
( x==y+3 ) == false/true
To pass it to solver, we need to have x==y+3
expression in a symbolic form at if
In order to know it at this point, we should
track assignments of constituent
components..
Tracing symbolic data
• Solution: adding special tracing statements to
source statements
CIL
• “CIL (C Intermediate Language) is a high-level
representation along with a set of tools that
permit easy analysis and source-to-source
transformation of C programs.”
http://www.cs.berkeley.edu/~necula/cil/
• CIL enables user application to explore and refactor various types of C source constructs
(functions, blocks, statements, instructions,
expressions, variables etc) in a convenient way
while keeping the remaining code structure.
Tool Framework
User Input
User written
harness
Software
under test
Frontend
CIL
Instrument
-ation
Backend
Instrumented
Program
Run
Scoreboard
track coverage
Input
Generator
SMT
Solver
Problem: CIL Based Frontend does not support C++
Solution: Replace the CIL based frontend with LLVM to
support C++
How CIL simplifies handling the code..
• Automatically rewrites C expressions with side effects:
a = b+= --c ---> c = c-1; b = b+c; a = b;
• Uniformly represents memory references:
(base+offset)
• Converts do,for,while loops to
while (1) {
if (cond1) break; /* if needed */
if (cond2) continue; /* if needed */
body;
}
• Traces control flow
What is LLVM?
•
•
•
•
LLVM – Low Level Virtual Machine
Modular and reusable collection of libraries
Developed at UIUC and at Apple®
LLVM Intermediate Representation (IR) is well
designed and documented.
• Has a production quality C++ frontend that is
compatible with GCC
• Open-source with industry friendly license.
• More info at www.llvm.org
LLVM frontend
LLVM Based Frontend
User written
harness
Software
under test
Clang
C/C++
Parser
LLVM
IR
Compiler
Pass
Rest of
Compile
Instrumented
Program
Backend
LLVM provides modular libraries and tool infrastructure to
develop compiler passes
Using C++ overloads
• Idea: redefine operators such a way that they
output trace data:
my_int operator + (my_int x, my_int y) {
symbolic s = trace_addition(x.symbol(),y.symbol());
int c = x.val() + y.vall();
return my_int (s,c);
}
• Instrumentation is still needed (control tracing,
types..)
Reducing branches
• This 2-branch control:
if (x && y) action1; else action2;
really produces 3 branches in C/C++:
if (x)
{
if (y) action1; else action2;
}
else action 1;
• x && y is not really a logical and.
– We cannot simply supply (x && y) to a SMT solver..
Reducing branches: solution
• But.. Sometimes it IS logical and
– Namely, if y may be safely evaluated at x==false or y
cannot be safely evaluated at any x value
which means
– y has no side effects
and
– y crash conditions don’t depend on x
• If we can prove this statically, use the form:
if (logical_and(x,y)) action1; else action2;
• Else use 3-branch form 
Solver Theories
• Different solver theory
– Linear Integer Arithmetic: (a*x + b*y + ….) {><=} C
– Linear Real Arithmetic
– BitVector Arithmetic
• Most conditions in C source code fits one of
them. But some mixed/complex don’t
– alas, sometimes using random alternation
– luckily, theories are being developed actively
• Need to recognize theory patterns for better
performance
-> Sometimes supported scope is wider then declared
theory scope
36
Path exploration strategy
• Usually we explore all paths in
Depth First Search mode:
– alternate deeper ones first
– when complete, return one
level and try again
• But execution path count may
occur to be extremely high to
explore all of them
Path exploration strategy -2
• If we have no resources to
explore all path, DFS is not the
best strategy: some nodes never
be visited while some others are
carefully explored
- low coverage coverage
- most of dumb bugs may be
missed
explored
unexplored
• Good strategy principle: first
visit new nodes, next explore
new paths
- Details are subject to research
An optimization: Get function properties
• Idea: Taking advantage of code hierarchy: using I/O properies for
function/procedure call
-> try to go with the assumptions only rather than deepening into subroutine body

Example: y = string_copy (x)
require valid_pointer(x)
property valid_pointer(y) /* assuming we have yet memory */
property length(x) == length(y)
property i < length(x)  y[i] == x [i]
•
•
For black box (external library) code, assumptions should be supplied as collaterals
For available source code, they can also be extracted automatically
-> but it’s a question what to extract
If (length(s) > 2) {
p = string_copy(s);
if (length(s) >1) {
} else {
do_something();
}}
If (length(s) > 2) {
p = ???; assume length(p) == length(s);
if (length(p) >1) {
} else {
/* lenghts(p) <=1 && length(p) == length(s)
&& lengths(s) > 2) ---- Infeasible */
}}
An optimization: Separate
independent alternations
if (z == 2) {
x=b;
do_something1();
}
if (y == x) {
do_something2();
}
Dependent choices
We should try 2*2 combinations:
•z=2, y=b
•z=2, y≠b
•z≠2, y=x
•z≠2, y≠x
(all variables are sampled at the
beginning of code piece presented)
Separate independent alternations -2
if (z == 2) {
q = b;
}
if (y == x) {
p = c;
}
Independent choices
We can try only 2 combinations,
for example:
•z=2, y=x
•z≠2, y≠x
(provided that do_something1() and
do_something2() effects don’t
interdepend)
Separate independent alternations -3
if (z == 2) {
q = b;
}
if (y == x) {
p = c;
}
if (q == p) …
Dependent choices
again!
An optimization: re-using unsatisfied
conditions
if (a && b && c) {
…
}
if (a && b && c) {
…
}
if (a && b && c
&& d &&e) {
…
}
Let we’ve proved that we
cannot get here
Then, we can be sure that
we cannot get there too
 No need to call a solver
again
Handling Black Boxed Code
Contents
•
•
•
•
•
•
•
•
•
•
•
Motivation
Losing control with black boxes
Return Value Representation
Randomization
Characterizing
Learning
Stubbing/Wrapping
Example: The encryption problem
Selective/Dynamic Black-Boxing
Embedded White-Boxes
Afterwords
Motivation
• Testing a portion of a code within a large system.
E.g:
– Code over infrastructure/library functions
– Firmware over hardware API/Virtual Platform
– Binary infrastructure
• Hiding Code solver can’t cope with
– Non Linear arithmetic (a*b = C)
– Assembly
• Handling Deep paths/Recursion
Losing Control with Black Boxes
• Black-boxes impair our controllability when
program paths are influenced by black-box
outputs.
int a = choose_int(“a”);
int b = bb(a);
if (b > 10) { … } else { … }
• We have no information to pick “a” such that it
drives (b > 10) in both directions.
Return Value Representation
• The flow treats the return value of an
uninstrumented function as concrete only (not
symbolic).
• But it can be explicitly assigned a fresh symbolic
variable with fresh_*
int a = blackboxed_func(…); // a is concrete
fresh_integer(a, “a”);
// a is symbolic
• The reverse could be done as well with
concrete_* (later).
Example: The Encryption Problem
ulong x = choose_uint ("x%d", count);
ulong y = choose_uint ("y%d", count);
if (y == encrypt(x))
<…>;
else
<…>;
• We pathologically can’t guess x and y beforehand such that
y == encrypt(x).
coping with it by:
• Running once with y=x=0, the condition fails.
• “see”: (y == <concrete encrypt(0)>)
• choose x=0, y = encrypt(0) for the 2nd run.
Randomization
• We can increase our chances of gaining
coverage by adding randomization
int a =
choose_random_int(“a”);
int b = bb(a);
if (b > 10) { … } else { … }
Characterizing: assert
• Our first step to gain back control is having the user tell
us something about the function.
• A new construct is added: assert(<cond>)
Reminder:
– require(<cond>) : Assume <cond> holds. If it doesn’t
ignore current path and move on. This is actually a branch
equivalent to
if (!cond) exit(0);
– ensure(<cond>) : Make sure <cond> holds.
If it doesn’t - stop execution and report it.
If it does, try to make it fail.
This is actually a branch equivalent to
if (!cond) abort();
assert – 2.
•
Eg: a strictly monotonic black boxed function.
int a = choose_int(“a”);
int b = bb(a);
fresh_integer(b, “b”);
assert(b > a);
if (b > 10) { … } else { … }
•
•
•
•
assert(<cond>): Assume <cond> holds. If it doesn’t – stop execution and
report. But don’t try to make it fail.
Must use with fresh_*
Full characterization: solves the problem, but impractical.
Partial characterization:
– May help the solver – depending on its internal heuristics.
– The more assertions the better.
Learning
• Learn from concrete inputs and outputs of a
black-boxed function – and use if future runs.
• But what to learn?:
Function is not always deterministic: has
implicit “inputs” and “outputs” / internal
state.
• Instead of learning functions, we learn a
“subject” in many lessons. Each lesson can
have multiple inputs and outputs.
Learning - 2.
int a = choose_int(“a”);
lesson l = begin_lesson ();
learn_integer(l, a, LEARN_INPUT);
int b = bb(a);
fresh_integer(b, “b”);
learn_integer(l, b, LEARN_OUTPUT);
end_lesson(l);
if (b > 10) { … } else { … }
Learning - 3.
• When it misses a path it can retry it several
times, and learn new concrete values in each
try.
• Previous learning will add constraints to the
solver, so previous inputs will not be reused
when trying to get a different outputs.
Learning – 4.
• The “subject” is supposed to be common to all
invocations of a black-boxed function, but is
different from function to function.
• An easier function: begin_lesson() implicitly
creates a unique subject from the code location.
• Problem: the function is invoked from different
places in the code. We want to write learning
once.
• Solution: We shall later see how we can easily
write wrappers to divert all calls to one place.
Stubbing / Wrapping
• User will write a wrapper to add
characterization and learning sugaring to all
invocations of a function:
int bb_wrap(int a)
{
lesson l = begin_lesson ();
learn_integer(l, a, LEARN_INPUT);
int b = bb(a);
fresh_integer(b, “b”);
learn_integer(l, b, LEARN_OUTPUT);
end_lesson(l);
assert(b > a);
return(a);
}
Stubbing / Wrapping – 2.
• Now we want to call the wrapper bb_wrap,
instead of bb.
• But we don’t want to manually change all
invocations in our code.
• instrument does it for us:
instrument –stub bb:bb_wrap –
stub … -stub …
• Conveniently it won’t replace calls within the stub
code itself.
Selective Blackboxing
• We can selectively blackbox instrumented code:
begin_blackbox(true);
int x = 10;
end_box();
concrete_integer(x);
• We must use fresh_* or concrete_* on values
that were defined/modified inside the blackbox
and are visible outside of it.
• Otherwise it might think it is uninitialized, or
miscalculate paths.
Dynamic Blackboxing - 1.
• Sometimes we just have too many paths:
char* s = choose_valid_string("str", 100);
int count_a = 0;
for(int i=0; i<100; i++)
{
if (s[i] == 'a') {
count_a++;
}
}
• If we can’t change the string length it will see
2^100 paths…
Dynamic Black-boxing – 2.
• We can dynamically hide code:
•
char* s = choose_valid_string("str", 100);
int count_a = 0;
for(int i=0; i<100; i++)
{
begin_blackbox(i>2);
if (s[i] == 'a') {
count_a++;
}
end_box();
}Now XXX sees only 2^2 paths.
Embedded White-boxes
• What if we want to look into some code that
is called from black-boxed code? E.g a
callback:
void callback(int x)
{ … } // we want to see this code
int main ()
{
int x = choose_uint("x");
bb_foo(x, callback); // bb_foo is blackboxed
return 0;
}
Embedded White-boxes - 2.
• On the “inside” we “whitebox” it expicitly, and
freshen (or concretize) the inputs:
void callback(int x)
{
static int count = 0;
begin_whitebox(true);
fresh_integer(x, “new_x%d”, count);
end_box();
++count;
} // we want to see this code
Afterwords
• We have established a “Swiss army knife” of
features to support future black-box
challenges.
• This helps overcome some simple synthetic
examples.
• Since the loss of controllability is generally
hard, we expect we’d need to refine this set of
features as we hit real life test-cases.
Download