A verification method of VLSI system

advertisement
Intelligent automatic test pattern
generation for C-based HW/SW co-design
descriptions through combined use of
concrete and symbolic simulations
Masahiro Fujita
Yoshihisa Kojima
University of Tokyo
May 2, 2008
Background

In high-level SoC design, system behavior can be
described in C-like programming languages


Target both hardware and software
Tool support is not sufficient

Difficulties compared with RTL or lower design descriptions



Many wide-bit word-level signals (large exploration space)
Complicated control flow (many paths)
Difficulty in modeling various descriptions



Our goal is to assist test case generation for system-level
descriptions in C-like languages

Automatic input pattern generation


2
SW: pointers, pointer-arithmetic, casting, dynamic allocation, recursive
calls…
HW: concurrency, synchronization, throughput, latency…
Assertion-based verification to find bugs
For higher code coverage that results in higher confidence
Most important issues in debugging

Generally speaking, counter examples generated by
simulation/emulation are very “long”




Could be billions of cycles
Not east at all to understand why error occurs
Need much shorter counter examples just to understand why the
bug happens
Are those long sequences really necessary ?
Initial
state
Bug
State space

State space
Bounded model checking is based on assertions with
“constraints”


3
Initial
state
There can be
more directBug Loops can be skipped
path
Bounds cannot be large
Can we drive good constraints from the counter examples found in
simulation/emulation ?
Target language

SpecC = ANSI-C + mechanisms for HW


Structural hierarchy
Parallelism


Behavior
Ports
Synchronization
Channel
p1
Channel
c1
Interfaces
p2
B
v1

Languages discussed here


C language
Some additional features
b1
Child behaviors
4
b2
Variable
(wire)
Outline



Background
Problem definitions for input pattern generation
Preliminaries


Concrete/symbolic hybrid simulation






5
branch / path / coverage definitions
Concrete simulation, symbolic simulation
Hybrid simulation
Proposed Method for branch coverage
Implementation
Experimental Results
Conclusion and Future work
Requirements for input pattern generation (1)

For assertion failure detection
 Given a design description annotated with
Input variable definitions
 Assumption for input variables as predicates
 Assertion predicates


Possible result
Assertion violation (and input value assignments),
 Assertion holds for all possible input values,
 Unknown

int func(int x, int y) {
int r = 0;
if (x – y > 0)
r = x - y;
else
r = y – x;
return r;
}
6
int x, y;
FL_INPUT(x);
FL_INPUT(y);
FL_ASSUME(x >= 0);
FL_ASSUME(y >= 0);
FL_ASSERT(func(x, y) > 0);
Assertion failure
Counter examples exist:
(x = 0, y = 0)
(x = 3, y = 3)
...
Requirements for input pattern generation (2)

For branch coverage:
 Given design description with annotations and target
branch coverage
 Generate set of test cases (input value assignments) to
cover branches
 Tell
how to activate code fragments as many as
possible (over multiple runs)
int x, y;
FL_INPUT(x);
FL_INPUT(y);
if (x > 2) {
}
if (y > 2) {
}
7
Test cases of
(1) (x = 0, y = 0)
(2) (x = 3, y = 3)
will achieve 100% branch coverage
Outline



Background
Problem definitions for input pattern generation
Preliminaries


Concrete/symbolic hybrid simulation






8
branch / path / coverage definitions
Concrete simulation, symbolic simulation
Hybrid simulation
Proposed Method for branch coverage
Implementation
Experimental Results
Conclusion and Future work
Branch / path definitions

A (pair of) conditional branch(es):
Associated with if, do-while, for, switch-case, and
while statements
 A branch is covered when the associated
condition has been evaluated as true (or false) at
least once (over multiple runs)

if (cond)
then
BC = cond
9
else
BC = ! cond
Branch / path definitions

A path is a sequence of branches taken
A path condition is defined as the conjunction of
all the branch conditions taken
 A false (infeasible) path is a path such that there
is no value assignment which satisfies the path
condition

1:
2:
3:
4:
5:
6:
7:
8:
void func(int x, int y) {
if (x > 2) {
} else {
}
if (y > 2) {
} else {
}
}
1:
2:
3:
4:
5:
6:
void func(int x, int y) {
if (x > 2) {
}
if (x < 2) {
}
}
There appear to be 4 paths;
There are 4 paths;
The path condition is (x > 2) AND NOT(y > 2)
10
But the path condition is
(x > 2) AND (x < 2)
INFEASIBLE!
Branch / path coverage definitions

Branch coverage


# of branches covered out of # of all branches
Path coverage


# of paths covered out of # of all (or feasible) paths
Difficult to use in practice because:
The number of feasible paths cannot be known so easily
 The number of possible paths can be huge


Exponential w.r.t. # of if-statements * loop iterations
if
if
11
Exercised 2 runs:
branch coverage: 4 / (2 + 2) (100%)
path coverage: 2 / (2 * 2) (50%)
Outline



Background
Problem definitions for input pattern generation
Preliminaries


Concrete/symbolic hybrid simulation






12
branch / path / coverage definitions
Concrete simulation, symbolic simulation
Hybrid simulation
Proposed Method for branch coverage
Implementation
Experimental Results
Conclusion and Future work
Traditional (concrete) simulation approach

Create test cases (input values) by hand


Or, generate randomly



Very simple, but how long does it take to hit the failure?
Incomplete: cannot prove the assertion ALWAYS holds


Automated, but maybe difficult to activate the corner cases
In system level descriptions, the search space can be huge (e.g.
32-bit word level signals)
Run simulation


Not so easy
unless all possible values have been exercised (not practically
possible)
Confidence (quality of tests): given by coverage metrics
E.g. Branch-coverage
Try (x=3, y=100) => r=97 > 0 OK
Try (x=1, y=20) => r=19 > 0 OK
...
...
Try (x=10, y=10) => r=0 > 0 NG! (may eventually happen, but much rarely)

13
Formal approach




14
Build the formal expressions and mathematically
solve the constraints
Precise & Complete
Computationally expensive
Word-level approach: Symbolic simulation
 Evaluates values as symbolic expressions
instead of concrete values
Symbolic Simulation
Needs to enumerate all the paths
 Sometimes the path can be infeasible (falsepath problem)
path-condition

Path1 int func(int x, int y) {
int r = 0;
if (x – y > 0)
r = x - y;
else
r = y – x;
return r;
path2 }
Enumerates possible paths
(including infeasible ones)
Path1:
(r_1=0)
(x – y > 0)
(r_2=x - y)
(x>=0)
(y>=0)
-> (r_2>0)
VALID
for all x,y
15
Path2:
(r_1=0)
NOT(x – y > 0)
(r_2=y -x)
(x>=0)
(y>=0)
-> (r_2>0)
INVALID
Counter Example: (y - x=0)
(some of them may be reported)
Symbolic simulation (cont’d)

Employs SMT (satisfiability modulo theory) solver
To solve path conditions
 To evaluate assertions


For each path:


One symbolic simulation on a path corresponds to
concrete simulations of all possible values on that path
Limitations:
# of paths (including false paths)
 Size of symbolic expressions
 Solver capability (non-linear algebra)
 How to model complicated descriptions


16
May not be applied straightforwardly to complex /
large descriptions
Concrete-symbolic hybrid approach


Combines concrete simulation and symbolic
simulation (originally proposed by Larson[5])
CUTE[11] is proposed for unit testing





Exhaustive traversal on all paths
Concrete run guides the path for symbolic simulation
(initially random simulation)
Symbolic run on that path derives the path-condition
Use concrete values for approximation if the constraints
cannot be processed (e.g. non-linear)
Solve the constraints to guide the path to another

17
Negate some path-condition term to take another branch
Concolic Simulation (1st)
initially
random
1: void test(int x,
int y,
int z) {
2: if (x > 3) // B1
3: if (y > 11) // B2
4:
if (z == y*y) // B3
5:
if (x < 5) // B4
6:
reach_me();
7: }
Concrete States
x=0
y=0
z=0
(0 > 3)?
-> no!
Find the inputs to reach reach_me()
18
Symbolic States
x=i1
y=i2
z=i3
(i1 > 3)?
Path Condition
(i1 <= 3)
Negate this condition
And solve to take THEN
branch at B1
Concolic Simulation (2nd)
1: void test(int x,
int y,
int z) {
2: if (x > 3) // B1
3: if (y > 11) // B2
4:
if (z == y*y) // B3
5:
if (x < 5) // B4
6:
reach_me();
7: }
Concrete States
x=10
y=0
z=0
(10 > 3)
(0 > 11)?
-> no!
Find the inputs to reach reach_me()
19
Symbolic States
x=i1
y=i2
z=i3
(x > 3)
(y <= 11)
Path Condition
(i1 > 3)
(i2 <= 11)
Negate this condition
And solve to take THEN
branch at B2
Concolic Simulation (3rd)
1: void test(int x,
int y,
int z) {
2: if (x > 3) // B1
3: if (y > 11) // B2
4:
if (z == y*y) // B3
5:
if (x < 5) // B4
6:
reach_me();
7: }
Concrete States
x=10
y=20
z=0
(10 > 3)
(20 > 11)
(0 == 400)?
-> no!
Find the inputs to reach reach_me()
20
Symbolic States
x=i1
y=i2
z=i3
(x > 3)
(y > 11)
(z == y*y)
Path Condition
(i1 > 3)
(i2 > 11)
(i3 != 400)
Non-linear i2*i2 is
replaced by 400.
Negate this condition
And solve to take THEN
branch at B3
Concolic Simulation (4th)
1: void test(int x,
int y,
int z) {
2: if (x > 3) // B1
3: if (y > 11) // B2
4:
if (z == y*y) // B3
5:
if (x < 5) // B4
6:
reach_me();
7: }
Concrete States
x=10
y=20
z=400
(10 > 3)
(20 > 11)
(400 == 400)
(10 < 5)?
-> no!
Find the inputs to reach reach_me()
21
Symbolic States
x=i1
y=i2
z=i3
(x > 3)
(y > 11)
(z == 400)
(x >= 5)
Path Condition
(i1
(i2
(i3
(i1
> 3)
> 11)
== 400)
>= 5)
Negate this condition
And solve to take THEN
branch at B4
Concolic Simulation (5th)
1: void test(int x,
int y,
int z) {
2: if (x > 3) // B1
3: if (y > 11) // B2
4:
if (z == y*y) // B3
5:
if (x < 5) // B4
6:
reach_me();
7: }
Concrete States
x=4
y=20
z=400
(4 > 3)
(20 > 11)
(400 == 400)
(4 < 5)
Symbolic States
x=i1
y=i2
z=i3
(x > 3)
(y > 11)
(z == 400)
(x < 5)
Path Condition
(i1
(i2
(i3
(i1
> 3)
> 11)
== 400)
< 5)
Find the inputs to reach reach_me()
Reached successfully!
22
Concolic approach
Can be applied to work-around non-linear
 Can be used to enumerate the paths

 Good

Can be used to guide the path

But CUTE does not think about which path should
be tried next
 As

23
for path coverage
CUTE’s strategy is exhaustive
May not terminate if # of paths is huge
Outline



Background
Problem definitions for input pattern generation
Preliminaries


Concrete/symbolic hybrid simulation






24
branch / path / coverage definitions
Concrete simulation, symbolic simulation
Hybrid simulation
Proposed Method for branch coverage
Implementation
Experimental Results
Conclusion and Future work
Proposed method

Flip a branch condition on a path only when not
covered yet

Gives the priority for path enumeration


Terminates when the target coverage is achieved


Tries to avoid enumerating all the paths
Not guaranteed to cover all possible branches



25
Skips the uncovered paths that do not contribute to the
branch coverage
Derived alternative paths may not be feasible
Worst case: all paths need to be enumerated
Also limited by the solver’s capability (i.e. path condition
may not be solved)
Our implementation

Implemented on FLEC (our C-Equivalence Checker)



Used as SpecC[3] frontend
Control/data/communication/… dependencies have been extracted
AST interpreter

Evaluates AST node (expression / statement) one by one







For alternative path
For assertion failure
SMT solver: CVC3[12]


26
Concrete simulator evaluates with concrete values
Symbolic simulator evaluates with symbolic expressions
Branch/Path coverage profiler
Input pattern generator


C.f. CUTE: instrument & compile
We can start from any points in the program !

To generate input patterns
To evaluate assertions
C.f. CUTE: lpsolve
Outline



Background
Problem definitions for input pattern generation
Preliminaries


Concrete/symbolic hybrid simulation






27
branch / path / coverage definitions
Concrete simulation, symbolic simulation
Hybrid simulation
Proposed Method for branch coverage
Implementation
Experimental Results
Conclusion and Future work
Experimental results (1/3)
1: int func(int x, int y) {
2: int r = 0;
3: if (x – y > 0)
4:
r = x – y;
5: else
6:
r = y – x;
7: return r;
8: }
9: void main() {
10: int x, y;
11: FL_INPUT(x);
12: FL_INPUT(y);
13: FL_ASSUME(x >= 0);
14: FL_ASSUME(y >= 0);
15: FL_ASSERT(func(x, y) > 0);
16: }
28
Simple example
 Achieved 2 / 2
(100%) branch
coverage with 2 runs
 Detected assertion
failure with (x=0,
y=0)

Experimental results (2/3)
1: unsigned int fact_rec(unsigned
int s) {
2: if ( s <= 1) {
3:
return 1;
4: } else {
5:
unsigned int t;
6:
unsigned int p;
7:
t = s * fact_rec(s – 1);
8:
return t;
9: }
10: unsigned int fact_for(unsigned
int s) {
11: unsigned int i;
12: unsigned int p;
13: p = 1;
14: for (i = 1; i <= s; i++) {
15:
p *= I;
16: }
17: return p;
18:29}
19:
20:
21:
22:
23:
24:
25:
26:

void main() {
int i, o1, o2;
FL_INPUT(i);
FL_ASSUME(i <= 10);
o1 = fact_for(i);
o2 = fact_rec(i);
FL_ASSERT(o1 == o2);
}
Calculate factorial with two
implementations



With recursive function calls
With for-loop
Validated for one path (i =
8)

Achieved 4/4 (100%) branch
coverage with 1 run
Experimental results (3/3)
1: int f(int x,int y, int z) {
2: int p;
3: if (x+y+z == 6)
4:
if (2*x+7*y+3*z==25)
5:
if(-4*x-2*y+2*z==-2)
6:
FL_ASSERT(0);
7: for (p = 0; p < 100; p++) {
8:
if (p == z) {
9:
}
10: }
11: }
12: void main() {
13: int x, y, z;
14: FL_INPUT(x);
15: FL_INPUT(y);
16: FL_INPUT(z);
17: f(x, y, z);
18: }
30





# of branches: 10
# of paths: 4 * 2^100
Achieved 10 / 10 (100%)
branch coverage with 5 runs
Detected assertion failure
with (x=1, y=2, z=3)
CUTE got stuck due to too
many paths
Elevator controller profile

Elevator controller (abstracted model)


Cycle-based behavior
Simple, but designed by real engineer


Inputs:

3 Floors




1F
2F
open
3 buttons for floor stop request
2 buttons for door open / close
Outputs:





31
Up request buttons on 1F and 2F
Down request buttons on 2F and 3F
1 Cabin


3F
There is a not-intended bug

Up, Down request status
Floor stop request status
Door open/close
Cabin vertical speed (0: stopped, +1: up, -1: down)
Cabin position (on 1F, b/w 1F and 2F, on 2F,
b/w 2F and 3F, on 3F)
Service direction (0: none, +1: up, -1: down)
3F
2F
close
1F
Elevator controller profile (cont’d)

State variables:









Up/Down request status (2+2)
Floor stop request status (3)
Door status (1)
Cabin position (on 1F, b/w 1F and 2F, on 2F,
b/w 2F and 3F, on 3F)
Cabin speed (0: stopped, +1: up, -1: down)
Service direction (0: none, +1: up, -1: down)
2^8 * 5 * 3 * 3 = 11.5k states (including infeasible ones)
Initially stopped on 1F, door closed, no request active
Original code: 396 lines in SpecC


145 million paths (including infeasible)
Replaced if-then-else & switch-case statements with conditional
(cond ? True : false) expressions



32
To handle multiple paths at once
Simple control flow (straight line), but very complex data flow
Reduced to 155 lines
Elevator controller profile (cont’d)

Property examples

Elevator must be on or between 1F and 3F
 ASSERT((out_position

>= 0) && (out_position <= 4));
Door opens only when the elevator is stopped on
either of 1F, 2F and 3F
 ASSERT
(!out_door ||
( (out_speed == 0) &&
( (out_position == 0) || (out_position ==2) ||
(out_position == 4))))
33
Symbolic simulation result

Symbolic expression explodes in 3-4 cycles of symbolic simulation




nodes

With constant propagation/substitution
With simplifications for ITE, AND, OR, and other operators
Without concrete-value substitution (approximation)
Without common sub-expression sharing
# of cycles of symbolic simulation must be highly bounded!
Beginning of
Symbolic
simulation
1.E+06
1.E+05
1.E+04
1.E+03
1.E+02
1.E+01
1.E+00
Reset
sequence
typical
signal
all signals
1
34
300k nodes
and more!
2
3
cycle
4
5
User guided simulation

Starts symbolic simulation from the specified state
by the user

Explore with respect to the states of user’s interest
 Some of the states (proved to be) reachable by
concrete (random) simulation
 Jump into the states (which may or may not be feasible)

Will need to check its feasibility later
Cycle is
bounded
Concrete
simulation
Symbolic State space
simulation
Initial states
35
Paths unknown
Symbolic
simulation
Might be
infeasible
User guided result (1)

Try to generate the input pattern to make a
situation where
Located on 2F
 Speed = -1 (down)




I.e. to violate ASSERT (!((out_speed == -1) &&
(out_position == 2)))
This state is out of bound from the initial state
(stopped on 1F)

36
(not a bug)
Need more than 3 cycles for elevator to accept request
on 1F, start moving, go up at least to 2F, and go down…
User guided result (1) (cont’d)

So let’s jump in to one of the feasible state



Found one of the input pattern to violate the
assertion @ cycle 5 (3rd cycle of symbolic sim.)





37
state_position = 4, state_door = false, state_speed = 0
…
Known as a reachable state by random simulation a
priori
Up request on 1F @ cycle 1 = true
Up request on 2F @ cycle 1 = false
Down request on 2F @ cycle 1 = false
Stop on 1F request @ cycle 1 = false
Stop on 2F request @ cycle 1 = false
User guided result (2)

Try to violate the assertion

Elevator must be on or between 1F and 3F


ASSERT((out_position >= 0) && (out_position <= 4));
Let’s jump into one of the state



state_position = 4 (on 3F)
state_speed = +1 (up)
next state goes into
out_position = 5 (higher than 3F!)
 And violates the assertion!


However, the state (state_position = 4,
state_speed = +1) is actually infeasible


38
Wrong assumption may lead a wrong conclusion
The feasibility of the originating state should be verified in
some way
Conclusion & Future work

Conclusion



Implemented concrete/symbolic hybrid simulator based on
AST interpreter
Proposed a method for input pattern generation for branch
coverage
Experimental results demonstrate the input pattern
generation
For assertion failure detection
 For better branch coverage


Future work




39
Capability to cover the specified target branch
Handling of concurrent executions
Hybrid simulation heuristic tuning
Efficient management of symbolic expressions
References




40
[3] D. D. Gajski, J. Zhu, R. Domer, A. Gerstlauer, and S. Zhao. SpecC:
Specification Language and Methodology. Kluwer Academic Publishers,
2000.
[5] E. Larson and T. Austin. High coverage detection of input-related
security facults. In SSYM’03: Proc of 12th conf on USENIX Security
Symbosium, 2003.
[11] K. Sen, D. Marinov, and G. Agha. CUTE: a concolic unit testing
engine for c. In Proc. Of Esec/SIGSOFT FSE-13, 2005.
[12] A. Stump, C. Barrett, and D. Dill. CVC: a cooperating validity
checker. In 14th int’l conf on computer-aided verification, 2002
Difficulty compared with RTL or lower


41
In traditional methodology for RTL or gate-level
 Word signals are converted into bit-vector
 Then, solved with Boolean algebra
 Efficient algorithms available: SAT, BDDs…
In system-level descriptions
 Too many word signals, too wide words (32 bit / 64 bit)
 Too wide space to explore
 Complicated control-flow
 Data-flow dynamically changes depending on the
path
 Control-conditions are complex
 Too many paths
Difficulty compared with RTL or lower (cont’d)

In system-level descriptions

To model software
 Recursive
calls, pointers, pointer-arithmetic, typecasting, dynamic-allocations…

To model hardware
 Concurrency,

synchronization, throughput, latency…
As word-level solvers, SMT solvers can be
employed, but with limited capability
Usually up to linear algebra
 Need approximation / workaround, otherwise it
would not work!

42
Download