Eli Singerman
Design and Technology Solutions
Intel Corp.
FMCAD ’ 2007, Austin
•
– Michael Mishaeli, Elad Elster, Ronen
Kofman, Tamarah Arons, Andreas
Tiemeyer, Shlomit Ozer, Jonathan Shalev,
Pavel Mikhlin, Terry Murphi, Sela Mador-
Haim (past)
•
– Lenore Zuck, Amir Pnueli, Moshe Vardi
E. Singerman FMCAD Talk November 2007
2
• Motivation
• Embedded Software Intro
– Characteristics
– Verification Landscape
• Application of Formal Methods
– Modeling
– Reasoning
– Verification Flows
• The way ahead …
• Related work
E. Singerman FMCAD Talk November 2007
3
•
– E.g., virtualization, security …
– We expect this to grow as CPUs gradually move to SoC design paradigm
FMCAD Talk November 2007
4
E. Singerman
• Verification is on the critical path
– Limits the introduction of new features
– Dominant in Time-To-Market
• Embedded software (mostly microcode) is
“ responsible ” for a significant portion of bugs
– Implementation is challenging (and error prone)
• Verification methodologies/tools/technologies are challenged
• In this talk, I will overview some of the directions we have pursued in addressing this growing gap
(extending what we reported at CAV ’ 05)
E. Singerman FMCAD Talk November 2007
5
• Motivation
• Embedded Software Intro
– Characteristics
– Verification Landscape
• Application of Formal Methods
– Modeling
– Reasoning
– Verification Flows (it is not all FV … )
• The way ahead …
• Related work
E. Singerman FMCAD Talk November 2007
6
•
•
– Denoting state variables such as registers, memory, microarchitectural control and configuration bits
E. Singerman FMCAD Talk November 2007
7
•
– Where the labels appear in the program or indicate a call to an external procedure
– Sometimes, label + offset is used
– It is possible to have indirect jumps (target is known only at run time)
– Have loops (non-recursive, almost all with fixed bound)
E. Singerman FMCAD Talk November 2007
8
• Atomic statements of microprograms are called
– These are implemented via dedicated hardware executed in various units of the CPU (e.g., ALU)
– Can be thought of as functions that (typically) get as input two register arguments, perform some computation and assign the final result to a third register
• Possibly raising various exceptions
E. Singerman FMCAD Talk November 2007
9
…
• A simple microprogram (not actual … )
BEGIN FLOW(example) { reg1 := memory_read (reg2, reg3); reg4 := add (reg1, reg5);
}
– Registers are bitvectors of width 64
– Memory is an array[32][64]
• That is, an address space of 32 bits is mapped to entries of
64 bits
– memory_read and add are microinstructions
10
E. Singerman FMCAD Talk November 2007
• In addition to invoking microinstructions, we have
interaction by reading/setting various shared machine-state variables
– E.g., memory (both persistent and volatile), special control bits for signaling microarchitectural events , etc.
• The latter is used for governing the microarchitectural state and is mostly modeled as side-effects (not visible in the source code)
E. Singerman FMCAD Talk November 2007
11
•
– Except for spin-loops waiting for HW event to occur in order to resume execution
•
– Normal: Nothing bad happened
– Exception: Some fault occurred, e.g., arithmetic (underflow … ), memory (page not found … )
12
E. Singerman FMCAD Talk November 2007
HW Env model
Simulator
Tests
Test
Generator
Source
Path
Manager
• Simulation based
• Try to cover all possible execution paths in each microprogram (at least once … .)
Lint Checker
Paths DB
E. Singerman FMCAD Talk November 2007
13
• Path extraction is manual -- annotation based
– Very difficult to write and get correct
– Missing real paths
– Generating un-real paths that can never be covered
– Significant maintenance burden
• Test Simulate Cover loop is too long
• Verification is control oriented
– In critical microprograms data should be taken into account
E. Singerman FMCAD Talk November 2007
14
• Motivation
• Embedded Software Intro
– Characteristics
– Verification Landscape
• Application of Formal Methods
– Modeling
– Reasoning
– Verification Flows (it is not all FV … )
• The way ahead …
• Related work
E. Singerman FMCAD Talk November 2007
15
…
E. Singerman FMCAD Talk November 2007
16
• Native Embedded SW dialects are extremely complex, with many implicit side-effects, and intricate semantics
• We introduce an intermediate format, which we call
Intermediate Representation Language
IRL –
• IRL is a simple programming language
– Basic Data Types are bits and bit-vectors (with a rich set of operations)
– Basic statements are conditional assignments and GOTOs.
• IRL is
– expressive enough to describe fully and explicitly the behavior of microprograms and their (implicit) interaction with their hardware environment at the “ right ” abstraction level
– Yet, its sequential semantics is simple enough to enable formal reasoning
17
E. Singerman FMCAD Talk November 2007
• IRL uses a Template Mechanism
– Each microinstruction is implemented by an IRL Template
– Template body is a sequence of (plain) IRL statements that compute the effect of the microinstruction
• Including side effects of a microinstruction computation, by updating the relevant auxiliary variables
• When compiling a microprogram, templates are instantiated to generate IRL code
• This enables
– Compositional build
– Write once, use many times
• In addition, exceptions/faulty behaviors are modeled as executions at the end of which various variables are made observable
E. Singerman FMCAD Talk November 2007
18
• A simple microprogram (not actual uCode)
BEGIN FLOW(example) { reg1 := memory_read (reg2, reg3); reg4 := add (reg1, reg5);
}
– Registers are bitvectors of width 64
– Memory is an array[32][64]
• That is, an address space of 32 bits is mapped to entries of
64 bits
– memory_read and add are microinstructions
19
E. Singerman FMCAD Talk November 2007
template add (reg result,reg src1,reg src2){ result := src1 + src2; zeroFlag := (result = 0);
}
– Note that a side effect – setting zeroFlag – is explicit (for simplicity, we ignore the possibility of add overflow)
• Memory_read is more involved
• Includes a possible exception . The address is calculated as tmp address + offset.
• If this is out of the memory address range of 32 bits, then an address overflow exception is signaled with relevant variables
E. Singerman FMCAD Talk November 2007
20
exception address_overflow (bit[32] address); template memory_read (reg result, reg tmp_address, reg offset) {
TMP0 := tmp_address + offset; if (TMP0 > 0xFFFFFFFF) exit address_overflow (TMP0[63:32]); result := memory[TMP0[31:0]]; zeroFlag := (result = 0); observable
Found_valid_address := 1;
} side effect
21
E. Singerman FMCAD Talk November 2007
•
– States defined by means of state variables
– Transitions defined by means of logical constraints between pre and post values
•
– Each exit has its own observable expressions
FMCAD Talk November 2007
22
E. Singerman
…
E. Singerman FMCAD Talk November 2007
23
• Reasoning is done through an IRL symbolic simulator
• All inputs are assigned with initial symbolic values
– Memory interaction is modeled via un-interpreted functions using a stack mechanism
• Constraints computed by the simulator are propositional formulas involving bit-vector expressions over initial values
– We compute necessary and sufficient conditions to traverse any path the program can execute
– For each path, compute the final state mapped to selected observables
• For evaluation, the conditions are submitted to propositional SAT solver
– We are very encouraged by initial results using academic bitvector solvers, more on this later …
24
E. Singerman FMCAD Talk November 2007
IRL
Microinstruction User Properties &
Embedded SW
MicroFormal
E. Singerman
Compiler
SAT Solver
Symbolic Simulator
Verification
Conditions generator
Debug
FMCAD Talk November 2007
25
• Simulating industrial strength embedded SW requires resolving some difficulties
– First, expressions get REALLY BIG, e.g., a microprogram can consist of thousands of execution paths, several of which have a sequential length of
~10^4
– In addition, have to handle indirect jump statements
– Lastly, have to account for loops …
– Have to handle these on-the-fly due to first issue
E. Singerman FMCAD Talk November 2007
26
• Avoiding expression blow-up by dynamic
– Pruning of un-feasible execution paths (evaluate path condition on-the-fly)
– Merging of simulation paths at strategic control locations (both automatically detected and user provided)
– Resolution (at least reduction) of indirect jump targets by a combination of static expression analysis and SAT
– Caching and grouping of conditions
• Speed up current simulation using control info computed at previous simulations
27
FMCAD Talk November 2007
BEGIN(toy_program) start:
I1: if (CPL > 0) fault;
I2: if (EAX > 7) then EBX := 8 else EBX := EAX - 2;
I3: if (EBX < 5) goto skip_mask;
I4: EAX := EAX & 0x000F; skip_mask:
I5: if EAX < EBX fault;
E. Singerman FMCAD Talk November 2007
28
…
Remove infeasible…
CPL
0
> 0 start: I1
CPL
0
= 0
I2
(EAX
0
EBX
1
:=
>7)?8:EAX
0
-2
P0: fault
Merge…
I3
EBX
1
≥ 5
EBX
1
< 5
EAX
0
< EBX
1
P1: fault skip_mask:
I5 I4
EAX
0
≥ EBX
1
P2: End
EAX
1
:= EAX
0 skip_mask:
I5
EAX
EAX
1
0
:=
& 0x000F
Path Conditions:
P0
P1
: (CPL
: (CPL
0
>0) fault
0
=0) & (EBX
1
EBX
1
) fault
P2 : (CPL
1
EBX
0
=0) & (EBX
1
) end
P3 : (CPL
0
=0) & (EBX
EBX1) fault
1
< 5) & (EAX
0
< 5) & (EAX
0
≥ 5) & (EAX
1
P4 : (CPL
EBX
0
=0) & (EBX
1
) end
1
≥ 5) & (EAX
1
<
≥
<
≥
((EBX
1
< 5) ? EAX
0 1 1 EAX
1
≥ EBX
1
0
: EAX
1
) ≥ EBX
1
29
E. Singerman FMCAD Talk November 2007
…
E. Singerman FMCAD Talk November 2007
30
“
”
HW Env model
Simulator
Tests Source
Compute paths automatically w/o manual annotations
Test
Generator
Path
Manager
Guide a direct test generator
Lint Checker
Paths DB
31
FMCAD Talk November 2007
• Using the path conditions, tests can be created to exercise paths in simulation
• State mapped between microarchitectural representation and architectural means of setting it
• The program simulated is only a small piece of the whole environment simulated
– Other structures are needed to bootstrap test, handle faulting conditions, and reach the program under test
– For this reason, it is not possible to simply “ jam ” the values
(example: memory layout)
E. Singerman FMCAD Talk November 2007
32
• Goal: verify intended (partial) behavior of microprograms
• Specifying properties
– Essentially, state predicates expressing uArch/Arch properties
– User can specify “what” and “where” to check
– These are natural – based on control flow of the program, relating to significant program locations
– Examples:
“ If EAX[3]=true at start then XYZ = true at end ”
“ if ECX contains an initial value of given set, then a General Protection Ffault will occur ”
…
Basic directives
– Assume a predicate at a specific loaction
– Assert (verify) a predicate at a specific location (or at a set of locations)
– Constrain program simulation paths during execution
E. Singerman FMCAD Talk November 2007
33
• Goals:
– Ensure “ backward compatibility ” w/IA 32
– Verify that optimizations do not break functionality
• Given two microprograms “new”, “legacy”
• A set of (global) constraints
– Mapping predicates -- relating the two different CPU micro-architectures
– Predicates specifying “new” features are disabled
• “new” is backward compatible with “legacy” if both exhibit the same observable behavior (under constraints):
– For every initial state , both exit in the same manner
• Either both reach normal exit or both have the same “fault”
– Both produce the same values on relevant observables
– Both write the same values into the same locations of external memory, in the same order
• Compatible means: equivalent under constraints
E. Singerman FMCAD Talk November 2007
34
• Goals:
– Verify correct input/output behavior against a full architectural specification
– Account for both software and hardware implementation
• We developed a high-level specification based on the programmers reference manual – in IRL
• Since we have IRL representation for the SW implementation, its verification reduces to checking equivalence of IRL programs
– With some auxiliary mapping to bridge the abstraction gap
• The HW RTL implementation of microinstructions is formally verified separately
– Again, using symbolic simulation (STE)
• Together, this implies full verification (for in-order execution)
– Sometimes, it works
E. Singerman FMCAD Talk November 2007
35
• Motivation
• Embedded Software Intro
– Characteristics
– Verification Landscape
• Application of Formal Methods
– Modeling
– Reasoning
– Verification Flows (it is not all FV … )
• The way ahead …
• Related work
E. Singerman FMCAD Talk November 2007
36
• In the past couple of years, we have we made progress in the introduction of formal methods to verification of embedded SW at Intel
• Current toolset provides automaton in several key verification activities
– Contributes to quality and productivity of traditional methods
• Future (and on-going work) include
– Application (and adaptation) to other types of embedded SW
– On-going search for efficiency improvements in all levels
– On the solver level we are very encouraged by latest results using academic word-level solvers
37
E. Singerman FMCAD Talk November 2007
E. Singerman FMCAD Talk November 2007
38
• R.E. Bryant, “Symbolic simulation techniques and applications”,
• D. Currie et al, “Embedded software verification using symbolic execution and uninterpreted functions”, Int. J. Parallel Program descriptions using symbolic simulation”,
, 2006
• A. Koelbl and C. Pixley, “Constructing efficient formal models from high-level
Int. J. Parallel Program
DAC’1990
, 2005
• D. Babic and A. Hu, “structural abstraction of software verification conditions”,
CAV’2005
• D. Currie, A. Hu and S. Rajan, “Automatic formal verification of DSP software”,
DAC’2000
• C. Flanagan and J. Saxe, “Avoiding exponential explosion: generating compact verification conditions”, POPL’2001
• E.M. Clarke, D. Kroening and K. Yorav, “Behavioral consistency of C and Verilog programs using bounded model checking”, DAC’2003
• R. C. Ho et al, "Architecture validation for processors", Proc. Int. Symp. Computer
Architecture (ISCA’95)
• D. Lugato et al, “Automated Functional Test Case Synthesis from THALES industrial
Requirements”, RTAS’04
• P. Mihsra and N. Dutt, “Graph-Based Functional Test Program Generation for Pipelined
Processors”, DATE’2004
• S. Ur and Y. Yadin, “Micro Architecture coverage directed generation of test programs”, DAC’1999
39
E. Singerman FMCAD Talk November 2007
• D. Cyrluk, “Microprocessor Verification in PVS: A Methodology and Simple Example”,
• Technical Report SRI-CSL-93-12, 1993
• S.Y. Huang and K.T. Cheng, “Formal Equivalence Checking and Design Debugging
• Kluwer, 1998.
• D. Harel and A. Pnueli, “On the development of reactive systems”, In Logics and
”,
Models of Concurrent Systems , 1985.
• A. Pnueli, M. Siegel, and E. Singerman, “Translation validation”, In TACAS’1998
• J. Sawada and W.A. Hunt, “Verification of FM9801: An out-of-order microprocessor model with speculative execution, exceptions, and program-modifying capability”, on Formal Methods in System Design , 2002.
• M. Srivas and S. Miller, “Applying formal verification to the AAMP5 microprocessor:
A case study in the industrial use of formal methods”, J. on Formal Methods in
J.
System, 1996.
• A. Aharon et al, “ Test Program Generation for Functional Verification of PowerPC
Processors in IBM ” , DAC ’ 1995
• R. S. Boyer, B. Elspas and K. N. Levitt, “ SELECT - a formal system for testing and debugging programs by symbolic execution ” , 1975.
• T. Ball and S. K. Rajamani, “ Automatically Validating Temporal Safety Properties of
Interfaces ” , SPIN 2001.
40
E. Singerman FMCAD Talk November 2007
…
• S. Fine and A. Ziv. Coverage Directed Test Generation for Functional Verification using Bayesian Networks, DAC ’ 2003
• D. Geist et al, “ Coverage directed test generation using symbolic techniques ” ,
FMCAD ‘ 1996
• A. Gupta et al, “ Property-Specific Testbench Generation for Guided Simulation ” ,
VLSID ’ 2002.
• T. Arons et al, “ Formal verification of backward compatibility of microcode ” ,
CAV ’ 2005
• T. Arons et al, “ Embedded Software Validation: Applying formal techniques for coverage and test Generation, MTV ’ 2006
• T. Arons et al, “ Efficient symbolic simulation of low-level software ” , to appear in
DATE ’ 2008
E. Singerman FMCAD Talk November 2007
41