Summer Formal 2011 Homework and Lab Jason Baumgartner www.research.ibm.com/sixthsense IBM Corporation May 2011 Homework 1: Netlist Modeling Exercises 1.1) Properties are specially annotated as "outputs" in the AIGER format. However, there are no special ways to annotate "constraints". How may the netlist be manipulated to allow constraints be reflected in an AIGER netlist? assume (busy not req_valid) 2 Homework 1: Netlist Modeling Exercises 1.2) Latches are assumed to have constant-0 initial value in the AIGER format. Assume you wish to initialize a set of latches to an arbitrary one-hot state: i.e., exactly one of them will be active at any point in time. How may this be represented in the netlist? assertable? 0 0 0 3 Homework 1: Netlist Modeling Exercises 1.3) Certain types of logic functions such as multipliers are very difficult to reason about using bit-level algorithms. "Uninterpreted functions" are sometimes used to facilitate the verification of designs with such functions, wherein two instances of a particular function (e.g., one in the implementation and one in a reference model) are replaced with nondeterministic behavior. In particular, these two instances are each replaced by a multiplexor: if the arguments to the abstracted functions are identical, the same nondeterminstic values are sensitized through both multiplexors. Otherwise, different nondeterminstic values are sensisized through. Uninterpreted functions are useful when the correctness of the verification task is not dependent upon the precise values produced from the abstracted functions; only the *consistency* of identical results being produced under identical arguments is relevant. When verifying sequential netlists, a challenge with using uninterpreted functions is that the function pairs to be abstracted may be receive their arguments at different points in time. I.e., the implementation may be pipelined hence the timing with which it receives relevant arguments may not match the un-pipelined reference model. How could one model a precise "sequentially consistent" uninterpreted function to cope with this? Comment on the size of the resulting implementation with respect to the width of the abstracted function. Could you think of lossy yet "sound" shortcuts which are of smaller sizes and retain sequential consistency? 4 Homework 1: Netlist Modeling Exercises 1.4) Recall that “liveness checking” may be reduced to “safety checking” through a netlist transformation, which entails adding a “shadow register” for every register in the original netlist against which a state-repetition – i.e. lasso loop – may be detected. Consider checking a liveness property of the form: every request eventually gets a grant. A single assertion net may be synthesized which remembers that a request has occurred and is awaiting a grant – hence the liveness check consists of checking whether this assertion net may stick at logical 1 forever. Work through the exercise of how to convert this overall check to safety, e.g. how to model the shadow registers, to end up with an AIGER-style safety assertion net capturing all liveness failures of the above. Liveness checking often also entails fairness constraints, which are logical conditions which must hold “infinitely often” along any counterexample trace. Consider a set of fairness constraints – expressed as nets which assert to 1 when they are satisfied – used to qualify the liveness check. Work through the exercise of how to support fairness constraints in the above modeling. 5 Homework 2: Algorithmic Exercises 2.1) Netlist Modeling Exercises #2 asks for a way to reflect constraints without a dedicated netlist construct. Can you think of drawbacks of this modeling in simulation and semi-formal verification frameworks? Can you think of potential drawbacks to such an approach in various verification frameworks such as induction and redundancy removal? Can you think of algorithmic ways to compensate for such a modeling if desired? constrain! Stimulus Generato r assertable? 0 1 0 6 Homework 2: Algorithmic Exercises 2.2) Redundancy removal, which identifies and eliminates pairs of gates which are equivalent in all reachable states, is a powerful simplification technique capable of dramatically reducing overall verification resources (i.e., through simplifying the netlist for a subsequent proof technique), if not outright solving many intricate verification problems. In cases, a netlist may have pairs of gates which are equivalent only after several time-frames from the desired initial state set. Can you think of several ways to try to exploit this condition, i.e. to attain the desired reduction while strictly (or at least, conservatively) preserving property checking? E.g., ways to alternatively model a testbench; an automated transformation to accomplish something similar; a type of invariant which may capture the desired information? =0? =0? A B Miter without spec reduction A B Miter with spec reduction 7 Homework 2: Algorithmic Exercises 2.3) Reachability analysis may be performed using BDDs as follows. A BDD variable may be allocated for each primary input I_i, and each "current state variable" (register) C_i. Another BDD variable may be allocated for each next-state variable N_i, allowing a "transition function" to be built for each register r_i of the form: N_i = f(I, C) where f correlates to the combinational netlist driving the next-state function. The "transition relation" is the conjunction of all "transition functions". A core operation of reachability analysis is the "image computation". Given a set of "current states" whose onset is expressed as a BDD over "C" variables, the image computation returns a set of "next states" expressed over "N" variables which may be transitioned to under some valuation of inputs. This is performed by conjuncting the desired set of "current states" with the "transition relation", then existentially quantifying away inputs "I" and current-state variables "C". A fundamental bottleneck of image computation is the intermediate complexity that comes through having all variables alive concurrently on the same BDD set. Can you think of ways to optimize the image computation to minimize the number of live variables, while preserving the precision of the computation? Hint: the "support" of each next-state function is not necessarily identical, nor even overlapping. Can you also formulate preimage computation, which maps from next-states to current states? Can you think of reasons why one may be more effective than the other in (un)reachability analysis? 8 Homework 3: Testbench Modeling Exercises Refers to Lecture 3 3.1) The "Cache Associativity" case study described how to minimize testbench complexity through a shortcut: instead of explicitly tracking the "age" of every entry to determine what the "least-recently used victim", one need only nondeterminstically choose a single entry to monitor. Specifically, whenever the monitored entry is accessed, one resets the vector of "accessed" tags representing all other entries; whenever another entry is accessed, its tag is set; and whenever the monitored entry is selected by the netlist as the victim, the checker asserts that all other entries' tags have been set. This modeling is lossless in an exhaustive "formal" setting; i.e., any design flaw which prematurely casts out any entry without it being "least recently used" will be flagged as an error. Can you think of a drawback with such modeling in an explicit-state simulation environment? Can you think of any approach which is less "costly" in terms of testbench registers which somewhat improves upon this drawback? 9 Homework 3: Testbench Modeling Exercises Refers to Lecture 3 3.2) Consider that we need to implement a testbench checking that a netlist associates the proper data with the proper tag. E.g., the design under verification is a "load queue" which first samples a "tag" from the driver, then enqueues that tag waiting for "data" marked with the same "tag" to be subsequently driven, and when it has the awaited data it finally presents the associated tag and data at its outputs. To enable a simpler testbench, one may merely embed the tag within the associated data in the driver, and check that the output data/tag pair properly reflects the embedded tag within the corresponding data. What type of design errors may be missed by the above check and driving shortcut? What steps may be taken in the driver to help minimize any missed bugs? Would any additional properties be relevant to capture missed bugs? 10 Homework 3: Testbench Modeling Exercises Refers to Lecture 3 3.3) Consider a testbench intented to verify a FIFO-style design over a deep queue. How could the checker be modeled in a way which is independent of data width? How could the checker be modeled in a way which is only logarithmically dependent upon the queue depth? 11 Lab We discussed how transformation-based verification may eliminate hardware implementation artifacts which entail verification complexities ABC is a state-of-the-art hardware model checker Numerous transformations, formal / semiformal engines, synthesis routines Overall 1st place winner of all Hardware Model Checking Competitions http://www.eecs.berkeley.edu/~alanmi/abc/ ABC takes numerous benchmark formats as input We will use the And / Inverter Graph format AIGER http://fmv.jku.at/aiger/ 12 Lab There are strong freely-available model checkers Hardware Model Checking Competition: http://fmv.jku.at/hwmcc10 However, freely-available front-end language processors are dreadfully lacking! We will use smvtoaig from the AIGER toolset to convert a very crude subset of SMV into AIGER Unfortunately, this crude subset is almost identical to AIGER! Fortunately, this problem should be remedied in ~the next year as hardware model checking formats migrate to word-level support; as SMT formats consider supporting sequential netlist constructs 13 Lab: Part 1 Download and install ABC and AIGER http://www.eecs.berkeley.edu/~alanmi/abc/ http://fmv.jku.at/aiger/ Download Hardware Model Checking Competition 2008 benchmarks http://fmv.jku.at/hwmcc08/ These are already in AIGER format Browse the summary of ABC’s capabilities; its HMWCC 2008 performance http://www.eecs.berkeley.edu/~alanmi/presentations/dprove2_02.ppt Run ABC on some of the downloaded benchmarks to familiarize yourself with basic commands Run some of the individual transforms listed on slide 8, not just dprove 14 Lab: Part 2 Let’s study how hardware implementation artifacts hurt verification scalability, and how transformations cope with this complexity This exercise illustrates the impact of pipelining We equivalence-check 2 netlists comprising a 32-bit pipelined multiplier Note that the multiplier was retimed across a pipeline stage data1 data2 data1 data2 out1 out1’ 15 Lab: Part 2 This example is called pipeline_hotclocked The output is a the disjunction over each bit-wise XOR check of out1 vs out1’ Why is this example potentially difficult to verify? Which transformation(s) that we discussed should be adept at simplifying it? Once simplified, what sort of proof techniques may be adept at verifying the simplified problem? Find a minimal set of ABC commands to solve this example, not using conglomerate operations such as dprove data1 data2 data1 data2 out1 out1’ 16 Lab: Part 3 This example illustrates the impact of clocking on verification complexity pipeline_altclocked includes an oscillating 1,0,1,0… clock Why is this example potentially difficult to verify? Which transformation(s) that we discussed should be adept at simplifying it? Find a minimal set of ABC commands to solve this example, not using conglomerate operations such as dprove updates when clock=0 data1 data2 data1 data2 updates when clock=1 out1 out1’ 17 Lab: Part 4 This example further illustrates the complexity of clocking pipeline_unclocked removes the internal oscillating clock Can you find any ABC command that can verify it??? data1 data2 data1 data2 out1 out1’ 18 Lab: Part 4 Edit pipeline_unclocked.smv to add an oscillating clock See SMV example on next slide; note that smvtoaig uses a very restrictive subset o SMV Now solve this example, as per Lab Part 3 Can you see any verification impact if the clock oscillates 010101… vs 101010… ? What sort of design bugs may be missed through using an oscillator? data1 data2 data1 data2 out1 out1’ 19 Lab: Part 4 SMV Example smvtoaig supports a very restrictive subset of SMV req1, req2 are primary inputs; sv is a register with defined init and next-state function MODULE main VAR req1 : boolean; VAR req2 : boolean; ASSIGN ack1 := req1; ASSIGN ack2 := req2 & !req1; VAR sv : boolean; ASSIGN init(sv) := 0; ASSIGN next(sv) := req1; ASSIGN a1 := !(ack1 & ack2); SPEC AG a1=1 20 Lab: Part 5 This example illustrates driver options Function f’ has been optimized knowing that vec should always be onehot: exactly one bit of this three-bit vec will be 1 at any point in time Run ABC on pipeline_unclocked_onehot_assumption without this assumption What is the outcome? vec data1 data2 out1 f vec data1 data2 f’ out1’ 21 Lab: Part 5 Next edit pipeline_unclocked_onehot_assumption.smv to drive this onehot constraint on input vec(0 to 2) Run ABC on the result; what is the outcome? vec data1 data2 out1 f vec data1 data2 f’ out1’ 22 Lab: Part 5 Need to drive an oscillating clock to get a conclusive answer? Edit the smv file to add an oscillator Run ABC on the result; what is the outcome now? vec data1 data2 out1 f vec data1 data2 f’ out1’ 23 Lab: Part 6 Next use a constraint-style approach to model this input constraint Undo your overriding of vec; instead use an antecedent-conditioning approach to preclude the property from failing if the input assumption is ever violated I.e., you can add a register which remembers violations of the input assumption; conjunct the appropriate logic to the output being checked Can you find a way to solve the resulting problem using ABC??? Refer to Homework 1.1 and 2.1 vec data1 data2 out1 f vec data1 data2 f’ out1’ 24 Lab: Part 7 Run ABC on more examples from the HWMCC benchmarks Assess the impact of various commands, such as strash, iprove, phase, dretime, dret -f, scl, scorr, dc2, bmc, itp For any you cannot solve, try dprove or dprove2 Are any transforms more useful than others? Is there a particular order that is best for these reductions? 25 Lab: Part 8 (Extracurricular) Devise a new engine, or improvement to an existing engine, of ABC Implement that technique; solve the remainder of the HWMCC problems Publish that technique, and base your PhD dissertation upon it Apply for a job at IBM! 26