Word Level Feature Discovery to Enhance Quality of Assertion Mining Reviewer: Shuo-Ren, Lin 1 ALCom 2012/5/11 Abstract Bit level automatic generated assertions are numerous, making them unreadable and frequently unusable. This work propose a methodology to discover word level features using static and dynamic analysis of the RTL source code. A post processing of assertions is employed to remove redundant propositions. Experimental results show that the generated word level assertions have higher expressiveness and readability than their corresponding bit level assertions. 2 ALCom 2012/5/11 Outline Introduction Background A motivating example Procedure for automatic word level assertion generation Simulation guided weakest precondition computation to discover word level features Removing Redundant propositions Experimental evaluation Reviewer conclusion 3 ALCom 2012/5/11 Introduction Assertion based verification Manually construct high functional coverage assertions is timeconsuming Several academia and industry solutions for automatic assertion generation Machine learning ( decision tree learning in this work) Previous works learn assertions from dynamic simulation or execution traces of RTL designs. It’s not a new idea to generate assertion Disadvantages of bit level assertion 4 Low readability Low coverage of input space for single assertion Tend to be repetitive ALCom 2012/5/11 Introduction Weakest precondition (wp) computation Has been used in software program analysis In this work, to discover word level features Computing wp for all possible paths is subject to blowup. Use concrete simulation to guide wp computation Remove redundant assertions Do post-process based on design knowledge Identify known exclusive features Result ( compare to bit level assertions) 5 Fewer assertions are generated Higher percentage of true assertions ALCom 2012/5/11 Background Target: a variable we want to generate assertions. Logic cone of target: variables that can affect target value Feature: a variable that is used to predict the target’s value Generated assertion A => B Antecedent, in terms of the features 6 Consequent, in terms of the target ALCom 2012/5/11 Background Word level variable: a variable with bit width larger than 1 Conditional expression: an expression in RTL evaluated to be true/false to determine which branch should be executed. Word level predicate: a first order formula in terms of word level variables and evaluated to be true/false. Word level assertion: an assertion that has at least one word level predicate as a proposition in its antecedent or consequent Mining window length: duration of time cycles for which we want the generated assertions to capture temporal behavior 7 ALCom 2012/5/11 Background Use-Definition Chain(UD Chain): a data structure consisting of a used variable and all the definitions of that variable that can reach that use without any other intervening definitions. wp( st, P) weakest precondition 8 st: statement P: post-condition ( a predicate) wp is true before the execution of st and guarantees to meet P after the execution of st. ALCom 2012/5/11 Background Consider the truth table of a target’s function in terms of features: 9 A table entry is covered by a given assertion if the concrete value of the entry can satisfy the antecedent of the given assertion The input space coverage of a given assertion refers to the percentage of truth table entries covered by the assertion ALCom 2012/5/11 Background Feature selection 10 Selection of a subset of input variables by eliminating variables with little or no predictive information. In this work, only include primary input variables. For temporal assertion generation, the design is unrolled and the variable is treated as different sequential variable. E.g. , where a and b are at different cycles. ALCom 2012/5/11 A Motivating Example Determine the inst. For decoding Word level target Alu_op[3:0] = `ALU_OR Bit level features Word level features If_insn[31:26] = `IR32_ORI Id_insn[31:26] = `IR32_ORI Logic cone Assign values to alu_op 11 ALCom 2012/5/11 Procedure for Automatic Word Level Assertion Generation Phase1 Discover word level targets Phase 2 Discover word level features Dotted block is extension of this paper 12 ALCom 2012/5/11 Phase 1: Discovering Word Level Targets Consider bit-vector output variables with constant assignments in the RTL code Bit level variables can only have one of two values, 0 or 1. Word level variables can have many possible values. It’s natural for machine learning algorithm To avoid generating too many assertions, the word level variable along with its intended value is given as a proposition. Provide the word level predicate itself as a target to the learning algorithm E.g. 13 In Figure 1 ALCom 2012/5/11 Phase 2: Discovering Word Level Features Phase 2.1: identifying word level conditional expressions, but these expressions may not in terms of primary inputs. Phase 2.2 : using simulation guided weakest precondition computation to discover all word level features in terms of the primary inputs. The variables in target’s logic cone but not used by any discovered word level feature should also be output as feature. 14 ALCom 2012/5/11 Data Generation and Learning Algorithm The discovered targets and features are instrumented back to RTL and re-simulation. There are many heuristics for picking the best splitting variables from feature variables. 15 ALCom 2012/5/11 RTL to CDFG CDFG The path condition for an assignment statement is a conjunction of all conditional expressions leading to the execution of that assignment statement on the path UD chain 16 Branch node: b1 Assignment node: b2 Merge node: b6 Record multi-cycle path during simulation, and the path is used to guide wp computation Point to all statements that assign it E.g. UD Chain: id_insn ALCom 2012/5/11 wp Computation Recursive substitution until PIs or constants are reached Notation 17 I : primary input variables r r1 , r2 ,, rn : all state variables : transition function for state variable : transition relation for corresponding : wp of P : wp for k consecutive cycles ALCom 2012/5/11 wp computation example Problems 18 Complex and unreadable predicates # of static path increases exponentially for large k Path conditions for different variables used in P may conflict ALCom 2012/5/11 Simulation Guided wp Computation 19 Follow the concrete path Not trying to extract the complete function of the given target E.g. target : alu_op = `ALU_OR ALCom 2012/5/11 Removing Redundant Propositions Mutually exclusive features lead to over-constrained or meaningless assertions. E.g. State=S1, State=S2, (State=S1 Λ (State=S2)’) B Identification of mutually exclusive features Do post-process for all generated assertions 20 Group all discovered features from the same post-condition predicate Check whether there are mutually exclusive features within each group ALCom 2012/5/11 Experimental Result Synopsys VCS for simulation Three design: EMAC, I2C and OpenRISC Cadence IFV for formal verification Most generation processes complete within half an hour depending on the IFV runtime. Bit level vs. word level from six perspectives 21 Number of generated candidate assertions Percentage of true assertions Average number of propositions in assertion’s antecedent Input space coverage analysis of generated assertions Analyzing relationship between word level assertions and bit level assertions Injecting bugs in RTL and using the generated assertions to detect the injected bugs ALCom 2012/5/11 Experimental Result Number of generated candidate assertions Less number of word level features 22 Window length and target signals are determined by the user beforehand ALCom 2012/5/11 Experimental Result Candidate assertions Percentage of true assertions 23 ALCom 2012/5/11 Experimental Result Avg. # of propositions in assertions antecedent Input space coverage 24 alu_op=OR is target ALCom 2012/5/11 Experimental Result Relationship between word level assertions and bit level assertions, i.e. one word level assertion can cover several bit level assertions. 25 Antecedent of bit level assertions implies the antecedent of word level assertion They assert the same value on target ALCom 2012/5/11 Experimental Result Bug detection ability 26 Using systematic mutation-based method to compare Four types of bugs: operator replacement, variable to constant replacement, constant replacement, and relational operator replacement. ALCom 2012/5/11 Reviewer Conclusion Easy to read Contribution is clear Some unclear points 27 Declare IFV runtime instead of whole process runtime? # of test cases decrease Experiment seems more powerful than its comment ALCom 2012/5/11