EE 290A – Sequential Optimization and Verification http://www-cad.eecs.berkeley.edu/~alanmi/courses/290A/index.htm Tues. Thurs. 9:30-11 Cory 540 A/B 3 hours credit Instructors: Robert Brayton Roland Jiang Alan Mishchenko 1 Outline Introduction Schedule Latches and Flip-flops Verification of a clock-schedule 2 Requirements Attend class and participate in discussions Do research on an individual project collaborating with a mentor Give two presentations on your project – preliminary overview (March 3) and final project lecture – (May 10, 12) lecture should be similar to a conference presentation of 20-25 minutes Final project report (May 20) should be similar to a conference paper Learn a lot and have fun 3 Web site contents News (probably we will send out e-mail to class roster instead of posting here) General Overview Syllabus (rough and preliminary description of course content) Lectures (lectures will be posted here – hopefully before the class) Examples (this contains examples on which some CAD programs can be applied) Reading (this contains a list of relevant papers for more info.) Projects (some proposed ones are here already with mentors but this list is preliminary. Final list will be given out early in February) Summary 4 Computing Resources All students will be given an account on the CAD computers – if don’t have one see Roland Jiang These can be used to run programs – MVSIS, VIS, SIS, ESPRESSO, SPICE… on examples Used for project work 5 Coordination with 219B – Logic Synthesis 8-9:30 Tues. Thurs. Schedule with 219B has been coordinated with Prof. Kuehlmann so some basic material will be presented there prior to use in 290A. Required that 219B schedule to be changed from previous years. This makes it possible to take both courses at the same time This makes it possible to just take 290A without having taken 219B – just make sure you attend 219B when basic material is taught. Other than this dependency, 290A is self contained. Auditors are welcome to attend classes and can opt to do a project. 6 Sequential Optimization Much research exists – lots of effort in ’90s Could lead to important improvements in area performance etc. Interesting work from a theoretical standpoint Not really used in commercial CAD except retiming Computationally expensive and therefore a lot of techniques can only be applied to small examples Optimization use restricted by need to verify. 7 Sequential Verification Big area and a lot of current work Abstraction methods help in scalability to larger problems Interesting theoretical foundations Lots of commercial interest in CAD companies and in system design houses. Recent SAT-based methods have increased capability (also applied to optimization) 8 Purpose of Course Only very small subset has been taught previously Bring into focus best state-of-the-art methods to get synergistic view See if these can be integrated and advanced in light of recent advancements in logic synthesis and verification Expose fruitful directions of research Create a few conference papers. 9 Outline Introduction Schedule Latches and Flip-flops Verification of a clock-schedule 10 Schedule Week 1 (January 18-20) Jan 18: Introduction, latches and flipflops, clock schedule analysis (Bob) Jan 20: Basics of reachability analysis (Alan) Week 2 (January 25-27) Jan 25: Cyclic circuits – Part 1 (Roland) Jan 27: Cyclic circuits – Part 2 (Roland) Week 3 (February 1-3) Feb 1: Asynchronous synthesis – Part 1 (Alex Kondratyev, CBL) Feb 3: Asynchronous synthesis – Part 2 (Alex Kondratyev, CBL) Week 4 (February 8-10) Feb 8: Asynchronous synthesis – part 3 (Alex Kondratyev, CBL) Feb 10: Clocking networks (Rajeev Murgai, Fujitsu) 11 Schedule Week 5 (February 15-17) Feb 15: State-based FSM manipulations – Advanced reachability (Alan) Feb 17: State-based FSM manipulations – Sequential flexibility (Alan) Week 6 (February 22-24) Feb 22: State-based FSM manipulations – State minimization (Alan) Feb 24: Structure-based FSM manipulations – Clock skewing (Alan / Aaron Hurst) Week 7 (March 1-3) Mar 1: Structure-based FSM manipulations – Retiming (Alan) Mar 3: Preliminary project presentations (290A students) Week 8 (March 8-10) Mar 8: Structure-based FSM manipulations – Initialization sequences, peripheral retiming (Roland) Mar 10: Structure-based FSM manipulations – Inherent power of retiming and resynthesis (Roland) 12 Schedule Week 9 (March 15-17) Mar 15: Structure-based FSM manipulations – Sequential testing and redundancy removal (Bob) Mar 17: Structure-based FSM manipulations – High-level retiming (Bob) Week 10 (spring break) Week 11 (March 29-31) Mar 29: Structure-based FSM manipulations – Retiming and technology mapping (Alan) Mar 31: Structure-based FSM manipulations – Sequential optimization w/o reachability (Alan) Week 12 (April 5-7) Apr 5: Formal verification – Temporal logic, omega automata, language containment (Bob) Apr 7: Formal verification – Bounded model checking, temporal induction (Alan) Week 13 (April 12-14) Apr 12: Formal verification – Unbounded model checking – Part1 (Ken McMillan, CBL) Apr 14: Formal verification – Unbounded model checking – Part2 (Ken McMillan, CBL) 13 Schedule Week 14 (April 19-21) Apr 19: Formal verification – Sequential equivalence checking – Part1 (Roland) Apr 21: Formal verification – Sequential equivalence checking – Part2 (Roland) Week 15 (April 26-28) Apr 26: Formal verification – Functional dependency (Roland) Apr 28: GALS and Latency Insensitive Design (Bob) Week 16 (May 3-May 5) May 3: Other topics May 5: Other topics Week 17 (May 10) May 10: Final project presentations – Part1 (290A students) May 12: Final project presentations – Part2 (290A students) May 13 – 20 Final examinations period May 20: Hand in Final Project Reports 14 Outline Introduction Schedule Latches and Flip-flops Verification of a clock-schedule 15 Latches and Flip-flops Bi-stable circuit (no inputs) 16 SR Latch (bi-stable circuit with control) R Q Q S Assumption: RS 0 QRS Q , S Q * RQ * Q* S RQ * R\SQ 0 1 00 0 0 01 1 0 11 1 X 10 1 X 17 D latch C Q R S Q C Q D Q symbol D R CD S CD Q* S RQ CD (C D)Q CD CQ 18 D flip-flop D C D C Q master Q D C Q Q slave Q* S RQ CD CQ When C is asserted (= 1) Old value of Q is captured in slave Master is opened Q follows the D input When C is de-asserted (= 0) Master is closed and D value is captured in master Slave is opened Transmits the output Q from the master 19 Summary of latch and flip-flop characteristics SR latch Gated SR latch D latch SR flip-flop D flip-flop JK flip-flop T flip-flop (edge triggered) T flip-flop (clocked) Q* S RQ Q* SC QR CQ Q* DC CQ Q* S RQ Q* D Q* KQ JQ Q* Q Q* TQ TQ 20 Set-up and hold times Set-up time Time during which data must be stable before the clock comes Hold time Time during which data must be stable after the clock comes 21 Analog analysis Assume pure CMOS thresholds, 5V rail Theoretical threshold center is 2.5 V 22 Analog analysis Vout Vout T (Vin ) Vin 23 Metastability Metastability is inherent in any bi-stable system Vin1 Vout 2 stable metastable stable Vin 2 Vout1 24 Metastability dynamics Separatrix Vin1 Vout 2 stable metastable stable Vin 2 Vout1 25 D-latch operation latch acts like a wire while its control is active latch (later) grabs data when control changes D-latch timing parameters • Propagation delay (from C or D) • Setup time (D before C edge) • Hold time (D after C edge) metastability Metastability dynamics C Q R S Q D Separatrix Vin1 Vout 2 stable metastable stable Vin 2 Vout1 28 Pos.-edge-triggered D flip-flop behavior D C D C Q master D C Q Q slave D flip-flop timing parameters • Propagation delay (from CLK) • Setup time (D before CLK) • Hold time (D after CLK) metastability Outline • Introduction • Schedule • Latches and Flip-flops • Verification of a clock-schedule Timing verification of synchronous circuits “checkTc and minTc: Timing Verification of Optimal Clocking of Synchronous Digital Circuits”, K. Sakallah,T. Mudge and O. Olukotun “Verifying Clock Schedules”, T. Szymanski and N. Shenoy. Sakallah et. al. formulation Handles arbitrary multi-phase clocks Captures signal propagation along short as well as long paths Simple formulation 32 Clocking system A set of clock k phases C (1 ,...,k ) A set of n D-latches L ( L1 ,..., Ll ) Assume that all clock phases are active high, i.e. are transparent when clock is high Data is captured when clock transits low. 33 Parameters n number of latches in circuit pi clock phase controlling latch i S i setup time of latch i H i hold time of latch i ij minimum combinational delay from latch i to latch j ij maximum combinational delay from latch i to latch j 34 Variables Defining Clock Schedule clock period wi length of time that clock phase i is active ei absolute time within first period when phase i begins, i.e. when clock phase rises. (different from Szymanski paper) 35 Other Variables Eij time between start of phase i and next phase j ai earliest signal arrival time at latch i (relative to start time of pi) Ai latest signal arrival time at latch i (relative to start time of pi) d i earliest signal departure time from latch i (relative to start time of pi) Di latest signal departure time from latch i (relative to start time of pi) All arrival and departure times are in their local time frame where the origin is at ei (i.e. tilocal= 0 when t = ei + n). Eij is used to translate between time frames. 36 Equations and Constraints e j ei Eij e j ei if e j ei otherwise ei wi i wj j ej 0 Note: here Eij e j ei and E ji ei e j Note: Eii 37 ei wi i wj j ej 0 Data departs latch input when it arrives and the latch is transparent: di max(0, ai ) Di max(0, Ai ) 38 ei wi pi wj pj ej 0 dj pj Dj ji pi ai min j i (d j ji E p j pi ) pj ji pi Ai max j i ( D j ji E p j pi ) 39 Constraints for Correct Operation Ai wi Si ei wi pi pj wj ej 0 setup e j max(0, Aj ) ji ei wi Si max(0, Aj ) ji ( ei e j ) wi Si max(0, Aj ) ji E ji wi Si Ai wi Si 40 Constraints for Correct Operation ai wi H i ei wi pi pj wj ej 0 e j max(0, a j ) ji ei wi H i hold max(0, a j ) ji ( ei e j ) wi H i max(0, a j ) ji E ji wi H i ai wi H i 41 Verification and Optimization Clock schedule optimization problem: find the minimum value of for which there is an assignment to all variables (including w and e) consistent with the constraints. Clock schedule verification problem: Given values for , w and e, find values for the rest of the variables that satisfy the constraints. 42 Iterative construction ai0 Ai0 dim max(aim ,0) D max( A ,0) m i m i aim min j i (d mj 1 ji E ji ) A max j i ( D m i m 1 j ji E ji ) Note: • a solution is a fixed point • iteration from below • both min and max times are computed simultaneously 43 Lemmas 1. For all i and m > 0, (monotone increasing) aim aim1 , Aim Aim1 , dim dim1 , Dim Dim1 2. Let (a, A, d , D) be a solution. Then for all i, m > 0, ai aim1, Ai Aim1, di dim1, Di Dim1 3. If din din1 or Din Din1 for some i, then the equations have no solution. (n is # latches) 4. If the iteration converges, then it converges to the minimum solution. 44 Uniqueness results The construction is run to find a solution to the equations, and then this solution is checked to see if it satisfies the setup and hold inequalities. But what if there is not a unique solution? There can be multiple solutions If there is more than one solution, then the clock period is optimum If there is more than one solution, then some of those violate the setup constraints. The minimum solution is the “correct” one because if we slow down the clock by just e , all other solutions disappear. 45 Example = = 10 2 8 Latch 1 S=2 H=3 = 10 Di max( Ai , 0) di max(ai , 0) Ai max j i ( D j ji E ji ) E11 = Try = 10 + e ai min j i (d j ji E ji ) Ai wi Si ai wi H i 46 Theorem If is optimum, then there exists a cycle, j0,j1,…,jk = j0 such that k 1 i 0 k 1 ji ji 1 E ji ji 1 i 0 47 Pseudo code for each i with 1 i n do Ai ai for m 1 step 1 until m n for each i with 1 i n do Di max( Ai , 0) di max(ai , 0) for each i with 1 i n do Ai max( Ai , max j i ( D j ji E ji )) ai max(ai , min j i (d j ji E ji )) if no Ai or ai changed during this pass, then return "algorithm converged" return "algorithm diverged" 48 Experience (Szym. & Shen) Run in ISCAS’89 benchmarks Largest circuit had 3272 latches and 67704 edges j i Run time on this largest example was 20 sec., (1992 computers) most of which was spent reading in the circuit. Typically only a few iterations were needed for convergence. 49