Power Optimization Toolbox

advertisement

Magic

An Industrial-Strength Logic Optimization,

Technology Mapping, and Formal Verification

System

Alan Mishchenko

UC Berkeley

1

Overview

Motivation

Big picture

Problem representation

Algorithms

Sequential synthesis

Combinational synthesis with choices

Technology mapping

Minimum-perturbation retiming

Experimental results

Future work

2

Historical Perspective

Design size, gate count

1,000,000

10,000

100

10

ABC,

Magic

SIS, VIS,

MVSIS

Espresso,

MIS, SIS

Truth tables

1950-1970

Sum-ofproducts

1980

Binary

Decision

Diagrams

1990

And-Inverter

Graphs

Conjunctive normal forms

2000 2010

Time, years

3

Motivation

ABC is a public-domain system for logic synthesis and formal verification under development at

Berkeley since 2005

A successor of Espresso, MIS, SIS, VIS, MVSIS

The baseline version of ABC is not applicable to industrial designs because it does not support

Complex flops

Multiple clock domains

Special objects (adders, RAMs, DSPs, etc)

Standard-cell libraries

A fresh start, called Magic, was taken in Fall 2009

Includes new design database that supports these

Integrates application packages for better memory/runtime

Achieves better scalability

4

Big Picture

Verilog,

EDIF, BLIF

Programmable

APIs

AIG rewriting

File / Code interface

Sequential synthesis

Computing choices

Design database

Retiming

Tech mapping

Post-place

Structuring resynthesis for delay

Verification

A. Mishchenko, N. Een, R. K. Brayton, S. Jang, M. Ciesielski, and T. Daniel,

"Magic: An industrial-strength logic optimization, technology mapping, and formal verification tool". Proc. IWLS'10.

5

Application Packages

 Framework

Design database

File input / output

Programmable APIs

AIG rewriting

Computing choices

Tech mapping

Structuring for delay

File / Code interface

Design database

Verification

Sequential synthesis

Retiming

Post-place resynthesis

Combinational optimization

AIG rewriting

Choice computation

Technology mapping

Sequential optimization

Retiming

Merging equivalence nodes

Technology mapping

Mapping with choices

Speedup

Verification

Simulation

Comb equivalence checking

Seq equivalence checking

6

Representations

 Netlist

Original / current / resulting design with “industrial stuff”

 AIG: The main data-structure of ABC / Magic

Represents local / global functions

Gets synthesized / mapped / verified

 Logic network

Represents the result of technology mapping

7

AIG: Definition and Examples

AIG is a Boolean network composed of two-input ANDs and inverters cd ab

00 01 11 10 F(a,b,c,d) = ab + d(ac’+bc)

00 0 0 1 0

01 0 0 1 1

11 0 1 1 0

10 0 0 1 0 a b d

6 nodes

4 levels a c b c

F(a,b,c,d) = ac’(b’d’)’ + c(a’d’)’ = ac’(b+d) + bc(a+d) b a

00 01 11 10

00 0 0 1 0

01 0 0 1 1

11 0 1 1 0

10 0 0 1 0 a c b d b c a d

7 nodes

3 levels

8

AIG: A Unifying Representation

 An underlying data structure for various computations

Representing both local and global functions

Used in rewriting, resubstitution, simulation, SAT sweeping, induction, etc

 A unifying representation for the whole flow

Synthesis, mapping, verification pass around AIGs

Stored multiple structures for mapping (‘AIG with choices’)

 The main functional representation in ABC

Foundation of ‘contemporary’ logic synthesis

Source of ‘signature features’ (speed, scalability, etc)

9

Magic Optimization Flow

Inputting the design

Sequential synthesis

Comb synthesis with choices

Tech mapping

Retiming and resynthesis

Outputting the design

The design is entered from file or through programmable APIs

Internal representation is based on a light-weight data-structure for improved memory and runtime

Sequential synthesis is applied to detect and merge seq equiv objects

Combinational synthesis and mapping are iterated several times, while saving the best result

Optionally, min-perturbation retiming and resynthesis are applied to reduce delay/area after mapping

The design is saved into file or through programmable APIs

Verification is performed between any two points in the flow

10

Sequential Synthesis (Motivation)

 Combinational equivalence

Two functions, F and G, produce the same output for all input combinations

 Sequential equivalence

Two functions, F and G, produce the same value for all reachable states

F

00 01 11 10

00 0 1

01 0 1

11 1 1

10 0 1

0

1

0

0

0

0

0

0

G

00 01 11 10

00 0

01 0

11 1

10 0

1

1

1

1

0 0

1 0

0 0

0 0

Complete Boolean space is shown by highlighting

F G

00 01 11 10

00 01 11 10

00 0

01 0

11 0

10 0

1

0

0

0

1

0

0

0

0

0

0

0

00 0

01 0

11 0

10 0

1

1

1

1

0

0

0

0

0

0

0

0

Reachable state space of 1-hot encoding is shown by highlighting

11

Sequential Synthesis

 Detect, prove, and merge sequentially equivalent nodes

Seq equiv nodes are equivalent on reachable states

Special case: Comb equiv nodes are equivalent on for any state

B

A B

A

 Observations

Can be done using simulation and SAT (without BDDs)

Leads to substantial reduction for large designs (> 10% in area)

Works for large designs (10-15 minutes for 1M gates)

A. Mishchenko, M. L. Case, R. K. Brayton, and S. Jang, "Scalable and scalably-verifiable sequential synthesis", Proc. ICCAD'08.

12

Experiment Results

Results

0%

-2%

-4%

-6%

-8%

-10%

-12%

-14%

Ряд1

LUT

-13%

Register

-13,10%

Level

-1,50%

Results collected using a suite of 20 industrial designs

Comb Synthesis (AIG rewriting)

 Restructures AIG by applying the following transforms:

Rewriting/refactoring/redecomposition

Tree-balancing

Resubstitution

Minimization with don't-cares, etc

 Case study: AIG rewriting

Pre-compute AIG subgraphs for F = abc a b a c

Subgraph 1 a b c

Subgraph 2 b a c

Subgraph 3

A a b a c

Subgraph 1

Rewriting node A

A a b c

Subgraph 2

A. Mishchenko, S. Chatterjee, and R. Brayton, "DAG-aware AIG rewriting:

A fresh look at combinational logic synthesis", Proc. DAC '06.

14

Combinational Synthesis with

Structural Choices

Perform synthesis and keep track of changes

Iterate fast local AIG rewriting with a global view (via hash table)

Collect AIG snapshots and prove equivalences across them

Use equivalences (choices) during technology mapping

Observations

Leads to improved QoR after technology mapping

Successfully applied to 1M gate designs

Traditional synthesis

D1 D2 D3 D4

Synthesis with choices

D1

D2 HAIG D4

D3

15

Technology Mapping

Customizable structural mapping with priority cuts

Computes a small subset of cuts without impacting the QoR

Uses structural choices

Observations

Controls QoR tradeoffs

Minimizes delay/area, wire count, switching activity, etc

Successfully applied to 1M gate designs

AIG f

Mapped network f

LUT

LUT

LUT a b c d e

Primary outputs a b c d e

Choice node

A. Mishchenko, S. Cho, S. Chatterjee, R. Brayton,

"Combinational and sequential mapping with priority cuts", Proc. ICCAD '07.

Primary inputs

16

Minimum-Perturbation Retiming

Reduces delay, while minimizing the number of flops moved

Produces a trade-off: delay gain vs. the number of flops moved

Handles “industrial stuff”; retimes over white boxes such as adders !

Computes new initial state after backward retiming

Allows the user to control the resources

Desired delay gain

Maximum allowed number of flops moved

Maximum area increase after retiming

Observations

Can be useful before and after placement

Can be implemented efficiently

• Runs in less than a minute for 1M gates

Delay

Flops moved

S. Ray, A. Mishchenko, R. K. Brayton, S. Jang, and T. Daniel, "Minimum-perturbation retiming for delay optimization". Proc. IWLS'10.

17

Sequential Verification

Property checking

Takes design and property and makes a miter (AIG)

Equivalence checking

Takes two designs and makes a miter (AIG)

 The goal is to transform AIG until the output can be proved const 0

 Equivalence checking in Magic is based on the model checker that won Hardware Model Checking

Competition in 2008 and 2010 http://fmv.jku.at/hwmcc10/results.html

Property checking p

D1

Equivalence checking

D1 D2

0

0

18

A Naïve Way to Use ABC

Convert all persistent logic to black boxes

Box IOs are treated as PI/POs in synthesis

Adverse effects

Losing the correlation of box outputs/inputs

Restricting synthesis due to broken logic paths

Not being able to propagate delays through the boxes

Sequential synthesis doesn’t work well

A Better Way to Use ABC

Clock domains

Represent clock signal in the data-base

Annotate flops with their clock-domain number in the AIG

Separate clock domains in sequential transforms

Complex controls of the flops

Use parametrized flop model

Perform elaboration of control signals if needed

Handle asynchronous reset carefully!

Industrial primitives (adders, RAMs, DSPs, etc)

Use boxes (black/white, comb/seq, merge/no_merge, etc)

Currently propagates timing information, improves quality of synthesis

Elaborate boxes for seq synthesis, but do not map them

Need better support for userspecified attributes (don’t-touch, etc)

20

Experimental Setup

 Integrated Magic into an industrial FPGA synthesis flow

 Experimented with the full flow, including P&R

Did not use retiming

Did not use post-placement re-synthesis

 Verified by running Magic and in-house simulation tools

 Experimented with 20 designs, from 175K to 648K LUT4

 Two experimental runs:

“Reference” stands for the typical industrial flow without Magic

“Magic” stands for the new flow with Magic

Frontend

Design entry, high-level synthesis, quick mapping

Magic

Seq and comb synthesis, mapping, legalization

Backend

Placement, routing, design rule checking, etc

21

Experimental Results

Circuits

C9

C10

C11

C12

C13

C14

C15

C16

C5

C6

C7

C8

C1

C2

C3

C4

C17

C18

C19

C20

Geomean

Ratio

Profile Reference

PI PO LUT FF

736 369 174972 113157

150 67 187037 112991

4 80 199097 53954

517 253 206725 132416

4 280 212124 64120

803 258 255415 166644

24 10 296152 133704

124 58 323818 86712

268 132 413017 195150

205 94 439963 134139

148 456 455429 160450

4 3 455630 20277

4 240 470436 230811

218 69 522988 311436

377 183 575355 351911

73 33 599413 216051

136 66 618377 259844

136 66 621875 249327

146 391 630918 275871

135 32 648849 353940

Lev fMAX Time LUT FF

12 128.53 1.05 173561 100398

18 91.32 0.53 161303 93930

27 68.49 0.69 137126 36190

11 105.37 1.31 197029 114745

26 68.82 0.65 152799 49513

11 113.25 2.08 255026 148445

17 89.93 0.72 246908 114002

32 40.68 1.99 346516 86662

18 81.50 1.40 375481 174306

20 63.17 3.55 445950 133575

96 27.53 2.23 398428 149126

6 66.67 0.78 152414 19446

28 53.59 3.30 462010 225676

17 68.78 1.83 448426 257996

10 136.05 2.59 575672 349715

4 202.02 1.07 599413 216051

56 47.66 2.75 562367 243084

27 45.68 4.60 606135 247825

55 46.36 2.50 572834 259336

7 127.71 2.45 645501 353616

Magic

Lev fMAX

10 133.87

16 95.69

20 75.59

8 129.20

19 77.70

8 123.00

14 120.48

25 47.08

15

15

79.81

69.06

56 33.11

6 100.40

18

15

57.34

69.40

8 136.99

4 209.21

34

27

53.53

52.58

36 50.76

5 136.43

Time

1.61

2.64

1.90

0.41

6.18

2.19

2.95

1.79

0.70

0.67

0.77

0.67

0.74

1.00

0.90

1.94

2.61

4.03

2.51

2.91

377883 150015 18.54 74.768 1.591 329751 135972 14.40 83.572 1.541

1 1 1 1 1 0.873 0.906 0.777 1.118 0.969

22

Cumulative Improvement

(retiming excluded)

QoR

20,00%

10,00%

0,00%

-10,00%

-20,00%

-30,00%

-40,00%

-50,00%

-60,00%

Ряд1 fMAX

11,80%

LUT count

-12,70%

Registers

-9,40%

Levels

-22,30%

Total

Runtime

-3,10%

P&R Runtime

-50%

Future Work

 Continue to improve application packages

AIG rewriting, tech-mapping, sequential synthesis, etc

 Improve integration of logic and physical synthesis

Synthesis/mapping/retiming before placement

Retiming/restructuring after placement

 Extend the flow to work for other technologies

Macro cells

Standard cells

24

Download