Prof. Kurt Keutzer
EECS keutzer@eecs.berkeley.edu
1
z z z z z
Introduction to Kurt … and others
Where does CAD fit in?
Brief overview of CAD
Goals of Course
Project
2
z z z z z
Professor in EECS
B. S. in mathematics from Maharishi International University 1978
(yes, I’m serious)
M. S. in
Ph.D. in CS from Indiana University 1984
AT&T Bell Labs, Area 11 1984-1991 z
Developed a number of successful (internally) tools for hardware developers z z
Plaid – Programmable Logic AID – used to create racks of switching system hardware
DAGON – worked with Chuck Stroud and Mark Vancura to create a logic synthesis system for Bell Labs – dozens of
IC’s developed with the system
3
z
Synopsys, Inc. 1991-1998 (now 14 th largest software company) z
From Member of Research Staff of $30M 200 person company to
SVP/CTO of $600M 3000 person company in 7 years z z z
As CTO oversaw and reviewed technology of over 25 software products accounting for $600M in revenue
Identified new technology and market opportunities z z
Initiated and participated in a dozen corporate acquisitions
As Manager=>Director=>VP=>SVP or research z z z
Initiated a number of product ideas and two complete products:
FPGA Express – FPGA synthesis software – brought to ``product roll-out’’
Formality – market leader in formal verification of circuits –
UC Berkeley 1998-present z
Professor of EECS z
As teacher – EECS 244 (Intro to CAD), CS169 (Software Engineering) z z
Associate Director – Gigascale Systems Research Center 1998-2001
As a research advisor z
MESCAL: modern embedded systems, compilers, architectures, and languages – 8 students
4
As an entrepreneur: z z
Cadabra (acquired by Numerical Technology 2000, acquired by
Synopsys 2003) – investor/Corporate Board
Everest Design (acquired SNPS, 1999) – investor/TAB z z z
Right Track CAD (acquired by Altera, 2000) – angel investor/TAB
0-in Design Automation – Series A investor 1998/TAB acquired by Mentor Graphics 8/2004
Tensilica, Inc (upside top 100), Series A investor 1998/TAB z z z
Catalytic Compilers – angel investor/TAB – founded Fall 2002,
$6M in funding from NEA July 2003
Stretch Inc. - Series A investor/TAB – founded 2002, $15M in funding from Worldview, July 2003
As a consultant: Cadence (2001-present) Synopsys (1998 – 2000)
Ammocore, C-Cube Microsystems (IPO), CoWare, Hier Design
(acquired by Xilinx) Reshape, a number of venture capital firms
5
6
z z z z z
Introduction to Kurt
Where does CAD fit in?
Brief overview of CAD
Goals of Course
Project
z z
The world is increasingly dependent on electronic systems
The “first world” is entirely dependent on electronic systems z
World economy $33.4 trillion http://www.imf.org/ z
Electronic systems $1 trillion Sources: Gartner
Group/Dataquest, Rose Associates; January, 2000 http://www.facsnet.org/tools/sci_tech/tech/biz/
8
7
z
Electronic systems are entirely dependent on semiconductor components z
Electronic systems $1 trillion z z
Semiconductor industry $160 billion
Sources: Gartner Group/Dataquest, Rose Associates; January, 2000 http://www.facsnet.org/tools/sci_tech/tech/biz/
9
Real World
Electronic Systems
Silicon Foundries
Computer-aided
Design
Integrated Circuit
10
Real World
Electronic Systems
Semiconductor
Industry
Silicon Foundries
Speech processing
Signal processing
Performance analysis
System modeling and synthesis
Formal verification
Logic synthesis
Place and route
Circuit simulation
Device simulation
Process modeling
Designer using
CAD
11
Real World
Electronic Systems
Semiconductor
Industry
Silicon Foundries
Matlab
Labview
SystemC
Metropolis
Ptolemy
Design compiler
Physical compiler
Apollo
BSIM
Spice
Pisces
Designer using
CAD
12
z
Electronic systems and semiconductor components are entirely dependent on computer-aided design/electronic design automation tools z
Electronic systems $1 trillion z z z
Semiconductor industry $160B
EDA industry $3B
Sources: Gartner Group/Dataquest, Rose Associates; January, 2000 http://www.facsnet.org/tools/sci_tech/tech/biz/
13
Entire Technology Sector
$1290B
By Revenue (TTM)
Transportat ion
2%
Technology
9%
Utilities
6%
Basic
5%
Goods
4%
Conglomer ates
4%
Consumer
Cyclical
11%
Services
22%
Healthcare
5% Financial
15%
Energy
11%
Consumer/
Non-
Cyclical
6%
14
Entire Technology Sector
$1290B
TTM Revenue Breakdown
Software &
Programming,
127,737
Comm Eqpt,
226,929
Semis,
147,624
Sci & Tech
Instr, 31,610
Computer
Hardware,
201,226
Office Eqpt,
49,305
Electronic
Instr. &
Controls,
144,429 Computer
Storage
Devices,
27,023
Computer
Services,
192,348
Computer
Networks,
15,805
Computer
Peripherals,
126,283
Multex Investor – bottom up TTM
15
Technical &
System
Software,
$10.44B
Security
Software,
$2.83B
Multimedia &
Graphics
Software,
$8.18B
Business
Software,
$5.33B
Application
Software,
$74.11B
Software Market (not incl services)
$100.9B
Multim edia &
Gra phics
So ftwa re ,
$ 8.18B
Se c urity
S o ftwa re &
S e rvic e s ,
$ 2.83B
Te chnic al &
S ys tem
So ftwa re ,
$ 10.44B
Info rm atio n &
De live ry
S e rvic es ,
$ 59.72B
He a lthc are
Info rmatio n
Se rvice s ,
$ 3.41B
B us ine s s
So ftwa re &
S e rvic e s ,
$ 55.75B
Applic a tio n
S o ftwa re ,
$ 74.11B
Software & Services Market
$229.8B
16
Corporation Ticker
14. Synopsys,
Inc
15. Amdocs
Ltd
16. Siebel
SNPS
DOX
SEBL
17. Check
Point
Software
18. Cadence
Design
19. Mercury
Interactive
20. Verisign
CHKP
CDN
MERQ
VRSN
Market Rev
$5051 $1106
$4598
$4595
$3974
$3543
$1427
$1418
$425
$1136
Marg
(16.2)
8.1
P/E
NM
Price High Low
$65.0
$65.5
$31.8
40.3
$21.3
$27.3
$5.85
(8.2)
58.2
NM
16.6
$9.30
$12.2
$5.33
$16.2
$22.2
$12.6
6.2
50.8
$13.0
$15.6
$8.65
$3398 $444
$3307 $1112
15.1
(28.4)
52.7
$39.8
$45.6
$15.2
NM $13.9
$16.1
$3.92
17
World economy > $33 Trillion
Electronic systems > $1Trillion
Semiconductor > $160B
CAD $3B
18
z z
Computer-aided design (CAD)/Electronic design automation (EDA) enables electronic systems, and electronic systems enable the world economy
CAD/EDA software companies are big players in the world software market, but modestly sized relative to the industries they serve
19
z z z z z
Introduction to Kurt
Where does CAD fit in?
Brief overview of CAD
Goals of Course
Project
20
Transistors
10M
1M
100K
10K
1K
100
10
1
4004
8080
Microprocessors
8086
68000
68020
80386
80486
68040
Pentium
PPC603
Pentium Pro
PPC601
MIPS R4000
1975 1980 1985 1990 1995
21
Clock Speed GHz.
10.0
9.0
7.0
5.0
3.0
1.0
0
1997 1999 2001
On-chip, global clock, high performance
On-chip, local clock, high-performance
2003 2006 2009 2012
22
Transistors
10M
1M
100K
10K
4004
1K
100
10
1
8080
1975
8086
68000
68020
80386
1980 1985 1990
Processor Complexity
Avg. Human IQ
80486
68040
Pentium
PPC603
Pentium Pro
PPC601
Intelligence
Quotient
160
MIPS R4000
180
140
120
1995
100
80
50
23
z z
Because the capability of integrated circuit technology scales so rapidly, traditionally we have had: z
Exponentially more devices every process generation z
Exponential increases in speed every process generation
Will these trends continue?
z z
After a few process generations we need to do something fundamentally different
CAD is not a field you can relax in!
24
Results
(Design Productivity) 1999 a b
0
1 d s clk q
1992
RTL Synthesis
Schematic Entry
1978
1985
What’s next?
Transistor entry
McKinsey S-Curve
Effort
(EDA tools effort) 25
Results
(Design Productivity)
What’s next?
1999
1992
Synthesis - Cadence, Synopsys a b
0
1 d s clk q Schematic Entry - Daisy, Mentor, Valid
1978
1985
Transistor entry - Calma, Computervision
McKinsey S-Curve
Effort
(EDA tools effort) 26
z z z
Key tools: z
Transistor-level layout – e.g.
Calma workstation z
Transistor-level simulation – e.g.
Spice z
Bonus: transistor-level compaction – e.g. Cabbage
Size of circuits: 10’s of transistors to few thousand
Key abstractions and technologies: z z
Transistor-level modeling
Logical gates- NAND, NOR, FF and cell libraries z
Compaction
27
z
Key tools: z z gate-level layout editor –
Daisy, Mentor, valid workstation
Gate-level simulator z
Automated place and route z
Size of circuits: 3,000 – 35,000 gates
(12,000 to 140,000 transistors) z
Key abstractions and technologies: z
Logic-level simulation z
Cell-based place and route z
Static-timing analysis c_in a b
Add_full_0_delay
(a ⊕ b) a sum
Add_half_0_delay b c_out w1
(a ⊕ b) w2 ab a sum
Add_half_0_delay b c_out w3
(a
(a + b) c_in + ab
sum c_out
28
z z z z
Border between transistor domain
(analog) and digital domain
Digital gate level models introduced to speed up digital simulation.
Gate level model contains: z
Logic behavior z
Delays depending on: operating conditions, process, loading, signal slew rates z
Setup and hold timing violation checks
Gate level model parameters extracted from transistor level simulations and characterization of real gates.
J. Christiansen,
CERN - EP/MIC
Jorgen.Christiansen@cern.ch
29
FA
4-bit Ripple Adder
FA FA FA
OR Half Adder
XOR AND
Hall Adder
Ciletti, M. D.
30
z z z
Key tools: z
Hardware-description language simulator – Verilog, VHDL z
Logic synthesis tool - Synopsys z
Automated place and route –
Cadence, Avant!, Magma
Size of circuits: 35,000 gates to …?
Key abstractions and technologies: z z z z
HDL simulation
Logic synthesis
Cell-based place and route
Static-timing analysis z
Automatic-test pattern generation module Half_adder (Sum, C_out, A, B); output Sum, C_out; input A, B; xor and endmodule
M1 (Sum, A, B);
M2 (C_out, A, B); module Full_Adder (sum, c_out, a, b, c_in); output input wire sum, c_out; a, b, c_in; w1, w2, w3;
Half_adder M1 (w1, w2, a, b);
Half_adder M2 (sum, w3, w2, c_in); or M3 (c_out, w2, w3); endmodule module Full_Adder_4 (sum, c_out, a, b, c_in); output output input
[3:0]sum; c_out;
[3:0] a, b; input c_in; wire
Full_adder
Full_adder
Full_adder
Full_adder c_in2, c_in3, c_in4;
M1 (sum[0], c_in2, a[0], b[0], c_in);
M2 (sum[1], c_in3, a[1], b[1], c_in2);
M3 (sum[2], c_in4, a[2], b[2], c_in3);
M4 (sum[3], c_out, a[3], b[3], c_in4); endmodule
31
module Half_adder (Sum, C_out, A, B); output Sum, C_out; input A, B; xor and endmodule
M1 (Sum, A, B);
M2 (C_out, A, B); module Full_Adder (sum, c_out, a, b, c_in); output input wire sum, c_out; a, b, c_in; w1, w2, w3;
Half_adder M1 (w1, w2, a, b);
Half_adder M2 (sum, w3, w2, c_in); or M3 (c_out, w2, w3); endmodule module Full_Adder_4 (sum, c_out, a, b, c_in); output output input
[3:0]sum; c_out;
[3:0] a, b; input c_in; wire
Full_adder
Full_adder
Full_adder
Full_adder c_in2, c_in3, c_in4;
M1 (sum[0], c_in2, a[0], b[0], c_in);
M2 (sum[1], c_in3, a[1], b[1], c_in2);
M3 (sum[2], c_in4, a[2], b[2], c_in3);
M4 (sum[3], c_out, a[3], b[3], c_in4); endmodule c_in a b
Add_full_0_delay
(a ⊕ b) a sum
Add_half_0_delay b c_out w1
(a ⊕ b) w2 ab a sum
Add_half_0_delay b c_out w3
(a
(a + b) c_in + ab sum c_out
32
Results
(Design Productivity) 1999 a b
0
1 d s clk q
1992
RTL Synthesis
Schematic Entry
1978
1985
What’s next?
Transistor entry
McKinsey S-Curve
Effort
(EDA tools effort) 33
a b
0
1 d s clk q
34
Beh
RTL a b
0
1 d s clk q
Staff Months
62.5
125
625
Implementations here are often not good enough
Because implementations here are inferior/ unpredictable
6250
Power
62,500
Delay
Area
35
Speed of
Designer
Behavioral Synthesis
Sequential Synthesis
Combinational Synthesis
Acceptance curve
Quality of
Design
Expert designer
Source: A. DeGeus
36
module foobar (q,clk,s,a,b); input clk, s, a, b; output q; reg q; reg d; always @(a or b or s) // mux begin if( !s ) d = a; else if( s ) d = b; else d = 'bx; end // always @ (a or b or s) always @(clk) // latch begin if( clk == 1 ) q = d; else if( clk !== 0 ) q = 'bx; end // always @ (clk) endmodule
37
Library/ module generators
HDL
RTL
Synthesis netlist logic optimization netlist physical design layout
HDL simulation module Full_Adder_4 (sum, c_out, a, b, c_in); output output input
[3:0]sum; c_out;
[3:0] a, b; input c_in; wire
Full_adder
Full_adder
Full_adder
Full_adder c_in2, c_in3, c_in4;
M1 (sum[0], c_in2, a[0], b[0], c_in);
M2 (sum[1], c_in3, a[1], b[1], c_in2);
M3 (sum[2], c_in4, a[2], b[2], c_in3);
M4 (sum[3], c_out, a[3], b[3], c_in4); endmodule
38
z
: specify and enter the design intent
39
z z z z z
Introduction to Kurt
Where does CAD fit in?
Brief overview of CAD
Goals of Course
Project
40
z z z
Help to develop the core competences of a CAD engineer z
Software expertise z
Algorithmic facility z
Domain expertise in ic design
Communicate the essence of the current IC design flow in a semester z
Goal: ``If Avanti, Cadence, and Synopsys employees were all abducted by aliens, their software could be recreated by this class.’’
Prepare you for performing publishable research – aim high, a real publication!
41
z z z z
Processing, Devices students – understand the tool flow, examine ways of bridging the gap between processing, design, and CAD
Circuits students – understand how the tools that you will be using for the rest of your life work
CAD students – give you foundation material for the field, prepare you for preliminary examinations
Theory types – understand how algorithms are applied in this algorithmrich area
42
z
Each week z
Examine a portion of the IC design flow z
Identify one or more key problems z
Formulate the problem mathematically z
Solve the problem, examining trade-offs between z
The computational efficiency of the algorithms z
The quality/optimality of the result z
Look at contemporary practice z
See how close the classroom work approaches industrial practice
43
z z z z z z z
EECS 244
Cory 521, Monday, Wednesday 1:00 – 2:30 PM
Prof. Kurt Keutzer, Cory 566, Office hour: Wednesday 2:30 – 3:30, or by appointment z keutzer@eecs.berkeley.edu
Exam 1: 30%
Exam 2: 30%
Final project: 40% (20% general content, 10% content in presentation,
10% content in written report)
No TA for course – still working out web pages, pdf etc.
z
Syllabus, Web page: up soon z
The course material will not be hard for you – but the project may be …
44
z z z z z
Introduction to Kurt
Where does CAD fit in?
Brief overview of CAD
Goals of Course
Project
45
z z
Your first couple years of graduate school are about making the transition from z
Excellent course/test taker Æ creative researcher z
Solitary student Æ Active team member z
Assimilating well defined information Æ pursuing open questions
The project portion of the course is to help you make this transition
46
z z z z z z z z
Motivation
Problem statement
Investigative approach
Results
Summary
Conclusions
Future Work
47
z z z z
Research will be conducted in groups of 2-3
Individual projects highly discouraged
Research may be coordinated with other class projects: EECS249,
EECS290N etc.
Research will culminate in a: z
Powerpoint presentation/demo z
Written report
48
z z z z z z z z z z
Use your skill set
Circuits, devices, processing, software development, system-level applications
Great idea:
Topical – e.g. system level, deep submicron effects, power
Tractable – can make an impact in a semester, have all the software, examples, data files that you need
Get started early
Get mentorship (senior grad students, post-docs, prof of course)
Follow deadlines
Formulate the problem clearly
Formulate your results clearly
49
z z z z z z
``Getting to the Bottom of Deep Submicron’’, D. Sylvester, K. Keutzer, In
Proceedings of the International Conference on Computer-Aided Design,
November, 1998, pp. 203-211.
``Towards True Crosstalk Noise Analysis’’, P. Chen, K. Keutzer, In
Proceedings of the International Conference on Computer-Aided Design,
November, 1999, pp. 132-137.
``Impact of Systematic Spatial Intra-Chip Gate Length Variability on
Performance of High-speed Digital Circuits’’ M. Orshansky, L. Milor, P.
Chen, K. Keutzer, C. Hu, In Proceedings of the International Conference on
Computer-Aided Design, November, 2000, pp. 62-67.
`` Bus Encoding to Prevent Crosstalk Delay ‘’, B. Victor, K. Keutzer,
Proceedings of the International Conference on Computer-Aided Design,
November, 2001.
“Constraint Driven Communication Synthesis”, A. Pinto, L. Carloni, A.
Sangiovanni-Vincentelli, DAC 2002
“Multi-Domain Clock Skew Scheduling”, Kaushik Ravindran, Andreas
Kuehlmann, Ellen Sentovich, ICCAD 2003.
50
z
“Constraint Driven Communication Synthesis”, A. Pinto, L. Carloni, A.
Sangiovanni-Vincentelli, DAC 2002 z z
Problem: z
Addresses the design of the communication architecture of a complex system from a library of pre-defined Intellectual Property (IP) components. The key communication parameters that govern all the point-to-point interactions among system modules are captured as a set of arc constraints in the communication constraint graph.
Similarly, the communication features offered by each of the components available in the IP communication library are captured as a set of feature resources together with its cost figures.
Solution: z
Each communication architecture that can be built using the available components while satisfying all constraints is implicitly considered
(as an implementation graph matching the constraint graph) to derive the optimum design solution with respect to the desired cost figure.
The corresponding constrained optimization problem is efficiently solved.
51
z
“Bus Encoding to Prevent Crosstalk Delay”, B. Victor, K. Keutzer.
z
Problem: z
Delay in global wires is a large factor in overall circuit performance z
Capacitive coupling (cross-talk) between wires may greatly increase this delay z
Solution: z
Old solutions: place shielding wires between signal wires z
Large (2X) area penalty z
Proposed solution: use data encoding to eliminate coupling between neighboring bus lines z
Smaller (1.25X) area penalty
52
z
“Getting to the Bottom of Deep Submicron’’, D. Sylvester, K. Keutzer z
Problem: z
Total chip wiring length increase dramatically z
It is claimed that wire delay contribution becomes dominant z
Traditional design flows (as will see) treat wire contribution as something secondary z
It is claimed that traditional design flows thus will fail in deep submicron z
Solution: z
Need to perform careful analysis of local and global interconnect z
If design is done on blocks of 50K-100K gates, traditional flows will work
53
z
“Impact of […] Variability on Performance of High-speed Digital
Circuits’’ M. Orshansky, L. Milor, P. Chen, K. Keutzer, C. Hu, ICCAD
2000 z
Problem: z
Chip performance depends on path delay distribution and on critical path values z
Manufacturing of advanced technologies inevitably creates large variations in properties of transistors z
Neglecting this variability and uncertainty leads to underperforming and malfunctioning circuits z
Solution z
A probabilistic model is derived to predict critical path degradation due to uncertainty z
Circuit and process-level fixes proposed for most critical factors
54
z z z z
Computer-aided design (CAD)/Electronic design automation (EDA) enables electronic systems, and electronic systems enable the world economy
CAD/EDA software companies are big players in the world software market, but modestly sized relative to the industries they serve
Although many “higher productivity” design flows have been developed the industry continues to rely on the RTL synthesis flow
Essence of being a CAD engineer is the ability to: z
Build large, robust, software systems that z
Embody sophisticated CAD algorithms, targeted toward z
Solving real IC design problems
55
z z
More on projects
Overview of the RTL design flow
56