Digital VLSI Design - University of Hartford's Academic Web Server

advertisement
Digital VLSI Design
Design of Very Large Scale Integrated Digital
circuits
using CAD tools
http://uhaweb.hartford.edu/ilumokanw
Syllabus
University of Hartford – College of Engineering
Electrical Engineering Department
ECE565 Digital VLSI Design
Fall 2005
Professor:
Dr. Abby Ilumoka, Room UT 235, Ph: (860) - 768 – 5231
Email: ilumokanw@hartford.edu
Website: http://uhaweb.hartford.edu/ilumokanw
Class Time:
Tue Thu, 4.15-5.30pm
Office Hrs: :
Wed 2-3.30pm, Tues, Thur 1.30pm – 2.30pm
(other consultation by appointment)
Credit Hours 3
Lecture Hours 1.75hr/w Laboratory Hours 0.75hr/wk
Prerequisites/Co-requisites
Digital System Logic(EE231), Digital Laboratory(EE232), Electronics Circuits
(EE362), Electronics Lab II (EE364), Senior or graduate standing
Textbook
Digital Integrated Circuit Design by Martin, Oxford Publishing
References
CMOS Digital and Analog Circuit Design by John Uyemura, Oxford Publishing
Software Tanner VLSI Design Suite: LEDIT Pro Full Custom Layout Editor,
TSPICE Pro Circuit Simulator, UPLib, CMOS Lib, SEDIT Schematic Editor,
LVS Netlist Comparator
Syllabus (contd)
Bulletin Description
Techniques for CMOS digital integrated circuit design at circuit,
subsystem and system levels. CAD tools for design from schematic
capture to physical layout. Design methodologies – programmable
logic, standard cell, full custom; CMOS fabrication technology;
design issues – speed, power, reliability, testability; CMOS design
case studies. Laboratory project.
Course Outcomes When the students have completed this course, they
will be able to design state-of-the-art digital integrated circuits. They
will have acquired in depth knowledge of VLSI design constraints as
well as degrees of design freedom available to them thus enabling
standard cell and full custom design of digital integrated circuits
using both mask and netlist level tools.
Assessment
3 X 75min Exams. Each exam counts 25% toward final grade.
Cell Library Design counts 25%
Other Course Information
Exam Dates: Exam 1 9/27, Exam 2 Oct 27, Exam 3 Nov 17, 4.15 –
5.30pm (Final), Mini-Projects due Fri Dec 16
TOPICS
•
•
•
•
•
•
•
•
•
Introduction and MOSFET Electrical Properties
Design Methodology (Fabrication)
Digital System Building Blocks
Design of Microprocessor datapath
VLSI Circuit Concepts (R,C Delays and Crosstalk)
Partitioning, Floorplanning and Placement
Grid Global and Channel Routing
VLSI Circuit Optimization and Testing
Supplementary Topics
Historical: 2003 Technology
Intel Itanium Line 64-bit dual-processor chips
• Itanium Deerfield - low-power 1GHz Itanium 2 processor
• Consumed about half as much power (62 watts) as predecessor
• For lower-cost systems, power conservation important ($744)
• Itanium's Madison 1.4 GHz processor, 1.5M bytes of level 3
cache, cost $1,172
• For systems running at least two processors
• Supercomputing-like performance for the scientific and
technical markets.
Historical: 2004 Technology
• World’s highest performance 2004 desktop processor - Intel
Pentium 4
• Operated @ 2.8-3.4GHz
• Built with 0.13um technology, 533MHz system bus
• Hyper-pipelined technology - longer pipeline boosts speed
• Intel released retooled version of Pentium 4 code-named
Prescott - came with 31-stage pipeline, functions like internal
assembly line (Older Pentium 4s had only 20-stage pipeline
Pentium III had ten-stage pipeline)
• Intel developed Pentium M - energy-efficient chip for
notebooks, shared characteristics of both Pentium III & 4
• Pentium 4’s feature enhanced floating point and Multimedia Performance for Digital Lifestyle – Reduced time required to
encode digital media e.g. music, pictures, movies. Processor Cost =
$508 in 2004, slashed by average $200 in 2005
2004 Intel Itanium
Low Power
Power Headaches
• Problem of heat dissipation in modern semiconductors
causing manufacturers like Intel to kill faster clock speeds
• Over past decades engineers have scaled microprocessor
to smaller dimensions in accordance with Moore’s Law, so
that today some elements are only a few layers of atoms
thick. Thinness of structures contributes to power
headaches - current leakage, power consumption and high
operating temperatures.
• High power consumption generates unwanted heat and
decreases battery life of portable devices like notebooks
and handhelds. The well-known leakage problem gets
worse with successive process generations
• Big dilemma for entire semiconductor industry.
• Latest Intel® Pentium® 4 processors with over 125 million
transistors built on 90nm process technology consume as
much as100 watts (glowing 100W light bulb – ouch!)
• Today’s PCs - Large cooling elements, noisy fans, and
massive heat sinks
• Solution??
Eureka! Enter Multicore Technology
• Dual-core and multicore chips change the game
• By placing more than one computational engine or core on each die,
Intel can continue to add more and more transistors to its
processors and diminish troublesome effects of processor scaling.
• Intel plans to run dual-core chips at lower frequencies than single
core chips so they’ll require lower voltage and throw off less heat
• Two cores on a single chip will enable a processor to do more
without a proportional increase in power
• Dual-core chips not the same as dual-processor systems. Many
servers today have two or more processors on same motherboard
These dual-processor or multi-processor systems widely used in
enterprise computing environments
• By contrast, dual core components have two complete processor
chips inside each package - big manufacturing change from today's
single core chips
• Promises temporary relief from power and thermal challenges
threatening processor performance
Era of Parallelism: 2005
Double Vision??
Smithfield
• Pentium who?
Pentium Extreme Edition 840 Intel dual-core chip thoroughly Pentium 4 heritage
• Code-named "Smithfield," pair of Pentium 4 "Prescott"
cores situated together on single piece of silicon. Each
core has 1MB of L2 cache onboard, and two cores share
an 800MHz front-side bus. Siamese twin action
• Smithfield manufactured using same basic 90nm
fabrication process as current Pentium 4 chips. However,
roughly twice size of Prescott core at 230 million
transistors and 206 mm2 of die space
• IBM produced first multicore Power4, in 2001 (Intel aims
to be first in volume production of the new chips across
all market segments: server, desktop, and mobile)
2005/06 Technology
• Parallelism revolution continues
• Intel Development Forum (IDF) CA, Aug 2005
• Intel CEO introduces new 65nm dual core microprocessor designed to
bring increased power per watt , production begins end 2005, in market
by 2nd half 2006
• 2006 shipments (60million) based on 65nm to surpass current 90nm
• Processors allow chipmakers to get more performance out of a single
piece of silicon without boosting power consumption and heat
generation.
• Enables computer programs to work on more than 1 task at same time
• For example, multi-core technology helps Google process data in
parallel, while controlling power and electricity costs
• New processor - applied to laptops - code-named Merom
• Applied to Desktop computers - code-named Conroe
• Applied on Server platforms – code-named Woodcrest
Software Adjustments: Hyper-Threading
• Many software vendors have already programmed their code to utilize
the multithreaded capabilities of HyperThreading technology
• Hyper-Threading Technology enables software applications to execute
threads in parallel. To improve performance, threading enabled in
software by splitting instructions into multiple streams so that multiple
processors can act upon them.
• Delivers faster response times for multi-tasking
• Multicore processors benefit from the same programming optimizations
as for HyperThreading
• Dualcore will provide an immediate performance improvement to
hyperthreading applications
• Operating systems such as WindowsXP and Linux have been optimized
for multicore processors and are ready to support Intel's next generation
processors as soon as they are launched….
• Multicore has also raised question of software licensing and customer
billing ($$). Some vendors have considered charging license fees on a
per processor basis, charging more for dual or multi core systems.
Microsoft has announced that its software will be licensed on a per
processor package basis - only one license necessary regardless of
how many cores are contained within processor.
Intel Family Overview
•
•
•
•
•
•
•
•
•
•
•
>100million devices/chip (gigascale integration)
80286 - 100,000 devices
80386 - 275, 000 devices
80486 - 1,000,000 devices
Pentium III – 3,000,000 devices
Pentium 4 – over 5,000,000 devices/chip (VLSI, ULSI, Gigascale)
MultiCore – Smithfield, Merom, Conroe, Woodcrest
How is a design of this complexity realized?
Must automate design, powerful CAD tools
CAD Tools research and development
Decompose design process into different levels of abstraction
Levels of Abstraction in VLSI Design
Idea for New VLSI Chip
CAD/Subproblem Level
Generic CAD Tools
Architectural Design
Behavioral/Architectural Level
Behavioral Level &
Simulation Tools
Logical Design
Register Transfer/Logic Level
Logic Minimization &
Simulation Tools
Physical Design
Cell/Mask Level
Layout Editing, Partitioning
Placement & Routing Tools
Levels of Abstraction: Architectural Design
Idea for New VLSI Chip
CAD/Subproblem Level
Generic CAD Tools
Architectural Design
Behavioral/Architectural Level
Behavioral Level &
Simulation Tools
Logical Design
Register Transfer/Logic Level
Logic Minimization &
Simulation Tools
Physical Design
Cell/Mask Level
Layout Editing, Partitioning
Placement & Routing Tools
Architectural Design
• Carried out by human experts
• Decisions affect Cost & performance
e.g.Architectural Design of Microprocessor
1. What should instruction set be?
2. Should instruction pipelining be employed?
3. Should processor have on-chip cache? How big?
4. Should arithmetic unit be bit-serial or parallel?
• CAD Programs aid system architect
• Once architecture defined, 2 tasks
Two Tasks at logic level
•
•
•
•
Task 1
DATA PATH DESIGN
What is the datapath?
Functional Blocks, storage
elements, hardware
components which allow
transfer of data
• E.g. Adders, Multipliers,
Shift registers, RAMs
• Data transferred using tristate busses or mux
/demux
•
•
•
•
Task 2
CONTROL PATH DESIGN
What is the control path?
Modules which generate
control signals necessary to
operate circuit
• E.g. initializing storage
elements, initiate data
transfer
• hardwired or microprogrammed
Design of 8-bit Adder A ← A+B
• Sum in 8bit A Reg
• 8bit B Reg unchanged
• Economical Design
• Some Possibilities:
1. 8bit CLA Adder
2. 8bit ripple carry adder
3. Two 4bit CLA adders with ripple carry between
4. 1bit adder, perform addition serially (8 clock cycles)
Consider Option 4:
Serial Adder Data & Control Paths
• Serial approach gives minimum cost, uses 2 shift
registers
• Ak, Bk are kth significant bits of reg A & B
• Full Adder adds Ak, Bk and Carry Ck-1 during kth
clock
• Carry generated in kth cycle saved in D flip flop
(init set to 0)
• Data Path: Two 8bit SR, 1FA, 1DFF, 2 Mux, 3bit
counter
• Multiplexer A selects between DtaIn and Sum
output
Control Path Design
•
•
•
•
•
•
•
•
Control Signals needed
SA - Shift A R by 1 bit
SB - Shift B R by 1 bit
MA - Control Mux A
MB - Control Mux B
RD - Reset D Fflop
RC - Reset Counter
STRT - Start Addition
Control Algorithm
• forever do
while (STRT = 0) skip
Reset DFF & Counter
Set MA & MB to 0
repeat
Shift A & B Right by one
counter = counter+1
until counter = 8
Tradeoffs at Architectural Level
• Serial adder cheap but slow and difficult to test
• trade-off between cost, performance, testability,
power etc.
• 8bit parallel CLA adder fastest & most costly
• view alternative options as points in design
space
• Specs may impose more constraints
• Automated generation of data and control
signals: high level synthesis may be necessary
Levels of Abstraction: Logical Design
Idea for New VLSI Chip
CAD/Subproblem Level
Generic CAD Tools
Architectural Design
Behavioral/Architectural Level
Behavioral Level &
Simulation Tools
Logical Design
Register Transfer/Logic Level
Logic Minimization &
Simulation Tools
Physical Design
Cell/Mask Level
Layout Editing, Partitioning
Placement & Routing Tools
Design at Logic level
• Data & Control paths contain logic blocks such
as shift regs, muxs, buffers, ALU
• Q: How is cct to be implemented? As PCB, VLSI
or MCM ?
• If PCB, are components available off the shelf?
• If VLSI, what strategy? Full custom,standard cell
or gate array?
• In either case, components placed on layout
surface and wired together
Levels of Abstraction: Physical Design
Idea for New VLSI Chip
CAD/Subproblem Level
Generic CAD Tools
Architectural Design
Behavioral/Architectural Level
Behavioral Level &
Simulation Tools
Logical Design
Register Transfer/Logic Level
Logic Minimization &
Simulation Tools
Physical Design
Cell/Mask Level
Layout Editing, Partitioning
Placement & Routing Tools
Physical Design
• Refers to all synthesis steps which succeed logic design but
precede fabrication e.g. partitioning, placement, routing
• Physical layout crucial in determining circuit performance, area,
catastrophic yield, reliability
• 1. Circuit Performance: Timing delays, Crosstalk
metal, poly interconnect have finite impedance. Long lines have
large inpedance, longer delays, crosstalk. Contacts, Vias slow
signals down
• 2. Area: functional and wiring
affects yield (# of defect free chips)
large chip area = low catastrophic yield
Physical Design: Layout Effects
• low yield = high prod cost = high cct unit cost
• large area = modules widely spaced = long
wires=delays and crosstalk
• layout affects reliability: e.g. vias unreliable,
layout with large #’s of vias prone to defects; line
widths of metal tracks must be wide enough to
prevent metal migration
• course focuses on Physical, Custom Design
Physical Design Strategies
3 main approaches
differing in 2 ways
1. Layout Surface
2.structural constraints imposed on layout elements
Full Custom
Layout Editing to generate physical description of Circuit
Field Prog Gate Array
Realize cct by placing
metal connections between
transistors prefab on wafer in 3D array
Standard cell Design
Realization using
predefined logic blocks or cells stored in library
Full Custom Layout
• Full control to the artwork designer in placing and
interconnecting circuit blocks
• expert can achieve high degree of optimization in area
and circuit performance
• difficult and expensive - many person months to layout
ULSI chip - only used in mass prod cases
• requires powerful CAD tools - layout editor with DRC,
Compaction, Extraction
• not for low prod volume ASICs
• standard layout architectures to cut design time
Layout Styles: Gate array
• Mask programmable gate array or field prog
• 2/3D array of unconnected transistors
• Connections placed by either masking (MPGA) or applied voltage
(FPGA)
• 2 types of personalization: intra-cell or inter-cell
• cell library maintained, intercell wiring by layout software
• after personalization, wafer diced, chips packaged
• foundries stock large #’s of pre-fab wafers
• quick to fabricate
• few processing steps, high catastrophic yield, cheap
Layout Style:Standard Cell Layout
• Standard cell - logic block performing specific function e.g. nand,
xor, nor, d flipflop
• cell library - data on std cells (function, pin structure, layout in givien
technology)cells have same height
• develop floorplan for layout
• select library cells, place in Si, interconnect
• placement & route simplified by dividing layout into rows sep by
horiz routing channels
• very flexible cf gate array, wiring space not pre-assigned, cell size
can vary
• Fab more complex than gate array
Example of Std Cell
• Inverter function
• rectangular shape
• dimensions 0.6u X4.8u, CMOS
0.18u technology
• lower left corner at (-1, -1)
• top right corner at (0.6, 4.8)
• input a available at left
• output available at right
• VDD & GND lines available
Macrocells, PLA & FPGA
• Macrocells - No restrictions on cell size to allow more compact layout
increased cell complexity (regs, ALU’s memory) efficient layout design of
complex macrocells
• PLA’s - Sum-of-Products minimal expression can be realized using 2-level
logic: AND terms formed in 1st level, OR terms in 2nd level e.g. Z = A0.A1
+A0.A2 + A1.A2 easy to automate
• FPGA’s (e.g. Xilinx, Altera) 2D array of configurable logic blocks, can
implement any logic fn. Channels between blocks for interconnect. I/O
blocks on periphery, interconnect and logic blocks field prog by user.
Cheap prototyping, re-usable,slower. 100% use of gates not possible
Complexity of Physical Design Problem
• Problem can be viewed as complex optimization problem
with multiple objectives and conflicting constraints
• Good layout - min area, short wires, few vias, meet all
specs/constraints e.g. target tech, routing space
• difficult to fully automate
• How can we simplify task?
• Adopt stepwise approach:subdivide problem into
manageable subproblems, each one a constrained
optimization problem
Problem Subdivision & Solution
• Subproblems
1. Circuit Partitioning
2. Floorplanning and Channel definition
3. Circuit Placement
4. Routing (global)
5. Channel Routing
• Find feasible solution to each constrained opt problem
• Optimize objective
• Stay within constraints
• Subproblems NP Hard
• Heuristic techniques
Download