CSCE 612: VLSI System Design

advertisement
CSCE 212
Introduction to Computer Architecture
Instructor: Jason D. Bakos
What is Computer Architecture?
• The design of computer systems, to…
– To improve “performance”
•
•
•
•
•
•
•
•
Run programs faster
Use less power, last longer on battery power
Generate less or more uniformally distributed heat
Improve video, 3D rendering, encoding, or decoding frame rate
Handle more secure encryption standards with reasonable latency
Achieve routing or network intrution detection at higher line speeds
Be more scalable
Be less expensive (e.g. higher integration)
– Can be achieved via:
• Software (better OS, more optimized application code)
or
• Hardware (processor)
• Designing any complex system requires abstraction
CSCE 212 2
Abstraction
• Abstration used to
manage complexity of
design
– Hide details that are
not important
145/146/240/245
Application
Software
Programs
330
Compiler
311
Operating
Systems
Device
Drivers
212
Architecture
Instructions
Registers
Microarchitecture
Datapaths
Controllers
Logic
Adders
Memories
Digital
circuits
AND gates
NOT gates
Analog
circuits
Amplifiers
Filters
Devices
Transistors
Diodes
Physics
Electrons
211/611
211
ELCT 371
CSCE 212 3
Domains and Levels of Modeling
Functional
Structural
high level of
abstraction
low level of
abstraction
Geometric
“Y-chart” from
Gajski & Kahn
CSCE 212 4
Domains and Levels of Modeling
Functional
Structural
Algorithm
(behavioral)
Register-Transfer
Language
Boolean Equation
Differential Equation
Geometric
“Y-chart” from
Gajski & Kahn
CSCE 212 5
Domains and Levels of Modeling
Functional
Structural
Processor-Memory
Switch
Register-Transfer
Gate
Transistor
Geometric
“Y-chart” from
Gajski & Kahn
CSCE 212 6
Domains and Levels of Modeling
Functional
Structural
Polygons
Sticks
Standard Cells
Floor Plan
Geometric
“Y-chart” from
Gajski & Kahn
CSCE 212 7
Structure
CSCE 212 8
MIPS Microarchitecture
RTL (datapath)
fetch instruction
1. Address <= PC
2. MemRead
3. PC <= PC + 1
4. IR <= MemData
Control
fetch instruction
1. IorD = 0
2. MemRead = 1
3. PCEn = 1
ALUSrcA = 0
ALUSrcB = 01
ALUOp = ADD
PCSource = 01
4. IRWrite = 1
CSCE 212 9
Structure
CSCE 212 10
Logic Synthesis
• Behavior:
– S=A+B
– Assume A is
2 bits, B is 2
bits, C is 3
bits
A
B
C
C2  A1 A0 B1 B0  A1 A0 B1 B0  A1 A0 B1 B0 
00 (0)
00 (0)
000 (0)
00 (0)
01 (1)
001 (1)
A1 A0 B1 B0  A1 A0 B1 B0  A1 A0 B1 B0
00 (0)
10 (2)
010 (2)
00 (0)
11 (3)
011 (3)
01 (1)
00 (0)
001 (1)
01 (1)
01 (1)
010 (2)
01 (1)
10 (2)
011 (3)
01 (1)
11 (3)
100 (4)
10 (2)
00 (0)
010 (2)
10 (2)
01 (1)
011 (3)
10 (2)
10 (2)
100 (4)
10 (2)
11 (3)
101 (5)
11 (3)
00 (0)
011 (3)
11 (3)
01 (1)
100 (4)
11 (3)
10 (2)
101 (5)
11 (3)
11 (3)
110 (6)
C2  B1 B0 ( A1 A0  A1 A0  A1 A0 )  A1 B1 B0 ( A0  A0 )  A1 A0 B1 B0
C2  B1 B0 ( A1 A0  A1 ( A0  A0 ))  A1 B1 B0  A1 A0 B1 B0
C2  B1 B0 ( A1 A0  A1 )  A1 ( B1 B0  A0 B1 B0 )
CSCE 212 11
Logic Gates
inv
YA
NAND2
Y  A B
NAND3
Y  A B
NOR2
Y  A B
Y  A B
CSCE 212 12
Latches
Positive edge-sensitive latch
CSCE 212 13
Elements
CSCE 212 14
Semiconductors
•
Silicon is a group IV element (4 valence electrons, shells: 2, 8, 18, 32…)
– Forms covalent bonds with four neighbor atoms (3D cubic crystal lattice)
– Si is a poor conductor, but conduction characteristics may be altered
– Add impurities/dopants (replaces silicon atom in lattice):
•
•
Makes a better conductor
Group V element (phosphorus/arsenic) => 5 valence electrons
–
•
Leaves an electron free => n-type semiconductor (electrons, negative carriers)
Group III element (boron) => 3 valence electrons
–
Borrows an electron from neighbor => p-type semiconductor (holes, positive carriers)
+P-N junction
+ + ++ + +
--- ---
+ + ++ + +
--- ---
+forward bias
reverse bias
CSCE 212 15
MOSFETs
negative
voltage (rel.
to body)
(GND)
positive voltage
(Vdd)
NMOS/NFET
current
body/bulk
GROUND
---
+++
---
+++
channel
shorter length,
faster transistor
(dist. for
electrons)
PMOS/PFET
current
body/bulk
(S/D to body is
reverse-biased)
HIGH
• Metal-poly-Oxide-Semiconductor structures built onto substrate
– Diffusion: Inject dopants into substrate
– Oxidation: Form layer of SiO2 (glass)
– Deposition and etching: Add aluminum/copper wires
CSCE 212 16
IC Fabrication
• Chips are fabricated using set of
masks
– Photolithography
• Basic steps
–
–
–
–
oxidize
apply photoresist
remove photoresist with mask
HF acid eats oxide but not
photoresist
– pirana acid eats photoresist
– ion implantation (diffusion, wells)
– vapor deposition (poly)
– plasma etching (metal)
CSCE 212 17
Layout
3-input NAND
CSCE 212 18
Cell Library (Snap Together)
Layout
CSCE 212 19
Layout
CSCE 212 20
Synthesized and P&R’ed MIPS Architecture
CSCE 212 21
IC Fabrication
CSCE 212 22
8” Wafer
•
8 inch (200 mm) wafer containing Pentium 4 processors
– 165 dies, die area = 250 mm2, 55 million transistors, .18mm
CSCE 212 23
Another 8” Wafer
CSCE 212 24
Feature Size
• Shrink minimum feature size…
–
–
–
–
Smaller L decreases carrier time and increases current
Therefore, W may also be reduced for fixed current
Cg, Cs, and Cd are reduced
Transistor switches faster (~linear relationship)
CSCE 212 25
Minimum Feature Size
Year
Processor
Speed
Transistors
Process
1982
i286
6 - 25 MHz
~134,000
1.5 mm
1986
i386
16 – 40 MHz
~270,000
1 mm
1989
i486
16 - 133 MHz
~1 million
.8 mm
1993
Pentium
60 - 300 MHz
~3 million
.6 mm
1995
Pentium Pro
150 - 200 MHz
~4 million
.5 mm
1997
Pentium II
233 - 450 MHz
~5 million
.35 mm
1999
Pentium III
450 – 1400 MHz
~10 million
.25 mm
2000
Pentium 4
1.3 – 3.8 GHz
~50 million
.18 mm
2005
Pentium D
2 cores/package
~200 million
.09 mm
2006
Core 2
2 cores/die
~300 million
.065 mm
2008
Core i7
4 cores/die
~800 million
.040 mm
2010
“Sandy
Bridge”
8 cores/die
??
.032 mm
CSCE 212 26
Clock Speed
• Clock speed is affected by:
– Fabrication technology
– Architecture: how much work performed in a single cycle
• Execution time =
– instructions per program * cycles per instruction * seconds per cycle
• Now we must add to the product:
– (number of program threads / number of processor cores)
CSCE 212 27
Integration Density
Core 2 Duo (2007) has ~300M transistors
CSCE 212 28
Integration Density
CSCE 212 29
Microprocessor Technology
• Advances in fabrication (lithography, photoresist, metal
layers)
• …faster transistor switching (faster processor)
• …smaller transistors/wires
• …higher integration density
• …more “real estate”
• …architectural improvements!
CSCE 212 30
Microarchitectural Parallelism
• Parallelism => perform multiple operations simultaneously
– Instruction-level parallelism
•
•
•
•
•
Execute multiple instructions at the same time
Multiple issue
Out-of-order execution
Speculation
Branch prediction
– Thread-level parallelism (hyper-threading)
• Execute multiple threads at the same time on one CPU
• Threads share memory space and pool of functional units
– Chip multiprocessing
• Execute multiple processes/threads at the same time on multiple CPUs
• Cores are symmetrical and completely independent but share a common
level-2 cache
CSCE 212 31
Download