A Physical Perspective of Computer Architecture Peter Hsu, Ph.D. Chief Architect Microprocessor Development Toshiba America Electronics Components, Inc. Toshiba Presented February 13, 2001 at Univ. of Wisconsin in Madison Introduction Computer Architecture Perspectives – Logical, Performance • Abstract • Quantitative • Academia – Physical, Cost • Constrained by History, Emotions, Physics • Modulated by Current World Affairs • Apprenticeship A Physical View of Computer Architecture 2 Content Tour a Particular Design Point – Rationale of Choices – Confluence of Decisions – Rules of Thumb Computer Markets – PC Infrastructure Defines Most Cost-Efficient – Consumer Volume Advancing Technology – All other Computers Competing with PC A Physical View of Computer Architecture 3 Multichip Module heat distributor 88 chip stacks silicon substrate 10mm alignment cage 12mm 14cm 4mm printed circuit board pressure plate A Physical View of Computer Architecture 3000 wire bonds 4 Chip Stack 12mm 0.3mm router 10mm DRAMs 12mm processors 10m width 20m pitch stack shown upside down A Physical View of Computer Architecture 5 Stack to Substrate Connection wirebond springs heat conventional wirebond pads DRAMs router chip silicon substrate A Physical View of Computer Architecture 6 System Unit heat sink multichip module flex signal PCB 1.5 in input/output connectors rigid power PCB 19 inches A Physical View of Computer Architecture 7 Scalable Configurations peripherals system units peripherals 64-node supercomputer (80 Kilowatt) office (110V 15A) copier room (220V 30A) A Physical View of Computer Architecture 8 Physical Architecture CPU CPU CPU DRAMs DRAMs DRAMs router router router chip stack chip stack chip stack silicon multichip substrate system unit cables between system units A Physical View of Computer Architecture 9 Logical Architecture chip stack CPU CPU CPU CPU L1$ L1$ L1$ L1$ level 2 cache system unit router main memory byte-wide point-to-point network serial point-to-point cable network A Physical View of Computer Architecture 10 Guiding Principles Performance 1. Latency (Memory, Interprocessor, etc.), 2. Bandwidth, then 3. Microarchitecture Cost – Silicon Portion Scales With Process (e.g. Learning curve of copper-on-silicon substrate) – Non-Silicon Portion Does Not Scale (e.g. Liquid immersion cooling hardware) A Physical View of Computer Architecture 11 Silicon Substrate 4mm maximum trace length 24.8cm 12mm 12mm chip maximum cutset 2048 p-to-p links 4mm spacer 200mm (8 inch) wafer 150m pitch 14cm A Physical View of Computer Architecture 3200 wire bonds substrate to PCB 12 Stack to Substrate Connection 75m tolerance alignment cage substrate chip stack 75m clearance (0.003 inch or 3 mils) 250m pitch 125m pad 125m space Rule of thumb chip stack • Machined parts need several mils tolerance A Physical View of Computer Architecture 13 Substrate Design Internal Signals – Link • 8 data, 2 clock bits (20% overhead) • Source Synchronous – Density • 20,480 signals across cutset ( 7m per track) • 63210 1260 signals / stack (2304 total) Rule of thumb • High speed 50% signal pads A Physical View of Computer Architecture 14 Substrate Design (con’t) External Connections – Signal • Node to multicomputer node (2in 2out 64) • Node to peripheral device (2in 2out 64) – Power • 1280 power/ground pairs • 20W per stack (VDD 1V) Rule of thumb • A wire bond 1A sustained current A Physical View of Computer Architecture 15 Interconnect Dimensions width W space S pitch 4 3.5 7.5 VDD 6 height H 8 insulation thickness T VSS 2 4.5 3 5 10 5.5 15 A Physical View of Computer Architecture 7.5 16 Electrical Characteristics 0.222 C () () [ () () = 1.15 + R = 0.222 W T 0.06 L WH + 2.80 W T H T + 1.66 H T 0.222 0.14 Z0 = 1.34 ( ) ]( ) H T T S C0 C Bakoglu, H.B., Circuits, Interconnections, and Packaging for VLSI, Addison-Wesley, 1990. A Physical View of Computer Architecture 17 Lossy Transmission Line 1 1 V 1 V 0 V 0 time 0 time Self terminating if time Z0 R 2Z0 A Physical View of Computer Architecture 18 Substrate Design (con’t) Construction – Material • Copper, 1.7 mcm • “Low-k” Insulator, 3.0 – Design Rules • 1 • 2 • 3 L 7.2cm R 51 L 18.4cm R 52 L 24.8cm R 47 Z0 27 Z0 26 Z0 27 – 7 Layers (3 X•Y pad) A Physical View of Computer Architecture 19 Package Design Thermal – System Unit • Ambient: A 40C • Airflow: 1 m/s (200 ft/min) • 1280W 16 16 1in Rule of thumb • Heat sink with fan dissipates 5W per inch3 A Physical View of Computer Architecture 20 Package Design (con’t) – Thermal Resistance • DRAM leakage: J 80C • 1280W, 40C JA 0.03C/W gas condense metal wick boil liquid heat pipe heat Rule of thumb • Solid heat sink JA 1C/W A Physical View of Computer Architecture 21 Thermal Limits router Major Design Implications – PC Processor 10-30W – First Order Constraint 3W logic + 1W substrate (4W total) 41W active simultaneously (4W total) DRAMs • MHz • Latencies 42.5W CPU + 2W L2 cache (12W total) processors Observation • Activity vs. State Retention Density A Physical View of Computer Architecture 22 Package Design (con’t) 10% variation Power Supply – PC 10¢ / W 15A 110V AC – Server 30¢ / W – Exotic $1 / W -5% first stage 120A 12V DC -10% 1,280A 1V DC second stage Rule of thumb • Standard 15A, 110V AC outlet 1300W A Physical View of Computer Architecture 23 Package Design (con’t) Cable Connectors flex signal PCB – Serial, e.g. USB – 64 per side Finger Access, Airflow 1 inch Rule of thumb • Connector 0.3 in2 panel, 0.7 in2 clearance A Physical View of Computer Architecture 24 Memory Latencies 7.5ns PC133 “3-2-3” 37.5ns FSB NB addr global 7 cell array DQ data 22ns R S Cable FSB global cell array DQ R substrate R S R global cell array DQ R S R cell array DQ R S R PCB trace R T B R 82.5ns 29 R Stack Substrate NB 58ns 76ns transceiver 3m cable B T S R global T B 3m cable B T R Rule of thumb • PCB 10cm/ns (5ns/foot), coax 20cm/ns A Physical View of Computer Architecture S 176ns R 25 Memory Latencies (con’t) Benefits – Performance • Cache miss penalty – Robustness • Global vs. local memory 30% • Remote access 3 local – Marketability • Minimize application speed variance • “No surprises” A Physical View of Computer Architecture 26 High Lights Scalability – Partial node ... 64-node supercomputer Performance – Latency • PC133-timing to 1 TBytes – Bandwidth • 32B / stack / cycle on substrate • 1/6B / stack / cycle via cable A Physical View of Computer Architecture 27 High Lights (con’t) Risk Management – Not liquid immersion; no pumps, hoses – Configurable for 110V outlet – Substrate uses ordinary silicon process Taken Risks – Stacking chips not mainstream – Wirebond spring recent invention – Heat pipe reliability A Physical View of Computer Architecture 28 Summary Architecture of a Large Computer – Performance – Materials – Mechanical Assembly – Thermal Management – Power Supply Many More Issues... – Architect responsible for everything, even if s/he doesn’t know anything about it! A Physical View of Computer Architecture 29