ESE534: Computer Organization Day 11: March 3, 2014 Instruction Space Modeling 1 1 Penn ESE534 Spring 2014 -- DeHon Last Time • Instruction Requirements • Instruction Space 2 Penn ESE534 Spring 2014 -- DeHon Architecture Taxonomy PCs Pints/PC depth width Architecture 0 N 1 1 FPGA 1 N (48,640) 8 1 Tabula ABAX (A1EC04) 1 1 1024 32 Scalar Processor (RISC) 1 N D W VLIW (superscalar) 1 1 Small W*N SIMD, GPU, Vector N 1 D W MIMD 16 1 2048 64 16-core (4?) 3 Penn ESE534 Spring 2014 -- DeHon Today • Model Architecture from Instruction Parameters – implied costs – gross application characteristics • Focus on width as example – Instruction depth on Wed. 4 Penn ESE534 Spring 2014 -- DeHon Quotes • If it can’t be expressed in figures, it is not science; it is opinion. -- Lazarus Long 5 Penn ESE534 Spring 2014 -- DeHon Modeling • Why do we model? 6 Penn ESE534 Spring 2014 -- DeHon Motivation • Need to understand – How costly is a solution • Big, slow, hot, energy hungry…. – How compare to alternatives – Cost and benefit of flexibility 7 Penn ESE534 Spring 2014 -- DeHon What we really want: • Complete implementation of our application • For each architectural alternatives – In same implementation technology – w/ multiple area-time points 8 Penn ESE534 Spring 2014 -- DeHon Reality • Seldom get it packaged that nicely – much work to do so – technology keeps moving • We must deal with – estimation from components – technology differences – few area-time points 9 Penn ESE534 Spring 2014 -- DeHon Modeling Instruction Effects • Restrictions from “ideal” + save area and energy – limit usability (yield) of PE • May cost more energy, area in the end… • Want to understand effects – Area, energy model – utilization/yield model 10 Penn ESE534 Spring 2014 -- DeHon Preclass • Energies? – 8-bit, 16-bit, 32-bit, 64-bit • 16-bit on 32-bit? – Sources of inefficiency? • 8-bit operations per 64-bit operation? • 64-bit on 8-bit? – Sources of inefficiency? 11 Penn ESE534 Spring 2014 -- DeHon Area Target • Switch focus to area-efficiency – Equivalently computational density • Come back to energy-efficiency at end of lecture 12 Penn ESE534 Spring 2014 -- DeHon Efficiency/Yield Intuition (Area) • What happens when – Datapath is too wide? – Datapath is too narrow? – Instruction memory is too deep? 13 Penn ESE534 Spring 2014 -- DeHon Efficiency/Yield Intuition (Area) • What happens when – Datapath is too wide? – Datapath is too narrow? – Instruction memory is too deep? – Instruction memory is too shallow? 14 Penn ESE534 Spring 2014 -- DeHon Computing Device • Composition – Bit Processing elements – Interconnect: space – Interconnect: time – Instruction Memory Tile together to build device 15 Penn ESE534 Spring 2014 -- DeHon Computing Device 16 Penn ESE534 Spring 2014 -- DeHon Computing Device • Composition – Bit Processing elements – Interconnect: space – Interconnect: time – Instruction Memory Tile together to build device 17 Penn ESE534 Spring 2014 -- DeHon Relative Sizes • • • • Bit Operator 3-5KF2 Bit Operator Interconnect 200K-250KF2 Instruction (w/ interconnect) 20KF2 Memory bit (SRAM) 250-500F2 18 Penn ESE534 Spring 2014 -- DeHon Model Area 19 Penn ESE534 Spring 2014 -- DeHon Architectures Fall in Space 20 Penn ESE534 Spring 2014 -- DeHon Calibrate Model Arch. w d c k FPGA 1 1 1 4 SIMD 220K Altera 8K 230K Xilinx 4K 160K 1000 1 0 2 Abacus Processor Area (F2) 32 MIPS-X 43K 48K 32 1024 2 650K 530K 21 Penn ESE534 Spring 2014 -- DeHon Peak Densities from Model 22 Penn ESE534 Spring 2014 -- DeHon Peak Densities from Model • Only 2 of 4 parameters – small slice of space – 100 density across • Large difference in peak densities – large design space! Penn ESE534 Spring 2014 -- DeHon 23 Architectural parameters Peak Densities 24 Penn ESE534 Spring 2014 -- DeHon Efficiency • What do we really want to maximize? – Not peak, “guaranteed not to exceed” performance, but… – Useful work per unit silicon [per Joule] • Yield Fraction / Area • (or minimize (Area/Yielded performance) ) 25 Penn ESE534 Spring 2014 -- DeHon Efficiency • For comparison, look at relative efficiency to ideal. • Ideal = architecture exactly matched to application requirements • Efficiency = Aideal/Aarch • Aarch = Area Op/Yield 26 Penn ESE534 Spring 2014 -- DeHon Width Mismatch Efficiency Calculation Area(Task on matched Architecture) E Area(Task on this Architecture) W A task bi telm |w w t a sk E W task W A arch bi telm |w w a rch W arch 27 Penn ESE534 Spring 2014 -- DeHon Efficiency: Width Mismatch c=1, 16K PEs 28 Penn ESE534 Spring 2014 -- DeHon Application Needs • What are common application datawidths? 29 Penn ESE534 Spring 2014 -- DeHon Energy Target • Switch to energy focus for rest of lecture 30 Penn ESE534 Spring 2014 -- DeHon Efficiency for Preclass Energy(Task on matched Architecture) E Energy(Task on this Architecture) • Preclass 6 table 31 Penn ESE534 Spring 2014 -- DeHon Robust Point • Can we find an architecture that – Is reasonably efficient regardless of application needs? – never less efficient than 50% ? 32 Penn ESE534 Spring 2014 -- DeHon Robust Point • What is Energy Robust Point for preclass model? 33 Penn ESE534 Spring 2014 -- DeHon Big Ideas [MSB Ideas] • Applications typically have structure • Exploit this structure to reduce resource requirements • Architecture is about understanding and exploiting structure and costs to reduce requirements 34 Penn ESE534 Spring 2014 -- DeHon Admin • • • • HW4 grading…in progress Office hours on Tuesday HW5.1 Due Wed. Reading for Wed. – Finish chapter 35 Penn ESE534 Spring 2014 -- DeHon