CS/ECE 3330 Computer Architecture Chapter 1 Power / Parallelism Last Time Performance Analysis • It’s all relative Instructio ns Clock cycles Seconds CPU Time Program Instructio n Clock cycle • Make sure the units cancel out! • What is a Hz? • Amdahl’s Law • Benchmarking 1 CS/ECE 3330 – Fall 2009 Why Worry about Power Dissipation? Battery life Thermal issues: affect cooling, packaging, reliability, timing Environment 2 CS/ECE 3330 – Fall 2009 Power Trends “The Power Wall” 3 CS/ECE 3330 – Fall 2009 Power Dissipation Has Peaked Must design with strict power envelopes • 130W servers, 65W desktop, 10-30W laptops, 1W mobile 4 CS/ECE 3330 – Fall 2009 How Hot Does it Get? 5 CS/ECE 3330 – Fall 2009 Cooling Issues http://www.youtube.com/watch?v=nYhEpHEPqcc 6 CS/ECE 3330 – Fall 2009 Intel vs. Duracell 16x Processor (MIPS) 14x Improvement (compared to year 0) 12x Hard Disk (capacity) 10x 8x Memory (capacity) 6x 4x Battery (energy stored) 2x 1x 0 1 2 3 4 5 6 Time (years) No Moore’s Law in batteries: 2-3%/year growth 7 CS/ECE 3330 – Fall 2009 Environment • Environment Protection Agency (EPA): computers consume 10% of commercial electricity consumption – Includes peripherals, possibly also manufacturing • Data center growth was cited as a contribution to the 2000/2001 California Energy Crisis • Equivalent power (with only 30% efficiency) for AC • CFCs used for refrigeration • Lap burn • Fan noise 8 CS/ECE 3330 – Fall 2009 Power Matters at Scale… [J. Koomey (LBL), 2007] Eric Schmidt, CEO of Google: "What matters most to the computer designers at Google is not speed, but power low power, because data centers can consume as much electricity as a city." CS/ECE 3330 – Fall 2009 9 But Remember Amdahl’s Law 10 CS/ECE 3330 – Fall 2009 Power vs. Energy 11 CS/ECE 3330 – Fall 2009 Power vs. Energy Power consumption in watts • Determines battery life in hours • Sets packaging limits Energy efficiency in joules • Rate at which power is consumed over time • Energy = power * delay (joules = watts * seconds) • Lower energy number means less power to perform a computation at same frequency 12 CS/ECE 3330 – Fall 2009 Another Fallacy: Low Power at Idle X4 power benchmark • At 100% load: 295W • At 50% load: 246W (83%) • At 10% load: 180W (61%) Google data center • Mostly operates at 10% – 50% load • At 100% load less than 1% of the time Consider designing processors to make power proportional to load 13 CS/ECE 3330 – Fall 2009 Capacitive Power Dissipation Capacitance: Function of wire length, transistor size Supply Voltage: Has been dropping with successive fab generations Power ~ C V2 f Frequency switched: Clock frequency + likelihood of change 14 CS/ECE 3330 – Fall 2009 Reducing Power Suppose a new CPU has 75% of capacitive load of old CPU 25% voltage and 25% frequency reduction Pnew Cold 0.75 (Vold 0.75) 2 Fold 0.75 4 0.75 0.32 2 Pold Cold Vold Fold 15 The power wall We can’t reduce voltage further We can’t remove more heat How else can we improve performance? CS/ECE 3330 – Fall 2009 Uniprocessor Performance Constrained by power, instruction-level parallelism, memory latency 16 CS/ECE 3330 – Fall 2009 Multiprocessors Multicore microprocessors • More than one processor per chip Multiprocessors and clusters – another course Requires explicitly parallel programming • Compare with instruction-level parallelism – Hardware executes multiple instructions at once – Hidden from the programmer • Hard to do – Programming for performance – Load balancing – Optimizing communication and synchronization 17 CS/ECE 3330 – Fall 2009 Multicore Architecture Examples 2 × quad-core Intel Xeon e5345 (Clovertown) 2 × quad-core AMD Opteron X4 2356 (Barcelona) 18 CS/ECE 3330 – Fall 2009 Multicore Architecture Examples 2 × oct-core Sun UltraSPARC T2 5140 (Niagara 2) 2 × oct-core IBM Cell QS20 19 CS/ECE 3330 – Fall 2009 Key Points Power has become a limiting factor • Power vs energy • P = C * (V^2) * F One solution: Multicore processors • Different scale than “old” parallel processors • More detail in Chapter 7 20 CS/ECE 3330 – Fall 2009