What Computer Architects Need to Know About Memory Throttling

advertisement
IBM Research – Austin
Heather Hanson
Karthick Rajamani
What computer architects need to
know about memory throttling
WEED 2010
June 20, 2010
© 2010 IBM Corporation
Outline
 Memory throttling overview
 Experimental platform
– System configuration
– Memory throttling implementation
 Memory throttling characterization
– Bandwidth
– Power
– Performance
 Summary
2
© 2010 IBM Corporation
Memory throttling in a nutshell

Memory throttling is a power-performance knob that:
– Impacts memory reference rates of both instruction and data streams
– controls power
– can be used for safety or optimization
• regulate DIMM temperatures
• enforce memory power budgets

Memory throttling restricts read & write traffic
– directly controls memory power
– indirectly affects processors and other components

Several implementation styles in commercial systems
– insert periodic idle cycles
– allow arbitrary number of transactions up to power (estimated) threshold
– run + hold windows
– enforce read & write quotas [this paper]
• first N transactions to proceed in time window
• any further requests wait until next time period
3
© 2010 IBM Corporation
Comparison to clock throttling
run-hold clock throttling
regular frequency during run portion;
clock halted during hold portion
quota-style memory throttling
Nth request in each period;
additional requests would be
queued for later service
reads & writes proceed as requested up
to N requests per period
Example: N = 6
Up to 6 transactions serviced per
period, regardless of request timing
4
© 2010 IBM Corporation
POWER6 Memory Throttling
 IBM JS12 blade system
– Processor
• POWER6
• 1 socket x 2 cores per processor socket
• 3.8 GHz frequency (fixed in these experiments)
• SLES10 linux
– Memory
•16 GB capacity
• 8 DIMMS x 2 GB each
• DDR2
• 667 MHz bus
 Quota-style memory throttling
– N transactions per M memory cycles
100% throttle level == unthrottled
– Time period is faster than thermal and power supply timescales
5
© 2010 IBM Corporation
Memory throttle characterization methodology
1. Sweep throttle settings
•
Set throttle
•
Run steady-behavior benchmark
DAXPY (double A * X plus Y)
FPMAC (floating-point multiply accumulate)
RandomMemory (generate random addresses)
SPECPower_ssj2008 calibration phase (peak throughput for warehouse transactions)
•
Record sensor data, 256ms per sample
Memory power
Memory reads & writes
Instruction throughput
And other sensors not shown here
•
Decrement throttle
•
Repeat for full range of throttle settings
2. Repeat throttle sweep for multiple benchmarks and memory footprints
– Microbenchmarks: L1 cache contained and main memory footprints
– SPECPower_ssj2008: behaves as nearly contained in on-chip caches
3. Calculate median sensor data for each permutation {benchmark, footprint, throttle}
6
© 2010 IBM Corporation
Memory throttle effect on bandwidth
transition between linear &
saturated regions
saturated
7
© 2010 IBM Corporation
Subtle but very important point about transition region
Actual bandwidth < max bandwidth
bandwidth restrictions
pipeline starvation
reduced request rate
A closer look at RandomMemory-DIMM
• uses less bandwidth than other benchmarks at same throttle levels
• also less bandwidth than its own saturation level
Simply measuring bandwidth at a single/current
throttle level is not enough to identify a region of
operation
less than max could be saturated or transition region
….a controller will not be able to accurately predict
the effect on bandwidth of a throttle level change
…or predict the effect on power or performance
8
© 2010 IBM Corporation
Memory Power
is basically linear with bandwidth, so this chart looks familiar….
9
© 2010 IBM Corporation
Throttling effects relative to each benchmark

Generally more performance reduction than power reduction (in %)
– Throttling alone doesn’t affect static portion of memory power
• Leveraging idle low-power modes of memory can alter positively the power-performance
curve for memory request rate throttling.
– Possible to waste energy from longer execution time

Larger bandwidth demands  larger effect from throttling
– Conversely, power reduction only when performance is impacted.
L1-contained DAXPY: throttling has no effect
performance
power
DIMM-sized DAXPY: drastic effect
10
© 2010 IBM Corporation
Summary
 Memory throttling is a power-performance knob available in commercial systems
 Memory controller restricts read & write bandwidth
– caps memory power
– controls DIMM temperature
 Mileage may vary
– power and performance management depend on bandwidth demand
• throttling a low-bandwidth workload doesn’t reduce much power
– potential to use more energy due to increased execution time
• use highly throttled settings with caution
 Effective tool for power capping
– power constrained configurations
– thermal safety
– power shifting
11
© 2010 IBM Corporation
Acknowledgements
 IBM Research – Austin
 IBM Systems & Technology Group
– Memory characterization: Joab Henderson, Kenneth Wright
– EnergyScale firmware: Guillermo Silva, Andrew Geissler
12
© 2010 IBM Corporation
Download