slides - Barry Rountree

advertisement
HPPAC 2012
Monday, May 21st
LLNL-PRES-552151
This work has been authored by Lawrence Livermore National
Security, LLC under contract DE-AC52-07NA27344 with the U.S.
Department of Energy. Accordingly, the United States Government
retains and the publisher, by accepting this work for dissemination,
acknowledges that the United States Government retains a nonexclusive, paid up, irrevocable, world-wide license to publish or
reproduce the disseminated form of this work or allow others to do
so, for United States Government purposes.
 Traditional
• All components can
operate at highest power
level simultaneously
• Power provisioned for
“worst case”
• Users are happily
oblivious (about power)
• Few if any applications
limited by power
Lawrence Livermore National Laboratory
 Exascale (if not
sooner)
• Not all components can
operate at highest power
level simultaneously
• Power provisioning is best
effort
• Users must tune power for
performance
• Nearly every application
limited by power
2
LLNL-PRES-552151
 Traditional
 Exascale (if not
sooner)
• Utilization measured in
node-hours
• Weak-scaling jobs perform
best using as many nodes
as possible
• Running all components
as fast as possible reliably
leads to top performance
Lawrence Livermore National Laboratory
• Utilization measured in
kilowatt hours
• Weak-scaling jobs may
perform optimally with
fewer, faster nodes
• Running all components
as fast as possible cannot
be done. Running most
components at identical
speeds is suboptimal
3
LLNL-PRES-552151
Average Processor Power Bound
exascale
rzmerl
(EarlyApril)
(Mid
(?)
April)
Average Processor Power Bound
Power
(Watts)
Each processor uses
some amount of power
Processors
Lawrence Livermore National Laboratory
Sum of processor
power draw divided
Lost performance
by processor count
must be
atpower
or below
Total
Linpack
Short-term
Mid-term
Long-term
processor
+solution:
solution:
solution:
this
divided
Intel
Turbo
bylevel.
processor
Boost
count
Disable
Buy
Schedule
should
more
Turbo
power
be less
Boost
than
globally
to
the
optimize
bound
performance
(This does not scale)
GHz
non-turbo
(2.6 GHz)
max turbo
(3.3 GHz)
4
LLNL-PRES-552151
 Runtime Average Power Limit (RAPL)
• Measures cumulative joules (power x time)
• Three separate power meters
• Clamping on package and DRAM power
 Turbo suppression
 Effective frequency
 libmsr currently under development
Lawrence Livermore National Laboratory
5
LLNL-PRES-552151
Can placeenergy
Introduced
Onboard
user-specified
on Sandy
meters
Bridge
measure
limitProcessors
on average
accumulated
powerjoules.
over a
user-specific time window.
Divide by time to get average power.
Lawrence Livermore National Laboratory
Source: Intel 64 and IA-32 Software Developer’s Manual,
Volume 3B
6
LLNL-PRES-552151
Setting
LOCK
fixesuntil
power
limits
until
reboot
Two
windows
allows
tweaking
peak
and
Limits
are
ignored
enable
bits
are
setaverage power
Power
limit
is enforced
using
average
watts
Watts
granularity:
0.125W
Higher
bound,
smaller
window
over user
specified
window.
Minimum
power
bound:
51W for peak power
Lower bound, wider window for average power
Resolution: ~1ms
Max Window: ~46ms
Lawrence Livermore National Laboratory
Source: Intel 64 and IA-32 Software Developer’s Manual,
Volume 3B
7
LLNL-PRES-552151
Similar interface for DRAM power control
Only one power limit supported
Lawrence Livermore National Laboratory
Source: Intel 64 and IA-32 Software Developer’s Manual,
Volume 3B
8
LLNL-PRES-552151
rzzin
mg.C.8
64 processors
34 power bounds
Processors
No
51W
Power
Power
Bound
are
Bound
heterogeneous
under a power bound
Processors require
take similar
sametime
amountshould
Where
of power
the hot
Significant variation
processors
go?
in power
Individual processor efficiency
Power
has
Is
is not
worth
variation
changed
paying
expected
a premium
and
acceptable
efficient
processors?
Efficiency variation manifests
as performance variation
Lawrence Livermore National Laboratory
9
LLNL-PRES-552151
rzmerl
NPB C.8
234 processors
Avergae Watts
Wide variation in power
consumption across
applications
Provisioning power for
most power-hungry
application leaves
remaining applications
node-bound, not powerbound
Processors ordered by
cg.C.8 average PKG power
Lawrence Livermore National Laboratory
10
LLNL-PRES-552151
rzmerl
NPB C.8
234 processors
Avergae Watts
Memory power
substantially lower than
package power
Processors ordered by
cg.C.8 average PKG power
Lawrence Livermore National Laboratory
11
LLNL-PRES-552151
 Overprovision hardware
• Processors are cheap and plentiful
• Power is not
 Measure performance at max power
consumption
• May require turning off nodes
• Running out of nodes before running out of power means
application is not power-bound
 Expect heterogeneous processor performance
• Put most-efficient nodes on the critical path if possible
• Put least-efficient nodes where they will do the least harm
Lawrence Livermore National Laboratory
12
LLNL-PRES-552151
Download