practical dynamic thermal management on intel desktop computer

advertisement
PRACTICAL DYNAMIC THERMAL MANAGEMENT
ON INTEL DESKTOP COMPUTER
Guanglei Liu
Department of Electrical and Computer Engineering
Florida International University
July 12, 2012
Major Professor: Dr. Gang Quan
Thermal Design Challenges
Number of transistors keeps increasing
•
Nearly 40 billon transistors are
integrated into single die [Mizunuma, 2009
ICCAD]
More complicated architectures are built
•
80 core single chip processor has
been demonstrated by Intel
[Vangal,
2007 ISSCC]
Figure from Intel Microprocessor Technology Lab, 2011
High transistor density increases power density
Electric Bill
•
U.S. Datacenters: 120 billon kilowatt
hours in 2012
•
9 billion dollar, 15% of all energy in U.S.
Environmental concerns
•
In U.S, 46% of electricity is generated by
fossil fuels.
Source: Environmental Protection Agency (EPA) Report
High power density brings up the on-chip temperatures and causes thermal issues
Thermal Issues
Computing system cooling solutions
Increase package/cooling costs
Mechanical Cooling Solution
•
•
1-3 dollar per watt [Skadron, ICSA 2003]
Data center, each watt on computing, ½ - 1 watt for
cooling [Brill, 2007]
Affect reliability
•
Air-cooling (e.g. fan + heat sink)
•
•
As much as 50% reduction of device’s life span for
every 10oC increase [Yeo, DAC 2008]
Degrade performance
•
10-15% more circuit delay for each 15oC increase
[Santarini, EDN 2005]
Increase Leakage power consumption
•
Temperature from 65oC to 110oC can increase
the leakage power by 38% for IC
circuits.[Santarini, EDN 2005]
Crush the computing system
•
Processor’s self-protect mechanism automatically
shuts down processor to avoid physical damage
[Rohou, WFDO 1999]
Cooling cost takes 51% of overall server power
budget [Lefurgy, COM 2003]
Noise level increases 10dB as fan speed increases
by 50% [Lyon, STMMS 2004]
High cooling cost
Liquid-cooling
•
High density liquid absorb 3500 times more
heat than air [Chu, DMR 2004]
Dynamic Thermal Management (DTM)
•
Dynamic voltage and frequency scaling
(DVFS) technique [Kim, HPCA 2008]
Sacrifice system
performance
[Gunther, ITJ 2001]
•
Task migration [Lim QED 2002]
•
Clock gating
•
Fetch toggling
[Brooks, HPCA 2001]
Related Theoretical Work
Thermal-aware throughput maximization
[Chantem et al., ISLPED 2009]
[Zhang et al., ICCAD 2007]
[Chatha et al., DAC 2010]
Overall energy reduction under peak
temperature constraints
[Bao et al., DATE 2010]
[Andrei et al., DAC 2009]
[Huang et al., DATE 2011]
Peak temperature minimization
[Chaturvedi et al., ASPDAC 2011]
[Liu et al., RTAS 2010]
[Qiu et al., ICESS 2010]
Real-time guarantee under peak temperature
constraint
[Chaturvedi et al., CIT 2010]
[Wang et al., RTS 2006]
[Huang et al., RTSS 2009]
Those theoretical work are derived based on simplified mathematical
thermal models and idealized assumptions
Our Research Goal:
To develop up a practical hardware platform that enables us to investigate the
limitations of the existing theoretical work, and develop practical and effective
DTM techniques to accommodate those limitations
Major contributions
Practical hardware platform
•
•
Intel i5 Quad core
Linux operating system
[SouthEast 2011]
Thermal management
validation
• DTM techniques VS air-cooling
• DTM vs DPM algorithm
•Fundamental DTM principles
validation
[SUSCOM 2012]
Reactive DTM
Single-core
•Limitations of theoretical
works
• Non-constant sampling
period
• Thermal profiling analysis
[GreenCom 2012]
Proactive DTM algorithm
Multi-core
• Neighbor-aware temperature
prediction
• Algorithm for multicore with
task migration
[DATE 2012] [ASP2012]
Practical Hardware Platform
CPU_affinity module
Migrate process
between cores
Dell Precision T1500
workstation
SPEC CPU2000
Benchmark
Linux kernel version of
2.6.23
Integers and floating
point operations
DVFS
technique
Task migration
DVFS
technique
Cpufreq module
12 different speed
levels
DVFS Technique
DVFS technique
Fluke current clamp,
Multimeter
SPEC Benchmark
Intel i5 quad core
Power
measureme
nt
Temperature
capturing
Fan Speed Control
Fan control
Cooling/ CPU power
consumption
CoreTemp driver
Read on-chip thermal
sensor
Fancontrol shell script
Computing system hardware monitoring tool
Lm-sensors Tool
Manually adjust fan
speed
Temperature
value
Fan Speed
Voltag
e
value
Monitor system
information
Our Approach
Buffer zone and safe region
Enhanced reactive DTM (ERDTM)
is maximum
possible temperature
increment 4oC
Buffer zone:
Safe region:
Temperature
Offline thermal profiling analysis
Build up a temperature vs. speed lookup table
 Run benchmarks with different speed
levels
 Collect corresponding peak
temperatures
TURESHOLD
T
Buffer zone
Tsafe
Safe region
Time
Experimental results
Experiment setup
Frequency lookup table
1.1
FSDTM
VS-DTM
ERDTM
1.08
Throughput (%)
 Four identical tasks assigned to four cores to
simulate single-core environment
 Temperature threshold is 55oC
 Construct the lookup table offline
DTM algorithm Performance evaluation
1.06
1.04
1.02
1
0.98
0.96
galgel ammp lucas equake
vpr
gcc
parser crafty
SPEC CPU2000 Benchmark
ERDTM average throughput improvement is 8.1%
FSDTM algorithm
VS-DTM algorithm
Number of violations
87
ERDTM algorithm
Number of violations
Number of violations
12
0
Neighbor-aware temperature prediction
Our Neighbor-aware prediction
Training process
where
and are weights, which are
obtained by collecting training data
Obtained
offline
Run the tasks and record temperature information
Individual increment factor
Processor temperature increment
Neighbor increment factor
Heat transfer from neighbor processor
Apply least-square estimation
Neighbor-aware Task Migration
NADTM Algorithm
Conventional approach:
Always migrate task from hottest core to
the coolest core.
Our migration strategy
Predict thermal
emergency
choose the migration candidate with the
minimum
Migrate task
Heat factor: to evaluate the processor hotness
DVFS technique
Increasing factor: to evaluate the temperature
increment
Performance analysis
48
Threshold
NADTM
OS Default
Temperature (Celsius)
46
44
 NADTM algorithm can effectively
control the temperature under the
threshold
 It has a small temperature
oscillation of 1oC
42
40
38
36
34
0
50
100
150
200
Time (Second)
Multiple task
Single task
An average of 5.8%
overall throughput
improvement
An average of 3.6%
overall throughput
improvement
Journals
1.
2.
3.
Guanglei Liu, M. Fan, G. Quan, M. Qiu “On-Line Predictive Thermal Management under Peak Temperature
Constraints for Practical Multi-core Platforms”, Journal of Low Power Electronics (ASP). (under review),
2012.
Guanglei Liu, G. Quan, M. Qiu “Practical Dynamic Thermal Management on An Intel Desktop Computer ” ,
Embedded Software Design, Journal of Sustainable Computing (SUSCOM) (under review), 2012.
H. Huang, V. Chaturvedi, Guanglei Liu, G. Quan, ”Leakage Aware Scheduling On Maximum Temperature
Minimization For Periodic Hard Real-Time Systems”, Journal of Low Power Electronics (ASP), 2012.
Peer Reviewed Conferences
1.
2.
3.
4.
5.
Guanglei Liu, M. Fan, G. Quan, “Neighbor-Aware Dynamic Thermal Management for Multi-core Platform”,
The 15th Design, Automation, and Test in Europe (DATE 2012), Dresden, Germany, March 12-16, 2012.
Guanglei Liu, G. Quan, M. Qiu, “The Practical On-line Scheduling for Throughput Maximization on Intel
Desktop Platform under the Maximum Temperature Constraint“, The 2011 IEEE/ACM Green Computing and
Communications (GreenCom 2011), Sichuan, China, August 4-5, 2011.
Guanglei Liu, G. Quan, ”Thermal Aware Scheduling on an Intel Desktop Computer,” IEEE SouthEast
Conference (SouthEast 2011), Nashville, Tennessee, March 17-20, 2011.
Guanglei Liu, J. Fan, “Framework for Statistical Analysis of Homogeneous Multi- core Power Grid Networks“,
IEEE 8th International Conference on ASIC (ASICON 2009), Changsha, China, October 20-23, 2009.
C. Liu, J. Tan, R. Chen, Guanglei Liu, J. Fan, “Thermal Aware Clocktree Optimization in Nanometer VLSI
Systems Considering Temperature Variations“, IEEE 40th Southeastern Symposium on System Theory (SSST
2008), New Orleans, LA, March 17-18, 2008.
Thank You for Your Attention !
Download