Early Validation of MPSoCs Thermal
Mitigation through Integration of Thermal
Simulation in SystemC Virtual Prototyping
Tanguy Sassolas1, Charly Bechara1, Pascal Vivet2,
Hela Boussetta3 & Luca Ferro3
Pascal.Vivet@cea.fr
CEA List1, CEA Leti2, Docea Power3
Thermal issues in modern SoCs
 Increasing thermal issues
 Technology scaling => higher power density
 3D stacking with TSV => greater thermal issues
 Temperature impacts




Power consumption
Peak performance
Ageing
Package costs
 MPSoC architectures
 Dynamic applications, variable execution time
 Power management solutions (DVFS), can even worsen thermal properties!
 Thermal mitigation schemes must be proposed at design time
Pascal Vivet| DAC’2013| June 2013
© CEA. All rights reserved
|2
Thermal design flow
12 -24 months
2-3 months
6 months
High level
model (e.g.
systemC)
New
architecture
concept
RTL
model
Architecture
exploration
VHDL
design
Area & power
consumption
netlist
Synthesis
Refined
Power Studies
Fixed
floorplan
constraints
Floorplan
exploration
chip
sales
Tape out
Place & Route
w. Thermal sign off
Fab
Circuit
test
Development & Validation
Online thermal control
Early power evaluation
 Need for thermal evaluation tools that take into account a
complete SoC environment and its dynamic behaviour
 Need early HW/SW thermal effects evaluation => ESL
 Linking thermal/power/functional tools in the same design flow is mandatory
Pascal Vivet| DAC’2013| June 2013
© CEA. All rights reserved
|3
Thermal Simulation State of the art
 Low level tools: Multiphysics simulation
 Multiphysics level : FloTherm (Mentor Graphics), Icepak (Ansys/Apache), …
 Mostly oriented to package/board and active cooling (fan) design
 Post-layout level : Heatwave (Gradient DA)
 Adapted to short term analysis at die level but long simulation times
 Fine grain but not adapted to SoC/Platform level exploration
 High level tools:




HotSpot: Complex grid, long transient simulation
3D-ICE: Built for 3D stacks, little packaging description
Mostly academic tools
No native evaluation of thermal impact on power nor link with VP
 Existing Virtual Platforms (SystemC/TLM)
 May include power models but no thermal models
No Thermal/Power/Functional ESL framework for early design
Pascal Vivet| DAC’2013| June 2013
© CEA. All rights reserved
|4
ESL Modeling & Simulation Framework
Power
AceplorerTM
Thermal
(Power modeling and coupled
Power/Thermal simulation)
Functional
AceTLMConnectTM
(Library for functional, power,
thermal parameter monitoring
& providing co-simulation link)
AceThermalModelerTM
(Dynamic Compact Thermal Model)
/ TLM
Pascal Vivet| DAC’2013| June 2013
© CEA. All rights reserved
|5
MPSoC use case: LOCOMOTIV
 LOCOMOTIV architecture
 Multi-Core with shared memory
 Thermal sensors
 Power management
 Local: adapts to process/ageing/temperature
 Global: DVFS control per core
 Hardware Assisted Runtime Software (HARS)
 Pedestrian Detection Application
 Variable execution time
 Parallel execution
Pascal Vivet| DAC’2013| June 2013
© CEA. All rights reserved
|6
Power & Thermal Aware Virtual Platform
Existing Virtual Platform
Application
Application
Runtime
Runtime sync
Hardware
dependent SW
Timed HW
platform or
PVT platform
Power
Power actuator
Sync HAL
ISS & HW blocks
DVFS HAL
Power unit
Thermal
Thermal management
Thermal sensor HAL
Thermal sensor
AceTLMConnect
Power & thermal
analysis tools
activity
Clock domain &
voltage supply
Static & Dynamic
power
Temperature
Pascal Vivet| DAC’2013| June 2013
© CEA. All rights reserved
|7
LOCOMOTIV : Power Modeling
 Aceplorer power model
 5 DPM modes (Idle 0, Idle 1, Idle 2, Idle 3, NoFetch)
 5 frequencies (500 Mhz, 375 Mhz, 187.5Mhz, 68.75Mhz, 0Mhz)
 1 generic frequency mode (Vdd-Hopping)
DVFS modes
DPM modes
Vdd hopping mode
 Power values refined from post layout simulations
 Full temperature impact is characterized
 Impact on leakage
 Impact on dynamic consumption
Pascal Vivet| DAC’2013| June 2013
© CEA. All rights reserved
|8
LOCOMOTIV : Thermal Modeling
 Thermal description with AceThermalModeler
Model geometries and material properties
Generate a Compact Thermal model
 Die model according to floorplan
 SBGA304 package (5.08x5.08 mm)
 PCB model (4 dies, 1 FPGA)
 Thermal sensors
 Real Thermal Sensor built of Ring Oscillators
 Sensor temperature provided by Aceplorer
 Thermal values accessed through runtime HAL
[L. Vincent, DAC’12]
Pascal Vivet| DAC’2013| June 2013
© CEA. All rights reserved
|9
Online thermal management
 Software stack enhanced with thermal control
 Cooperative runtime (Threads queue handed by master core)
 Schedule power hungry tasks to low temperature cores
 Reactive thermal control: Adapt DVFS to meet thermal budget
 Profit from data dependant computation to reduce PE performances
 Slack Reclamation + Idle mode when reaching temperature threshold
Slack Reclamation with Temperature Thresholds :
• If t > WCET & Temp < Thigh, then DVFS +
• If t < WCET,
then DVFS • If t > Tframe, then loose next frame
• If Temp > Tcrit., go to IDLE mode
• If Temp < Tlow, start next frame
Pascal Vivet| DAC’2013| June 2013
© CEA. All rights reserved
| 10
Thermal management results (1/3)
No thermal mitigation
Fast Simulation of Temperature
Warm-up phase (Simulators are decoupled)
Thermal budget is crossed : T > 95 °C
Peak temperature = 114°C
Thermal control is needed
Tools are mandatory for early exploration and validation
Pascal Vivet| DAC’2013| June 2013
© CEA. All rights reserved
| 11
Thermal management results (2/3)
No thermal mitigation
Idle Time Management + Thresholds
Thermal budget is respected : T < 95°C
Using thresholds : 70°C < T < 90 °C
Peak temperature = 90,71°C
Thermal control works but with poor application results :
Skipped frames : 10/52
but successive frames !
Pascal Vivet| DAC’2013| June 2013
© CEA. All rights reserved
| 12
Thermal management results (3/3)
No thermal mitigation
Idle Time Management + Thresholds
Slack Reclamation
+ Thresholds
Slack reclamation
Fine grain power and thermal analysis simulation time: 16 minutes
(SystemC VP simulation time only : 5 minutes)
Compliant with early design phases!
Thermal budget respected : T < 95°C, same thresholds : 70°C < T < 90 °C
Peak temperature = 91,2°C
Slack reclamation : use estimated remaining time to reduce DVFS
Skipped frames : 3/52. Application rendering is preserved.
Pascal Vivet| DAC’2013| June 2013
© CEA. All rights reserved
| 13
Conclusion
 Validated thermal mitigation algorithm
 Thermal mitigation is becoming mandatory with power & thermal issues
 Thermal/Power/Functional ESL design flow demonstrated
 with a MPSoC architecture on a real video application
 Limited development costs
 Power & Thermal modelling : using commercial ESL tools (Aceplorer & ATM)
 Can be easily connected to standard Virtual Platforms (SystemC/TLM)
 Limited simulation costs, using compact transient thermal models
 Perspectives
 More advanced (predictive)
thermal mitigation algorithm
 Confrontation of the design flow
with LOCOMOTIV silicon results
Pascal Vivet| DAC’2013| June 2013
© CEA. All rights reserved
| 14
 Thank you for your attention. Any Questions ???
 Come and Discuss around the poster
 Come and See the demo @ DOCEA’s booth,
 exhibitor 2113
Pascal Vivet| DAC’2013| June 2013
© CEA. All rights reserved
| 15