Early Validation of MPSoCs Thermal Mitigation through Integration of Thermal Simulation in SystemC Virtual Prototyping Tanguy Sassolas1, Charly Bechara1, Pascal Vivet2, Hela Boussetta3 & Luca Ferro3 Pascal.Vivet@cea.fr CEA List1, CEA Leti2, Docea Power3 Thermal issues in modern SoCs Increasing thermal issues Technology scaling => higher power density 3D stacking with TSV => greater thermal issues Temperature impacts Power consumption Peak performance Ageing Package costs MPSoC architectures Dynamic applications, variable execution time Power management solutions (DVFS), can even worsen thermal properties! Thermal mitigation schemes must be proposed at design time Pascal Vivet| DAC’2013| June 2013 © CEA. All rights reserved |2 Thermal design flow 12 -24 months 2-3 months 6 months High level model (e.g. systemC) New architecture concept RTL model Architecture exploration VHDL design Area & power consumption netlist Synthesis Refined Power Studies Fixed floorplan constraints Floorplan exploration chip sales Tape out Place & Route w. Thermal sign off Fab Circuit test Development & Validation Online thermal control Early power evaluation Need for thermal evaluation tools that take into account a complete SoC environment and its dynamic behaviour Need early HW/SW thermal effects evaluation => ESL Linking thermal/power/functional tools in the same design flow is mandatory Pascal Vivet| DAC’2013| June 2013 © CEA. All rights reserved |3 Thermal Simulation State of the art Low level tools: Multiphysics simulation Multiphysics level : FloTherm (Mentor Graphics), Icepak (Ansys/Apache), … Mostly oriented to package/board and active cooling (fan) design Post-layout level : Heatwave (Gradient DA) Adapted to short term analysis at die level but long simulation times Fine grain but not adapted to SoC/Platform level exploration High level tools: HotSpot: Complex grid, long transient simulation 3D-ICE: Built for 3D stacks, little packaging description Mostly academic tools No native evaluation of thermal impact on power nor link with VP Existing Virtual Platforms (SystemC/TLM) May include power models but no thermal models No Thermal/Power/Functional ESL framework for early design Pascal Vivet| DAC’2013| June 2013 © CEA. All rights reserved |4 ESL Modeling & Simulation Framework Power AceplorerTM Thermal (Power modeling and coupled Power/Thermal simulation) Functional AceTLMConnectTM (Library for functional, power, thermal parameter monitoring & providing co-simulation link) AceThermalModelerTM (Dynamic Compact Thermal Model) / TLM Pascal Vivet| DAC’2013| June 2013 © CEA. All rights reserved |5 MPSoC use case: LOCOMOTIV LOCOMOTIV architecture Multi-Core with shared memory Thermal sensors Power management Local: adapts to process/ageing/temperature Global: DVFS control per core Hardware Assisted Runtime Software (HARS) Pedestrian Detection Application Variable execution time Parallel execution Pascal Vivet| DAC’2013| June 2013 © CEA. All rights reserved |6 Power & Thermal Aware Virtual Platform Existing Virtual Platform Application Application Runtime Runtime sync Hardware dependent SW Timed HW platform or PVT platform Power Power actuator Sync HAL ISS & HW blocks DVFS HAL Power unit Thermal Thermal management Thermal sensor HAL Thermal sensor AceTLMConnect Power & thermal analysis tools activity Clock domain & voltage supply Static & Dynamic power Temperature Pascal Vivet| DAC’2013| June 2013 © CEA. All rights reserved |7 LOCOMOTIV : Power Modeling Aceplorer power model 5 DPM modes (Idle 0, Idle 1, Idle 2, Idle 3, NoFetch) 5 frequencies (500 Mhz, 375 Mhz, 187.5Mhz, 68.75Mhz, 0Mhz) 1 generic frequency mode (Vdd-Hopping) DVFS modes DPM modes Vdd hopping mode Power values refined from post layout simulations Full temperature impact is characterized Impact on leakage Impact on dynamic consumption Pascal Vivet| DAC’2013| June 2013 © CEA. All rights reserved |8 LOCOMOTIV : Thermal Modeling Thermal description with AceThermalModeler Model geometries and material properties Generate a Compact Thermal model Die model according to floorplan SBGA304 package (5.08x5.08 mm) PCB model (4 dies, 1 FPGA) Thermal sensors Real Thermal Sensor built of Ring Oscillators Sensor temperature provided by Aceplorer Thermal values accessed through runtime HAL [L. Vincent, DAC’12] Pascal Vivet| DAC’2013| June 2013 © CEA. All rights reserved |9 Online thermal management Software stack enhanced with thermal control Cooperative runtime (Threads queue handed by master core) Schedule power hungry tasks to low temperature cores Reactive thermal control: Adapt DVFS to meet thermal budget Profit from data dependant computation to reduce PE performances Slack Reclamation + Idle mode when reaching temperature threshold Slack Reclamation with Temperature Thresholds : • If t > WCET & Temp < Thigh, then DVFS + • If t < WCET, then DVFS • If t > Tframe, then loose next frame • If Temp > Tcrit., go to IDLE mode • If Temp < Tlow, start next frame Pascal Vivet| DAC’2013| June 2013 © CEA. All rights reserved | 10 Thermal management results (1/3) No thermal mitigation Fast Simulation of Temperature Warm-up phase (Simulators are decoupled) Thermal budget is crossed : T > 95 °C Peak temperature = 114°C Thermal control is needed Tools are mandatory for early exploration and validation Pascal Vivet| DAC’2013| June 2013 © CEA. All rights reserved | 11 Thermal management results (2/3) No thermal mitigation Idle Time Management + Thresholds Thermal budget is respected : T < 95°C Using thresholds : 70°C < T < 90 °C Peak temperature = 90,71°C Thermal control works but with poor application results : Skipped frames : 10/52 but successive frames ! Pascal Vivet| DAC’2013| June 2013 © CEA. All rights reserved | 12 Thermal management results (3/3) No thermal mitigation Idle Time Management + Thresholds Slack Reclamation + Thresholds Slack reclamation Fine grain power and thermal analysis simulation time: 16 minutes (SystemC VP simulation time only : 5 minutes) Compliant with early design phases! Thermal budget respected : T < 95°C, same thresholds : 70°C < T < 90 °C Peak temperature = 91,2°C Slack reclamation : use estimated remaining time to reduce DVFS Skipped frames : 3/52. Application rendering is preserved. Pascal Vivet| DAC’2013| June 2013 © CEA. All rights reserved | 13 Conclusion Validated thermal mitigation algorithm Thermal mitigation is becoming mandatory with power & thermal issues Thermal/Power/Functional ESL design flow demonstrated with a MPSoC architecture on a real video application Limited development costs Power & Thermal modelling : using commercial ESL tools (Aceplorer & ATM) Can be easily connected to standard Virtual Platforms (SystemC/TLM) Limited simulation costs, using compact transient thermal models Perspectives More advanced (predictive) thermal mitigation algorithm Confrontation of the design flow with LOCOMOTIV silicon results Pascal Vivet| DAC’2013| June 2013 © CEA. All rights reserved | 14 Thank you for your attention. Any Questions ??? Come and Discuss around the poster Come and See the demo @ DOCEA’s booth, exhibitor 2113 Pascal Vivet| DAC’2013| June 2013 © CEA. All rights reserved | 15