Thermal Management Using PCM-Based Heatsinks Elon Bauer, Joseph Carlos Electrical and Computer Engineering Carnegie Mellon University Pittsburgh, PA eob@andrew.cmu.edu, jcarlos@andrew.cmu.edu Abstract—Existing passive cooling solutions limit the short-term thermal output of systems, thereby either limiting instantaneous performance or requiring active cooling solutions. Active cooling requires relatively large amounts of power, and cannot easily and efficiently handle temperature spikes. Phase change materials (PCMs) are materials that can change phase at a device’s operating temperature, absorbing energy in the process. PCM-based heatsinks are able to tolerate short-term spikes in energy output with substantially less (or even no) active cooling. In this project we examine the tradeoffs and benefits present in PCM-based heatsinks by simulating software workloads and thermal responses on a temperature-aware simulator. I. INTRODUCTION Conventional electronic cooling systems usually consist of a metal heatsink coupled to a fan, which turns on and off based on the load in the system. These systems are inefficient at dissipating heat in a dynamic system that outputs large amounts of heat for short periods of time. This is due to the large thermal inertia of the metals used and the delay between heat input and the time when active cooling begins to take effect. Phase change materials are materials that are special only in that they undergo a change in phase at a certain temperature. During this phase change, PCMs are able to absorb very large amounts of heat without increasing in temperature. This ability to absorb energy without requiring airflow over short periods could lead to a reduced reliance on active cooling techniques. These materials therefore present a great opportunity in heatsink design to absorb the heat dissipated during particular workloads with greater power efficiency over short time periods than conventional systems. The major limitations to PCM-based heat sinks are the short time of phase change and finding a way to contain the PCM in both solid and liquid states. These limitations will be addressed and overcome over the course of this paper. Our main focus will be on the use of Lauric Acid as a PCM, which melts at 44.2 °C, but we will also consider water as a base case. Our simulation methods include the use of Sniper, McPAT, HotSpot, and Matlab to arrive at our conclusions. In section II we will discuss previous work in the field of PCM-based heatsinks. In section III we will provide a brief technical description of PCMs and touch on some challenges present in using them as heatsink materials. In sections IV-VIII we will discuss our simulation method and in the final sections we will discuss our results and conclude the paper. II. PREVIOUS WORK The use of PCMs for cooling electronics is a new area of research. Most works focus on creating accurate parameterizations of various materials or validating their robustness in real systems, while few have discussed end-toend simulation of the interaction between a chip running a specific benchmark and the PCM that is cooling it. Tan & Tso and Kandasamy et al. developed numerical models from empirical data using different materials (n-eicosane and paraffin wax, respectively) [1] [2]. Rostamizadeh et al. developed a theoretical model for PCM from first principles and validated it with experimental results using calcium chloride hexahydate [3]. Efforts have been made, specifically by Wang and Baldea, to simulate power-management systems using PCMs [4]. However, none of these works involves a study of how different software workloads might affect the use of PCM versus conventional cooling methods. Fig. 1. Temperature of a PCM over time. III. TECHNICAL DESCRIPTION Phase change materials are materials which undergo a change in physical phase when a sufficient amount of heat is supplied. The specific phase change we will focus on in this paper will be turning from solid into liquid (melting). During the time when the material is actively melting, additional heat applied to the material will not change the temperature of the material, but is instead absorbed and used in the melting process. This phenomenon continues until the entirety of the material is melted. The temperature graph of a PCM over time looks similar to the graph in figure 1 [4]. Utilizing these materials to absorb heat in their phase change region is more efficient than using materials which do not undergo a phase change because during the phase change an essentially unlimited amount of heat can be input into the material without the external temperature changing, which is untrue of typical metallic heat sink materials, which undergo a linear change in heat as heat is applied. One of the challenges to this task is that phase change materials typically also change shape and physical characteristics as they undergo changes in phase. This poses challenges for use in mobile systems which have restrictions on space and do not allow for expansion and contraction of the materials inside. Existing research has shown that suspending the PCM inside of a thermally conductive matrix of something like epoxy provides similar heat sinking capacities to pure PCM while keeping the overal heatsink from changing shape [4][3]. This has the added benefit of increasing the surface area of the PCM, thus allowing it to absorb and release heat more quickly. We will assume for this paper that this approach is taken, and that the PCM heatsink can be modeled as a block consisting of only PCM material. Another limitation of PCMs is that they can only absorb a finite amount of heat in their phase change region before they return to a proportional relationship of output heat vs. applied heat as shown in figure 1. Thus, PCM-based heat sinks still need active cooling if the average applied heat over time will exceed the melting temperature of the PCM. However, if the application being run consists of short spikes in temperature, interspersed with periods of low temperature, then the PCM would be able to disperse the accumulated heat on its own without needing any active cooling. Fig. 2. McPAT per-unit power values over time. VI. HOTSPOT HEAT SIMULATION McPAT outputs its power values in a JSON format, thus we were forced to create a Matlab utility to parse the data and output it in a power trace – “ptrace” – file that HotSpot could recognize. The code for that utility is included on this project’s website, listed at the end of this paper. Beyond that step, most of the process for going from Sniper to HotSpot is outlined in [7]. We followed that process quite closely, but with a few exceptions. We created our own floorplan of the Nehlaem chip, as shown in Figure 3, using the numbers presented in [7]. However, Sniper did not output as many functional units as were listed in the paper, so many things have been lumped into “other” or combined to form more basic blocks. The Nehalem chip has four cores and an L3 cache, so we concatenated four of our floorplans together and included an L3 cache, as shown in Figure 4. IV. SNIPER POWER SIMULATION In order to simulate the reaction between the PCM-based heat sink and a CPU for various workloads, we first needed to obtain power consumption values for a processor under different workloads. The tool we chose to complete this task was the Sniper simulator [5]. Sniper has a number of built-in benchmarks, which can be easily run with different sizes of data. For our purposes we chose three benchmarks from three different libraries: Splash2-fft, Parsec-blackscholes, and NPBEP. All three were run with the largest data size, while NPB was also run with the smaller data size for comparison. We determined that Sniper by default uses a Nehalem chip, which we determined would suit our purposes in later steps. V. MCPAT POWER SIMULATION By default, Sniper only reports cumulative power values at the conclusion of the benchmark. In order to force it to produce values over time, you have to provide the “--viz" and “-power" flags to Sniper. This causes Sniper to invoke McPAT [6] during its execution to continuously produce power values over time. In addition to full-chip power values, McPAT produces per-unit power values for the different functional units present on the Nehalem chip as shown in Figure 2. This format turned out to be perfect for use in HotSpot which produces per-unit heat values by taking in per-unit power values over time. Fig. 3. Single core floorplan for Nehalem CPU From this point we diverge from [7] to compute the heat numbers generated for the different benchmarks listed in section IV. With a series of ptrace files generated from Matlab and a floorplan containing matching blocks, it was easy to run HotSpot with the basic settings to obtain temperature values over time. A thermal image of the Nehalem chip running the Splash2-fft benchmark can be seen in Figure 5. Note that each core is simply a scaled version of a single core since McPAT does not produce per-core outputs over time for its per-unit power values. We ran two iterations for every benchmark, as suggested in the HotSpot HowTo, to account for initialization errors as the chip warms up. We also modified the temperature.c file of the HotSpot code to reflect more accurate ambient air temperatures and chip startup temperatures. (K) Fig. 4. Full-chip floorplan of Nehalem CPU. Dimensions in meters. VII. PCM ANALYTICAL MODEL In order to characterize the behavior of a PCM-based heat sink, we would first need a model that could take temperature over time as an input and report changes in the heat of the PCM as well as its phase. We attempted to use the transfer function derived in [4] to accomplish this goal, however we eventually determined that the transfer function was not timeinvariant and thus we could not perform a Fourier Transform on it to perform calculations in the frequency domain. After this setback, we came to the realization that we would need to create an analytical model from scratch by ourselves. We did so using Matlab. The model assumes that the heatsink is a 7 cm by 7 cm by 3.5 cm box with a copper container. This container is attached to the chip via thermal paste, which is assumed to have a significantly higher thermal conductivity than copper. 𝑑𝑄 𝑇! − 𝑇! 𝐻= =𝑘∗𝐴∗ (1) 𝑑𝑡 𝐿 Equation 1 gives the rate of transfer of energy through a surface of length L and area A given the "hot" region's temperature, TH, and the "cold" region's temperature, TC. Since the simulation results from HotSpot are temperatures sampled at a static sampling interval, we can solve for the energy input into the heatsink from the chip at a given timestep, as shown in equation 2. Fig. 5. HotSpot heatmap of Nehalem CPU running Splash2-fft. 𝑑𝑄 = 𝑑𝑡 ∗ 𝑘 ∗ 𝐴 ∗ 𝑇!"# − 𝑇!"# (2) 𝐿 In equation 2, k is the thermal conductivity of the material separating the CPU and PCM (in our case, copper), A is the area of the chip connected to the PCM, which is just the chip's total area, and L is the width of the wall of the copper PCM container. A similar quantity can be obtained for the heat transfer due to the ambient air. Adding these quantities results in the total energy transfer in the system in a single time step. 𝑄 = 𝑚 ∗ 𝑐 ∗ 𝑑𝑇 (3) Equation 3 represents the heat required (Q) for a temperature change in a mass m. Here, c is the specific heat of the material and dT is the change in temperature. Using the specific heat of the PCM and the mass of the PCM (calculated using the volume of PCM and it's density), this equation can be rearranged to give equation 4, which is the change in temperature of the PCM in a single time step. Q in this equation is found using equation 2. 𝑑𝑇 = 𝑄 (4) 𝑚∗𝑐 When the PCM is not undergoing a phase change, the change in temperature can be obtained using equation 4. During this phase change we use equation 5 to determine whether the entire PCM has melted or solidified. m is the mass of the PCM and L is its latent heat of fusion. During a phase change, the PCM has constant temperature, so we do not use dQ (from equation 2) to find dT. Instead, we accumulate dQ until the necessary amount of energy (m*L) has been accumulated or dispensed by the PCM. At this point, the PCM has changed phase, and we resume using equations 2 and 4 to find the change in temperature. 𝑄 = 𝑚 ∗ 𝐿 (5) else if state is liquid: dT = dQ / (m * c_liquid) if T_PCM + dT < T_melt: T_PCM_next = T_melt state_next = changing phase else if state is changing phase: Q_acc_next = Q_acc + dQ if Q_acc + dQ > m * L_PCM: state_next = liquid Q_acc_next = 0 The pseudocode for our model’s state machine is as follows: Inputs: CPU_temp_input = list of input CPU temperatures, one per sampling interval dt = sampling interval ACPU = area of CPU Aair = area of PCM in contact with air Lbox = width of copper container wall k = thermal conductivity of copper Tair = constant m = mass of PCM cliquid = specific heat of PCM in liquid phase csolid = specific heat of PCM in solid phase LPCM = latent heat of fusion of PCM TPCM = initial PCM temperature state = initial PCM state Outputs: PCM_temp_output = list of output PCM temperatures, one per sampling interval Pseudo code: Qacc = 0 while CPU_temp_input is nonempty: TCPU = pop(CPU_temp_input) TPCM_next = TPCM state_next = state Qacc_next = Qacc dQCPU = dt * k * ACPU * (TCPU - TPCM) / L_box dQ_air = dt * k * A_air * (T_air - T_PCM) / L_box dQ = dQ_CPU + dQ_air if state is solid: dT = dQ / (m * c_solid) if T_PCM + dT > T_melt: T_PCM_next = T_melt state_next = changing phase else if Q_acc + dQ < - m * L_PCM: state_next = solid Q_acc_next = 0 push(PCM_temp_output, T_PCM) T_PCM = T_PCM_next state = state_next Q_acc = Q_acc_next The heat transfer equations in this section were found in [8]. VIII. PCM SIMULATION The final step towards simulating our PCM-based heatsink was to find a PCM with thermal properties that would perform well as a heatsink at the temperatures the chip was experiencing. A list of PCMs and their thermal properties can be found in [9], and based on the fact that for all benchmarks the CPU was running at around 50 °C, we determined that Lauric Acid, with a melting point of 44.2 °C would be an ideal PCM for our purposes. We also include water as a baseline. As mentioned in section VII, the PCM analytical model uses the heat of fusion, the specific heats of solid and liquid phases, the melting point and the density of the material to characterize its response to temperature input. This provides an accurate response characterization and shows how the phase change would occur in the presence of heat from the CPU and cold from the ambient air around it. IX. RESULTS In order to get a wide range of results, we created temperature profiles of the Nehalem chip in HotSpot for four different benchmarks. These were Splash2-fft with the large dataset, Parsec-blackscholes with the large dataset, and NPB3.3-MPI (EP) with both the large and small datasets. The differences in chip utilization between these benchmarks led to interesting variations in the performance of the PCM as a heatsink. However, because HotSpot produces very small variations in temperature, we increased the variation in HotSpot’s outputs. We also set the room temperature for our PCM to be very close to the melting point, simply because this makes it easier to observe the melting reaction in graph form. We found that for benchmarks like Splash2-fft, which exhibits CPU temperatures as shown in Figure 6, the PCM behaved in an ideal manner. The PCM response for Splash2fft is shown in Figure 7 and shows that the PCM reaches its melting point very rapidly, but then stays at a constant temperature due to the fluctuations of the CPU temperature. With a workload that fluctuates such as this, the PCM will never gain enough heat to completely melt and will thus never increase in temperature. This is the ideal scenario. Fig. 8. Parsec-blackscholes CPU temperature response Fig. 6. Splash2-fft CPU temperature response Fig. 9. PCM heatsink response to Parsec-blackscholes benchmark is for many iterations of the Parsec benchmark, which only runs for a few milliseconds for a single run. So, if the load is high, but for a short time, i.e. a temperature spike, PCM-based heat sinks should be able to effectively dissipate that heat. Fig. 7. PCM heatsink response to Splash2-fft benchmark A less ideal scenario is one like that shown by the Parsec benchmark which climbs quickly to a high load and maintains that load for a long time. This causes the CPU to stay at a high temperature (about 51 °C in this case) for a very long time as shown in Figure 8. Because this temperature is greater than the melting temperature of the PCM, the PCM completely melts and then resumes a linear relationship between heat input and output, eventually reaching equilibrium with the ambient air around 44.3 °C after 120 seconds of simulation time as shown in Figure 9. This is the worst-case scenario, since the PCM is not doing any better than a regular heatsink after it completely melts. However, it should be noted that this The NPB benchmarks that were run behaved similarly to the Parsec benchmarks. If only run for 5-10 iterations of the benchmark, the PCM stayed in the phase transition and perfectly absorbed all the heat. However if the benchmark were run for long enough, it too would completely melt the PCM and cause it to enter the linear region. As a reference, and to verify our model, we created a synthetic temperature profile for use with water as the PCM. The temperature profile starts below the melting point of water and then jumps to 40 °C so that the water can completely melt, then it goes back down so we can see the water freeze again. The temperature profile of the chip and PCM are shown in Figures 10 and 11 respectively. These figures demonstrate that our analytical model is sound and performs as expected, thus validating the rest of our results. X. CONCLUSION In this paper the capability of phase change materials to function as heatsinks for CPUs under various workloads has been explored in detail. Power utilization values were obtained using the Sniper simulator and McPAT for various benchmarks running on a Nehalem CPU. A floorplan of that CPU was created and used for HotSpot simulations, which output temperature values of the various units on the chip. Finally these temperature values were inputted into a custombuilt analytical model of a Lauric Acid-based heatsink to characterize its performance as a heat dissipater. The model was validated with water being used as the phase change material and a synthetic heat input. will become proportional to input temperature. This is not a huge problem for short periods of time, since in this region the PCM is simply performing similarly to how a regular heatsink would perform. The field of PCM-based heatsinks is a growing and exciting field and the authors hope that much more research will be done towards making these thermal wonders a reality. XI. MILESTONES AND SCHEDULE Table 1 contains a list of project milestones and the dates by which they were completed. The year is 2014. TABLE I. PROJECT MILESTONES Milestone Obtain or create a thermal model of a processor for use in HotSpot and Sniper. Obtain power numbers for various benchmarks through Sniper and McPAT. Create a floorplan of the processor for use in HotSpot Obtain heat values over time from the processor using HotSpot. Obtain or create a simulation model for a PCM-based heat sink. Obtain hard numbers for PCM-based heat sink performance. Analyze PCM-based heatsink performance and comment on feasibility Final paper. Date November 11 November 11/20 December 3 December 3 December 6 December 7 December 9 December 11 XII. WORK DISTRIBUTION & WEBSITE Fig. 10. Synthetic temperature response for use with water The work was split evenly between the authors for the entirety of the project. Joseph researched and discovered how to make Sniper output power numbers over time, and handled everything related to Matlab including the script for parsing McPAT output and the PCM analytical model. He was also the primary researcher, contributing most of the previous work section and finding out what the Nehalem CPU specifications were. Elon found benchmarks suitable for the application and ran the simulations for Sniper, converted and input that data into HotSpot, and created Python scripts to make the output consistent with the PCM model inputs. He also created the floorplan for the Nehalem CPU for use in HotSpot. Both remembers contributed to writing the report and creating the website. Our project website was created by Joseph and Elon and can be found at http://www.andrew.cmu.edu/user/eob/. We hope you enjoy it. Fig. 11. Water-based heatsink melting and freezing response We determined that properly selected phase change materials can be highly effective at dissipating heat generated from a CPU over relatively short periods of time. If there are short spikes of heat interspersed with temperatures well below the melting point of the PCM, the PCM will stay in the phase change region and will stay at a constant temperature. If, however, the input temperature is consistently very high, the PCM will melt completely and the response to temperature REFERENCES [1] [2] [3] [4] F.L. Tan, C.P. Tso. “Cooling of mobile electronic devices using phase change materials,” Applied Thermal Engineering vol. 24, issues 2-3, pp. 159-169, February 2004. Ravi Kandasamy, Xiang-Qi Wang, Arun S. Mujumdar. “Application of phase change materials in thermal management of electronics,” Applied Thermal Engineering, vol. 24, issues 17-18, pp.2822-2832, December 2007. Mohammad Rostamizadeh, Mehrdad Khanlarkhani, S. Mojtaba Sadrameli. “Simulation of energy storage system with phase change material (PCM),” Energy and Buildings, vol.49, pp.419-422, June 2012. Siyun Wang, M. Baldea. “Storage-enhanced thermal management for mobile devices,” American Control Conference (ACC), pp.5344-5349, June 2013. [5] [6] [7] [8] [9] Sniper Power Simulator, http://snipersim.org/ McPAT Multicore Power, Area and Timing Framework, http://www.hpl.hp.com/research/mcpat/ Adrian Florea, Claudiu Buduleci, Radu Chis, Arpad Gellert, Lucian Vintan. “Enhancing the Sniper Simulator with Thermal Measurement,” 18th International Conference on System Theory, Control and Computing, Sinaia, Romania, Oct. 17-19, 2014 Hugh Young, Roger Freedman, Lewis Ford. University Physics with Modern Physics (12th Edition). Benjamin Cummings, 2008. Wikipedia, “Phase-change Material, Thermophysical Properties of Common PCMs,” http://en.wikipedia.org/wiki/Phase-change_material.