A SIMULATED ANNEALING APPROACH TO THERMAL PLACEMENT PROBLEMS Shin-Fa Chen and Jing Lee Department of Electronic Engineering, Southern Taiwan University of Technology Email: leejing@mail.stut.edu.tw ABSTRACT This paper presents a simulated annealing approach to the thermal placement problems. For speeding up the convergence of the simulated annealing, we propose a new perturbation technique that is a mix of greedy perturbation with random perturbation. Our method begins with a random initial placement. Then, a candidate placement is found by the mixed perturbation. The candidate is accepted or rejected based on its system failure rate. A heat transfer solver is used to determine the temperature distributions on the placement substrate and to calculate the system failure rate of the placement. Three industry circuits designed by IBM are used to examine the present method and compare to an existed placement algorithm. 1. INTRODUCTION Placement is the process of arranging the circuit components on a layout surface such that multiple, possibly conflicting, design objectives can be satisfied. Historically, placement techniques have been developed primarily on the basis of routability. Sherwani [1] provides a summary of the classic techniques. But with the increasing demand for high quality and reliable performance over time, techniques to address placement for reliability, which is known as the thermal placement problem, become necessary [2]. Previous works on the thermal placement problem fall into two major categories: thermal force-directed algorithm (TFPA) [3, 4] and metaheuristics [5-8]. In the TFPA, the thermal placement problem is mapped into obtaining a force equilibrium placement by solving a system of thermal force equations. Since this method is a constructive algorithm, it is fast but easily falls into local optima. On the other hand, metaheuristics have the ability to avoid tracking into local optima. However, these methods are too time-consuming especially for thermal placement problem, because they need to calculate the temperature distributions on the substrate and the system failure for deciding accepting or rejecting the new placement at each iterative procedure. In the study, we propose a simulated annealing approach with a fast heat transfer solver to the thermal placement problems. The present method can be applied to VLSI, hybrid circuits, and MCM designs. However, for simplicity, the following description is restricted to MCM designs. 2. PROBLEM DESCRIPTION 2.1 Package Structure Here, we consider an electronic package as illustrated in figure 1. This model consists of a sandwich structure formed from the multilayer substrate - epoxy adhesive aluminum heat sink with thicknesses of 9, 0.076, 1.27 mm, respectively. Within each layer, the material is assumed linear, isotropic, and homogeneous. Thermal conductivities of the multilayer substrate, the epoxy layer, and the heat sink are 39.4 W/mK, 0.276 W/mK and 195 W/mK, respectively. Heat loss from substrate into the top and the bottom sides of the package are quantified by heat transfer coefficients, h1 and h2. Since the heat loss from the sidewalls of the substrate is insignificant compared to the top and bottom sides of the substrate, the sidewalls of the substrate are treated as perfect insulators. Y W h1 Chips 0 Epoxy L Multiplayer substrate Fig. 1 Heat sink X h2 The structure of an electronic package 2.2 Thermal Placement Problem Consider a two-dimensional substrate on which chips are to be placed in a checkerboard model. The substrate is characterized in terms of a finite array of chip sites. A matrix location or chip site is represented by a point in an x-y coordinate system as depicted in Fig. 1. Chips are the entities to be assigned to chip sites on the substrate. If a chip is assigned into a chip site, the position of the chip is at the center of the chip site. The thermal placement problem can be stated briefly as: given a set of chips C with its set of heat dissipations Q and a set of chip sites S on the substrate, assign each chip to one of the chip sites such that the system failure rate is minimized. accepted as new placement depending on (p), (p’) and T. p’ replaces p if (p’) < (p) or, in case (p’) (p), with a probability which is a function of T and = (p’) - (p). The probability is generally computed following the 3. SOLUTION METHODOLOGY Botzmann distribution e . In the beginning, T is set to a very high value such that most of the candidate solutions are accepted. Then T is gradually decreased, so the candidates with higher failure rates than the current solution have less chance of being accepted. Finally, T is reduced to a very low value so that only the candidates with lower system failure rate than the current placement are accepted, and the algorithm converges to a placement of a low system failure rate. The details of our algorithm are shown in Fig. 3. / T The solution methodology, illustrated in Fig. 2, has two elements: a heat transfer solver and a simulated annealing algorithm. The heat transfer solver is responsible for the prediction of the temperature of each heat source used for calculating the individual failure rates. The simulated annealing algorithm is responsible for the adaptive search of optimal or near-optimal solutions. Fig. 2 Illustration of the solution methodology 3.1 Heat Transfer Solver Several analytical methods have been presented to determine the temperature distributions of the package depicted in Fig. 1 [9]. In the study, the TAMS (Thermal Analyzer for Multilayer Structures) program developed by Ellison [10] is used to predict the chip’s temperatures. The failure rate, measured in failures per megahours (fr/Mh-1), is a measure used to assess cost of a placement. In the study, the failure rate of chip i is estimated using the Arrhenius relation: E 1 1 i r exp a B Tr Ti (1) where i and r are the failure rates of the chip i at a temperature of Ti K and at a reference temperature of Tr K, respectively; Ea is the activation energy (eV); B is the Boltzmann's constant. The system failure rate of a placement p, denoted as (p), is given by the sum of the individual chip failure rates. 3.2 Simulated Annealing The simulated annealing algorithm is a general purpose combinatorial optimization technique that is analogous to the process of metallurgical annealing in which a system is heated and then cooled gradually until the material achieves certain desired metallurgical properties [11]. Our algorithm starts by randomly generating an initial placement p and by initializing the so-called temperature parameter T. Then, at each iteration a candidate placement p’ is found by a perturbation technique and whether p’ is procedure SA( ) p← random initial placement (p) ← TAMS(p) // calculate the system failure rate T ← T0 // initial temperature p_best←p // currently best placement _best ←(p) while (T > Tf) // Tf is the frozen temperature for i← 1 to M // M is the length of Markovian chain p’ ← PERTURB(p) (p’) ←TAMS(p’) ←(p’) - (p) if (p’) < _best then p_best←p’ ; _best ←(p’) if < 0 or RANDOM(0,1) > p ← p’ ; (p) ←(p’) endif end T ← SCHEDULE(T) end OUTPUT(p_best) Fig. 3 e / T then The procedure of the simulated annealing 3.2.1 Perturbations In the study, we apply two very different perturbation techniques. The first, named random perturbation, is a swap of two randomly selected chips. The second, named greedy perturbation, is a swap of the highest temperature chip and the lowest temperature chip. What we expect is that the greedy perturbation helps with a faster convergence, while the random perturbation helps to escape from local minima. So, our perturbation technique is a mix of the two perturbations. For example, we can execute 20% greedy perturbations and 80% random perturbations in the iterative procedure. 3.2.2 Initial Temperature The initial temperature must be chosen so that almost all candidate solutions are accepted initially. Here we use the method developed by [12] to determine the initial temperature. That is T0 where av l n ( 01 ) (2) av is the average decrease of and 0 is the accepted rate. 3.2.3 Cooling Schedule The choice of an appropriate cooling schedule is crucial for the performance of the simulated annealing algorithm. The cooling schedule defines the value of T at each iteration k, Tk+1 = f(Tk, k). Theoretical results on non-homogeneous Markov chains [13] state that under particular conditions on the cooling schedule, the simulated annealing converges in probability to global optima for k ∞. The logarithmic law fulfils the hypothesis. However, it is too slow for practical applications. Here, we adopt the geometric law: Tk+1 = α × Tk, where α (0, 1), which corresponds to an exponential decay of the temperature. For saving the running time, the cooling schedule is divided into two stages. Initially, set α = 0.85, so the temperature is reduced rapidly. When the probability is smaller than 0.6, reset α = 0.95, so the temperature is reduced slowly. the placements obtained by SA have obviously reliability improvement than the placements of IBM and TFPA. For the 30-chip-cite module, SA reduced the system failure rates by 4.3% and 14.1% over TFPA and IBM, respectively. For the 31-chip-cite, SA reduced the system failure rates by 5.2% and 9.1% over TFPA and IBM, respectively. For the 121-chip-cite module, SA reduced the system failure rates by 7.5% and 79.2% over TFPA and IBM, respectively. Fig. 4 Cost variation for the 30-chip-cite case 3.2.4 Length of Markov Chain The length of Markovian chain, M, is the number of trials at each temperature. In general, the value of M is taking according to the size of the problem, n. In the study, we set M 2 n (3) 4. EXPERIMENTAL RESULTS The simulated annealing heuristic and the heat transfer solver have been implemented in C++ language and run on a 2.5GHz Pentium IV personal computer. To test the program and compare it with existing placement techniques, three industry circuits designed by IBM have been considered. The 30-chip-site module and the 31-chip-site module are derived from the IBM’s GEMI modules [14]. A large example, 121-chip-site module, is derived from the IBM’s TCM with 110 chips in [15]. Figs. 4-6 show the cost variation for the test cases under two different perturbation techniques. One can see that 100% random perturbation produces more unaccepted solutions. Adding 20% greedy perturbation can significant reduce these unacceptable solutions. So, the mixed perturbation technique is better than random perturbation in the convergent speed. Table 1 lists the results obtained by the present method (SA), by the thermal force placement algorithm (TFPA) [4], and by IBM [14,15], where Tav, Tmax, Tmin, and Tmax are the temperature averages of all chips, the highest temperature, the lowest temperature, and the temperature range, respectively. Note that SA generates placements of minimum values of Tmax and Tmax . Thus, Fig. 5 Cost variation for the 31-chip-cite case Fig. 6 Cost variation for the 121-chip-cite case Table 1 Modules The comparisons of SA, TFPA, and IBM. 30-chip-site 31-chip-site 121-chip-site Items IBM TFPA SA IBM TFPA SA IBM TFPA SA Tav(oC) 98.4 97.8 97.4 103 103 102.7 158.9 161.1 160.6 Tmax(oC) 108 107 104 115 112 110 228 185 181 Tmin(oC) 80 82 85 88 85 91 100 145 151 Tmax (oC) 28 25 19 27 27 19 128 40 30 (fr/Mh-1) 78 70 67 121 116 110 124960 28143 26030 5. CONCLUSION This paper proposes a solution methodology for placing electronic components on a substrate in a checkerboard type for minimizing the system failure rate. A simulated annealing with a maxed perturbation technique is presented for the thermal placement problem. Our experimental results show that by mixing greedy and random perturbation can speed up the entire procedure. Experiments on three industrial MCMs show that the obtained placements have significant improvements to their original designs (i.e. IBM) in system reliability. ACKNOWLEDGMENT This work was supported by the National Science Council under contract no. NSC91-2215-E-218-015 REFERENCES [1] N. Sherwani, Algorithms for VLSI Physical Design Automation, 3rd ed. Boston: Kluwer Academic Publishing, 1999. [2] T. Kam, S. Rawat, D. Kirkpatrick, R. Roy, G. S. Spirakis, and N. Sherwani, “EDA challenges facing future microprocessor design,” IEEE Trans. on Computer-Aided Design, vol. 19, no. 12, pp. 1498-1506, 2000. [3] J. Lee, “Thermal placement algorithm based on heat conduction analogy,” IEEE Trans. on Comp. Packag. Technol., vol. 26, no. 2, pp. 473-482, 2003. [4] J. Lee, “An Approach to Thermal Placement in MCM Using Thermal Force Model,” VLSI Design/CAD symposium, pp. 21-24, 2003. [5] C. N. Chu and D. F. Wong, “A matrix synthesis approach to thermal placement, ” IEEE Trans. Computer-Aided Design, vol. 17, no. 11, pp. 1166-1174, 1998. [6] M. C. Tang and J. D. Carothers, “Consideration of thermal constraints during multichip module placement,” Electronic Letters, vol. 33, no. 12, pp. 1043-1045, June 1997. [7] C. Beebe, J. D. Carothers, and A. Ortega, “MCM placement using a realistic thermal model,” in Proc. of the Tenth Great Lakes Symp. on VLSI, pp. 189-192. [8] C.-H. Tsai and S.-M. Kang, “Cell-level placement for improving substrate thermal distribution,” IEEE Trans. Computer-Aided Design, vol. 19, no. 2, pp. 253-266, 2000. [9] C. C. Lee, A. L. Palisoc, and Y. J. Min, “Thermal Analysis of Integrated Circuit Devices and Packages,” IEEE Trans. Comp. Hybrids. Manufact. Technol., vol.12, no. 4, pp. 701-709, 1989. [10] G. N. Ellison, Thermal Computations for Electronic Equipment, New York: Van Nostrand Reinhold, 1983. [11] S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi, “Optimization by Simulated Annealing,” Science, vol. 220, no. 4598, pp. 671-680, May 1983. [12] D. S. Johnson, C. R. Aragon, L.A. Mcgeoch, and C. Schevon, Optimization by simulated annealing: an experimental evaluation, PartⅠ, AT&T Bell Lab., Murray Hill, 1987 [13] E. H. L. Aarts and J. K. Lenstra, Eds. Local Search in Combinatorial Optimization, Wiley, Chichester, UK, 1997. [14] G. A. Katopis, W. D. Becker, T. R. Mazzawy, H. H. Smith, C. K. Vakirtzis, S. A. Kuppinger, B. Singh, P. C. Lin, J. Bartells, G. V. Kihlmire, P. N. Venkatachalam, H. I. Stoller, and J. L. Frankel, “MCM technology and design for the S/390 G5 system,” IBM Journal of Research and Development, vol. 43, no. 5/6, pp. 21-49, 1999. [15] G. F. Goth, M. L. Zumbrunnen, and K. P. Moran, “Dual-tapered-piston (DTP) module cooling for IBM enterprise system/9000 systems,” IBM Journal of Research and Development, vol. 36, No. 4, pp. 805-816, 1992.