General Thermal Force Model with Experimental Studies Jing Lee Department of Electronic Engineering Southern Taiwan University of Technology Tainan, Taiwan 701, R.O.C. Email: leejing@mail.stut.edu.tw Abstract Thermal force has proven to be a useful concept for managing the package-level thermal placement problem. However, the previous thermal force model is based on insulated edge boundary condition, thus it is only suitable to face-cooled packages. A general thermal force model is proposed to extend the applicability of the thermal force to cover general cooling conditions by introducing the concept of the heat transparency of a boundary, . By managing , the present method generates a series of thermal-force-equilibrium placements fitting to situations from face-cooled to edge-cooled packages. Experimental results indicate that for generating the reliability-best placements the best ’s are in the range 0 0.05 for face-cooled packages and in the range 0.1 0.25 for the volumetric-cooled packages. For edge-cooled packages, the best ’s are the largest values of that can achieve convergent thermal-force-equilibrium placements. Index Terms—Force-directed algorithm, reliability, thermal force, thermal placement. I. INTRODUCTION Placement is the process of arranging circuit components on a layout surface such that multiple, possibly conflicting, design objectives can be satisfied [1]. Historically, placement techniques have been developed primarily on the basis of routability. Sherwani [2] provides a summary of the classic techniques. But with the increasing demand for high quality and reliable performance over time, techniques to address placement for reliability, which is known as the thermal placement problem, become necessary [3], [4]. TCPT-2004-031.R1 1 The thermal placement problem occurs at three different levels: system-level, board-level, and package-level. At system-level, the placement problem is to place all the subsystems together so that the volume occupied is minimized. At the same time, the heat dissipated by each of the subsystems should be cooled properly so that the system does not malfunction due to overheating of some components. At board-level, all the chips on a board along with other solid-state devices have to be placed within a fixed area of the PCB such that the system reliability is optimized. At package-level, electronic components are placed on a substrate. The electronic components are logic circuits for VLSI designs and chips for multichip module (MCM) layouts. The objective function is also to optimize the system reliability. The key differences of the thermal placement problems between board-level and package-level are the different thermal models and cooling considerations. At board-level, the temperature difference between the P-N junction and the case of the package is simply modeled by a thermal resistance. The thermal analysis therefore focuses on the heat convection from the heat components to ambience. At package-level, the heat from the case to ambience is simply modeled by a heat transfer coefficient. So, the thermal analysis technique is addressed on calculating the on-chip temperature profile, for which transfer heat is mainly by conduction. Due to the differences in boundary conditions and problem granularity, the thermal placement techniques developed for board-level cannot be used for package-level, and vice versa [2]. Some research works have been done earlier to address the thermal placement problem at the system-level [5], [6] and the board-level [7]-[14]. However, continuing increase in the levels of circuit integration and concomitant increases in performance are sustaining the trend of increasing power dissipation in a package. Recently, researches [15]-[24] on the thermal placement problem have concentrated on the package-level. Since the system failure rate of a package is strongly dependent on the temperature distribution on the substrate, the package-level thermal placement problem usually is simplified as to obtain a placement of uniform temperature distribution across the substrate. For dealing with the uniform temperature distribution problem, knowledge of the TCPT-2004-031.R1 2 package structure and its cooling condition is necessary, because the temperature profiles on the substrate directly depend on them. However, a practical package structure generally is not completely determined at the placement phase, and a practical structure is usually too complex in calculating the temperature profiles. Some simplified thermal models of packages therefore are typically used to reduce the computation time for the temperature profiles calculations. The simplest thermal model is the one without any boundary effect. In other words, every component on the substrate is considered having the same cooling condition and the boundary effect is ignored. So, the uniform temperature distribution problem can be further simplified as a uniform heat distribution placement. Most of previous thermal placement algorithms are based on this model. Two neural network based approaches have been presented in [15], [16]. Two top-down approaches based on partitioning technique are proposed in [17], [18]. Chu and Wong [19] consider this problem as a matrix synthesis problem and present several heuristic algorithms for gate arrays. Tang and Carothers [20] solve the matrix synthesis problem by a hybrid approach combining a genetic algorithm to a simulated annealing heuristic. All the above studies model an electronic component as a particle, in which the size is ignored. Beebe et al. [21] present a similar placement technique as [20], but they use a circle model instead of the particle model for electronic components. The main weakness of the model without any boundary effect is that a placement with a uniform heat distribution does not lead to a placement with a uniform temperature distribution due to the effects of boundary conditions and the finite thermal conductivity of the packaging components [22]. A more practical thermal model of a package is the face-cooled model. In this model, heat is removed from the top and the bottom sides of the package, and the edge sides of the package are considered as insulated. A face-cooled substrate can be transformed into an infinite substrate by the image method [25], [26]. In view of the fact that there is no boundary in an infinite substrate, the uniform temperature distribution problem on a face-cooled substrate can be transformed into the uniform heat distribution problem on an infinite substrate. A thermal force model based on image TCPT-2004-031.R1 3 method and heat conduction analogy is presented for the face-cooled thermal placement problem by the author [23]. By using the thermal force model, the placement problem for obtaining a uniform temperature distribution on a face-cooled package is mapped into obtaining a Thermal-Force-Equilibrium (TFE) placement by solving a system of thermal force equations. Some authors have also studied the thermal placement problem on a volumetric-cooled package [22] and an edge-cooled package [24]. In [22], a 3-D mesh of thermal resistors and current sources are used to model the package and the thermal boundary conditions. Two simulated annealing algorithms are presented for the chip placements of standard cell and macro cell design styles, respectively. In [24], a thermal force based on a fuzzy model has been introduced to manage the edge-cooled thermal placement problem for MCM designs. Most previous package-level placement algorithms are based on randomized iterative techniques [19]-[22]. Theoretically, these randomized iterative approaches need an infinite number of iterations for finding a global optimum [27]. They also need to calculate the temperature distribution on the substrate and the system failure rate for selecting one or more better placements in each iterative procedure. Additionally, to achieve high quality solutions, a set of parameters that govern the convergence of these algorithms must be specified. However, the best values of the parameters usually need trial and error [28]. So, randomized iterative algorithms suffer from long runtimes for large-sized problems. In this paper, a general thermal force model is presented for the package-level thermal placement problems. This model extends the previous thermal force model [23] to include the edge-cooled and the volumetric-cooled boundary conditions. When compare to the randomized iterative algorithms the present method, which is a root-finding approach, takes less iteration to obtain a convergent solution. Besides, the present method calculates the temperature distribution on the substrate and the system failure rate only when the final placement has been obtained. So, it is computationally more efficient than randomized iterative algorithms. Another important merit is that the present thermal force model can easily combine other force models presented for the TCPT-2004-031.R1 4 optimizations of the routability and performance [29]-[32]. So, the multiobjective optimal placement problem can be solved by the same technique presented in the paper. II. THEORETICAL BACKGROUND This paper deals with the problem of placing electronic components on a substrate such that the temperature is uniformly distributed across the substrate. The present method can be applied to VLSI, hybrid circuits, and MCM designs. However, for simplicity, the following description is restricted to MCM designs. A. Placement Styles There are basically two different categories of placement problems, referred to as the continuous and discrete (array) problems. For the continuous case, the placement substrate is treated as a continuous plane on which the chips to be placed are free to reside. In the discrete case, the substrate is partitioned into a matrix of slots into which the chips are placed. In the following, we will stress on the continuous placement problem. However, it should be pointed out that the present method can be easily converted to the discrete problem by a two-phase placement procedure. That is, the discrete placement is firstly considered as a continuous placement, and a topological placement solution is generated by the present method. Then, the topological placement is mapped to a discrete placement by an assignment procedure. The assignment problem has been extensively studied in the literature [33]. B. Packaging Model and Failure Rate Evaluation System failure rate prediction is important in placement stage since it is a measure used to assess performance of a placement. Temperature is generally considered as a key parameter in failure mechanisms. Virtually, all failure mechanisms are accelerated by increased temperature [34], [35]. An Arrhenius relation has generally been adopted to model the strong dependency of failure rate with temperature, TCPT-2004-031.R1 5 E 1 1 i 0 exp a k T0 Ti (1) where i and 0 are the failure rates of chip ci at a temperature of Ti K and at a reference temperature of T0 K, respectively; Ea is the activation energy (eV); k is the Boltzmann's constant. To determine the failure rate of an individual chip, various operating parameters need to be specified. Without loss of generality, in this paper, all chips are assumed to have the same factors of 0 and Ea, which are 1 Fit (i.e. 10-9/hour) and 1 eV, respectively. The objective of this work is to minimize the system failure rate of an MCM, which is given by the sum of the individual chip failure rates. That is n ( P ) i (2) i 1 where n and P are the chip number and the chip placement, respectively. In order to estimate (P) , it is necessary to understand the temperature profiles on the substrate. Since the temperature profiles directly depend on the chip placement, packaging structure, and cooling conditions, they must be determined beforehand. In this paper, a simplified thermal model of an MCM, as illustrated in figure 1, is considered. The package consists of a sandwich structure formed from the ceramic multilayer substrate-epoxy adhesive-aluminum heat sink with thicknesses of 9, 0.076, 1.27 mm, respectively. Within each layer, the material is assumed to be linear, isotropic, and homogeneous. The temperature and heat flow are continuous at the interfaces between layers. Thermal conductivities of the multilayer substrate, the epoxy layer, and the heat sink are 39.4 W/mK, 0.276 W/mK and 195 W/mK, respectively. Chips are treated as heat fluxes directly from the substrate. Heat loss from substrate into the top side, the bottom side, and the edge sides of the package are quantified by heat transfer coefficients, htop, hbot, and he. The temperature distributions on the substrate and package are obtained by FLOTHERM, a commercial software package designed specifically for predicting airflow and heat transfer within electronic systems [36]. TCPT-2004-031.R1 6 htop he multilayer substrate epoxy heat sink hbot Fig. 1. Multilayer thermal model for an MCM C. Newton’s Law of Cooling and Boundary Conditions The rate of heat transfer from a convection boundary can be expressed by Newton’s Law of Cooling as q hA(T T0 ) (3) where T and T0 are the surface and the ambience temperatures, respectively; h is the convection heat transfer coefficient and A is the heat transfer surface. According to Newton’s Law of Cooling, no heat transfer from the boundary if h = 0. Hence, an insulated boundary can be defined in terms of Newton’s Law of Cooling as hib = 0. On the other hand, a perfect heat sink maintains the temperature of a boundary at a fixed level. This boundary can be defined in terms of Newton’s Law of Cooling as hhs = . Since practically the perfect insulated boundary and the perfect heat sink do not exist, it is more reasonable to set hib a very small value instead of the zero and hhs a very large but finite value instead of infinity. While the insulated boundary and the perfect heat sink are two limits of general thermal boundaries, it is reasonable to express the heat transfer coefficient of a general thermal boundary as a combination of the two extremes. That is h 1 hib hhs where 0 1. TCPT-2004-031.R1 7 (4) III. IMAGE METHOD The image method for the insulated boundary and the heat sink boundary was introduced by Dean [25]. Mathematical supplement and some applications can be seen in [25], [26]. In the following, this technique is generalized to cover the general thermal boundary condition. A. Insulated Boundary As there is no heat flow across insulated boundaries and because heat flow is proportional to temperature gradient, there is no temperature gradient normal to an insulated boundary. A perfect insulator can therefore be represented as a zero temperature gradient normal to the boundary. As a result, a plane of symmetry as shown in figure 2 can be used to replace an insulated boundary by the use of a reflected virtual source. insulated boundary q q q virtual region Fig. 2. real region (a) An infinite insulated boundary with a heat source and (b) replace the insulated boundary by the use of a reflected virtual source [23]. B. Heat Sink A perfect heat sink maintains the temperature of a boundary at a fixed level. A line of negative symmetry as shown in figure 3 also can produce a plane of constant temperature. Therefore, a plane of negative symmetry can be used to replace a perfect heat sink by using a reflected negative virtual source. heat sink -q q q vir t ua l r e gio n r e a l r e gio n Fig. 3. (a) An infinite heat sink boundary with a heat source and (b) replace the boundary by the use of a negative virtual source. TCPT-2004-031.R1 8 C. General Thermal Boundary According to the principle of superposition and (4), we are motivated to decouple a general thermal boundary as a combination of an insulated boundary and a perfect heat sink in applying the image method. That is, a general thermal boundary is replaced by the use of a reflected virtual source (1 2 )q at the symmetric position to the boundary as shown in figure 4. The physical meaning of can be explained as the heat transparency of the boundary. For a perfect heat sink heat passes through the boundary completely, thus 1 . In contrast, heat cannot pass through the insulated boundary, thus 0 . general thermal boundary q q q vir t ua l r e gio n r e a l r e gio n Fig. 4. (a) A general thermal boundary with a heat source and (b) replace the boundary by the use of a reflected heat source of (1 2 )q . In applying the image method, a rectangular substrate of several heat sources under the general thermal boundaries can be transformed into an unbounded substrate containing an infinite number of heat sources, which is depicted as a series of concentric rectangular ring substrate surround the real substrate as shown in figure 5(b). The virtual substrate at the rth row and the cth column is called the r-c-substrate. The heat source on the r-c-substrate is denoted by the superscript of (r, c). The heat dissipation of c jr,c is qj q jr,c 1 2 q j if a is even if a is odd (5) where a max( r , c ) is the index of the concentric rectangular-ring substrate. By using the image method, the geometric information of the substrate and the cooling conditions of the package are implicitly included in the unbounded substrate with reflected heat sources. Since the boundary conditions are removed from the mathematical model, this technique TCPT-2004-031.R1 9 reduces the difficulty to manage the discontinuities at boundaries. qi qj (a) rectangular ring substrate II qi qi qi qi qi qj qj qj qj qj column=-2 qj real substrate qi qi qj q i q i q j q j q i qi q j qj q i q i q j q j qj qi column=-1 qi qj column=0 qj rectangular ring substrate I qi q i q j qi qi qj qj q i qi q j qj q i q j qj qj qi column=1 qi qi row=2 row=1 row=0 row=-1 row=-2 qj column=2 (b) Fig. 5. Image method: (a) A rectangular substrate of two heat sources under general thermal boundaries and (b) an unbounded substrate with infinite heat sources. IV. THERMAL FORCE MODEL A. Thermal Force In the study, the substrate with chips is firstly transformed into an unbounded substrate by the image method. In the situation of unbounded substrate, the temperature rise of a considered chip is resulted from two sources: the heat generated by itself and that conducted from other chips (including the reflected heat sources). The temperature rise caused by the heat conducted from other chips can be reduced by enlarging the distances between the considered chip and other chips. So, the phenomena of heat transfer from other chips to the considered chip can be thought as other chips pushing the considered chip with forces, namely thermal forces [23]. If the considered chip TCPT-2004-031.R1 10 can move freely on the substrate, it will move far away from these chips and thus has a lower temperature. Since the heat flux decreases with the square of the distance from the heat source in an ) infinite body, it is reasonable to formulate the thermal force exerts on ci by c (r,c as j q (jr,c ) f i,j( r,c ) δ i,j( r,c ) ui,j( r,c ) 2 ri,j( r,c ) (6) ) where ri,j(r,c ) is the position vector from c (r,c to ci, ui,j(r,c ) is the unit vector of ri,j(r,c ) , and j 0, δ(r,c) i,j 1, if ci is fixed or c (jr,c ) ci otherwise . Expanding this formulation to cover all n chips, one obtains the net thermal force on ci to be n ( 0 ,0) a ( a,c ) ( a,c ) a 1 Fi f i,j f i,j( r,a ) f i,j( r,a ) f i,j f i,j a 1 c a r a 1 j 1 (7) Theoretically, the maximum value of a is infinite. However, since f i,j( r,c ) is an inverse measure of 2 ri,j( r,c ) , setting the maximum value of a to five is adequate [23]. A thermal-force-equilibrium (TFE) placement is a placement that every chip ci locates at a thermal-force-equilibrium position (xi, yi) and for ensuring a legal placement the position must satisfy the following geometry constraints: xi li l w w 0 , xi i Lx , yi i 0 , and yi i L y , 2 2 2 2 (8) where li and wi are the length and width of ci, and Lx and Ly are the length and width of the substrate. B. Solution of Thermal Force Equations The system of thermal force equations is sufficiently complex so that it is not possible to obtain a closed-form analytical solution. Here, a modified Newton-Raphson method is used to solve the system of thermal force equations. This method begins with a random generated initial placement. Then, a chip at a time is selected and moved to a new position by TCPT-2004-031.R1 11 F /F ' xi(new) xi i , x i ,' x F /F y i(new) y i i, y i, y (9) where is a positive parameter; Fi ,x' and Fi ,' y are the partial derivations of Fi , x and Fi , y to xi and yi, respectively; i is from 1 to n. The above procedure iteratively proceeds until Norm 1 2n 2 F n i,x Fi,y (10) i 1 is below 0.0001. The stop criterion is obtained according to experimental results. Note that a TFE placement attained depends on the starting random initial placement. In this work, different random initial placements, therefore, were tried on the same problems. The final placements attained differed in the configuration but the final values of thermal properties and system failure rate in all these cases were very close to each other under the stopping criterion. C. Range of In the cases of 0.5 , the heat dissipations of virtual chips in the rectangular ring substrate I are negative or 0. Since virtual chips with negative heat dissipations give attractions to the real chips, some real chips’ TFE positions will locate at the image substrate, but the geometry constraints confine the chip positions must be in the real substrate, thus the program can not converge to the stop criterion. So, for obtaining a convergent solution is limited to be less than 0.5. D. Ill-Conditioning and A thermal placement problem is ill-conditioned when small position changes of chips produce large thermal forces changes exerted on these chips. Consider a chip ci to be placed very close to the left boundary of the substrate such that the thermal force caused by ci( 1, 0 ) dominates the net thermal force exerted on ci. The net thermal force exerts on ci therefore can be approximated as Fi ,x 1 2 qi 2 xi 2 where xi denotes the distance from the left boundary to ci. TCPT-2004-031.R1 12 (11) The partial derivative of Fi , x to xi is Fi,' x 41 2 qi (12) 2 xi 3 So, the correction distance of ci in x-direction is x i Fi , x Fi ,' x 0.5xi (13) Notice that Fi , x is reciprocal of the square of xi. If xi is small, the position change of 0.5 xi could produce large change of Fi , x except for a small value of . Conventionally, is set to be a constant value during the entire iterative procedure. For example, is 0.5 in [23], [29]. However, for an ill-conditioned thermal placement problem must be very small for chips nearby the boundaries. But, a small also slows down the convergent speed for chips far away from the boundaries. So, given a constant value is improper to ill-conditioned problems. A better strategy is to give different chips different values depending on their positions. A better formula obtained by experimental results is 2z 2 z 0.5 1 1 i Lz (14) where z is either x or y. Figure 6 shows z versus zi / Lz. The effects of (14) will be discussed in Section V. 0.5 0.4 0.3 z 0.2 0.1 0 0 0.2 Fig. 6. TCPT-2004-031.R1 0.4 z i / Lz 0.6 z versus zi/Lz 13 0.8 1 V. EXAMPLES AND COMPUTATIONAL RESULTS A. Description of Test Cases Eight MCMs of the thermal model as shown in figure 1 were used to test the present method and provide some insight into the method. For all MCMs, the dimensions of chips are 5mm square and the substrates are also square but different cases have different dimensions. Some MCM information is shown in Table I. The number in the module name points out the chip count in this module. The eight MCMs can be partitioned into two categories: four MCMs are chips of uniform heat dissipation, and the other four are chips of various heat dissipations. TABLE I MCM INFORMATION MCM No. of chips Edge length (mm) Heat dissipations (W) of chips M8A 8 30 1W×8 M8B 8 30 0.6 W, 0.7 W, 0.8 W, 0.9 W, 1.0 W, 1.1 W, 1.2 W, 1.3 W M16A 16 40 1W×16 M16B 16 40 1.5W×4, 1W×8, 0.5W×4 M16C 16 40 0.3 W, 0.4 W, 0.5 W, …, 1.8 W M29A 29 50 2.3W×29 M29B 29 50 3W×12, 2.7W×2, 2.5W×4, 1.6W×8, 1.3W×2, 0.7W×1 M49 49 50 1W×49 For each MCM, the present method generates a series of TFE placements from = 0, 0.05, 0.1, …, until no converge solutions can be obtained. For convenience P c denotes the TFE placement of = c. Then, for each P , the FLOTHERM software solves the temperature distributions with system failure rates at five different cooling conditions, respectively. The five cooling conditions have the same values of htop = 32.6 W/m2 K and hbot = 194 W/m2 K, but have TCPT-2004-031.R1 14 various values of he to be 0, 3.27, 32.6, 194, and 832 W/m2 K corresponding to insulated, natural convection, forced air convection, jet impingement air forced convection, and forced water convection conditions, respectively. Table II shows the percentage of heat loss conducted from the package edges for deferent MCMs under different cooling conditions. Here, the chip placement is obtained by setting = 0. Clearly, as he increases, the percentage of the heat loss conducted from the package edges is also increased. For the cases of he 32.6 W/m2 K, less than 20% of heat loss flows from the package edges, so the package can be considered as face-cooled; for the case of he = 194 W/m2 K, about 50% of the heat loss flows from the package edges, so the package can be considered as volumetric-cooled; for the case of he = 832 W/m2 K, more than 75% of the heat loss flows from the package edges, so the package can be considered as edge-cooled. Obviously, our experiences cover a wide range of cooling conditions imposed on the package by its environment. TABLE II PERCENTAGE OF HEAT LOSS CONDUCTED FROM PACKAGE EDGES he (W/m2 K) MCM 0 3.27 32.6 194 832 M8A 0 2% 17.3% 55.5% 84.3% M8B 0 2% 17.3% 55.5% 84.2% M16A 0 1.5% 13.5% 48% 79.1% M16B 0 1.5% 13.5% 48% 79.2% M16C 0 1.5% 13.5% 48% 79.2% M29A 0 1.2% 11.2% 42% 75% M29B 0 1.2% 11.2% 42% 75% M49 0 1.2% 11.2% 42% 75% B. Runtimes TCPT-2004-031.R1 15 The present algorithm has been implemented in C language, and runs on a 2.8GHz Pentium IV personal computer. Table III shows the runtimes of the tested examples at the cases of setting = 0.5 and (14), respectively. For each case the program runs twenty times and for each time it begins with a random initial placement. The runtime in Table III is the average value of the twenty trials. The term ‘Illed’ in the table denotes the thermal placement problem at this value is ill-conditioned and the program frequently is not converged to the stopping criterion. Besides, runtimes verse for M29B and M49 are also depicted in figure 7. TABLE III RNUTIMES (seconds) MCM M8A M8B M16A M16B M16C M29A M29B M49 0 0.05 0.1 0.15 0.2 0.25 =0.5 0.173 0.143 0.177 0.114 0.188 0.159 Eq.(14) 0.302 0.302 0.198 0.164 0.169 0.193 =0.5 0.156 0.180 0.175 0.141 0.168 Illed Eq.(14) 0.155 0.206 0.182 0.18 0.227 0.219 =0.5 0.92 0.83 0.85 0.78 0.78 0.89 Eq.(14) 1.07 1.19 1.11 1.02 1.10 1.18 =0.5 1.5 1.79 0.97 1.54 Illed Illed Eq.(14) 2.88 2.33 2.50 2.33 1.94 2.08 =0.5 1.88 2.13 2.64 2.99 Illed Illed Eq.(14) 1.79 2.69 2.14 2.67 2.86 2.51 =0.5 15.2 15.3 18.9 26.3 46.5 24.2 Eq.(14) 17.7 17.7 17.6 21.8 20.2 8 =0.5 10.1 10.2 10.3 11.0 19.6 Illed Eq.(14) 17.7 16.1 15.8 13.8 13.3 11.9 =0.5 39.4 43.8 48.4 50.2 55.3 60.7 Eq.(14) 65.1 60.5 53.2 52.4 51.8 43.1 Basically, the runtimes in = 0.5 usually are less than the runtimes in defined by (14) for the cases of 0.15 . But, for large size problems the runtimes in = 0.5 usually are larger than the TCPT-2004-031.R1 16 runtimes in defined by (14) for cases of 0.15 . In addition, the runtimes in = 0.5 increase as increases, and finally the program can not converged to the stopping criterion due to the ill-conditioning property of the problem. By contrast, the runtimes in defined by (14) decrease as increases. So, formula (14) significantly surmounts the ill-conditioning of the high problems, but it also increases the runtimes for the low problems. A better formula, therefore, is a combination of = 0.5 and (14). That is 0.5, 2z z 0.5 1 1 i Lz for 0.15 2 , (15) for 0.15 where z is either x or y. 70 M49 60 runtimes (s) 50 40 30 = 0.5 = Eq.(14) M29B 20 10 0 0.05 0.1 0.15 0.2 0.25 Fig. 7. Runtimes verse C. Spreading Number There are two mechanisms by which the thermal forces may be balanced. The first one is by changing the distances among chips. As an example, figures 8(a) to (c) show the placements of M8A at various . As increases, the heat dissipations of virtual chips in the rectangular ring substrate I decrease, thus these virtual chips give the real chips less push forces, the real chips in the TCPT-2004-031.R1 17 substrate border therefore are placed closer to the boundary. The second mechanism to balance the thermal forces is to rearrange the positions of chips. An example is shown in figures 9(a) through (c). It can be observed that as = 0 increases to = 0.1, the 1-watt chip and the 0.7-watt chip swap their relative positions, and the 0.6-watt chip and 1.2-watt chips swap their relative positions. As = 0.1 increases to = 0.2, more chips exchange their relative positions. 1 1 1 1 1 1 1 (a) 1 1 1 1 1 1 1 1 1 (b) 1 1 (c) TFE placements with chips’ power (W) for M8A: (a) 0 , (b) 0.2 , and (c) 0.4 Fig. 8. 0.7 1 1.2 0.8 1.3 0.9 0.6 1.1 0.7 1 0.6 0.8 1.3 0.9 (a) Fig. 9. 1 1 1 1 1 1 1.2 (b) 1.1 0.7 1.3 1 0.9 0.8 1.1 1.2 0.6 (c) TFE placements with chips’ power (W) for M8B: (a) 0 , (b) 0.1 , and (c) 0.2 . No matter what the cases are uniform chips or non-uniform chips, the chips in a high TFE placement are placed more separately than the ones in a low TFE placement. A spreading number is used to measure the separateness of the chips placed on a TFE placement. The definition of is TCPT-2004-031.R1 2d av Lx Ly 18 (16) where Lx and Ly are the length and width of the substrate, and 2 n n d av d ij n(n-1) i 1 j i 1 (17) is the average distance between any two chips, where dij is the distance between ci and cj. Figures 10(a) and (b) show the relationships between and for the cases of uniform and the non-uniform chips, respectively. For the cases of uniform chips, is almost proportion to due to only the first force-balance mechanism is effective; for the cases of non-uniform chips, as increases, increases with jagged shape due to both force-balance mechanisms are effective. These results also show that the ill-conditioning of a thermal placement problem directly depends on its value. It can be observed that chips are placed closer to the border of the substrate as increases in figures 8 and 9. So, a higher thermal placement problem is intrinsically more ill-conditioned than a lower thermal placement problem. 0.7 0.75 0.7 0.65 0.65 0.6 0.6 0.5 0 0.55 M8A M16A M29A M49 0.55 0.1 0.2 0.3 0.4 0.5 0 0.5 (a) Fig. 10. M8B M16B M16C M29B 0.1 0.2 0.3 0.4 (b) versus : (a) MCMs of uniform chips and (b) MCMs of non-uniform chips D. Thermal Properties Since no attraction but only repulsion forces exist in the thermal force model, a TFE placement has the important property that chips are placed apart to abound with the substrate; hence a TFE placement can have a cooler and even temperature profile. Figure 11 is a line plot for showing thermal properties of the TFE placements obtained by various ’s for M29B under different TCPT-2004-031.R1 19 cooling conditions. Notice that this example is the worst case that exhibits the largest temperature ranges in all examples tested. In the figure, the upper and the lower bars of a ‘ ’ denote the highest and the lowest chip temperatures of the corresponding TFE placement, respectively; the solid lines indicate the average temperatures of chips for different cooling conditions. In addition, the average temperature Tav and the range of temperature T of a TFE placement are indexed by numbers above or below ‘ ’. 160 140 120 Tav 153.4 (4.1) 153.9 (4.0) 153.6 (4.2) 153.8 (4.6) 154.8 (5.1) he = 0 he = 32.6 140.6 (4.3) 139.7 (4.3) 139.5 (4.2) 139.6 (4.2) 140 (3.9) 140 (4.0) 99.8 (6.6) 99.3 (7.0) 99.9 (6.6) 99.4 (6.3) 100.1 (5.3) 100.1 (4.8) he = 194 58.0 (9.4) 57.3 (10.4) 57.8 (10.0) 57.1 (9.8) 57.4 (9.2) 57.7 (8.1) he = 832 0.15 0.2 0.25 100 80 154.8 (4.9) 60 0 0.05 0.1 0.3 Fig. 11. Tav (oC) and T (in parenthesis, oC) versus for M29B at various cooling conditions. It is interesting to see that Tav’s are almost independent of . There are two reasons for this property. First, a TFE placement always places chips apart and heat flux decreases with the square of the distance from the heat source, so if each chip has enough substrate area to transport heat dissipated within the chip, enlarging the distances between the chip and its neighbors farther can not significantly reduce the chip’s temperature. Second, the decrease of one chip’s temperature by interchanging the positions of the chip with another chip always causes a temperature rise of the chip interchanged. So, Tav is also insignificantly different by interchanging the chips positions. Another important observation is that the TFE placements under edge-cooled condition have higher T than the same placements under face-cooled condition. For giving more insight of this phenomenon, power and temperature distributions of P 0 and P 0.25 under a face-cooled TCPT-2004-031.R1 20 condition (i.e. he = 0) and the edge-cooled condition (i.e. he = 832 W/m2 K) for M29B are shown in figures 12 (a), (b), and figures 13 (a), (b), respectively. Several observations are apparent from these figures. First, the chips are evenly placed on the substrate in P 0 . By contrast, the chips are not evenly placed on the substrate in P 0.25 , in which the chips placed in the central region are sparser than the chips placed around the border of the substrate. Second, in P 0 , the temperature is evenly distributed across the substrate on a face-cooled package, but the temperature distributions in the central region are hotter than the temperature distributions around the border on an edge-cooled package. In P 0.25 , chips in the central region are slightly cooler than those around the border on an edge-cooled package. On the contrary, the chips in the central region are hotter than those around the border on an edge-cooled package. From the above observations, one can conclude that a TFE placement is also an even temperature placement on a face-cooled package. However, the conclusion is not true for a TFE placement on an edge-cooled package. It is obvious to see this. For an edge-cooled package, heat conducts mainly from the sidewalls of the package. Since heat always flows from hotter regions to cooler regions, the chips on the central region must have higher temperature than the chips around the border such that the heat generated by the chips on the central region can be conducted to the boundaries. (a) Fig. 12. (b) Heat (W) and temperature distributions (oC) of P 0 for M29B: (a) he 0 W/m2 K and (b) he 832 W/m2 K. TCPT-2004-031.R1 21 (a) Fig. 13. (b) Heat (W) and temperature distributions (oC) of P 0.25 for M29B: (a) he 0 W/m2 K (b) he 832 W/m2 K E. Relative System Failure Rate Figures 14 (a) through (h) compare various ’s on the basis of the relative system failure rate versus . The relative system failure rate is defined as μ P P 0 (18) In general, as increases, μ also increases for a face-cooled package, but decreases for an edge-cooled package. For a volumetric-cooled package, as increases, μ decreases firstly to a minimum value, then μ increases. To achieve the most reliable placement, the best values of are in the ranges of 0 0.05 and 0.1 0.25 for the face-cooled and the volumetric-cooled packages, respectively. For an edge-cooled package, the best values of are the largest values of that can generate convergent TFE placements. In addition, the results also show that given a package an insulated edges assumption in a thermal placement problem usually is reasonable except the package is edge-cooled, since μ0 is less 5% relative failure rate higher than the best solution except for an edge-cooled package. TCPT-2004-031.R1 22 1.05 1. 0 5 1 1 () () 0.95 0. 9 5 he = 0 he = 3.27 he = 32.6 he = 194 he = 832 0.9 0 0.1 0.2 0.3 he he he he he 0. 9 0 0.4 = = = = = 0 3. 27 32. 6 194 832 0. 1 0. 2 (a) 1. 0 5 1 1 () () 0.9 0 0. 4 (b) 1.05 0.95 0. 3 0. 9 5 he = 0 he = 3.27 he = 32.6 he = 194 he = 832 0.05 0.1 0.15 0.2 0.25 0. 9 0 0.3 he he he he he = = = = = 0 3. 27 32. 6 194 832 0. 0 5 0. 1 (c) 0. 1 5 0. 2 0. 2 5 (d) 1. 1 1.05 1. 0 5 1 () () 0.95 0.9 0 1 0. 9 5 0. 9 he = 0 he = 3.27 he = 32.6 he = 194 he = 832 0.05 0. 8 5 0.1 0.15 0.2 0. 8 0 0.25 he he he he he = = = = = 0 3. 27 32. 6 194 832 0. 0 5 (e) 0. 1 0. 1 5 0. 2 0. 2 5 (f) Fig. 14. μ versus for various cooling conditions: (a) M8A, (b) M8B, (c) M16A, (d) M16B, (e) M16C, and (f) M29A. (Continued) TCPT-2004-031.R1 23 1.05 1.1 1.05 () 1 1 () 0.95 0.9 0.85 0.8 0 0.95 he = 0 he = 3.27 he = 32.6 he = 194 he = 832 0.05 0.1 0.15 0.2 0.9 0 0.25 he he he he he =0 = 3.27 = 32.6 = 194 = 832 0.05 0.1 (g) 0.15 0.2 0.25 (h) Fig. 14. μ versus for various cooling conditions: (g) M29B and (h) M49 VI. CONCLUSION In this paper, an elegant thermal force model is proposed to deal with the thermal placement problem. The new model improves the previous thermal force model that is based on the edges at insulated edge condition to cover the edges at general cooling condition by controlling the heat transparency of edges. Eight MCMs, each of five different edge cooling conditions, are used to examine the present method and provide some insight into the method. The main conclusions are: 1. A TFE placement has the desirable feature that chips are placed apart to abound with the substrate. So, placements with a cooler and even thermal profile can be obtained. 2. By managing , the present method can generate a series of TFE placements proper to situations from face-cooled to edge-cooled packages, respectively. In general, in obtaining the reliability-best placement, the best values of are in the ranges of 0 0.05 and 0.1 0.25 for the face-cooled and the volumetric-cooled packages, respectively. For an edge-cooled package, the best values of are the largest values of that can generate convergent TFE placements. 3. Except for an edge-cooled package, treating a substrate with an assumption of insulated edges is acceptable in the thermal placement problem, since the experimental results show that the TCPT-2004-031.R1 24 obtained solution is no more than 5% in system failure rate higher than the solution considering the practical edge cooling conditions. ACKNOWLEDGMENT This work was supported by the National Science Council, Taiwan, R.O.C. under contract no. NSC92-2218-E-218-015. The author wishes to thank Dr. J.-H. Chou for his valuable comments and suggestions concerning this paper. REFERENCES 1. L. L. Moresco, “Electronic system packaging: The search for manufacturing the optimum in a sea of constraints,” IEEE Trans. Comp. Hybrids. Manufact. Technol., vol.13, no. 3, pp. 494-508, 1990. 2. N. Sherwani, Algorithms for VLSI Physical Design Automation, 3rd ed. Boston, MA: Kluwer, 1999. 3. T. Kam, S. Rawat, D. Kirkpatrick, R. Roy, G. S. Spirakis, and N. Sherwani, “EDA challenges facing future microprocessor design,” IEEE Trans. Computer-Aided Design, vol. 19, no. 12, pp. 1498-1506, 2000. 4. S. V. Garimella, Y. K. Joshi, A. Bar-Cohen, R. Mahajan, K. C. Toh, V. P. Carey, M. Baelmans, J. Lohan, B. Sammakia, and F. Andros, “Thermal challenges in next generation electronic systems- summary of panel presentations and discussions,” IEEE Trans. on Comp. Packag. Technol., vol. 25, no. 4, pp. 569-575, 2002. 5. M. I. Campbell, C. H. Amon, and J. Cagan, “Optimal three-dimensional placement of heat generating electronic components,” Trans. ASME J. Electron. Packag., vol. 119, pp. 106-113, 1997. 6. H. Wong and T.-Y. T. Lee, “Thermal evaluation of a PowerPC 620 microprocessor in a multiprocessor computer,” IEEE Trans. Comp. Packag. Manufact. Technol., Part A, vol.19, no. 4, pp. 469-477, 1996. TCPT-2004-031.R1 25 7. M. Pecht, M. Palmer, and J. Naft, “Thermal reliability management in PCB design,” in Proc. 1987 Annu. Rel. and Maintainability Symp., 1987, pp. 27-29. 8. D. Dancer and M. Pecht, “Reliability optimization technique for convectively electronics,” IEEE Trans. on Rel., vol. 38, no. 2, pp. 199-205, 1989. 9. M. Osterman and M. Pecht, “Component placement for reliability on conductivity cooled printed wiring boards,” Trans. ASME J. Electron. Packag., vol. 111, pp. 149-156, 1989. 10. R. Eliasi, T. Elperin, and A. Bar-Cohen, “Monte Carlo thermal optimization of populated printed circuit board,” IEEE Trans. Comp. Hybrids, Manufact. Technol., vol. 13, no. 4, pp. 953-960, 1990. 11. M. D. Osterman, “A physics of failure approach to component placement,” Trans. ASME J. Electron. Packag., vol. 114, pp. 305-309, 1992. 12. J. Lee, J. H. Chou, and S. L. Fu, “Reliability and wireability optimizations for module placement on convectively cooled printed wiring board,” Integration, VLSI J., vol. 18, no. 2&3, pp. 173-186, 1995. 13. N. V., Queipo and G. F. Gil, “Multiobjective optimal placement of convectively and conductively cooled electronic components on printed wiring boards,” Trans. ASME J. Electron. Packag., vol. 122, pp. 152-159, 2000. 14. R. Cole, T. Dalton, J. Punch, M. R. Davies, and R. Grimes, “Forced convection board level thermal design methodology for electronic systems,” Trans. ASME J. Electron. Packag., vol. 123, pp. 120-126, 2001. 15. A. Kos, “An approach to thermal placement in power electronics using neural networks,” in Proc. Intern. Symp. Circuits Syst., 1993, pp. 2427-2430. 16. C.-X. Zhang, “Timing-, heat-, and area-driven placement using self-organizing semantic maps,” in Proc. of Intern. Symp. Circuits Syst., 1993, pp. 2067-2070. 17. J. Lee and J. H. Chou, “Hierarchical placement for power hybrid circuits under reliability and wireability constraints, ” IEEE Trans. on Rel., vol. 45, no. 2, pp. 200-207, 1996. TCPT-2004-031.R1 26 18. G. Chen and S. Sapatnekar, “Partition-driven standard cell thermal placement,” in Proc. Intern. Symp. Physical Design, 2003, pp. 75-80. 19. C. N. Chu and D. F. Wong, “A matrix synthesis approach to thermal placement, ” IEEE Trans. Computer-Aided Design, vol. 17, no. 11, pp. 1166-1174, 1998. 20. M. C. Tang and J. D. Carothers, “Consideration of thermal constraints during multichip module placement,” Electron. Lett., vol. 33, no. 12, pp. 1043-1045, June 1997. 21. C. Beebe, J. D. Carothers, and A. Ortega, “MCM placement using a realistic thermal model,” in Proc. 10th Great Lakes Symp. VLSI, 2000, pp. 189-192. 22. C.-H. Tsai and S.-M. Kang, “Cell-level placement for improving substrate thermal distribution,” IEEE Trans. Computer-Aided Design, vol. 19, no. 2, pp. 253-266, 2000. 23. J. Lee, “Thermal placement algorithm based on heat conduction analogy,” IEEE Trans. Comp. Packag. Technol., vol. 26, no. 2, pp. 473-482, 2003. 24. Y.-J. Huang and S.-L. Fu, “Thermal placement design for MCM applications,” Trans. ASME J. Electron. Packag., vol. 122, pp. 115-120, 2000. 25. D. J. Dean, Thermal Design of Electronic Circuit Boards and Packages, Edinburgh, Scotland: Electrochemical Publications, 1985. 26. A. L. Palisoc and C. C. Lee, “Exact thermal representation of multilayer rectangular structures by infinite plate structures using the method of images,” J. Appl. Phys. vol. 12, no. 64, pp. 6851-6857, 1988. 27. C. Blum and A. Roli, “Metaheuristics in combinatorial optimization: overview and conceptual comparison,” ACM Computing Surveys, vol. 35, no. 3, pp. 268-308, 2003. 28. J. C. Spall, Introduction to Stochastic Search and Optimization, New Jersey: John Wiley & Sons, 2003. 29. N. Quinn and M. Breuer, “A forced directed component placement procedure for printed circuit boards,” IEEE Trans. Circuits Syst., vol. 26, no. 6, pp. 377-388, 1979. 30. M. D. Osterman and M. Pecht, “Placement for reliability and routability of convectively cooled TCPT-2004-031.R1 27 PWB's, ” IEEE Trans. Computer-Aided Design, vol. 9, no. 7, pp. 734-744, 1990. 31. H. Eisenmann and F. M. Johannes, “Generic global placement and floorplanning,” in Proc. ACM/IEEE Design Automation Conf., 1998, pp. 269-274. 32. F. Mo, A. Tabbara, and R. K. Brayton, “A force-directed marco-cell placer,” in Proc. Intern. Conf. Computer-Aided Design, 2000, pp. 177-180. 33. R. E. Burkard and E. Cela, “Linear assignment problems and extensions,” in Handbook of Combinatorial Optimization, Supplement Volume A, D.-Z Du and P. M. Pardalos, Eds., Netherlands: Kluwer Academic Publishers, 1999, pp. 75-149. 34. P. Lall, M. Pecht, and E.B. Hakim, Influence of Temperature on Microelectronics and System Reliability, Orlando, FL: CRC Press, 1997. 35. K. Banerjee, A. Mehrotra, A. Sangiovanni-Vincentelli, and C. Hu, “On thermal effects in deep sub-micron VLSI interconnects,” in Proc. ACM/IEEE Design Automation Conf., 1999, pp. 885-891. 36. Flomerics, http://www.flotherm.com TCPT-2004-031.R1 28