Thermal Placement Algorithm Based on Heat Conduction Analogy Jing Lee Department of Electronic Engineering Southern Taiwan University of Technology Tainan, Taiwan 701, R.O.C. Email: leejing@mail.stut.edu.tw Abstract A thermal force-directed placement algorithm, called TFPA, based on heat conduction analogy, is proposed for MCM design. TFPA begins with the transformation of the real substrate with chips into an unbounded substrate with an infinite number of chips. Then, each chip pushes every other chip with a force based on the heat conduction analogy. Thus, each chip will move in the direction of the force until the system achieves equilibrium. TFPA generates high quality placement results and maintains a cooler and uniform thermal profile, by distributing chip powers as evenly as possible. Unlike conventional force-directed algorithms, which might have serious components overlapping problems, TFPA places chips apart and only little or even no overlap occurs. In practice, the initial placements obtained by TFPA are very close to final placements. Index Terms—Physical design, reliability, thermal placement, MCM. I. Introduction The trend in microelectronic packaging continues to be towards greater packaging density, speed, and higher power dissipation in both chip packages and printed circuits boards (PCB). In addition, these chips are frequently packaged in close proximity to each other in multichip modules (MCM) in order to minimize interchip propagation delays and the packaging volume. However, this trend results in higher heat flux densities at the substrate. There higher operating temperatures can occur if the dissipated heat is not properly removed. A higher temperature not only affects circuit performance directly by slowing down the transistors on chips, but also decreases their reliability. It is well known that most of the physical and chemical processes that can cause component failure are usually accelerated at elevated temperatures [1]. In addition, the unevenly distributed power dissipation of chips on TCPT-2002-019.R1 1 a substrate may produce hot spots, which can induce thermal stresses. When the stresses are severe enough and go through enough cycles, they can cause the chips to fail, usually by rupturing of the solder bumps [2]. Since in many MCMs, thermal management is the most important reliability factor, it should be considered as early as possible in the overall packaging process. The best opportunity may be during the chip placement stage, because the temperature distributions directly depend on the results of chip placement. It is conceivable that a placement tool without thermal considerations could place some chips with high heat dissipation closely spaced together. This would result in a hot spot on the substrate, even though the total power consumption is constrained. Historically, placement techniques have been developed primarily on the basis of routability. Sherwani [3] provides a summary of the classic techniques. These algorithms typically concentrate on minimizing total net length, while others focus on minimizing wire crossovers and vias [4]. However, with the increased demand for high quality and long-term reliable performance, techniques are developed to address placement for reliability. Some studies have focused on reliability improvement for power hybrid circuits (PHC) [5], for convectively cooled PCB [6-11], for VLSI [12-13], and for MCM [14-17]. In addition, placements for both reliability and routability considerations are presented for convectively cooled PCB [18-21] and for PHC [22]. Force-Directed Algorithms (FDA) have been widely used for routability optimization placement on PCB and VLSI [4, 23, 24]. FDA is based on the model that each component exerts forces of attraction on components connected by signal nets, and repulsive forces are used to keep components apart for those which are not connected. The conventional FDAs obviously are unsuitable for thermal placement problem (i.e. placement for reliability). Recently, a thermal FDA based on a fuzzy model has been introduced to manage the thermal placement problem for MCM by Huang and Fu [16]. In their fuzzy model, the repulsive force exerted on any two chips is proportional to the power dissipation product of the two chips, and is decreased as the distance between the two chips is increased. In this paper, a new force-directed placement algorithm is presented. The presented thermal force model is based on heat conduction analogy, so it is very suitable for solving the thermal placement problems for MCMs and hybrid circuits. The rest of this paper is organized as follows: formulation of the approach is described in Section 2, examples with TCPT-2002-019.R1 2 computational results are given in Section 3, and conclusions are drawn in Section 4. II. FORMULATION OF THE APPROACH Let d1, d2, …, dm be the chips to be placed on the substrate of an MCM, and (xi, yi) be the Cartesian coordinates for di. The thermal placement problem is to determine the locations of chips such that the system reliability can be optimized. This problem is considered to be NP-complete in the sense of Cook and Karp [25]. Hence, some heuristic procedure should be applied in order to obtain a nearly optimum solution in reasonable computation time. There are, in general, two types of heuristic methods for this kind of problem. One is a constructive method, which obtains an initial solution using heuristic rules, often in sequential, deterministic manner. The other is an iterative improvement method, which improves a solution by means of local transformations. The algorithm proposed here is a constructive algorithm. The thermal placement problem is managed by two stages. First, the real substrate with chips is transformed into an unbounded substrate with an infinite number of chips by multiple reflection technique. Next, a thermal force-directed placement algorithm is used to determine the locations of chips. A. Multiple reflection technique For an MCM, the heat loss from the sidewalls of the substrate is insignificant compared to the top and bottom sides of the substrate. Therefore, the sidewalls of the substrate can be treated as perfect insulators. As there is no heat flow across insulated boundaries and because heat flow is proportional to temperature gradient, there is no temperature gradient normal to an insulated boundary. A perfect insulator can therefore be represented as a zero temperature gradient normal to the boundary. As a result, a plane of symmetry can be used to replace an insulated boundary by the use of a reflected mirror image source as shown in Fig. 1. A rectangular substrate of several heat sources with four insulated boundaries can be similarly transformed into an unbounded substrate containing an infinite number of mirror image heat sources. The new configuration generated by this transformation has the property that it leaves the unchanged temperature distribution within the original region. An example including two heat sources is shown in Fig. 2. The mirror image substrate at the rth row and the cth column, which is shown in Fig. 2, is TCPT-2002-019.R1 3 called the r-c-substrate. The location of dj located in the r-c-substrate is denoted by ( x (r,jc) , y (r,j c) ), and 1 L x (r,j c) (c )L ( 1)c (x j ) 2 2 (1) 1 W y (r,j c) (r )W ( 1) r (y j ) 2 2 (2) where L and W are the length and the width of the substrate, respectively. If dj locates in the real substrate (i.e. 0-0-substrate), we always neglect the superscript and simply denote its location as (xi, yi). mirror image source Insulated boundary heat source heat source q q q Mirror image region (a) Fig. 1. Physical region (b) (a) An infinite insulated boundary with a heat source; (b) replace the insulated boundary by the use of a reflected mirror image source. Y qi qi qi qi W qi qi qj qj qj qj qj qj qj qj qj qj qj qi qi qi qi qi qi qi qi qi qi qj qj qj qj qi qi qi qi qi qi qi qi qi qi L qj qj qj qj column=0 (a) Fig. 2. qj W column=-2 column=-1 L qj column=1 qj qj qj qj qj row=2 row=1 row=0 row=-1 X row=-2 column=2 (b) Multiple reflection technique. (a) A rectangle region with insulated boundaries; (b) An unbounded region with infinite mirror heat sources TCPT-2002-019.R1 4 Mathematically, the discontinuities at boundaries constitute the major impediment to the prediction of temperatures. By using the multiple reflection technique, the boundaries are implicitly included in the unbounded substrate; thus, they are removed from the solution procedure. It is frequently easier to solve the temperature distribution on the unbounded substrate than the previous one with finite domains. Some previous studies have used the multiple reflection technique with the principle of superposition to predict the temperatures of electronic devices [26-28]. B. Thermal force model In an unbounded body, heat flows everywhere from every heat source, and the heat flux decreases with the square of the distance from the heat source in the steady state condition. The present thermal force model is an analogue of the above heat conduction mechanism. Every chip dj pushes every other chip di with a force that is directly proportional to the heat dissipation of dj and inversely proportional to the square of the distance separating them. This statement can be expressed as f i,(r,jc) i qj (3) ( ri,(r,jc) ) 2 where f i,(r,jc) is the force exerts on di in the real substrate by dj in the r-c-substrate; i is the thermal sensitivity factor of di, i 1 for normal chips; i 1 for chips that are sensitive to temperature qj is the heat dissipation of dj ri,(r,j c) = (x (r,j c) x i )2 (y (r,j c) yi )2 (4) f i,(r,jc) can be broken down into the components of c) fx (r, i, j i qj ( r (r, c) 2 i, j ) c) cos(θ (r, i, j ) (5) c) sin( θ (r, i, j ) (6) and c) fy (r, i, j i where TCPT-2002-019.R1 5 qj ( r (r,c) 2 i, j ) θ (r, c) i, j y (r,j c) y i atan2 (r, c) x x i j (7) Expanding this formulation to cover all m chips, one obtains a 1 ( a, c) a Fz i fz i,(a,jc) fz i,(r,ja) fz i,( j r, a) fz i, j fz i, j a 1 c a r a 1 j1 m (8) where z is x or y. In theory, the maximum value of ‘a’ in formula (8) must be infinity. However, since fzi,(r,jc) is proportional to 1 ( ri,(r,jc) ) 2 , only a small integer bound of ‘a’ is enough. As can be seen from appendix, the maximum value of ‘a’ of five is adequate. That is, the real substrate is translated into a region with the area of 121 times of the real substrate. C. Thermal force placement algorithm The set of simultaneous equations of (8) is solved by setting Fzi to zero and solving for zi. A modified Newton-Raphson (NR) method is used to solve this system of equations to find the correct positions of chips. For applying the modified NR method, the partial derivatives of Fz i is a 1 ' ' ( a, c) a F' z i f ' z i,(a,jc) f ' z i,(r,ja) f ' z i,( j r, a) f z i, j f z i, j a 1 c a r a 1 j1 m (9) where c) f ' x (r, i, j c) dfx (r, i, j c) f ' y (r, i, j dx i c) dfy (r, i, j dy i i qj ( ri,(r,jc) ) 3 i (10) (11) c) 3cos 2 (θ (r, i, j ) - 1 qj ( ri,(r,jc) ) 3 c) 3sin 2 (θ (r, i, j ) - 1 Then, the moving distance of di is chosen as z i 0.5 Fz i F' z i (12) The factor of 0.5 in equation (12) is taken because any two chips push each other, hence only 0.5 times the designated distance need to be moved. The new location of (xi, yi) after each iteration is TCPT-2002-019.R1 6 z i new z i z i (13) However, it is unreasonable if a new position zi(new) is in the outside region of the real substrate. To avoid the unreasonable condition, z i is modified as follows: if z i z i in equation (13) is out of range, the value of z i is decreased to half of its previous value. The procedure is iterated until z i z i is in the real substrate. procedure TFPA(Q, A, X, Y, L, W, ε1, ε2) // Q={qi︱1≦i≦m}, A = { i ︱1≦i≦m }, (X, Y)={(xi, yi)︱1≦i≦m } // // L and W are the length and the width of the substrate. // // ε1 and ε2 are two stopping criterion values. // begin do Random(X, Y) // 0 < xi < L, 0 < yi < W // Norm←a large value do Old_Norm←Norm Norm←0 for i←1 to m ( Fx i , Fy i )← Equation (8) ( F ' x i , F' y i )← Equation (9) ( x i , y i )←0.5 ×( Fy i Fx i , ) F' x i F' y i while xi+ x i L or xi+ x i 0 do x i ← x i / 2 repeat while yi + y i W or yi + y i 0 do y i ← y i / 2 repeat (xi, yi)←(xi+ x i , yi + y i ) Norm← Fx i Fyi repeat while Old_Norm Norm Norm ε1 while Norm ε 2 output(X, Y) end Fig. 3. TCPT-2002-019.R1 Procedure of TFPA 7 The present method, which is called the Thermal Force Placement Algorithm (TFPA), is derived in Fig. 3. TFPA begins with generating a random initial placement. New placements are iteratively computed by modified NR method until a convergent solution is obtained. For properly defining a convergent solution, set Norm Fx i Fy i m (14) i 1 Old_Norm denotes the Norm of the previous placement. ε1 and ε2 are two stopping criterion values, which are two small integer values defined by users. The optimum values of ε1 and ε2 need some trials, but if setting both ε1 and ε 2 0.01, the placements obtained only have little difference. Thus, for giving a comparison basis, both ε1 and ε2 are set to 0.01 in the present study. In the situation of Old_Norm Norm ∕ Norm ε 1 , the current placement is obviously not converged yet, and TFPA proceeds to obtain a new placement. In the situation of Old_Norm Norm ∕ Norm ε 1 but Norm ε 2 , the current placement is trapped at a local optimum and cannot proceed further. Thus, TFPA generates a new random initial placement and executes the whole procedure again. In the situation of Old_Norm Norm ∕ Norm ε 1 and Norm ε 2 , the placement is considered as convergent, and TFPA terminates. D. Algorithm Validation In TFPA, thermal force Fzi ’s are analogue of heat flux in an unbounded region. So, F' z i ’s are equivalent to the gradient of heat flux. Since the gradient of heat flux is always pointing in the direction of lower temperature region, TFPA always moves the chips in the direction of lower temperature region, hence leading to a convergent and uniform temperature distribution solution. E. Space Complexity TFPA requires 10m cells of storage for { i , qi, xi, yi, Fxi, Fyi, F’xi, F’yi, x i , y i ∣ 1 i m }. Namely, the space complexity of TFPA is O(m). F. Time Complexity TFPA consists of applying the one-dimensional NR method 2m times, once for each chip. Since the time needed for NR method is mainly taken in computing formulas (8) and (9), the TCPT-2002-019.R1 8 time needed for other steps is neglected in the following analysis. In computing formulas (8) and (9) per step, the frequency for evaluating fz (r,i, jc) and f ' z (r,i, jc) is equal to 1 2 m j1 5 a a 1 c a 2 a 1 r a 1 121m Let the time needed for evaluating fz (r,i, jc) and f ' z (r,i, jc) to be a unit of time. Then, it requires 242m2 units of time per iteration of TFPA. From the above analysis, one can conclude that the time complexity of TFPA is O(m2) for each iteration. The number of iterations required to obtain a solution is dependent upon problem size (m), the heat dissipation of chips, and the stopping criterion values (ε1 and ε2). G. Accuracy and Limitation of TFPA The accuracy of a particular solution depends upon at least three factors. The first is the extent to which the physical problem and the TFPA model have in terms of one-to-one correspondence. In TFPA, packaging structure is simplified as a rectangular plane, and the material of the substrate is assumed to be isotropic and homogeneous in its thermal conductivity and the edge of the substrate is perfectly insulated. All these simplification and assumptions induce some error. However, since TFPA always generates a placement of chips placed apart, so even a practical package do not satisfy these assumptions, TFPA still can be used to obtain a “good” solution. Second, TFPA neglects the dimensions of chips and considers each chip as a point for simplifying the problem. This may induce chip overlaps. However, the overlapping problem is not serious as shown in Sec. 3. Finally, the accuracy is also influenced by truncating the high order terms in formula (8). However, this error has no significant effect on accuracy as shown in appendix. III. EXAMPLES AND COMPUTATIONAL RESULTS TFPA has been implemented in C language, and run on a 300MHz Pentium II personal computer. Since there are no well-established MCM benchmark circuit or package data, the present method is applied to several already published examples. To simplify the analysis, all examples are treated as having the same module packaging and cooling conditions, but they may have different geometric dimensions. A TAMS program [29-30], based on Fourier series TCPT-2002-019.R1 9 solution, is used to predict the temperature distributions of chips on the substrate. Fig. 4 shows a TAMS model of an MCM. The different layers model the different parts of the MCM structure, such as the multi-layer substrate, the epoxy layer, and the heat sink. Within each layer, the material is assumed to be linear, isotropic, and homogeneous. Temperature and heat flow are continuous at interfaces between layers. Air conduction or convection within the space between the MCM ceramic surface and the cover is neglected (a worst case effect on results). The thermal conductivities of the multi-layer substrate (i.e. Cordierite substrate 2MgO . 2Al2O3 . 5SiO2), the epoxy layer, and the heat sink are 2.5×10-3 W/mm ℃ , 1.17×10-3W/mm℃, and 1.95×10-1W/mm℃, respectively. Because the average heat flux is very high in our examples, we select force convection at a velocity of 4 m/s for the board, and jet impingement forced convection at a velocity of 0.3 m/s for the finned sides. Correspondingly, the average heat transfer coefficients for h1 and h2 are 3.26×10-5 W/mm2℃ and 1.94×10-4 W/mm2℃. Y W h1 Chips 0 Epoxy L Multiplayer substrate Fig. 4. Heat sink X h2 Schematic representation of an MCM model used here The failure rate of a chip is estimated by the Arrhenius formula λ T 1 E 1 (i) λ o exp a k 298 T where T and o are the failure rates at T K and at 298 K, respectively; Ea is the activation energy (eV); k is the Boltzmann's constant. TCPT-2002-019.R1 10 (15) To determine the system failure rate of an MCM, various operating parameters need to be specified. Without loss of generality, all chips are assumed to be the same factors of o and Ea, which are 1 Fit (i.e. 10-9/hour) and 1 eV, respectively. The system failure rate, S, of an MCM with m chips is calculated as the sum of individual failure rate of chips. That is m S = T (i ) (16) i 1 A. Examples of uniform chips The first two examples are twelve and sixteen chips for Case 1 and 2, respectively. They are all of 1 watt, 5.1 mm square chips. The substrate for both cases has length, width, and thickness dimensions of 30.5 mm, 30.5 mm, and 5 mm, respectively. The placement with temperature distributions obtained by TFPA is shown in Figs. 5(a) and (b). One can see that chips are regularly placed on the substrate, and the temperatures of chips are very evenly distributed. It is no surprise that the results obtained are optimal in both cases. 118 118 118 117 118 139 139 140 139 139 139 139 139 139 139 139 140 139 139 140 118 118 118 117 140 118 118 118 (a) Fig. 5. (b) Placements with temperature (℃) distribution obtained by TFPA. (a) 12 chips example; (b) 16 chips example. B. Examples of chips with different heat dissipations The following four examples are from [16]. All of the four cases have the same sizes of substrates and chips, but different chips have different heat dissipation in the different cases. A 50 mm square substrate with a thickness of 5 mm is used. The chips are also square with a 5 mm edge. Table 1 lists heat dissipation of each chip. The placement results obtained by TFPA are compared to those obtained by Huang and Fu [16]. For reference, FMFDA (i.e. Fuzzy Model Force-Directed Algorithm) is assigned to TCPT-2002-019.R1 11 the algorithm presented by Huang and Fu. The results of maximum, minimum, average temperature, maximum temperature difference, and system failure rate (i.e. Tmax, Tmin, Tav, Tmax , S), obtained by TFPA and by FMFDA for various cases, are summarized in Table 2. The locations of chips listed in FMFDA are directly obtained from [16], but the temperature distributions are analyzed under the present packaging and cooling conditions. As shown in Table 2, the results obtained by TFPA have lower Tmax than and equal Tmin to those obtained by FMFDA. Therefore, the values of Tav, Tmax , and S obtained by TFPA are also lower than those obtained by the FMFDA. Especially, the S in TFPA is only 15% and 18% of the S in FMFDA for Cases A and B, respectively. Table 1. Four cases with various chip number and power dissipation [16]. Cases Number of chips Power dissipation value (W) A 5 0.8, 1.3, 1.5, 1.5, 1.7 B 6 0.5, 0.8, 0.9, 1.1, 1.3, 1.5 C 8 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0 D 10 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2 Table 2. The comparisons of TFPA and FMFDA for the cases in Table 1. Cases Algorithms A B C D FMFDA TFPA FMFDA TFPA FMFDA TFPA FMFDA TFPA Tmax(℃) 119 94 108 86 82 68 81 78 Tmin(℃) 64 64 52 52 44 44 47 47 Tmax =Tmax-Tmin 55 30 56 34 38 24 34 30 Tav(℃) 100.2 82.4 82.2 69.5 62 55.8 64.6 63 S (Fit) 25432 3830 8755 1610 1324 415 1550 1252 TCPT-2002-019.R1 12 For further comparisons, the detail chips’ locations with temperature distributions of Cases A and C, obtained by TFPA and FMFDA, are shown in Figs. 6(a), (b) and 7(a), (b), respectively. Obviously, the FMFDA places the chips of larger heat dissipations to the corners on the substrate to avoid the hot spot generated at the central region of the substrate. However, chips on the corners generally have worse cooling effects, since there is not enough area to transport the heat generated by the chips. As a result, FMFDA generated placements with four hot spots on four corners of the substrates, which are shown in Figs. 6(a) and (b). This can induce serious thermal stress on corners and decrease the system reliability of an MCM. On the other hand, TFPA placed chips apart, so each chip has enough substrate area to transport heat dissipated within the chip; hence, placements with a cooler and uniform thermal profile can be obtained. However, in cases that most heat can conduct away from the four sidewalls, FMFDA may give better results. 109 (1.5W) 119 (1.7W) 94 80 (1.7W) (1.3W) 64 (0.8W) 87 64 (0.8W) (1.3W) 100 109 (1.5W) (b) (a) Fig. 6. 87 (1.5W) (1.5W) Placements with temperature (℃) distribution and heat dissipations for case A. (a) Placement result by FMFDA; (b) placement result by TFPA. 72 (0.8W) 79 (0.9W) 47 (0.4W) 61 (0.8W) 68 55 (1W) (0.6W) 50 (0.5W) 51 (0.5W) 57 (0.7W) 44 (0.3W) 44 (0.3W) 82 (1W) 47 (0.4W) (0.7W) 67 54 (0.6W) (b) (a) Fig. 7. 64 (0.8W) Placements with temperature (℃) distribution and heat dissipations for case C. (a) Placement result by FMFDA; (b) placement result by TFPA. TCPT-2002-019.R1 13 C. Examples of uneven chips The final two examples are two consumer microprocessors, PowerPC 603 and 604. Table 3 shows the chip size and power information for both MCM [31]. The chips have varying heights, widths, and heat dissipations. The PowerPC 604 can be considered as a critical example, because the processor dissipates maximally 13 watts that is relatively higher than those of the other chips. For comparison needed, Figs. 8(a) and (b) show the chip layouts of both PowerPC 603 and 604 MCM on a 44 mm × 44 mm substrate depicted in [31]. The temperature distributions in Figs. 8(a) and (b) are analyzed under the present packaging and cooling conditions. Table 3. Chip size and power information for PowerPC 603 and 604 MCM [31]. MCM Processor Bridge Bridge Clock controller buffer Distr. SRAM Tag RAM Bus Buffer Chip size 7.4×11.4 6.7×7.1 7.8×7.4 4.1×4.1 5.7×12.1 9.8×4.1 3.2×1.3 PowerPC (mm) 603 Max. chip 3 0.75 0.5 0.9 1 0.7 0.4 power (W) Chip size 12.4×15.8 6.7×7.1 7.8×7.4 4.1×4.1 5.7×12.1 9.8×4.1 3.2×1.3 PowerPC (mm) 604 Max. chip 13 0.75 0.5 0.9 1 0.7 0.4 power (W) 65 (1W) 85 (0.4W) 83 (1W) 58 (0.5W) 89 (0.5W) 85 (1W) 66 (1W) 93 (0.8W) 122 (0.8W) 95 (3W) 180 (13W) 85 (1W) 66 (1W) 66 66 (0.7W) (0.75W) 65 (1W) 83 (1W) 90 (0.7W) 97 (0.75W) (b) (a) Fig. 8. 107 (0.4W) Chip layout with temperature (℃) distribution and heat dissipations for PowerPC 603 and 604 in [31]. (a) PowerPC 603; (b) PowerPC 604 TCPT-2002-019.R1 14 The initial placements of PowerPC 603 and 604, generated by the TFPA, are shown in Figs. 9(a) and (b), respectively. Obviously, both initial placements have very high quality; chips are placed apart and only little overlapping occurs in Figs. 9(a) and (b). Therefore, one can easily transform the initial placements to the final placements, simply by removing the overlaps according to design rules. The final placements are shown in Figs. 10(a) and (b). 64 (1W) 65 (1W) 88 (0.4W) 84 85 87 (0.75W)(0.7W) (1W) 94 (3W) 67 (0.75W) 65 60 (0.7W) (0.5W) 67 (1W) 108 (0.8W) 86 (1W) 67 (1W) 178 (13W) 86 (1W) 88 (0.8W) 88 (0.5W) 107 (0.4W) (a) Fig. 9. 84 (1W) (b) Initial placements obtained by TFPA. (a) PowerPC 603; (b) PowerPC 604. 65 (1W) 94 (3W) 86 85 (0.75W) (0.7W) 65 (1W) 91 (1W) 90 (1W) 108 (0.8W) 66 (0.75W) 65 (1W) 66 58 (0.7W) (0.5W) 85 (0.4W) 88 (1W) 65 (1W) 86 (1W) 87 (0.8W) 91 (0.5W) 106 (0.4W) (a) Fig. 10. 179 (13W) (b) Final placement after design rules checking. (a) PowerPC 603; (b) PowerPC 604 TCPT-2002-019.R1 15 Tmax, Tmin, Tav, ΔTmax, and S for the placements in Figs. 8(a), (b), and 10(a), (b) are summarized in Table 4. As shown in Table 4, the PowerPC 603 results obtained by the TFPA are slightly better than those of the original product. The system failure rate is improved from the original one by about 15%. For the case of PowerPC 604, the placement generated by TFPA has lower values of Tmax and ΔTmax than the original placement. So, the values of Tav and S obtained by TFPA are also lower than the original results. The system failure rate is improved from the original results by about 8%. It is well known that the initial placements, generated by conventional FDA [23], usually have serious block overlaps, since the force model does not include the geometric information of the substrate. In contrast, the geometric information of the substrate is implicitly included in the thermal force model of TFPA; thus, the overlapping problem can be largely reduced. Table 4. Comparisons of the placements obtained by TFPA with the placements in [31]. PowerPC 603 PowerPC 604 Cases Original TFPA Original TFPA Tmax(℃) 94 94 180 179 Tmin(℃) 59 58 83 85 Tmax =Tmax-Tmin 35 36 97 94 Tav(℃) 72.5 71.7 102.1 101.1 S (Fit) 4445 3788 659875 605712 IV. CONCLUSIONS A Thermal Force Placement Algorithm (TFPA), based on heat conduction analogy, is proposed for the thermal placement of MCM. TFPA has proved to be an efficient and effective algorithm. Space and time complexity of TFPA are O(m) and O(m2), respectively. In TFPA, chips are iteratively moved in the direction of lower temperature region. As a result, it leads to a cooler and uniform temperature distribution placement. Eight examples with three different types of chips have been examined by TFPA. For the type of uniform chips, TFPA generates optimal placements for reliability; for the type of chips TCPT-2002-019.R1 16 of the same size but with different heat dissipations, TFPA generates placements with more uniform temperature distribution than the method compared; for the type of uneven chips, TFPA generates placements with better system reliability than the present consumer products (PowerPC 603 and 604). Appendix From Sec. II equation (8) gives the net thermal force exerted on di by dj as Fz i, j fz i, j Pz (a) a 1 (8) fz i, j Pzc (a) Pzr (a) a 1 a where Pzc (a) fz i,( j a, c) fz i,(a,jc) ca a 1 , Pzr (a) fz i,(r,ja) fz i,( j r, a) r a 1 , and Pz (a) Pzc (a) Pzr (a) and z is either x or y.. This section discusses the truncation error of neglecting Pz (a ) in formula (8). In the a 6 following, since the derivation in y-direction is the same as the derivation in x-direction, only derivation in x-direction is considered. For convenient, consider substrate as a square of unit length, and let di at the center point. Therefore, 0 L, W 1 , xi=0.5 and yi=0.5. The coordinate difference between di and dj(r,c) can be represented by xi(,rj,c ) c x (c ) (17a) yi(,rj,c ) r y ( a ) (17b) and where x ( c ) 1 x j xi and y ( r ) 1 y j yi . Obviously, x ( c ) 0.5 , y ( r ) 0.5 . c r The distance square between di and dj(-a,-c) can be described as r x ( a,c) 2 i, j y ( a,c) 2 i, j c x a y x y cx (a,c) 2 i, j ay ( a ) a 2 c 2 1 2 a2 c2 (c) 2 (c) (c) 2 a2 c2 (a) 2 (a) 2 Since x y (c) 2 a c 2 (a) 2 2 1 1 2 0.014 1 as a 6 2 2 a c 2a 2 Formula (18) can be approximated as TCPT-2002-019.R1 17 (18) r a ( a, c ) 2 i, j where h1 c 2 1 2h1 h2 as a 6 2 (19a) ay ( a ) cx ( c ) and . h 2 a2 c2 a2 c2 Similarly, we can obtain r a ( a ,c ) 2 i, j 2 c 2 1 2h1 h2 as a 6 (19b) A. Derivation of Pxc (a) According to (5), we find x ( a , c ) xi(,aj,c ) i, j fx i,( j a, c) fx i,(a,jc) i q j 3 ( a ,c ) 3 ri (, j a , c ) r i, j (20) Substituting (17a), (17b), (19a) and (19b) into (20), yield fx i,( j a, c) fx i,(a,jc) a a iq j 2 c2 iq j 2 c 2 3/ 2 3/ 2 c x ( c ) c x ( c ) 3/ 2 3/ 2 1 2h1 h2 1 2h1 h2 3/ 2 3/ 2 (c) (c) c x 1 2h1 h2 c x 1 2h1 h2 2 3/ 2 1 4 h h 1 2 (21) By applying Taylor series expansion 1 2h1 h2 3 / 2 1 3h1 h2 3 h1 h2 2 2 (22) and 1 4h h 2 3/ 2 1 2 1 6h1 h2 ... 2 (23) The high order terms in (22) and (23) can be reasonable neglected according to the following analysis. 3 3 cx ( c ) ay ( a ) 2 h1 h2 2 2 a2 c2 2 2 3 a c 3 1 2 2 2 2a 0.015 1 as a 6 8 8 a c 2 and 6h1 h2 0.06 1 as a 6 2 Hence (22) and (23) can be approximated as 1 2h1 h2 3 / 2 1 3h1 h2 as a 6 (24) and 1 4h h 2 3/ 2 1 2 By inserting (24) and (25) into (21), we obtain TCPT-2002-019.R1 18 1 as a 6 (25) fx ( a, c) i, j fx (a, c) i, j i q j 2x ( c ) 6ch1 h2 a 2 c2 (26) 3/ 2 Pxc (a) can be obtained by summing up the series (26) from c=-a to c=a, Pxc (a ) a i q j 2x ( c ) 6ch1 h2 c a a c 2 2 3/ 2 (0) a 2x ( c ) 6ch1 h2 2x ( c ) 6ch1 h2 2x i q j 3 3 / 2 3 / 2 2 2 2 2 a c 1 a c a c (27) a 2x ( 0 ) 4x ( c ) 12ch2 i q j 3 3/ 2 c 1 a2 c2 a a 2x ( 0 ) a 2 2c 2 (c) i q j 3 4 x 5/ 2 c 1 a2 c2 a B. Derivation of Pxr (a) Pxr (a ) can be described as Pxr (a) fx a r a ( r , a) i, j fx i,(r,ja) fx i,( j a, a) fx i,( j a,a) fx i,(a,j a) fx i,(a,ja) (28) As the same way to derive Pxc (a) , the first term in the right side of (28) is fx a r a ( r, a) i, j a 2x ( 0) a 2 2r 2 fx i,(r,ja) i q j 3 4 x ( r ) 5/ 2 r 1 a2 r 2 a (29) Estimating the second term in the right side of (28), fx i,( j a, a) fx i,( j a,a) fx i,(a,j a) fx i,(a,j a) x ( a , a ) x i(,ja , a ) x i(,aj, a ) x i(,aj, a ) i, j i q j ( a,a ) 3 (a,a) 3 ( a,a ) 3 ri(, j a , a ) 3 r r r i, j i, j i, j i q j a x ( a ) a x ( a ) a x ( a ) a x ( a ) 3 x ( a ) y ( a ) 3 x ( a ) y ( a ) 3 x ( a ) y ( a ) 3 x ( a ) y ( a ) 2 2a 3 1 1 1 1 2 a 2 a 2 a 2 a iq j x ( a ) 3y ( a ) 2 2a 3 9 x ( a ) y ( a ) 1 4 a (a) i q j x as a 6 2a 3 2 x ( a ) 3y ( a ) 1 9 x ( a ) y ( a ) 4 a 2 (30) Substituting (29) and (30) into (28), we find TCPT-2002-019.R1 19 a 2x ( 0) x ( a ) x ( r ) a 2 2r 2 Pxr (a) i q j 3 4 5/ 2 2a 3 r 1 a2 r 2 a as a 6 (31) Pz (a ) C. Derivation of a 6 fx i, j Substituting (27) and (31) into (8), yield a 4x ( 0) x ( a ) x ( c ) a 2 2c 2 Px (a) i q j 3 8 2a 3 c 1 a 2 c 2 5 / 2 a as a 6 (32) Obviously, the above series is absolute convergence. Sum up Px (a) from a=6 to 1000 by Matlab, gives Px (a) 0.0043 i q j x ( 0) (33) a 6 Dividing fx i, j , yield Pz (a) a 6 fx i, j 0.0043 i q j x ( 0) iq j x ( 0) 0.0043ri , j 3 (34) r 3 i, j Since r 0.5 3 i, j 2 0.5 2 3/ 2 0.3536 then, Pz ( a ) a 6 fx i, j 0.0015 (35) Therefore, it can be concluded that the truncation error of neglecting Pz (a ) only a 6 gives a relative error less than 0.15% compare to fx i, j . TCPT-2002-019.R1 20 ACKNOWLEDGMENT This work was supported by the National Science Council, Republic of China under contract no. NSC 89-2215-E-218-013. I am pleased to thank Professor Jung-Hua Chou for his valuable comments and suggestions concerning this paper. REFERENCES 1. M. Pecht, Integrated Circuit, Hybrid, and Multichip Module Package Design Guidelines – A Focus on Reliability, New York: Wiley, 1994. 2. E. Suhir, “Thermal stress failures in microelectronic components – review and extension,” in Advances in Thermal Modeling of Electronic Components and System, A. Bar-Cohen and A. D. Kraus, ed., New York: Hemisphere Publishing Corporation, 1988, pp.337-412. 3. N. Sherwani, Algorithms for VLSI Physical Design Automation, 3rd ed. Boston: Kluwer Academic Publishing, 1999. 4. G. Wippler, M. Wiesal, and D. Mlynski, “A combined force and cut algorithm for hierarchical VLSI layout,” in Proc. 22th Design Automation Conf., 1982, pp. 671-677. 5. W. Maly and A. P. Piotrowski, "Heat exchange optimization technique for high-power hybrid IC's", IEEE Trans. Comp, Hybrids, Manufact. Technol., vol. 2, no. 2, pp. 226-231, 1979. 6. A. H. Mayer, "Computer-aided thermal design of avionics for optimum reliability and minimum life cycle cost," Technical Report AFFDL-TR-78-48, Air Force Flight Dynamics Laboratory, 1978. 7. M. Pecht and J. Naft, "Thermal reliability management in PCB design," in Proc. 1987 Ann. Reliability and Maintainability Symp., 1987, pp. 27-29. 8. D. Dancer and M. Pecht, "Reliability optimization technique for convectively electronics," IEEE Trans. on Reliability, vol. 38, no. 2, pp. 199-205, 1989. 9. M. Osterman and M. Pecht, “Component placement for reliability on conductivity cooled printed wiring boards,” Trans. of the ASME Journal of Electronic Packaging, vol. 111, pp. 149-156, 1989. 10. R. Eliasi, T. Elperin, and A. Bar-Cohen, "Monte Carlo thermal optimization of populated printed circuit board," IEEE Trans. Comp, Hybrids, Manufact. Technol., vol. 13, no. 4, pp. TCPT-2002-019.R1 21 953-960, 1990. 11. M. D. Osterman, "A physics of failure approach to component placement," Trans. of the ASME Journal of Electronic Packaging, vol. 114, pp. 305-309, 1992. 12. C. N. Chu and D. F. Wong, “A matrix synthesis approach to thermal placement," IEEE Trans. Computer-Aided Design, vol. 17, no. 11, pp. 1166-1174, 1998. 13. C.-H. Tsai and S.-M. Kang, “Cell-level placement for improving substrate thermal distribution,” IEEE Trans. Computer-Aided Design, vol. 19, no. 2, pp. 253-266, 2000. 14. K. Y. Chao and D. F. Wong, “Thermal placement for high-performance multi-chip modules," International Conf. on Computer Design, Austin, Texas, 1995, pp. 218 - 223. 15. M. C. Tang and J. D. Carothers, “Consideration of thermal constraints during multichip module placement,” Electronic Letters, vol. 33, no. 12, pp. 1043-1045, 1997. 16. Y.-J. Huang and S.-L. Fu, “Thermal placement design for MCM applications,” Trans. of the ASME Journal of Electronic Packaging, vol. 122, pp. 115-120, 2000. 17. C. Beebe, J. D. Carothers, and A. Ortega, “MCM placement using a realistic thermal model,” Proceeding of the Tenth Great Lakes Symposium on VLSI, pp. 189-192, 2000. 18. J. Lee, J. H. Chou, and S. L. Fu, "Reliability and wireability optimizations for module placement on convectively cooled printed wiring board," INTEGRATION, the VLSI Journal, vol. 18, no. 2&3, pp. 173-186, 1995. 19. M. D. Osterman and M. Pecht, "Placement for reliability and routability of convectively cooled PWB's," IEEE Trans. on Computer-Aided Design, vol. 9, no. 7, pp. 734-744, 1990. 20. N. V. Queipo, J. A. C. Humphrey, and A. Ortega, “Multiobjective optimal placement of convectively cooled electronic components on printed wiring boards,” IEEE Trans. Comp., Packag., Manufact. Technol., A, vol. 21, no. 1, pp. 142-153, 1998. 21. N. V., Queipo and G. F. Gil, “Multiobjective optimal placement of convectively and conductively cooled electronic components on printed wiring boards,” Trans. of the ASME Journal of Electronic Packaging, vol. 122, pp. 152-159, 2000. 22. J. Lee and J. H. Chou, "Hierarchical placement for power hybrid circuits under reliability and wireability constraints," IEEE Trans. on Reliability, vol. 45, no. 2, pp. 200-207, 1996. 23. N. Quinn and M. Breuer, “A forced directed component placement procedure for printed circuit boards,” IEEE Trans. Circuits Syst., vol. 26, no. 6, pp. 377-388, 1979. TCPT-2002-019.R1 22 24. F. Mo, A. Tabbara, and R. K. Brayton, “A force-directed macro-cell placer,” in Intl. Conf. on Computer-Aided Design, 2000, pp. 177-180. 25. R. M. Karp, “Reducibility among combinatorial problems,” in Complexity of Computer Computations, R. E. Miller and J. W. Thatcher, Ed., New York: Plenum, 1972. 26. D. J. Dean, Thermal design of electronic circuit boards and packages, Scotland: Electrochemical Publications Ltd., 1985. 27. A. L. Palisoc and C. C. Lee, “Exact thermal representation of multilayer rectangular structures by infinite plate structures using the method of images,” Journal Appl. Phys. vol. 12, no. 64, pp. 6851-6857, 1988. 28. Y.-K. Cheng and S.-M. Kang, “A temperature-aware simulation environment for reliable ULSI chip design,” IEEE Trans. Computer-Aided Design, vol. 19, no. 10, pp.1211-1220, 2000. 29. G. N. Ellison, Thermal Computations for Electronic Equipment, New York: Van Nostrand Reinhold Co., 1983. 30. G. E. Ellison, “Thermal analysis of circuit boards and microelectronic components using an analytical solution to the heat conduction equation,” Twelfth IEEE SEMI-THERM Symposium, 1996, pp. 144-150. 31. T. D. Yuan, “Thermal management in PowerPC microprocessor multichip modules applications,” Thirteenth IEEE SEMI-THERM Symposium, 1997, pp. 247-256. TCPT-2002-019.R1 23