DesignCon 2020 Power Coupling Extraction Method Comparison Sherman Chen, Kandou Bus shan@kandou.com John Phillips, Cadence Design Systems Inc. phillips@cadence.com Long Yang, Cisco Systems Inc., longyan@cisco.com Gert Havermann, HARTING Stiftung & Co. KG gert.havermann@HARTING.com 1 Abstract As the density of package has grown drastically in the past decade, the cross-power-rail coupling effect becomes more and more aggravated and is no longer easy to evade. To handle this increasingly threatening problem, accurate modeling of the cross coupling, plus correct simulation methodology, is indispensable for obtaining trustable simulation result. This paper presents a full scale comparison on the models extracted using Hybrid solver and 3D solver with the focus on the coupling effect modeling. The concept of cross-rail transfer function is introduced first. Then the comparison on the Hybrid and 3D models of a 20Gbps package are presented. The coupling effects in frequency domain and time domain are plotted and analyzed. At the end, a set of designing guidelines are summarized. Author(s) Biography Sherman Shan Chen is a principle SI/PI engineer with Kandou Bus. Sherman has many years of experience and worked at EMC, Intel, and Cisco as SI or PI engineer before joining Kandou. He has a wide range of knowledge in Analogy/Power/SI/PI design for various types of systems including telecommunication, storage and computation, across board, package, and chip levels. Sherman holds an MSEE degree from Zhejiang University, China. His current research interests are SI/PI co-simulation, macromodel, and high speed bus encoding schemes. John Phillips is a principal application engineer with Cadence Design Systems prior to joining Cadence John has worked at many different companies covering such diverse market places as high end compute platforms, mil-aero. During his career he has acquired a broad knowledge of SI, PI and EMC at chip, board and system level. John holds an MSc. Degree form Bolton University, UK. His current interest are SI/PI co-simulation and modelling for both serial and parallel interfaces. Long Yang received his B.S. and M.S. degrees in Xidian University at Xi'an, China and University of Science and Technology of China, in 2001 and 2004 respectively. He started his career at Intel in 2004 as an SI engineer. Later he worked at Ansys, Amphenol-TCS and Intel. Currently he is an SUPI lead at Cisco where he is responsible for the advanced high-speed switch system design and ASIC package design. His interests include signal integrity, power integrity, highspeed interconnect modeling, Serdes modeling and characterization. Gert Havermann is Signal Integrity Engineer and has over 17 years of professional experience within the HARTING Technology Group. Gert Havermann received his Diplom-Ingenieur (FH) degree as an Engineer in Electrical Communications Engineering 2 and Microwave Technology at the University of Applied Sciences Hannover in 2001. He has significant experience with field solver based modeling of RF- and high speed interconnects, digital system simulation and development of test fixtures and Reference Designs for compliance tests. He also designed several Antennas and Sensors for UHFRFID and holds Patents in SI, Antennas and Connector Design. 3 Introduction As one of various solutions for the further extension of Moore’s law, denser packaging technologies such as 3D package, WLCSP, FOP, etc, have been introduced to the industry aiming to enable the delivery of higher throughput within same or smaller package size [1]. This trend has shown good results and is proven able to continue Moore’s law in a scale-out rather than a scale-up way[2][3]. Consequently, along with the sharply increasing circuit density, the feature size (bump size, trace width/spacing, etc.) on package substrates have been downsized from 300+um, to 100um, 30um, and now 10um[10]. Inevitably, the cross coupling between power rails on package is also increased dramatically. The distance between power rails is no longer as spacious as five years ago at which time the inter-rail coupling could normally be avoided or mitigated without much engineering effort. Nowadays in the highly compacted packages, multiple power rails are forced to be routed in very close vicinity and as a result, the inter-rail coupling can no longer be easily avoided. PI engineers must now understand how to characterize and mitigate the increasingly higher cross-rail coupling in order to assure the noise levels seen by the on-die circuits are still within spec, after accounting for all of the noise coupled from other power rails. Following a similar simulation doctrine as that used for SI, to accurately capture the cross-rail coupling effect, the first and foremost task is to obtain a sufficiently accurate PDN model. Unfortunately, the characterization of the power noise cross coupling is not straightforward, especially for highly complicated vertical structures. Frequently we see that the extracted coupling models are far off from the reality, causing either overdesign or compromised performance, or even chip failure eventually. In this area there is literature on the theory base for the analysis methodologies[4][5], and certain preliminary studies on the coupling effect have been performed[6]. Getting down to the extractor categories that are employed for coupling effect extraction, there are mainly two options: 3D solver and Hybrid solver. Traditional 3D full wave solver provides a highly accurate model, but at the expense of a huge computational cost which is often unrealistic for most companies. Meanwhile the Hybrid solver has been used in characterizing PDN for years and has provided reliable results which are well correlated to measurement. Now that the stress goes to the effects of inter-power rail coupling, can a Hybrid solver still deliver as reliable results as a 3D solver does? In this paper, we compare the PDN models extracted with a Hybrid solver and a 3D full wave solver, respectively. The sanity of the extracted models is examined and the deviations stemming from the two models in terms of Z(f) (the frequency domain impedance curve) and the time domain voltage noise are compared and analyzed. The relationship between the Z(f) of a combined port and that of separate multiple ports is derived. The correspondence between frequency domain and time domain results are evaluated. Finally, at the end, a set of design guideline is presented. 4 Concept of cross-rail modeling The cross-power-rail coupling path could exist in any part of the PDN (Power Delivery Network), either the board, package, or die. Figure 1 shows two simplified power rail PDNs and the potential coupling paths between them. Note that coupling paths also exist between the grounds of the two PDNs which is not manifested in the diagram in order to not clutter the figure. Figure 1 Cross-power-rail coupling paths In subsequent sections, the fundamental concepts of cross-rail modeling are first introduced and then the modeling methodology is demonstrated through a real design case. 1) The concept of cross-rail modeling Figure 2 shows a simplified conceptual diagram of cross-rail coupling. In it, v1, v2 are the VRMs (Voltage Regulator Module) of the two power rails. Zv1_pdn and Zv2_pdn represents the PDN impedance of the two power rails, respectively. Zv1_v2 is the cross coupling impedance. Note that although in this simplified diagram Zv1_v2 is shown as one path between the two PDNs, in reality they exist between every path between the two PDNs. 5 Figure 2 Conceptual cross-rail coupling path The cross-rail coupling effect is also called the transfer function between the two power rails which, in the case of Figure 2, is represented by S21 if we regard the setup as a 2port network. Below is an illustration of how to calculate the S21 based on some hypothetical assumptions. Assuming that Zv1_pdn = Zv2_pdn = 1 Ohm, and that Zv1_v2 = 500 Ohm, then we have: _ = _ = + + = − + _ _ − _ = = . = _ ∗ ∗ = . Therefore, the coupling coefficient between the two power rails is 0.4%. Note: In calculating the s-parameter of the cross-rail coupling network, the source will be the aggressor power rail itself rather than an external source. For more than two power rails, the same concept applies except that the cross coupling coefficients of multiple power rails would be the linear sum of all of the coefficients provided that the noise sources are statistically uncorrelated. In addition, for improved accuracy, each port of PDNs, be it the victim PDN or the aggressor PDNs, should have to be terminated in the same way as in the application scenarios. Since usually we do not directly use s-parameters for PDN analysis, rather, the Z(f) is derived from the PDN’s s-parameters for frequency domain analysis, or the rational function model is derived for time domain analysis. It is necessary to develop the corresponding analysis methods in the frequency domain and in the time domain for the transfer function. Figure 3 shows the same network as that in Figure 2, but in T-topology, to facilitate the illustration of representing the transfer function using z-parameters. 6 Figure 3 Transfer function represented by the Z-parameter From the definition of the z-parameter for two port network we have: - = | = = + | Note: Z12 and Z21 are called the Transfer Impedance. For passive network, they are always identical. Manipulating the formulas we obtain: Equation 1 = = Equation 1 will be used in the frequency domain cross-coupling analysis in the following sections. In time domain analysis, we can directly use the expression of V2/V1 as the transfer function without the need for deriving it from the z-parameter. One aspect that we may need to pay attention to is that, for purely passive network, the reciprocity is always guaranteed. That is to say, Znm will be always equal to Zmn. However, this does not mean that the coupled time domain noise Vnm is unconditionally equal to Vmn. This is because the noise levels of the aggressors may not be the same, we will demonstrate this fact later on. Design project description Starting from this section, we will use a real project to demonstration the cross-coupling modeling methodology. The package is a 4mm x 4mm CSP BGA carrying eight 26.67Gbps high-speed ports. Figure 4 shows the BGA side view of the package. 7 Figure 4 BGA side view of the package There are two power rails on the package whose specifications are as follows: VDDA: 0.9V; 500mA; 21 bumps VDDS: 3.3V; 100mA; 2 bumps Key settings for model extraction The PDN model containing the two power rails extracted using a Hybrid solver and a 3D full wave solver, respectively. There are some vital setup options which have direct impact on the accuracy of the extracted models. 1) Selection of the reference impedance A long time controversial topic in PI simulation is what ref impedance is the best for PDN model extraction. Some authors say that since the PDN impedance is very low in the majority of frequency range, then Zref should use some value close to the PDN impedance[3], while others argue that sometimes using 50Ohm delivers the best model quality despite the fact that Zref is far off from Zpdn. Here we will examine this issue. 8 Equation 2 is the s-parameter reflection calculation formula in which Zpdn represents the impedance of PDN, and Zref is the reference impedance used in extracting the PDN model. = Equation 2 Based on Equation 2 the impact of ΔZpdn, which refers to the variation of Zpdn, on s11 and s21, denoted by Δs11 and Δs21 can be calculated and the results are listed in Table 1. Table 1 Impact of ΔZpdn on s11 and s21 (Blue: inputs; Black: outputs) Δs11 (%) Δs21 (%) s21: 0.27729 Relative computational error (%) 1 0.0004 -2.64003 1.355E-16 5 0.001999 -2.63995 2.7121E-17 -2.63985 1.3574E-17 Zpdn: 1Ohm ΔZpdn (%) 10 Zpdn: 1Ohm Zref: 50Ohm 0.003994 Δs11 (%) Δs21 (%) s21: 0.57496 Relative computational error (%) 1 -0.17382 -0.75555 -3.119E-19 5 -0.17194 -0.75551 -3.153E-19 10 -0.1696 -0.75546 -3.196E-19 ΔZpdn (%) Zref: 10Ohm s11: -0.96078 s11: -0.81818 Δs11 (%) Δs21 (%) s21: 0.19802 Relative computational error (%) 1 1.979802 -4.09732 2.7382E-20 5 1.978235 -4.09721 2.7403E-20 10 1.976279 -4.09707 2.743E-20 Zpdn: 1Ohm ΔZpdn (%) Zref: 0.01Ohm s11: 0.980198 Note: Since s21 represents the cross-rail coupling and is used for calculating inter-rail coupled noise, it is also presented in the table along with s11. The observation here we can make are, with Zpdn being 1Ohm: 1. When Zref used is 50Ohm, the change of ΔZpdn from 1% to 10% results in a change of s11 from 0.0004% to 0.004%; 2. When Zref used is 10Ohm, the change of ΔZpdn from 1% to 10% results in a nearly same deviation of s11 which is -0.17%; 3. When Zref used is 0.01Ohm, the change of ΔZpdn from 1% to 10% results in a nearly same deviation of s11 which is 1.9%; 4. Assuming the computation tolerance of simulation tool is 264-1, which is 5.42E20, the relative computational error caused from the variation of Zpdn are at the magnitude orders of 1e-17, 1e-19, and 1e-20 corresponding to Zref of 1, 10, and 0.1Ohm respectively. 9 The term Relative Computational Error used here is defined as the ratio of the computation resolution of simulation tool vs the resultant delta of s11 due to the change of Zpdn. It reflects that given a fixed change percentage of Zpdn, how much computational error will be yielded owing to different Zref value used. Based on above analysis the following conclusions can be reached: 1. Using Zref close to Zpdn produces wider (i.e., more effective) variation on the resultant s11 with the same percentage change of Zpdn; 2. Despite the seemingly advantage stated in 1, the ultimate relative computational errors difference are not significant (1e-16 vs 1e-20 for 64bit software) while varying Zref between 50Ohm and 0.1Ohm. 3. In summary, what Zref value to choose depends more on the particular case under consideration. We suggest to first try the Zref that is close to the Zpdn at DC&LF frequency range. If the resultant s-parameter appears problematic during sanity check, try other values till good s-parameter is obtained. 2) Bandwidth selection Next key factor to consider is what bandwidth should be applied in extracting the PDN model? Some literatures say for board + package PDN, the frequency range from DC to 200MHz is sufficient to suit most designs. Meanwhile others hold a contradicting opinion that, due to the highspeed operation of the chip, the PDN model extraction frequency range should be extended up to 1GHz even 2GHz. We will try to answer this question in the following paragraphs. A typical PDN’s impedance curve can be divided into three sections: board, package, and die, as shown by Figure 5. The main reason for a PDN impedance profile can be divided into different sections is due to the LC series/parallel structures present within a typically PDN, as shown in Figure 1 (no Cpkg in this PDN). The geometric sizes of, board, package, and silicon level routing normally play a role in forming the resonance structures, consisting of different values of inductances and capacitance in each section. Upon the stimuli of wide band signals, the LC series/parallel network produces multiple resonances/anti-resonances at different frequency points and these resonances/anti-resonances isolate the PDN network into different sections. 10 Figure 5 Typical PDN Z(f) Table 2 gives the frequency range of each section of PDN, based on our practical experiences. Note that the frequency ranges of two adjacent section have some overlaps, this is because the upper/lower bounds of each range is determined by the characteristics & locations of the decoupling capacitors employed at the interfaces of board and package, as well as package and die. For example, when the cavity caps, which are installed inside the BGA cavity (available only on larger chips such as CPUs), are heavily applied, they then will provide good isolation between the board and package PDNs, hence the starting frequency of package will be limited by the resonance frequency of the cavity capacitors. On the other hand, if the cavity capacitors are scarcely applied, which means the noise on the board may not be sufficiently filtered and a portion of the noise will enter into package and is therefore a noise path from package to board. In this case, since the board level noise will partly leverage the filtering effect of package PDN, the package PDN section will extend its lower bound down into board frequency range to reach 10MHz or even lower frequency The same concept also applies to the package capacitors close to the interface between package and die, i.e., the C4 bumps and the UBM. When these capacitors are not supplied sufficiently, the frequency range of die PDN will also extend into the package frequency range, to some frequency below 1GHz. In the case where no decoupling capacitors on package exists at all, then the board capacitors and die capacitance have to cover the conventional package capacitors decoupling range. For Instance in this case the board PDN range can be DC – 30MHz with the DIE PDN range being 30MHz – 1GHz. 11 The upper limit of the whole PDN frequency range is determined by the characteristics of the on-die capacitance. For example, when the die capacitances,(for example transistors , MIM and MoM capacitances) are heavily provisioned for the power rail, then the of the die level PDN frequency range can be as low as 100MHz and can extend beyond 1GHz. Table 2 The frequency range of PDN sections PDN section Frequency range Board DC – 20MHz Package 10MHz – 1GHz Die 100MHz – 2GHz and beyond Note The upper bound of the range depends on the decoupling capacitor performance and location. The lower bound of the range depends on the on-board decoupling capacitor performance and location. The upper bound of the range depends on the on-package decoupling capacitor performance and location. The lower bound of the range depends on the on-package decoupling capacitor performance and location. The upper bound of the range depends on the on-die decoupling capacitor performance and location In addition, the sectional frequency range also depends on the parasitic inductance of board and package since larger inductance creates a sharper anti-resonance rising slope which need more capacitance to suppress.. Summing up above analysis, the bounding frequencies of each section of the PDN depends upon the properties of the individual case at hand. While a rough range division as shown in Table 2 is generally applicable, the specific bounds of a particular project have to be determined through a case by case analysis. With above setups, the Hybrid model took about 15 mins to extract while the traditional 3D-EM model took 17 hours on a 40 core/2 Zeon E5 processor/512GB machine. The time benefit here is striking. Now the remaining questions are: Just how good will the Hybrid model be? Under what conditions can we use Hybrid instead of a 3D extraction? In the following sections we will try to answer these questions. S-parameter quality check As they say: Life is full of bumps. After a lengthy sequence of operations finally the sparameter of the PDN are obtained. Yet the model is still to be brought to a judge who will decide whether or not the model is usable – the sanity check. There are already many literatures [4][5] discussing about the s-parameter sanity check so we will not reiterate the significance of performing this step here. The s-parameter quality check result is presented in Table 3 S-parameter sanity check result of the 3D model. As can be seen all checked items showed good result except for the phase 12 jump. Although the checking tool rated the overall evaluation on the s-parameter as Poor solely due to the large phase jump, it may not necessarily mean that this s-parameter is unusable. Compared to other critical criteria such as passivity and causality, large phase jump does not always produce a problem. If the s-parameter actually has good continuity between adjacent data points, it should still be a good model despite the large phase jump. Table 3 S-parameter sanity check result of the 3D model Parameters 1 Matrix dimension 2 Passivity 3 4 5 Causality Reciprocity Lowest frequency 6 Maximum points between adjacent sampling Maximum between adjacent sampling points 8 Maximum range of phase Average number of amplitude extrema per 9 360° phase change 10 Overall evaluation 7 Results Comments 74 x 74 Maximum : 1.000021255154033 Number of non-passive points: 28 (11.570%) No non-causal points No non-reciprocal points 0 Hz Medium passivity violation No violation No violation Low enough 0.0310 Small jump 177.760° Large jump -643.268° from S(50,62) Small range 2 Good Poor Poor If the s-parameter passes the sanity check, then we may proceed to generate the rational model of the s-parameter. The rational polynomial model (also called a macromodel(mm)) represents the original s-parameter in such a way that the SPICE transient simulation can be run without having to do the convolution which would be demanded if directly using the s-parameters in a time domain simulation. There is a concern of how good the fidelity of the generated mm will be, that is if the mm’s response is very off from the original s-parameter, and consequently the obtained time domain result will be off from the real case. Figure 6 shows the s11 comparison between the original s-parameter and the derived mm. the errors between the two entities are also listed at the bottom of Figure 6. Give the max error of 0.058% it is believed that the mm is accurate enough to genuinely reflect the behavior of its source s-parameter. 13 Figure 6 Error of the fitted micromodel – 3D Similarly a s-parameter sanity check and macromodel fitting of the Hybrid PDN model was carried out, as shown in Table 4 and Figure 7. The checking and fitting results both showed that the s-parameter had both good sanity and that the fitted macromodel had very small error from the original s-parameter. Table 4 S-parameter sanity check result of the Hybrid model Parameters 1 14 Matrix dimension Results 74 x 74 Comments 2 Passivity 3 4 5 Causality Reciprocity Lowest frequency 6 Maximum points between adjacent sampling Maximum between adjacent sampling points 8 Maximum range of phase Average number of amplitude extrema per 9 360° phase change 10 Overall evaluation 7 15 Maximum : 1.000000000000002 Number of non-passive points: 5 (4.132%) No non-causal points No non-reciprocal points 1.000MHz Medium passivity violation No violation No violation Low enough 0.0342 Small jump 176.252° Large jump -710.519° from S(12,48) Small range 7 Acceptable Poor Poor Figure 7 Error of the fitted micromodel – Hybrid 16 Frequency domain comparison After the extracted s-parameter has been verified to be of good quality, now we may compare the result of the 3D and Hybrid models in both frequency and time domain. This section addresses the comparison of Z(f) which represents the impedance profile in frequency domain. Mainly, we look at the following FoM (Figure of Merit) of the Z(f) curves of the two models: 1. DC&LF values; 2. Resonance peak & frequencies; 3. Overall Z(f) A PDN is typically comprised of three parts: board, package, and die, as illustrated in Figure 8. Figure 8 Diagram of PDN The goal of plotting out the Z(f) of the PDN is to show the frequency characteristics of the PDN by which we can observe the following: 1. The amplitude and frequency of peaks in the Z(f) curve This data will help to identify the resonance frequencies of the PDN. By multiplying the Zpdn with the spectrum of icct (the die load current profile) and then performing an inverse Fourier transform (ifft), the time domain noise profile can be obtained. 2. The DC and LF value of Zpdn This value helps us understand the DC resistance of the PDN from VRM to the die, which decides the DC level of the time domain power supply waveform. 17 The Z(f)s of VDDA and VDDS obtained with 3D model and Hybrid model are plotted in Figure 9, with the full view shown in (a), and individual peak & DC&LF sections shown in (b) through (d). (a) (b) (c) 18 (d) Figure 9 Z(f) comparison (Blue: 3D; Pink: Hybrid) Table 5 compares the data captured in (b) – (d) which are the main deviating points between the two Z(f)s. The following observation can be made: 1. The biggest deviation occurs at DC&LF in which band an impedance difference of 8.6% is observed between the VDDS Z(f) of the two models. 2. The second largest deviation also occurs to the VDDS Z(f), at around 310kHz with a delta of 6.08%. 3. In general, VDDA shows very close correlation in Z(f) between the two models while VDDS shows some discrepancies. Table 5 The main Deviating points between 3D and Hybrid Z(f)s Deviating points 3D model Hybrid model 1 (VDDS) 2 (VDDA) 3 (VDDS) 4 (VDDS) 0.378@310.73kHz 4.27@41.69kHz 3.27@758.6MHz 0.0245@DC 0.401@309.03kHz 4.267@41.69kHz 3.285@758.6MHz 0.0266@DC ΔZ(f) (Hybrid – 3D)/3D 6.08% 0.07% 0.46% 8.6% An interesting question is that since the Z(f) of VDDS shows some deviations between the two models, then which one is more accurate, i.e., which model closer to the real value? To answer this question, a DC simulation was performed to obtain accurate DC resistance values for both the VDDA and VDDS power rails,. the results are shown in Figure 10. 19 Figure 10 DCR of VDDA, VDDS First, we will examine VDDS. Comparing the DCR obtained by DC simulation, which is 12.3mOhm, to the values obtained by the 3D and Hybrid models which are 24.5mOhm and 26.6mOhm, respectively, we determine that the Z(f) values are nearly twice that of the DCR. To find out the root cause of this apparent discrepancy, it may be necessary to look a bit deeper into the mechanism of how Z(f) is determined by the simulation tools. A simplified single power rail Z(f) topology is illustrated in Figure 11 with the following setup: The power rail has two paths connecting to its loads at port3 and port4. One path consists of z1 and z3, and the other consists of z2, z4; In the middle of two paths there is a bridging path that connects the two paths together through zb; Port3 and port4 each has one AC current source attached to them as the frequency sweep stimulus (icct), with each source’s current being half an ampere. Figure 11 Z(f) generation – Separation icct Table 6 shows t the calculated voltage magnitude at port3 and port4, which are equal to Z(f) in the unit of Ohm, as z2, z4 use different values. Here we introduced a new concept called Zpath which is defined as the impedance from all ports at load side lumped together to the ports at VRM lumped together. In the case of Figure 11 Zpath is the impedance of the Z(f) of port3 and port4 connected in parallel. Note that Zpath is only used to provide a rough evaluation on the overall PDN impedance and should not be used 20 as a representation of Zpdn, because Zpdn is more of a port-wise concept rather than alumped concept. Table 6 Separate port3, port4 z1 z2 z3 z4 zb I_port 3 (A) I_port 4(A) V(port4 ) (Vmag) 1 5.50 Z(f)_port3//Z(f)_ port4 (Ohm) Zpath (Ohm) 0.5 0.5 V(port3 ) (Vmag) 1 1 1 1 1 1 1 1 1 10 0 0 0.5 0.5 0.5 1//5.5 = 0.85 1 0.5 + 0.5//10 = 0.976 1 10 1 10 0 0.5 0.5 1.41 5.91 1.41//5.91 = 1.138 0.5 0.5 1 5.5 1//5.5 = 0.85 0.5//10 + 0.5//10 = 0.952 1//11 = 0.917 1 1 1 10 ∞ Notes: 1. The AC magnitude represents the impedance seen from each port toward VRM. To better clarify the matter the second simulation case is also introduced as shown in Figure 12. In this case port3 and port4 are combined into one port, and the stimulus AC current is changed to 1A, consequently. Figure 12 Z(f) generation – Combined icct Similarly, the calculated Z(f) of the combined icct is shown in Table 7: Table 7 Combined port3, port4 z1 1 1 z2 1 10 z3 1 1 z4 10 10 zb 0 0 I_port4(A) 1 1 V(port4) (Vmag) Zpath (Ohm) 1.41 1.41 1.8 1.8 From the results listed in Table 6 and Table 7 we have the following observations: 21 1. When the individual impedance from port3 and port4 to VRM are close to each other, the effective overall Zpath is approximately the individual impedances of each port, rather than these impedance lumped together (i.e. connected in parallel!). 2. When the individual impedance from port3 and port4 to VRM are close to each other, the effective overall Zpath is also relatively close to the impedance of all individual impedances lumped together. The above observation show that there is not a Yes or No answer to the controversial question: Does the individual impedance seen at each port, or all of the impedances lumped in a parallel fashion, represent the effective overall PDN impedance? Instead the correct answer is: It depends on the level of the difference between the individual impedances of each port in the PDN model. Now go back to our case study. First, why the DC&LF value of VDDS is 24.5mOhm while the actual DCR is 12.3m? The answers lie in the routing of the VDDS. Looking at the layout of VDDS on the package substrate, we can see this power rail has two balls on BGA side and two bumps on die side, with each of ball/bump on one side and are far from each other. Throughout all the package substrate layers there is no lateral connection of VDDS except one thin trace on CLB layer. Such construction makes VDDS effectively two separate routings and as shown previously, the overall PDN impedance would be close to the lumped impedance of the two paths, i.e., the half of 24.5mOhm which is 12.25mOhm – that is very close to the DC simulation value of the DCR. (a) 22 (b) Figure 13 VDDS routing ((a): Left – BGA side; Right – C4 side; (b): VDDS connection on CLB layer) Then the following question is: How to explain why VDDA shows 10.03mOhm while the DRC is 2.09mOhm? Again, the answer goes to the specific layout. Figure 14 shows the layout of VDDA on layer BL and BLB. Apparently enough, the lateral connection of VDDA is much better than VDDS. Therefore, the overall PDN impedance of VDDA cannot use the same calculation as that used for VDDS. Given that there are 21 bumps for VDDA, the overall impedance should be larger than the individual impedance of VDDA port divided by the number of bumps which is 10.03/21 = 0.476mOhm. indeed, it is smaller than the DC simulation value of DCR which is 2.09mOhm! (a) 23 (b) Figure 14 VDDA routing ((a): BL layer; (b): BLB layer) After sorting out the Z(f) puzzle, now we can plot out the transfer function from VDDS to VDDA. By disabling all AC sources at the load side except that for one connected to VDDS, we can obtain the transfer impedances of Zvdda_vdds as shown by Figure 15 and Figure 16. Selecting the max cross noise which occurs at around 758MHz, the transfer function max values of the 3D and Hybrid models are calculated as listed in Table 8. 24 Figure 15 25 Z(f) (VDDS only) – 3D Figure 16 Z(f) (VDDS only) – Hybrid Table 8 Transfer function of VDDS to VDDA Transfer function (max) 3D Hybrid 3.24% 3.58% (Hybrid – 3D)/3D 10.5% Summary of frequency domain comparison: 1. The max deviation between the Z(f) obtained with Hybrid and 3D model respectively occurs at DC&LF region, which is 8.06%. 2. VDDS shows larger deviation than VDDA. This could indicate that Hybrid solver tends to produce larger error when dealing with power shapes with very few vertical/lateral connections. When it comes to power shapes with multiple vias and solid horizontal connections, a Hybrid may give very close model as 3D solver does. 3. The 3D model gave a transfer function from VDDS to VDDA of 3.24% while Hybrid model gave 3.53%. 4. Overall, Hybrid and 3D models give quite close Z(f) in DC&LF, MF and HF ranges. From above frequency domain comparison, we can see that Hybrid solver gives no-lesser performance compared to 3D solver. Although, the time domain simulation result has the 26 final say in terms of whether the noise specification is meet or not, since the frequency domain Z(f) has a definite relationship to the time domain noise. We would not expect to see a huge deviation between the Z(f) and time domain result. Time domain comparison Before diving deep into time domain simulation, we still need to do one more thing – to convert the s-parameter models into rational polynomial model (also known as a macromodel). Figure 17 plots the Z(f) of both VDDA and VDDS based on the original s-parameters against the derived macromodels. It can be seen the macromodel is only very slightly different from the Z(f) generated from the original s-parameter. Note: 3D result is similar hence is omitted here. Figure 17 s-parameter vs macromodel– Hybrid Applying the obtained macromodel into the transient simulations, we can obtain the noise seen at C4 bump under the stimuli of multiple die iccts, with the die PDN model included. With VDDS VRM enabled and VDDA VRM disabled, the time domain noise is captured by Figure 18 in which (a) plots the waveform of VDDS and (b) is the coupled noise measured on VDDA. The noise waveforms of 3D and Hybrid are plotted together for a clear comparison. 27 (a) (b) Figure 18 28 VDDS-only transient noise ((a): VDDS; (b): VDDA; Blue: 3D; Pink: Hybrid) Table 9 records the noise results for a clear view: Table 9 Combined port3, port4 Model 3D Hybrid (Hybrid – 3D)/3D VDDS noise (mV) 59.76 61.79 3.4% VDDA noise (mV) 2.44 3.08 26.2% VDDA/VDDS 4.08% 4.98% 22.1% Summary of time domain comparison: 1. In time domain simulation, 3D and Hybrid models got similar noise result on the enabled power rail, i.e., VDDS, with the coupled noise produced with Hybrid model slightly higher than 3D (3.4%). 2. The coupled noise generated by Hybrid model is 22.1% higher than 3D. Compared to the frequency domain transfer function delta percentage of 10.5%, the time domain result is considered within the reasonable range. 29 Summary In this paper we compared the coupling effects of PDN modeled by a Hybrid and 3D fullwave solvers. The transfer impedance in frequency domain and the coupled noise in time domain were simulated and analyzed. The relationship between the Z(f) of individual port and the overall PDN impedance was derived. The impact of reference impedance used for PDN model extraction was investigated. Finally, the correspondence between frequency domain simulation and time domain simulation results were discussed. In conclusion to correctly model the coupling effects in a package, we suggest the following set of design rules: 1. For package which’s coupling effects are not significant, Hybrid solver can provide sufficiently accurate model as 3D full wave solver. 2. For package where coupling effect is critical, 3D solver is recommended over Hybrid. 3. For package with highly complicated vertical structures, regardless the coupling is significant or not, 3D solver is also recommended in order to model the vertical structure accurately. 4. The selection of Zref value depends on the particular case under consideration. Firstly we suggest to use a Zref that is close to the Zpdn at DC&LF frequency range. If the resultant s-parameter appears problematic during sanity checking, then try other values until a good s-parameter is obtained. 5. The boundary frequencies of each section of PDN depend on the characteristics of individual case including the decoupling capacitor numbers and locations, as well as the specific layout pattern. 6. Make sure the DC&LF impedance aligns with the real DCR. Since the relationship is not straightforward when lateral connections exist between ports, PI engineers can follow the example demonstrated in this paper to perform evaluation on the DC&LF impedance of PDNs. 30 Reference [1] Albert Lan, “Advanced Package Solutions & Its Challenges for Mobile and IoT/Wearable Devices”, 2016 [2] Intel, “Evolving Moore’s Law with chiplets and 3D packaging”, 2019 [3] P. G. Emma and E. Kursun, “Is 3D chip technology the next growth engine for performance improvement?” IBM J. Res. Dev., vol. 52, no. 6, Nov. 2008. [4] M. Swaminathan and E. Engin, “Power Integrity Modeling and Design for Semiconductors and Systems”, Prentice Hall, 2007 [5] Istvan Novak, Jason R.Miller, “Frequency Domain Characterization of Power Distribution Networks”, [6] Byoungjin Bae, et al, “A Preliminary Analysis of Domain Coupling in Package Power Distribution Network”, 2017 [7] Madhavan Swaminathan et al, “Designing and Modeling for Power Integrity”, 2010 [8] Yuriy Shlepnev, “Quality of S-parameter models”, Asian IBIS Summit, Yokohama, 2011 [9] Zheng Zhang et al, “Passivity Check of S-Parameter Descriptor Systems Via SParameter Generalized Hamiltonian Methods”, 2010 [10] Yarui Peng, “Die-to-Package Coupling Extraction for Fan-Out Wafer-LevelPackaging”, 2017 31