Microelectronics Reliability 56 (2016) 182–188
Contents lists available at ScienceDirect
Microelectronics Reliability
journal homepage: www.elsevier.com/locate/mr
Thermal reliability prediction and analysis for high-density electronic
systems based on the Markov process
Yi Wan a,⁎, Hailong Huang a, Diganta Das b, Michael Pecht b
a
b
College of Physics and Electronic Information Engineering, Wenzhou University, Wenzhou 325035, China
Center for Advanced Life Cycle Engineering, University of Maryland, College Park, MD 20742, USA
a r t i c l e
i n f o
Article history:
Received 9 April 2015
Received in revised form 25 August 2015
Accepted 5 October 2015
Available online 14 October 2015
Keywords:
Electronic systems
Thermal reliability estimation and prediction
Stochastic process
Markov theory
The feature parameters of thermal reliability
evaluation
a b s t r a c t
Thermal-mechanical fatigue is one of the main failure modes for electronic systems, particularly for high-density
electronic systems with high-power components. Thermal reliability estimation and prediction have been an
increasing concern for improving the safety and reliability of electronic systems. In this paper, we propose a
stochastic process prediction model to estimate the thermal reliability of an electronic system based on Markov
theory. We first divided the high-density electronic systems into four modules: the energy transformation and
protection module, the electronic control module, the connection module, and the signal transmission and transformation module. By integrating failure and repair characteristics of the four modules, a stochastic model of
thermal reliability analysis and prediction for a whole electronic system was built based on the Markov process.
The feature parameters of thermal reliability evaluation, including thermal reliability, thermal failure probability,
mean time between thermal faults, and thermal stable availability, were derived based on our comprehensive
model. Finally, we applied the model to an indoor electronic system of DC frequency conversion conditioning.
The thermal reliability was estimated and predicted using tested failure and debugging repair data. Effective
methods for improving thermal reliability are presented and analyzed based on the comprehensive Markov
model.
© 2015 Elsevier Ltd. All rights reserved.
1. Introduction
In an electronic system, the electrical and mechanical connections
between the electronic components and the circuit board are mainly
implemented through the solder joints. With the increase of integration
levels, more attention has been focused on overheating of electronic
products [1]. The thermal failure rate of electronic components will increase by an order of magnitude per 10 °C increment of environmental
temperature [2].
Thermal fatigue is the main failure mode of high-density electronic
systems. Table 1 shows the relation between failure rate and the temperature of electronic components [3,4]. Temperature and thermal features
directly affect the life of electronic products and are key issues for highdensity electronic systems [5]. Todd et al. pointed out that the reliability
of an electronic system is mainly the thermal reliability of the system [6].
As solder joints have gotten smaller, it has been emphasized that reliability designs need to be made to prevent thermal failure [7]. So, it is inevitable that reliability engineering and reliability theory are applied to
thermal failure analysis and thermal design of an electronic system.
⁎ Corresponding author.
E-mail address: jsj_yiwan@wzu.edu.cn (Y. Wan).
http://dx.doi.org/10.1016/j.microrel.2015.10.006
0026-2714/© 2015 Elsevier Ltd. All rights reserved.
Many scholars have carried out relevant studies, and some reliability
analysis and prediction models have been proposed. The most common
model used is the MIL standard model; the other common models are
the Schick–Wolverton model, the Shooman model, the Musa execution
time model, the Goel–Okumoto non-homogeneous Poisson course
model, the likelihood Bayesian model, and so on [8–11]. The other
popular methods are the thermal shock test/temperature cycling test
(TST/TCT) method, the nonlinear finite element method, and the
thermal structural reliability method [12–16].
In recent years, thermal reliability models have been developed.
Johann-Peter Sommer et al. presented a parametric finite element analysis (FEA) method for the thermal and thermo-mechanical behavior
of advanced electronic packages, taking into account the initial design
phase in order to evaluate and achieve reliability [17]. Yu-min Lee
et al. proposed an efficient statistical electro-thermal model for analyzing on-chip thermal reliability under process variations by the
collocation-based statistical modeling technique. A mixed-mesh strategy is presented to further enhance the efficiency of the developed statistical electro-thermal model [18]. Catelani et al. evaluated and analyzed
the reliability performance of the electronic device in accordance with
the Arrhenius and modified Coffin–Manson degradation models. The
whole test plan had been characterized, and some results were reported
[19]. Chipulis et al. developed an approach to solving the issue of
Y. Wan et al. / Microelectronics Reliability 56 (2016) 182–188
2. State analysis of an electronic system
Table 1
Failure rate of electronic components at high and low temperatures.
Basis failure rate
Component
Transistor
Ceramic capacitor
Transformer
Carbon film resistor
IC chip
High temperature
Low temperature
0.0640 at 160 °C
0.029 at 125 °C
0.0267 at 85 °C
0.0063 at 90 °C
0.5100 at 90 °C
0.008 at 40 °C
0.0009 at 40 °C
0.001 at 40 °C
0.0002 at 40 °C
0.0068 at 40 °C
183
ΔT/oC
120
85
45
50
50
⁎FRHL
8:1
32:1
27:1
31:1
75:1
⁎ FRHL denotes failure ratio of high and low temperature.
evaluating the reliability of thermal power engineering based on processing retrospective information by regression analysis methods [20].
Physics-of-failure (PoF) and T-FEA methods are developed and used in
the literature to assess and predict the thermal reliability of electronic
systems [21–22].
However, the existing models and methods are basically for thermal
reliability analysis and prediction for a single component. The comprehensive failure and debugging repair characteristics of an electronic
system were not considered in the above models. An electronic
system is a comprehensive system that consists of many components
and is capable of debugging and repair. Its thermal reliability is not
only determined by the thermal reliability of individual components,
but also the failure and debugging repair characteristics of every
component.
In this paper, we propose a stochastic process prediction model to
estimate the thermal reliability of electronic systems based on Markov
theory. We first divided high-density electronic systems into four
modules: the energy transformation and protection module, the electronic control module, the connection module, and the signal transmission and transformation module. By integrating failure characteristics
and debugging repair characteristics of the four modules, the stochastic
process model of thermal reliability analysis and prediction for the
whole electronic system was built based on Markov theory. Further,
the evaluation feature parameters, including thermal reliability, thermal
failure probability, mean time between thermal faults, and thermal stable availability, were derived and obtained based on the comprehensive
model. Finally, we applied the model to an indoor electronic system of
DC frequency conversion conditioning. The thermal reliability was
estimated and predicted using failure and debugging repair data, and
effective methods of improving thermal reliability were presented and
analyzed based on the comprehensive Markov model. In this work, we
developed a new method for thermal reliability design, accurate thermal reliability evaluation and prediction, and scientific management
for high-density electronic systems.
A high-density electronic system consists of an energy transformation and protection module, an electronic control module, a connection
module, and a signal transmission and transformation module (Fig. 1).
Every module consists of many components. An energy transformation
and protection module includes a switching power supply, powerconverting chips, transformers, fuses, magnetic rings, voltagedependent resistors, and so on. An electronic control module includes
relays, optocouplers, light-emitting diodes (LEDs), and so on. A connection module includes connectors, sockets, printed circuit boards (PCBs),
connecting cables, and so on. A signal transmission and transformation
module includes resistors, capacitors, inductors, diodes, transistors,
digital integrated circuits, AD/DA circuits, operational amplifiers, and
so on.
There are two states of normal and thermal failures for every module
in an electronic system. A state is constantly transformed from normal
to thermal failure, then from thermal failure to normal by debugging
or repair within a given time. A state transformation is shown in
Fig. 2. Both failure time and debugging repair time are random, and
the failure probability of every module is also random. So life and
state transformations for every module are random processes.
The thermal failure random process of an electronic system can be
considered a continuous-time Markov process with discrete states.
Assume the failure process of an electronic system corresponds with
the following statements [23]:
(1) The state is discernible.
(2) The thermal failure rate, λ, and debugging repair rate, μ, of all
modules are both approximately able to be regarded as constant,
so reliability distribution of every module is an exponential
distribution.
(3) The failure probability of the normal module is λΔt within t to
t + Δt.
(4) The debugging repair probability of the failure module is μΔt
within t to t + Δt.
(5) The failure probability of two and more modules is approximately
zero within Δt.
(6) The thermal failure event and debugging repair event of every
module are independent of each other.
Based on the above analysis, the thermal failure process of an electronic system is a Markov process wherein the state alternates between
normal and failure.
Assume that the state of every module is independent of the others
and that two or more modules are unable to be in thermal failure at the
same time. A group of professional debugging and repair specialists are
provided, and the state of every module is shown in Table 2.
3. Prediction and analysis of thermal reliability for an electronic
system based on the Markov process
If λ1, λ2, and λ4 are the thermal failure rates, then μ1, μ2, μ3, and μ4 are
the debugging repair rates of the energy transformation and protection
module, the electronic control module, the connection module, and the
Fig. 1. Functional structure of an electronic system.
Fig. 2. State transformation of an electronic system.
ID
548131
Title
Thermalreliabilitypredictionandanalysisforhigh-densityelectronicsystemsbasedontheMarkovprocess
http://fulltext.study/article/548131
http://FullText.Study
Pages
7