Accelerated Testing Background Accelerated Testing Obtaining Reliability Information Quickly • Today’s manufacturers face strong pressure to: Improve productivity, product field reliability, and overall quality using new technology. Develop newer, higher technology products in record time. William Q. Meeker Department of Statistics and Center for Nondestructive Evaluation Iowa State University Ames, IA 50011 • Implies increased need for up-front testing of materials, components and systems. • Accelerated tests provide timely information for product design and development. • Users must be aware of potential pitfalls 1 2 Overview What is Reliability? • Different kinds of accelerated tests • R(t) = 1 − F (t) • Example 1—Evaluation of an insulating structure • The probability that a system, vehicle, machine, device, and so on, will perform its intended function under encountered operating conditions, for a specified period of time. • Quality over time • Example 2—New-technology microelectronic logic device • Accelerated Degradation Tests • Importance of physics of failure and physical/chemical models (and sensitivity analysis) • A powerful marketing tool • Example 3—Microelectronic RF amplifier device • An engineering discipline requiring support from • Connecting with the field Physics and chemistry • Example 4—Appliance field reliability Statistics • Areas for further research 3 4 Breakdown Times in Minutes of a Mylar-Polyurethane Insulating Structure (from Kalkanis and Rosso 1989) Some Applications of Accelerated Tests • Assess component or material reliability or durability. 10 4 10 3 10 2 10 1 10 0 • Make design decisions to improve reliability or lower cost • System test to simulate field-use at accelerated conditions. • Predict product field performance. • Identify and fix potential failure modes at system/subsystem level (HALT and STRIFE tests). • Screening (100% or audit) testing of manufactured product (e.g. ESS and burn-in). Minutes • Verify predictions produced with physical models (e.g. FEM) •• • • • • • •• • •• • • • • • • • • •• •• • • •• • • • • -1 10 100 150 200 250 300 350 400 kV/mm 5 6 Plot of Inverse Power Relationship-Lognormal Model Fitted to the Mylar-Polyurethane Data (also Showing 361.4kV Data Omitted from the ML Estimation) Inverse Power Relationship-Lognormal Model The inverse power relationship-lognormal model is Pr[T ≤ t; volt] = Φnor log(t) − µ σ 10 5 10 4 10 3 10 2 10 1 ••• • • • Minutes where • µ = β0 + β1x, and • x = log(Voltage Stress). 10 • σ assumed to be constant. • ••• •• • • ••• •• • • •• •• • 90% • •• • • • • 0 -1 10 50 100 200 50% 10% 500 kV/mm 7 8 Lognormal Probability Plot of the Inverse Power Relationship-Lognormal Model Fitted to the Mylar-Polyurethane Data Methods of Acceleration Three fundamentally different methods of accelerating a reliability test: .99 .98 .95 • Increase the use-rate of the product (e.g., test a toaster 400 times/day). Higher use rate reduces test time. .9 Proportion Failing .8 .7 • Use elevated temperature or humidity to increase rate of failure-causing chemical/physical process. .6 .5 .4 .3 • Increase stress (e.g., voltage or pressure) to make degrading units fail more quickly. .2 .1 .05 Use a physical/chemical (preferable) or empirical model relating degradation or lifetime at use conditions. .02 219.0 .01 10 0 1 10 157.1 122.4 10 50 kV/mm 100.3 2 10 3 10 4 10 5 Minutes 9 Interval ALT Data for a New-Technology IC Device 10 New-Technology Integrated Circuit Device ALT Data • Tests run at 150, 175, 200, 250, and 300◦C. • Failures had been found only at the two higher temperatures. • After early failures at 250 and 300◦C, there was some concern that no failures would be observed at 175◦C before decision time. • Thus the 200◦C test was started later than the others. 11 Hours • Developers interested in estimating activation energy of the suspected failure mode and the long-life reliability. 10 7 10 6 10 5 10 4 10 3 10 2 x x x 100 150 200 250 x x x 300 350 Degrees C 12 The Arrhenius-Lognormal Regression Model Elevated Temperature Acceleration of Chemical Reaction Rates The Arrhenius-lognormal regression model is • The Arrhenius model Reaction Rate, R(temp), is −Ea −Ea × 11605 = γ0 exp kB (temp ◦C + 273.15) temp K where temp K = temp ◦C+273.15 is temperature in Kelvin and Pr[T ≤ t; temp] = Φnor R(temp) = γ0 exp kB = 1/11605 is Boltzmann’s constant in units of electron volts per K. The reaction activation energy, Ea, and γ0 are characteristics of the product or material being tested. • The reaction rate Acceleration Factor is log(t) − µ σ where • µ = β0 + β1x, • x = 11605/(temp K) = 11605/(temp ◦C + 273.15) R(temp) R(tempU ) 11605 11605 = exp Ea − tempU K temp K AF(temp, tempU , Ea ) = • and β1 = Ea is the activation energy • When temp > tempU , AF (temp, tempU , Ea) > 1. • σ is constant 13 Arrhenius Plot Showing ALT Data and the Arrhenius-Lognormal Model ML Estimation Results for the New-Technology IC Device. 14 Lognormal Probability Plot Showing the Arrhenius-Lognormal Model ML Estimation Results for the New-Technology IC Device .95 10 7 10 6 10 5 10 4 10 3 .9 .7 Proportion Failing Hours .8 .6 .5 .4 .3 .2 .1 .05 .02 10 x x x x x x 2 100 150 200 250 300 .01 .005 .002 50% .0005 10% 1% 300 Deg C .0001 350 10 2 250 200 3 10 175 10 4 150 100 10 5 10 6 10 7 Hours Degrees C on Arrhenius scale 15 16 Lognormal Probability Plot Showing the Arrhenius-Lognormal Model ML Estimation Results for the New-Technology IC Device with Given Ea = .8 Pitfall 4: Masked Failure Mode .95 .9 • Accelerated test may focus on one known failure mode, masking another! .8 Proportion Failing .7 .6 .5 .4 .3 • Masked failure modes may be the first one to show up in the field. .2 .1 .05 .02 .01 .005 • Masked failure modes could dominate in the field. .002 .0005 300 Deg C .0001 10 2 250 200 3 10 175 10 4 150 100 10 5 10 6 10 7 Hours 17 18 Unmasked Failure Mode with Lower Activation Energy 6 10 6 10 5 10 5 10 4 10 4 10 10% 10 10 3 3 Mode 1 2 10 2 10 Hours Hours Possible results for a typical temperature-accelerated failure mode on an IC device Mode 2 10% 10% 10 10 1 40 60 80 100 120 1 40 140 60 80 100 120 140 Degrees C Degrees C 19 20 Percent Increase in Resistance Over Time for Carbon-Film Resistors (Shiomi and Yanagisawa 1979) Advantages of Using Degradation Data Instead of Time-to-Failure Data • Degradation is natural response for some tests. 10.0 173 Degrees C Percent Increase 5.0 • Can be more informative than time-to-failure data. (Reduction to failure-time data loses information) 133 Degrees C • Useful reliability inferences even with 0 failures. 1.0 83 Degrees C 0.5 • More justification and credibility for extrapolation. (Modeling closer to physics-of-failure) 0 2000 4000 6000 8000 10000 Hours 21 22 Percent Increase in Operating Current for GaAs Lasers Tested at 80◦C Limitations of Degradation Data 10 0 • Analyses more complicated; requires statistical methods not yet widely available. (Modern computing capabilities should help here) 5 • Substantial measurement error can diminish the information in degradation data. Percent Increase in Operating Current • Obtaining degradation data may have an effect on future product degradation (e.g., taking apart a motor to measure wear). 15 • Degradation data may be difficult or impossible to obtain (e.g., destructive measurements). • Degradation level may not correlate well with failure. 0 1000 2000 3000 4000 Hours 23 24 Device-B Power Drop Accelerated Degradation Test Results at 150◦C, 195◦C, and 237◦C (Use conditions 80◦C) Arrhenius Model Temperature Effect on Chemical Degradation A1 k1 -A2 and the rate equations for this reaction are 0.0 Power drop in dB -0.2 • • •• ••• • • • • • ••• • •• • •• • ••• • • •• • ••• •• •• • • • • • •• •• • -0.4 • • • •• • • • •• • • •• • • • • -0.6 -0.8 -1.0 • • -1.2 -1.4 •• •• •• • •• • • • • • ••• • ••• • • •• •• • • • ••• • • • • • • • • • • ••• •• • • • • •• • • • • • • • • • • • • •• •• • • ••• ••• •• • • • •• • ••• • •• •• • • • 150 Degrees C • •• • •• • • • •• • •• ••• • •• •• •• •• • • • • • • • • • • • • •• • • • • • • • •• • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • •• •• • • • • • • •• •• • • • • • • • • • •• • • •• •• • • • • • •• • • •• •• • • • •• •• • ••• • • • • • • • • • • • • •• •• • • • • • • • • • • • • • • • • • • • 195 Degrees C • • • • • • ••• • • • • • • • • • • • • • • • • •• • • 237 Degrees C • • • • •• 0 1000 2000 •• • • • • •• • • • •• ••• • dA1 = −k1A1 dt Solving these gives • • • • • • • • • •• • • • • and dA2 = k1A1, dt k1 > 0. (1) A1(t) = A1(0) exp(−k1t) A2(t) = A2(0) + A1(0)[1 − exp(−k1t)] where A1(0) and A2(0) are initial conditions. The Arrhenius model describing the effect that temperature has on the rate of a simple first-order chemical reaction is 3000 k1 = γ0 exp 4000 −Ea kB × (temp + 273.15) Hours 25 26 Lognormal-Arrhenius Model Fit to the Device-B Time-to-Failure Data with Degradation Model Estimates .99 What Do Accelerated Test Results Tell Us About Field Reliability? Need information on: .98 .95 Proportion Failing .9 .8 • Effects of acceleration (e.g., cycling rate). .7 .6 .5 .4 .3 • Distribution of use-rates in actual use. .2 .1 • Distribution of environmental conditions (e.g., stress spectra distributions). .05 .02 .01 .005 237 Degrees C 195 150 Degrees C 80 Degrees C .001 10^1 10^2 10^3 10^4 These factors may be given or, in some situations, inferred from the available data. 10^5 Hours 27 28 Establish a Transfer Function Relating Laboratory Tests and Field Performance Component-A Laboratory Test Cycles to Failure • Carefully compare laboratory tests results and field failures. Same failure mechanisms operating in laboratory tests? Same factors (environmental noises) exciting the failure mechanisms? Identify laboratory/field discrepancies to improve test procedures. Seek understanding of reasons for lack of agreement. • Find a model (transfer function) to relate laboratory test to field use. • Understanding the relationship between the laboratory test results and product field reliability will provide stronger basis for using future laboratory tests to predict field performance. 29 0 10000 20000 30000 40000 50000 Cycles 30 Appliance Use-Rate Distribution (discretized lognormal distribution) Example Use-Rate Model • Life of a component in cycles of use, has a distribution 0.15 FC (c) = P (C ≤ c) = Φ log(c) − µ σ 0.05 0.10 • Actual use-rate has a distribution given by the proportion of users πi (i = 1, . . . , k) that use the appliance at constant rate Ri, where ki=1 πi = 1. • Then the failure probability as a function of time is 0.0 FT (t; θ ) = P (T ≤ t) = k πi Φ i=1 log (t) − µi σ where θ = (µ1, . . . , µk , σ) and µi = µ − log(Ri). 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Relative Frequency of Appliance Uses per Week 31 32 Predicted Field Reliability of Component-A as a Weighted Average of Lognormal Distributions Predicted Field Reliability of Component-A as a Weighted Average of Weibull Distributions .999 .98 .9 .95 .9 .7 .5 .8 .7 .3 .2 .6 .5 Fraction Failing Probability .1 .05 .02 .01 .005 .003 .4 .3 .2 .1 .05 .02 .001 .01 .0005 .005 .0002 .002 .001 .0001 50 100 200 500 1000 2000 5000 50 100 200 Mon Apr 10 14:01:32 CDT 2000 Weeks of Service 500 1000 Weeks of Service 2000 5000 Fri Mar 23 21:27:13 CST 2001 33 Fitted Use-Rate Model for the Wear Failure Mode 34 Field Variability Lognormal f (r; ηR, σR ) Density for the Wear Failure Mode (unloaded cycles relative to field days of use) Lab: subset AccWear Appliance B Wear Failure Mode ALT data Field: subset Field Appliance B Wear Failure Mode .95 Laboratory Field .9 Lab: subset AccWear Appliance B Wear Failure Mode ALT data Field: subset Field Appliance B Wear Failure Mode Fraction Failing .8 .7 .6 .5 .4 .3 .2 .1 .05 .02 .01 .005 .002 .0005 .00005 .00001 2 5 10 20 50 100 Lab time: Test Cycle 200 500 Field time: Weeks 1000 2000 5000 Thu May 10 22:23:56 CDT 2001 0.01 0.05 0.20 1.00 5.00 20.00 Thu May 10 22:24:23 CDT 2001 Test Cycles per Week 35 36 Simulation of a Proposed Accelerated Life Test Plan Planning Accelerated Tests • Most basic ideas of traditional DOE still hold Temp= 78,98,120 n= 155,60,84 centime= 183,183,183 parameters= -16.7330, 0.7265, 0.6000 10 5 10 4 10 3 10 2 10 1 Log time quantiles at 50 Degrees C Average( 0.1 quantile)= 8.014 SD( 0.1 quantile)= 0.4632 Average( 0.5 quantile)= 9.138 SD( 0.5 quantile)= 0.5116 Average(Ea)= 0.7266 SD(Ea)= 0.08594 • In censored accelerated life tests (failure time is response) allocate more test units to low acceleration factor level than high acceleration factor levels. • Consider including some tests at the use conditions. • Use simulation to investigate properties of alternative ALT plans. Days • Limit, as much as possible the amount of extrapolation used. 10 10% 0 Results based on 500 simulations Lines shown for 50 simulations 40 60 80 100 120 140 160 Degrees C 37 38 Concluding Remarks Areas for Further Research • Physical/statistical models for failure acceleration • Accelerated Testing can be valuable tool when used carefully • Methods for sensitivity analysis when empirical models must be used • There is no magic in Accelerated Testing • Prediction of service life in complicated environments • Physical/statistical models the field environment • Bayesian methods for analysis and planning (especially adaptive test plans) • Accelerated degradation test planning • Degradation analysis and planning with coarse (e.g. ordered categorical and censored) data. • Physical comparison of lab and filed failures to validate testing methods • Cross-disciplinary teams are needed to deal effectively with all issues Product/reliability/design engineers to identify productuse profiles, environmental considerations, potential failure modes or weaknesses that need to be evaluated, etc. Experts in materials and the chemistry/physics of failure to help in the understanding of an suggest/develop appropriate models for acceleration of particular failure modes. Statisticians to help with stochastic modeling, plan tests, fit models, and to help quantify uncertainty in results. • Users of Accelerated Testing must beware of pitfalls 39 40 References References • D. Byrne, J. Quinlan, Robust function for attaining high reliability at low cost, 1993 Proceedings Annual Reliability and Maintainability Symposium, 1993, pp 183-191. • L. W. Condra, Reliability Improvement with Design of Experiments, 1993, New York: Marcel Dekker, Inc. • M. Hamada, Using statistically designed experiments to improve reliability and to achieve robust reliability, IEEE Transactions on Reliability R-44, 1995 June. • M. Hamada, Analysis of experiments for reliability improvement and robust reliability, in Recent Advances in Life-Testing and Reliability, 1995, N. Balakrishnan, editor, Boca Raton: CRC Press. • Meeker, W.Q. and Hamada, M. (1995), Statistical Tools for the Rapid Development & Evaluation of High-Reliability Products, IEEE Transactions on Reliability R-44, 187-198. 41 • Meeker, W.Q. and Escobar, L.A. (1998a), Statistical Methods for Reliability Data. John Wiley and Sons, Inc. • Meeker, W.Q. and Escobar, L.A. (1998b), Pitfalls of Accelerated Testing. , IEEE Transactions on Reliability R-47, 114-118. • W. Nelson, Accelerated Testing: Statistical Models, Test Plans, and Data Analyses, 1990, New York: John Wiley & Sons, Inc. • G. Taguchi, System of Experimental Design, 1987; White Plains, NY: Unipub/Kraus International Publications. • T. S. Tseng, M. Hamada, C. H. Chiao, (1995), Using degradation data from a factorial experiment to improve fluorescent lamp reliability, Journal of Quality Technology, 363-369. 42