System Reliability and Availability Estimation Under Uncertainty Tongdan Jin, Ph.D. Ingram School of Engineering Texas State University, San Marcos, TX tj17@txstate.edu 4/11/2012 1 Contents System Reliability Estimation * Variance of reliability estimate * Series, and parallel systems Operational Availability * Performance based maintenance/logistics/contracting * Reliability growth or spare parts stocking ? * A unified availability model Conclusion 2 3 Topic One Modeling System Reliability With Uncertain Estimates Two Components having Same Reliability? Component Test Plan 1 Testing 100 hours Sample n=10, survivals=9 rˆ1 9 0 .9 10 Test Plan 2 Testing 100 hours Sample n=20, survivals=18 rˆ2 18 0.9 20 Which component is more reliable? 4 Risk-Averse vs. Risk-Neutral Design system 2 f (Rˆ ) E[ Rˆ1 ] E[ Rˆ 2 ] Var ( Rˆ1 ) Var ( Rˆ 2 ) system 1 Rˆ • f (Rˆ ) = probability density function for reliability estimate Rˆ • risk-neutral design would always choose system 1 • risk-adverse design might choose system 2 5 Variance of Reliability Estimate Test Plan 1 Testing 100 hours Sample n=10, survivals=9 9 rˆ1 0 .9 10 vaˆ r(rˆ) 0.9(1 0.9) 0.01 10 1 Which component is more reliable? Test Plan 2 Testing 100 hours Sample n=20, survivals=18 18 rˆ1 0. 9 20 vaˆr(rˆ) 0.9(1 0.9) 0.0047 20 1 rˆ(1 rˆ) vaˆ r( rˆ) n 1 6 Variance vs. Sample Size rˆ1 x n n=sample size rˆ(1 rˆ) vaˆ r(rˆ) n 1 x=survivals Variance of Component Reliability Estimate Variance 0.200 r=0.8 0.150 0.100 r=0.9 0.050 0.000 0 10 20 Sample Size 30 40 7 Reliability Variance of Series Systems Component 1 Component 2 Rˆ s rˆ1rˆ2 vaˆr(Rˆ s ) rˆ12 rˆ22 rˆ12 vaˆr(rˆ1 ) rˆ22 vaˆr(rˆ2 ) k Components in Series k Rˆ s rˆi i 1 k k i 1 i 1 vaˆr(Rˆ s ) rˆi 2 (rˆi 2 vaˆr(rˆi )) 8 Numerical Example Component 1 Test Plan 1 Testing 100 hours n1=10, x1=9 n2=20, x2=17 rˆ1 0.9 rˆ2 0.85 vaˆr(rˆ1 ) 0.01 vaˆr(rˆ2 ) 0.0067 Component 2 k Rˆ s rˆi i 1 k k i 1 i 1 vaˆr(Rˆ s ) rˆi 2 (rˆi 2 vaˆr(rˆi )) Rˆ s (0.9)(0.85) 0.765 vaˆr(Rˆ ) (0.9 2 0.852 ) s (0.9 2 0.01) (0.852 0.0067) 0.0126 9 Reliability Confidence Estimate Assuming Rˆ s is normally distributed, the lower bound Rˆ s E[ Rˆ s ] Z vaˆr(Rˆ s ) E[ Rˆ s ] 0.765 vaˆr(Rˆ ) 0.0126 s Rˆ s 0.765 (1.28) 0.0126 0.62 With 90% confidence Rˆ s 0.765 (1.64) 0.0126 0.58 With 95% confidence 10 Reliability Variance of Parallel System Component 1 Component 2 Rˆ p 1 (1 rˆ1 )(1 rˆ2 ) 1 qˆ1qˆ2 Where qˆi 1 rˆi for i=1, and 2 11 Parallel System Series System Estimates for Reliability and Unreliability rˆ x n vaˆ r(rˆ) qˆ n=sample size rˆ(1 rˆ) n 1 x=survivals nx n vaˆr(qˆ ) (1 qˆ )qˆ rˆ(1 rˆ) vaˆr(rˆ) n 1 n 1 12 Variance of Parallel System qˆ1 ˆ2 q ˆk q k components in parallel k k i 1 i 1 Rˆ p 1 (1 rˆi ) 1 qˆi k k vaˆr(Rˆ p ) qˆ (qˆi2 vaˆr(qˆi )) 2 i i 1 Where i 1 qˆi 1 rˆi 13 Numerical Example Component 1 Component 2 Test Plan 1 Testing 100 hours n1=10, x1=9 n2=20, x2=17 rˆ1 0.9; qˆ1 1 rˆ1 0.1 rˆ2 0.85; qˆ 2 1 rˆ2 0.15 vaˆr(qˆ1 ) 0.01 vaˆr(qˆ 2 ) 0.0067 k k i 1 i 1 Rˆ p 1 (1 rˆi ) 1 qˆi k k vaˆr(Rˆ p ) qˆ (qˆi2 vaˆr(qˆi )) 2 i i 1 i 1 Rˆ p 1 (0.1)(0.15) 0.985 vaˆr(Rˆ p ) (0.12 0.152 ) (0.12 0.01) (0.152 0.0067) 0.000225 14 Reliability Confidence Estimate Assuming Rˆ p is normally distributed, then Rˆ p E[ Rˆ p ] Z vaˆr(Rˆ p ) E[ Rˆ p ] 0.985 vaˆr(Rˆ p ) 0.000225 Rˆ p 0.985 (1.28) 0.000225 0.966 With 90% confidence Rˆ 0.985 (1.64) 0.000225 0.960 With 95% confidence p 15 Series-Parallel Systems 5 4 1 2 6 3 Variance Estimation 1 7 5’ 4 1’ 2’ 7 4’ 1’’ 7 1’’ 7’ 16 Compute r and var(r) over Time time (hours) Sample Size Failures Cum Failures 1 20 0 0 1 0 2 20 0 0 1 0 3 20 0 0 1 0 4 20 1 1 0.95 0.0025 5 20 0 1 0.95 0.0025 6 20 0 1 0.95 0.0025 7 20 1 2 0.9 0.0047 8 20 1 3 0.85 0.0067 9 20 2 5 0.75 0.0099 10 20 1 6 0.7 0.0111 Reliability Variance 17 18 Topic Two Operational Availability under Performance Based Contract (PBC) 19 Service Parts Logistics Business • Representing 8-10% of GDP in the US. • US airline industry is $45B on MRO in 2008. • US auto industry is $190B and $73B for parts in 2010. • US DoD maintenance budget $125B and $70B inventory with 6,000 suppliers. • Joint Strike Fighter (F-35): $350B for R/D/P, and $600B for after-production O/M for 30 years. • EU Wind turbine service revenue €3B in 2011 • IBM computing/network servers, etc. 20 Cost ($) Total Ownership Cost Distribution 30-40% 50-60% 10-20% Research Manufacturing Development Operation and Support 5% Retirement PBC aims to lower the cost of ownership while ensuring system performance goals Reference DoD 5000, University of Tennessee 21 Reliability Allocation and Spare Parts Logistics Reliability Allocation Spare Parts Logistics r5 (t) r1(t) r6(t) r2(t) r4(t) s21 r8(t) s s22 r7(t) min varRsys (r(t ), n) s32 Fleet 2 s3,n-1 Fleet n-1 Fleet n max Ap (s,x) min Cost r(t ), n Tillman et al. (1977) Kuo et al. (1987) Chen (1992) Jin & Coit (2001) Levitin & Lisnianski (2001) Coit et al. (2004) Ramirez-Marquez et al. (2004) Marseguerra, Zio (2005) Jin & Ozalp (2009) Ramirez-Marquez & Rocco (2010) More ..... Fleet 1 s3,n max E[ Rsys (r(t ), n)] • • • • • • • • • • • s32 min Cost(s,x), EBO(s, x) • • • • • • • • • • • Scherbrooke (1968, 1992) Muckstadt (1973) Graves (1985) Lee (1987) Cohen et al. (1990) Diaz & Fu (1996) Alfredsson (1997) Zamperini & Freimer (2005) Lau & Song (2008) Kutanoglu et al. (2009) More ..... A 4-Step Performance-Based Contracting Step 1 Performance Outcome Step 2 Performance Measures System readiness, operational reliability, assurance of spare parts supply System availability, MTBF, MTTR, Mean downtime, logistics response time Step 3 Performance Criteria Step 4 Performance Compensation Mini availability, max failure rate, max repair waiting time, max cost per unit time Cost plus incentive fee, cost plus award fee, linear reward, exponential reward 22 Five Performance Measures by US DoD • Operational availability (OA) • Inherent reliability or mission reliability (MR) • Logistics response time (e.g. MTTR, LDT) • Cost per unit usage (CUU) • Logistics footprint 23 Interactions of Five Performance Measures MTBF Ao MTBF MTTR MLDT MTBF=Mean Time Between Failures MTTR=Mean Time to Repair MLDT=Mean Logistics Delay Time Mission Reliability (MR) Operational Availability(OA) Logistics Footprint (LF) Cost Per Unit Usage (CUU) Logistics Response Time (LRT) 24 Total Ownership Cost Evolution of Sustainment/Maintenane Solution CM=>{Warranty, MBC} PM=>{MBC} CBM=>{Warranty, MBC} PBM/PBL=>{PBC} CM PM CBM PBM PBC aims to lower the cost of ownership while ensuring system performance (e.g. reliability and availability). Note: PBM=performance-based maintenance 25 Integrating Manufacturing with Service Emergency Repair OEM for design and manufacturing Repair Center Local spares stocking Supplier or OEM System fleet N(t) Customer Emergency Repair OEM for design and manufacturing Repair Center Supplier or OEM Local spares stocking System fleet N(t) Customer 26 Availability and Variable Fleet Size Variable Fleet Size • Availability MTBF(hours) 1,200 800 600 System Population 800 400 400 200 135 120 105 90 75 60 45 30 15 0 1 0 Weeks Cumulative Installed WT (1998 to 2030) 150,000 160,000 1.5 MW (2010-2014 ) 2.0 MW (2015-2020) 2.5 MW (2020-2025) 3.0 MW (2025-2030) 140,000 120,000 100,000 80,000 60,000 40,000 22,500 20,000 2030 2028 2026 2024 2022 2020 2018 2016 2014 2012 2010 2008 2006 0 02-03 Installed WT Population Wind Power Industry 200 A 0.95 200 10 MTBF 1,600 100 A 0.95 100 5 • MTBF=200 hours, MDT=10 hours 1000 Cumulative Fleet Size 2,000 98-99 • MTBF=100 hours, MDT=5 hours Figure 1: System Reliability and Fleet Size Semiconductor Industry MTBF A MTBF MDT 27 Performance Measures and Drivers Inherent Reliability () MTBF OEM Controlled Maintenance Schedule () MTTR Operational Availability (Ao) Logistics Support (s, ts, tr) MLDT Customer Controlled System Fleet (n, ) 28 29 A Unified Operational Availability Model Ao ( , s, , n, tr , t s ) 1 s (ntr ) x e ntr 1 t s tr 1 x! x 0 =system or subsystem inherent failure rate s =base stock level β =usage rate, and 0β1 n =installed base size tr =repair turn-around time ts =time for repair-by-replacement Ref: Jin & Wang (2011) Trading Reliability with Spares Stocking (II) Reliability vs. Spare Parts Inventory spare parts number 10 =0.5, n=50, tr=60 days 8 6 Ao=0.8 4 2 0 2000 Ao=0.95 4000 6000 8000 10000 12000 14000 16000 MTBF (1/lambda) in hours Note: here lambda=alpha in previous slide 30 Trading Reliability with Spares Stocking (I) Reliability vs. Spare Parts Inventory Level number of spare parts 10 =0.5, n=50, tr=30 days 8 Ao=0.8 6 4 Ao=0.95 2 0 1000 2000 3000 4000 5000 6000 7000 8000 MTBF (1/lambda) in hours 31 Trading Reliability and Spares Stocking (III) Reliability vs. Spare Parts Inventory spare parts number 20 =0.8, n=50, tr=30 days 15 10 Ao=0.8 5 Ao=0.95 0 0 2000 4000 6000 8000 10000 MTBF (1/lambda) in hours 12000 32 Key Terminologies 1. Variance of reliability estimate 2. Variance propagation 3. Series/parallel reduction 4. Unbiased estimate 5. Operational availability 6. Mean downtime 7. Mean time to repair 8. Mean logistics delay time 9. Mean time between failures 10. Mean time to failure 11. Performance based logistics/contracting/maintenance 12. Performance measure 13. Performance criteria 14. Material based contracting 33 Conclusion 1. Variance is a simple, yet accurate metric to gauge the reliability uncertainty 2. Estimating the reliability variance for series, parallel and mixed series-parallel systems 3. PBC aims to guarantee the system performance while lowering the cost of ownership 4. PBC incentivizes the OEM/3PL to maximize the profit by optimizing the development, production and logistics delivery. 34 References Reliability Estimation 1. 2. 3. 4. 5. D. W. Coit, “System reliability confidence intervals for complex systems with estimated component reliability,” IEEE Transactions on Reliability, vol. 46, no. 4, 1997, pp. 487-493. J. E. Ramirez-Marquez, and W. Jiang, “An improved confidence bounds for system reliability,” IEEE Transactions on Reliability, vol. 55, no. 1, 2006, pp. 26-36. E. Borgonov, “A new uncertainty measure”, Reliability Engineering and System Safety, vo;. 92, pp. 771784, 2007. T. Jin, D. Coit, "Unbiased variance estimates for system reliability estimate using block decompositions," IEEE Transactions on Reliability , vol. 57, 2008, pp.458-464. H. Guo, T. Jin, A. Mettas, “Designing reliability demonstration test for one-shot systems under zero component failures," IEEE Transactions on Reliability , vol. 60, no. 1, 2011, pp. 286-294 Availability Estimation 1. 2. 3. 4. 5. 6. 7. Huang, H.-Z., H.J. Liu, D.N.P. Murthy. 2007. Optimal reliability, warranty and price for new products. IIE Transactions, vol. 39, no. 8, pp. 819-827. Kang, K., M. McDonald. 2010. Impact of logistics on readiness and life cycle cost: a design of experiments approach, Proceedings of Winter Simulation Conference. pp. 1336-1346. Kim, S.H., M.A. Cohen, S. Netessine. 2007. Performance contracting in after-sales service supply chains. Management Science, vol. 53, pp. 1843-1858. Nowicki, D., U.D. Kumar, H.J. Steudel, D. Verma. 2008. Spares provisioning under performance-based logistics contract: profit-centric approach. The Journal of the Operational Research Society. vol. 59, no. 3, 2008, pp. 342-352. Öner, K.B., G.P. Kiesmüller, G.J. van Houtum. 2010. Optimization of component reliability in the design phase of capital goods. European Journal of Operational Research, vol. 205, no. 3, pp. 615-624. T. Jin, P. Wang, “Planning performance based contracts considering reliability and uncertaint system usage,” Journal of the Operational Research Society , 2012 (forthcoming) Jin, T., Y. Tian, “Optimizing reliability and service parts logistics for a time-varying installed base,” European Journal of Operational Research, vol. 218, no. 1, 2012, pp. 152-162 35 For Questions E-mail to tj17@txstate.edu 36