Quantifying the benefit of prognostic information in maintenance decision making A. Van Horenbeek* and L. Pintelon** *Celestijnenlaan 300A, 3001 Heverlee, Belgium, Catholic University Leuven, Adriaan.vanhorenbeek@cib.kuleuven.be **Celestijnenlaan 300A, 3001 Heverlee, Belgium, Catholic University Leuven, Liliane.pintelon@cib.kuleuven.be Abstract. Many models and methodologies to predict the remaining useful life (RUL) of a component or system are investigated nowadays. However, decision making based on these predictions (RUL) is still an underexplored area in maintenance management. The objective of this paper is to quantify the added value of this prognostic information (RUL). This is done by constructing a stochastic discrete-event simulation model, which optimizes maintenance action scheduling, based on prognostic information on the different components. Cost and availability criteria are taken into account in this optimization model as the objectives. The added value of the prognostic information is determined by comparing this prognostic maintenance policy to four other conventional maintenance policies: corrective maintenance, preventive maintenance, offline condition-based maintenance and online condition-based maintenance. The benefit of prognostic information and the stochastic discrete-event simulation model are validated by a real life case study on bearings of manufacturing equipment. Looking at more than one machine, a plant level approach is taken in the case study. 1. Introduction Condition-based maintenance is a well studied field in maintenance management. Many models in literature indicate that a condition-based maintenance policy is capable of reducing cost, increasing productivity and maintaining high equipment reliability and availability while at the same time ensuring a higher safety level. Marseguerra et al. (2002) uses Monte Carlo simulation and genetic algorithms to determine the optimal degradation level beyond which a preventive maintenance intervention should be taken by optimizing profit and availability. A multi-component simulation modeling approach is taken by Barata et al. (2002) to find the optimal degradation threshold for performing preventive maintenance actions. Liao et al. (2006) introduces a condition-based availability limit policy which achieves the maximum availability of a system by optimally scheduling maintenance actions. Other papers not only try to find the optimal degradation threshold, but at the same time optimize the inspection schedule or policy (Grall et al., 2002). Although condition-based maintenance takes advantage of the known state of components, setting a degradation threshold beyond which preventive maintenance is carried out is not always an optimal solution compared to predictive maintenance. Predictive maintenance uses current and prognostic information like the remaining useful lifetime of components to optimally schedule maintenance actions, while condition-based maintenance only uses current component state information. The benefit of also using information about future degradation over only using currently observed information is illustrated in different publications (Camci, 2009, Yang et al., 2008). Proactive maintenance decisions can be made based on the prognostic information which results in a dynamic maintenance schedule. The objective of this paper is to quantify the benefit of prognostic information in maintenance decision making by performing a real life case study on manufacturing equipment. A comparison between different maintenance policies is made and the optimal maintenance policy is determined. The maintenance policies considered are corrective, preventive, offline condition-based, online condition-based and prognostic maintenance. The optimization will be done by using stochastic simulation and genetic algorithms. For each maintenance policy a multi-objective optimization is perfomed by considering cost as well as availability (or downtime) as the maintenance objectives. In many cases both cost and availability objectives are combined into one cost objective function by expressing availability in terms of downtime cost. The reason to consider these as two separate objectives is that expressing availability in terms of value to the company is often difficult. This value can be for example increased production output or market share. Moreover, this value of increased availability to the company will change according to the varying business environment. In times of economic welfare, when as high availability as possible is needed; an increase in availability will be more valuable to a company then in a time of economic downturn with fewer orders. In the latter case steering on cost will be more important, because increased availability will not contribute to a higher profit. The advantage of performing multi-objective rather than single-objective optimization is that a pareto set of optimal solutions is found. Based on this set of optimal solutions, the optimal maintenance schedule can be determined according to the business environment and circumstances at the time of decision making. The structure of this paper is as follows. Section 2 describes the discrete-event simulation together with the different maintenance policies that are considered and how they are handled. In section 3 the case-study is introduced, while section 4 summarizes the simulation results for the different maintenance policies. Conclusions and future work are discussed in section 5. 2. Simulation of maintenance policies Markov models have been widely used in condition-based maintenance to model the state of a system. The advantage of using Markov models is that analytic results to the maintenance problem can be found. However, using Markov models also has some disadvantages. Many simplifying assumptions are made and the probabilities of the different states in a Markov process are difficult to find. The more realistic and complex the modeled systems get the more difficult and cumbersome it is to describe the system by analytic models. This is the main reason to resort to simulation tools to model the manufacturing equipment in this paper. Futhermore, simulation does not require any assumptions on the character of the degradation process (Yang et al., 2008). In this paper a discrete-event Monte Carlo simulation is used to model the dynamic behaviour of the manufacturing equipment over a finite horizon. Five different maintenance policies are simulated, which are corrective, preventive, offline conditionbased, online condition-based and prognostic maintenance. The maintenance policies are evaluated based on the expected value of the distribution of cost and availability. By doing so the optimal maintenance policy is determined and the added value of prognostic information is quantified. In the corrective maintenance case maintenance is only performed when a failure, which causes machine breakdown, of a certain component happens. A fixed maintenance schedule where maintenance is performed in regular time intervals is considered as preventive maintenance. When a component breaks down before a scheduled maintenance action corrective maintenance is performed on the component. Optimization of the time interval between two consecutive maintenance actions on the machine is executed for the preventive maintenance policy. Offline condition-based maintenance uses inspections (e.g. vibration measurements) to determine the current state of a machine or component. Inspections are carried out at regular time intervals. When the deterioration level, revealed during the inspection, of a component exceeds a well definied threshold, preventive maintenance is carried out. If the deterioration level is below the threshold level the next inspection is scheduled. Corrective maintenance is perfomed when a component breaks down between two scheduled inspections where the deterioration level was below the threshold level when the first inspection was done. Both optimization of the time between two consecutive inspections and the deterioration level beyond which preventive maintenance actions are taken is performed in the simulation model. Online condition-based maintenance applies online monitoring of all considered components in the machine. In this way the state and deterioration level of each component is continuously known. When the deterioration level exceeds a set deterioration threshold level a preventive maintenance action is performed. When the online monitoring is unable to detect an incipient failure corrective maintenance is executed at breakdown. For online conditionbased maintenance the preventive maintenance action threshold is optimized in the stochastic simulation model. Prognostic or predictive maintenance takes advantage of the available predictions of remaining useful lifetime for components. Based on the remaining useful lifetime distributions for all components an optimal maintenance schedule can be found which optimizes plantwide maintenance operations. A Genetic Algorithm (GA) (Holland, 1962) will be used to find this optimal maintenance schedule. A GA is a heuristic that mimics the process of natural evolution and survival of the fittest based on crossover and mutation on the initial population. The different maintenance schedules are represented by a chromosome defined as an array of binary numbers, where one represents scheduled maintenance at time t and zero no scheduled maintenance at time t. A different number of iterations, referred to as generations of the GA, are performed to improve the objective or fitness function(s). The choice for GA’s in this paper is based on two major advantages or properties of the heuristic. Firstly, they handle multi-objective optimization problems in a fast and accurate way. Secondly, no analytically tractable objective function is needed to solve the optimization problem. By comparing both cost and availability objectives for all different optimal maintenance policies the benefit of prognostic information in maintenance decision making is quantified. 3. Case study The discrete-event simulation is applied to a real life case study on manufacturing equipment to quantify the added value of prognostic maintenance. Focus is on one specific subassembly of each machine that consists of two roller bearings with corresponding bearing housings and a driving axle. When one bearing breaks down the other one is replaced at the same time. Components are always replaced and are restored to the asgood-as new state after maintenance. Three different maintenance or replacement scenarios exist both for preventive and corrective maintenance. In the first maintenance scenario only replacement of the bearings is necessary, in the second maintenance scenario replacement of both bearings and bearing housings is required, while in the third maintenance scenario replacement of the whole subassembly is necessary. All maintenance scenarios are initiated by preventive replacement or failure of one of the bearings. For this reason a failure probability distribution is fitted to breakdown data of the bearings. The fitted Weibull distribution with its parameters and 95%-confidence interval on the parameters is shown in Figure 1. These 95%-confidence bounds are used to simulate the failure behaviour of the bearings. It is assumed that this Weibull reliability curve correctly reflects the evolution in time of the monitored physical parameters (e.g. vibration measurement on bearings) and predicted remaining useful lifetime based on these measured parameters for the condition-based and prognostic maintenance policies. 1 0.9 0.8 Cumulative probability WL(9.2023,2.0363) W(10.0352,2.3571) 0.7 WU(10.9436,2.7286) 0.6 0.5 0.4 0.3 Failure data Weibull confidence bounds (Weibull) 0.2 0.1 0 5 10 15 Time To Failure 20 Figure 1. Weibull distribution fitted to the failure data of the bearings with scale parameter η = 10.0352 and shape parameter β = 2.3571. WL and WU are respectively the Weibull lower bound and Weibull upper bound to form the 95% confidence interval on both scale and shape parameters. Replacement of the bearing housings and axle are modeled by a probability of having one of the three maintenance scenarios when failure of a bearing happens or a preventive maintenance action is performed. Probabilities for the maintenance scenarios are different for preventive and corrective maintenance. When a bearing breaks down probability of replacing the bearing housing and axle are bigger than when a preventive maintenance action is performed. The maintenance scenarios are sampled from a multinomial distribution: f ( x; n, p) (n!/( x1!,..., xk !))( p1x1 ,..., pkxk ), whenik1 xi n. (1) Where x ( x1 ,..., x k ) gives the number of each of k outcomes in n trials of a process with fixed probabilities p ( p1 ,..., p k ) of individual outcomes in any one trial. The vector p has non-negative integer components that sum to one. The vector p defines the probabilities of the replacement or failure scenarios for both preventive replacement actions ( p p ( p1 0.95, p2 0.03, p3 0.02) ) and corrective maintenance ( p c ( p1 0.1, p 2 0.15, p3 0.75) ) actions. This means that for preventive maintenance 95% of the actions consist of only replacing the bearings, 3% consists of replacing bearings and bearing housings, and in 2% of the cases a replacement of the entire subassembly is necessary. The same logic holds for the corrective maintenance actions except that the probabilities of the failure scenarios change when a failure of one of the bearings happens. Failure of a bearing will induce secondary damage to other parts of the machine, like for example the cover, with a probability of 0.8. A summary of the other data and parameters used in the simulation is provided in Table 1. Table 1. Parameters and data used in the discrete-event simulation for all maintenance policies. Duration parameters Inspection Waiting Replacement Repair Installation Secondary damage Distribution Triangular Triangular Triangular Triangular Triangular Triangular Min. time (h) 0,4 23 3,5 3,5 3,5 0,5 Mean time (h) 0,5 24 4 4 4 1 Max. time (h) 0,6 25 4,5 4,5 4,5 1,5 Cost parameters Bearing Bearing house Shaft Transportation Secondary damage Working Cost (€) 302,5 232,5 1675 120 300 70 €/h The two objectives considered when optimizing the different maintenance policies are expected cost (€) and downtime (weeks), which are defined as: ETotalCost Cost p Costc Costinsp. . (2) ETotalDowntime Downtimep Downtimec Downtimeinsp.. (3) Where Costp is preventive maintenance cost, Costc is corrective maintenance cost, Costinsp. is cost of inspection, Downtimep is downtime due to preventive maintenance, Downtimec is downtime due to corrective maintenance and Downtimeinsp. is downtime due to inspection. The cost parameters for preventive maintenance, corrective maintenance and inspection are defined as followed: (4) Cost p T p CostW 3i 1 N pi Cost pi . Cost c Tc Cost W 3i 1 N ci Cost ci N SD Cost SD . (5) Costinsp. Tinsp. CostW . (6) Where Tp, Tc and Tinsp. are respectively the total preventive maintenance, corrective maintenance and inspection time during the simulation. Npi is the number of preventive maintenance actions for replacement scenario i. Nci is the number of corrective maintenance actions for failure scenario i. NSD is the number of times secondary damage occurs. CostW is the cost of working or personnel cost and CostSD is the cost of secondary damage. Finally, Costpi and Costci are the cost for a preventive action of replacement scenario i and the cost for a corrective action of failure scenario i. When simulating over several years, discounting of costs can have a big influence on the final results of the simulation (van der Weide et al., 2010). For this reason costs are discounted to their present value by using the following formula: CostDiscounted kj 0 Cost j / 1 d j . (7) Where k is the number of years simulated, Costj is the total cost in year j and d is the discount rate which equals the Weighted Average Cost of Capital (WACC) of 10% of the company. 4. Results For all maintenance policies the discrete-event simulation is run over a finite time horizon of 200 weeks with 5000 replications. The number of individuals in each population for the GA is set to 700 and the maximal number of generations is 200. Scattered crossover is selected as the crossover function with a crossover fraction of 0.8. This crossover fraction specifies the fraction of individuals in the next generation that are created by crossover. Mutation produces the remaining individuals in the next generation by using a Gaussian mutation function. A tournament selection function is used as the parent selection method. The objective functions considered are earlier defined in formula (2) and (3) of section 3. 4.1 Corrective and preventive maintenance For preventive maintenance the time between two consecutive preventive maintenance actions is optimized, in fact this is an optimization of the block-based preventive maintenance policy. Based on optimization of cost and downtime functions a fixed schedule of preventive actions can be determined. The optimal time between two preventive maintenance actions is 7 weeks when the total expected cost (65757.11€) is optimized and 5 weeks when the total expected downtime (3.18 weeks) is minimized. 4.2 Offline and online condition-based maintenance The deterioration threshold beyond which preventive maintenance is triggered together with the inspection schedule are the two parameters that are optimized for the offline condition-based maintenance policy. The isocost and –downtime curves can be seen in Figure 3. 0.15 0.2 Threshold level 0.25 4 3.5 3 0.1 0.3 3.27 3. 51 3.36 3.2 7 3. 18 0.15 3.48 3.2 73.2 31 3.3 3.3 .18 93 .36 3.2 4 3.4 2 3.4 3 5 .33 21 3. 3.5 3.4 1 3.4 8 3.4 5 2 3.3 9 0.2 Threshold level 3.33 6 0.1 0 62622524505000 662 70 4.5 4 3.2 1 3.2 61 650 61 800 61 950 62 100 6.5 0 61 65 61 500 5 3.3 61200 61350 5.5 39 3. 3.39 0 7 61 50 6 3.24 7.5 3.1 2 .3 3 machine 36 Expected total downtime per 3. 3.33 61650 61 500 0 61 35 3.21 3.54 8 3. 18 6.5 3. 18 3 3.12 3.09 .1 5 8.5 63 0 63 7590 0 Time between inspections (weeks) 9 7 64 35 64machine 200 0 Expected total cost per 64 05 0 3.63.57 63600 63450 63300 63150 63000 62850 62700 62550 62400 62250 62100 61950 61800 3.15 Time between inspections (weeks) 10 9.5 0.25 0.3 Figure 3. Isocost and –downtime curves for offline condition-based maintenance. The deterioration of the components is monitored continuously which makes the deterioration threshold beyond which a preventive maintenance action is taken the only parameter to optimize in the online condition-based maintenance policy. The results are shown in Figure 4. 9.5 x 10 4 4.5 Expected total downtime per machine Expected total cost per machine Online CBM 9 8.5 8 7.5 7 6.5 6 5.5 0 0.2 0.4 0.6 Threshold level 0.8 Online CBM 4 3.5 3 2.5 1 0 0.2 0.4 0.6 Threshold level 0.8 1 Figure 4. Expected total cost and downtime for online condition-based maintenance. 4.3 Prognostic maintenance 3.3 Expected total downtime per machine (weeks) Expected total downtime per machine (weeks) Prognostic maintenance makes use of the predictions of the remaining useful lifetime of components, which makes it possible to react to the real deterioration of each component in different machines. The last population of the GA together with the Pareto optimal front is given in Figure 5. Final population of GA 3.2 3.1 3 2.9 2.8 2.7 2.6 2.5 5.8 6 6.2 6.4 6.6 6.8 Expected total cost per machine 7 x 10 4 2.68 Pareto optimal front 2.66 2.64 Steering on cost 2.62 Steering on downtime/availability 2.6 2.58 2.56 5.8 5.82 5.84 5.86 5.88 5.9 Expected total cost per machine Figure 5. Expected cost and downtime for prognostic maintenance using GA. 5.92 x 10 4 4.4 Comparison of all maintenance policies A comparison between all considered optimal maintenance policies can be made based on the objectives of total cost and downtime (Table 2). This comparison makes clear that the added value of prognostic information is substantial. It even has a major impact on downtime reduction in this specific case. Moreover, the analysis in the previous sections makes clear that a different optimal maintenance policy is found based on the separate objectives of cost and downtime. The business environment at the time of decision making defines the value of availability to a company. Considering both cost and downtime as two separate maintenance objectives makes dynamic maintenance scheduling possible based on the value of availability at the time of decision making. This approach not only optimizes maintenance over time, but optimizes maintenance at every time instant while taking into account the business environment of the company. Table 2. Comparison of maintenance policies based on expected cost and downtime per machine. Cost (€) Maintenance policy Improvement (%) Downtime (weeks) Improvement (%) Mean SD Mean SD Corrective maintenance 69266,64 7640,30 4,2352 0,4084 Preventive maintenance 65757,11 9337,30 5,07% 3,1888 0,5390 24,71% Offline CBM 61017,70 8258,50 11,91% 3,0646 0,5486 27,64% Online CBM 59607,64 8028,60 13,94% 3,0104 0,5078 28,92% Prognostic maintenance 58109,22 2766,60 16,11% 2,5664 0.1453 39,40% 5. Conclusions and future work A real life case study is performed on manufacturing equipment to quantify the benefit of prognostic information in maintenance decision making. It shows that the influence of prognostic information on total cost and downtime is substantially valuable in comparison to the other investigated maintenance policies. Moreover, the simulation makes clear that the optimal maintenance policy is different according to both objectives of cost and downtime. According to the business environment and circumstances at the time of decision making the optimal maintenance policy can be determined based on the presented multi-objective optimization model. Future work will be on incorporating more components of the machine into the analysis, together with the effect of imperfect maintenance and inspections, and constraints on spare parts and manpower. References Barata, J., Soares, C. G., Marseguerra, M. and Zio, E. (2002) Simulation modelling of repairable multicomponent deteriorating systems for 'on condition' maintenance optimisation. Reliability Engineering & System Safety, 76, 255-264. Camci, F. (2009) System Maintenance scheduling with prognostics information using genetic algorithm. IEEE Transactions on Reliability, 58, 539-552. Grall, A., Dieulle, L., Berenguer, C. and Roussignol, M. (2002) Continuous-time predictive-maintenance scheduling for a deteriorating system. IEEE Transactions on Reliability, 51, 141-150. Holland, J. H. (1962) Adaptation in natural and artificial systems. University of Michigan: Ann Arbor, MIT Press. Liao, H., Elsayed, E. A. and Chan, L.-Y. (2006) Maintenance of continuously monitored degrading systems. European Journal of Operational Research, 175, 821-835. Marseguerra, M., Zio, E. and Podofillini, L. (2002) Condition-based maintenance optimization by means of genetic algorithms and Monte Carlo simulation. Reliability Engineering & System Safety, 77, 151165. Van Der Weide, J. A. M., Pandey, M. D. and Van Noortwijk, J. M. (2010) Discounted cost model for condition-based maintenance optimization. Reliability Engineering & System Safety, 95, 236-246. Yang, Z. M., Djurdjanovic, D. and Ni, J. (2008) Maintenance scheduling in manufacturing systems based on predicted machine degradation. Journal of intelligent manufacturing, 19, 87-98.