Page 1 of 37 ASSET MANAGEMENT AND THE USE OF UNAVAILABILITY DATA FOR POWER PLANT SYSTEMS H.C. Wels J.L. Brinkman 1 DEKRA NRG WHY UNAVAILABILITY DATA Let us assume a utility firm that operates a power plant with one or more production units to produce electricity and that is busy with its core business of production, taking care of maintenance, tries to prevent failures from happening, tries to make sure when components are repaired the failure does not occur again, and in general is improving this process; in short: Asset Management. The power plant may produce more than only electricity such as water, steam, fly- and bottom ash. The company may have business contracts with other parties for delivery of goods and services for or from the plant. Such a firm should be interested in gathering unavailability data from its assets to study the effectiveness of their Asset Management processes. If the unavailability of the firm's plant is sketched (see figure 1) as a function of time, the first part of the well-known bathtub curve with teething problems may have been encountered and survived already. Ageing is not felt to be present right now. However, a number of questions may arise with regard to Asset Management related to this plant. It is reasonable to assume that the forced unavailability of the power plant has been documented. However, is a forced unavailability of 5-10 % a low value, a reasonable value given the layout of the plant or such a high value that a project should be started for betterment of either the technical systems and components or operation and maintenance? Which forced unavailability is feasible in the long term? How long should the firm measure to be sure? Figure 1 Bottom of the bathtub curve Page 2 of 37 Regrettably, while no ageing nor teething problems seem to be present, an event happens on the steam turbine causing a 1 month outage. Is this High Impact failure an incident and has it therefore a low probability or is the frequency of such incidents out of bounds? Which components are generally related to HILP failures, what are the causal factors what can management do to influence this? Due to an increased demand the firm decides to build a new plant. What is the amount of teething problems that can be expected in the first part of the bathtub curve as per figure 2? Which components will cause teething problems as a function of technology and possibly manufacturer? What average value and what range of forced unavailability can one expect during this period of teething troubles? Figure 2 Teething problems An alternative is to extend the life of the old plant. It still has a good availability record despite its 25 years of operation. Regular investments have been made and the operation and maintenance departments precisely know how to operate the plant. However, for what components ageing (increase in failure frequency) and maintainability problems (increase in repair time) are likely to occur? How much is the uncertainty in the bathtub curve of figure 3? Figure 3 Life extension of a plant Page 3 of 37 These Asset Management questions can be answered by analysing failure data of relevant classes of power plants. In the next paragraphs, a basic qualitative model for unavailability is sketched. Depending on the questions to answer this model may need further quantification. It is necessary to take uncertainties into account. The amount of detail should increase stepwise, starting from a portfolio of plants to the plant itself with its systems and components. While human factors are not easy to quantify they should be taken into account at least by recognising their importance. 2 QUALITITATIVE AND QUANTITATIVE MODELS FOR UNAVAILABILITY A simple qualitative model for the forced unavailability of a production plant is shown in the next figure. When the new plant is initiated, an investment is made. This investment is important since the quality of the components as well as the complexity of the plant may result from this. An investment can also be necessary to modify the plant when system betterments are deemed necessary, when a layout change or operational change is necessary, spares are to be bought or pooled, etc. Humans are involved in the design, the choice for manufacturers, maintenance and operations. All humans make mistakes. Decisions by humans may not necessarily be optimal in retrospect. As shown in figure 5, management provides the boundary conditions for investment, operation and maintenance in a feedback loop as a function of external influences. Therefore these boundary conditions may change in time. Figure 5 Qualitative model Page 4 of 37 In most cases, the first period of operation of the plant is in base load with low wear-and-tear on components. Depending on state-of-the art of the components (their failure behavior may have improved in time), size of the plant and the anticipated failure behavior, redundancy may be present within the plant. Decision making that is focused at low installation costs, for instance with regard to the number and capacity of coal mills, may result in non-optimum redundancy and large Life Cycle costs which will be hard to resolve once the plant is in operation. From a cost perspective, any component preferably is 1 * 100 %. However, if this is a critical component and it fails, the plant is fully out of operation. Redundancy is an option in this case. From an Asset Management point of view, the outage costs and other consequences of failure should be weighed in a plant portfolio perspective with the costs of redundancy installed. If a 3 * 50 % or a 2 * 100 % redundant system is chosen, failure of a component does not result in production curtailment. Such a system has advantages with regard to maintenance but disadvantages with regard to costs and complexity. Several configurations are feasible and simple Reliability Block Diagrams (RBD) provide answers to redundancy questions using component failure rates, repair times, etc. from databases with information from power plants derived from practice (Retour de l'experience). After having solved teething problems in some components and preferably having reported these as such in databases such that future plant can benefit, a stable period of operation can be regarded as normal. The unavailability is at the bottom of the bathtub curve. A Reliability Block Diagram (RBD) with average failure data helps to define a reference value for the bottom part of the bathtub curve. During this phase of operation especially if the number of failures is low, the failure – repair process tends to be deterministic in nature: focus on prediction of the next failure on an engineering basis and, if it has occurred, solve this failure. Carrying out systematic Root Cause analysis for all important failures and implementing the lessons learned is making sure that the failure does not repeat itself. This is a tough process with many human factors involved. By establishing contacts with sister plants and analysis of those plants that have similar or even identical components, failures can be prevented. Modifications either due to failures that have occurred or are likely to occur may be necessary. However, also in this phase of the plant life, surprises can occur. Surprises are a result of errors made in the construction phase (f.i. a welding error), a design that behaves different from what is expected (design error or f.i. creep behavior of a new material). HILP failures generally are cause by combinations of faults (design, operating, maintenance). Over the life of a plant, the operating conditions will change since more modern plants will in time produce at lower costs. Their higher efficiency will lead to lower fuel costs. The plant that is the less modern is expected to copy with such lower costs. However, a change in operating conditions such as cycling (starting and stopping) often since it can no longer compete in base load together with the finite life of f.i. a steam header can result in a condition change that is faster than expected or even in an unexpected failure. Component Page 5 of 37 conditions have to be investigated to find out about their damage state (inspections) to prevent surprises. Inspections, NDE and condition monitoring form a large part of power plant maintenance and have implications to future maintenance activities as well as the maintenance regimes of components. It will be clear that the best quantitative models for unavailability should not only model the influence of time itself, but the types of model will be a function of (life) time also. To model the first years of operation of the plant without teething problems, simple Reliability Block diagrams (RBD) based on generic data are sufficient for decision making and forecasting. The number of plant specific failures is simply insufficient. After a period of operation, it will be found that the plant is sufficiently different from other plants to warrant plant specific failures to be taken into account. Further in the life of the plant quantitative models should be used that model ageing of the components in the plant. We consider it sufficiently shown from the above that both technical factors as well as human factors influence the unavailability of a plant. The human factors can both lower and increase the unavailability. It is relatively easy to wreck a boiler. It is not so easy to systematically improve a power plant to become an asset delivering more value to the utility. Page 6 of 37 3 NEW BUILD, PLANNING AND PLANT CONCEPTS 3.1 Capacity expansion planning Plant reliability and availability data are inputs to determine the risk of insufficient capacity while matching supply to demand (power, heat, desalinated water). One should be able to determine this risk as a function of the plant portfolio for existing and future plants running as base load, cycling load, hot and cold reserve, etc. The resulting probability on insufficient supply should be used for medium to long term planning and decisions on scrapping, life extension or mothballing of plant as well as building new plant. The use of plant specific reliability and availability data in combination with generic plant data (for those components that have not failed yet in plants) contributes to a better analysis of the risk of undersupply with the existing asset portfolio and with candidate new plants. Figure 6 Step 1 in capacity expansion planning Step 1 in capacity expansion planning should be an estimate of the demand. Peak demand can be modelled using regression analysis of historical demand as a function of such parameters as population, GDP, end-user estimates, etc. EPRI CU-68551 is a recommended report on how to incorporate uncertainty while producing forecasts that are easy to understand and are therefore accepted within and outside a utility. Since peak demand is uncertain it pays to analyse the hourly demand patterns for several years. Analyses of peak demand and energy consumption are complementary. 1 EPRI CU-6855, May 1990 "Uncertainty in Forecasting" Page 7 of 37 Figure 7 Step 2 in capacity expansion planning Step 2 should be a comparison between supply and demand. The simplest comparison incorporates a reserve-factor. The reserve-factor should however take into account maintenance, forced outages as well as grid connections to supply power to areas. The reserve-factor is a function of the portfolio of plants (large / small / type of plant). In Billington and Allan2 very basic generation capacity models can be found that are easily programmed in Excel. Quality control of the Excel model is simple due to the worked out examples in this handbook. A more elaborate analysis should use analytical models or large simulation models (f.i. PLEXOS) with “energy not delivered” or ”loss of load expectation" (LOLE), etc. as input parameters. In any such models the forced unavailability of the portfolio of plants is an important variable especially when a utility would like to apply a low capacity reserve factor (say 10 %) which is similar in magnitude to plant forced unavailability (say between 1 % and 10 %). 2 Roy Billington and Ronald N Allan, "Reliability Evaluaton of Power Systems", 1984, ISBN 0-273-08485-2 Page 8 of 37 Figure 8 Step 3 (and further) in capacity planning Step 3 in capacity expansion planning should define new plant. The decision should be robust to scenario's for the future and take into account the production costs (fixed and variable) of new and existing plants, the dispatch characteristics, forced and planned outage characteristics of new plant, rehabilitation and longer operation (life extension) of old plant. Uncertainty in demand and fuel prices is to be taken into account. Long term demand forecasts (over 30 years) do not need to be very accurate since the time needed to realise new plants is relatively short compared to such a long period assuming that decision making is an orderly process using regular demand forecasts. However it is interesting to see that long term optimum decisions for instance for a situation with a nuclear portfolio, may effect decision making on short term. An example is choosing open cycle GT's instead of the more expensive CCGT anticipating that with a future nuclear portfolio such CCGT will not be operated in base load when the NPPs are in operation. The plant availability data are to be analysed in relation to actual and expected load on a regular basis, taking into account the plants daily dispatch and costs structure. For capacity expansion planning it is recommended to analyse plant unavailability in a systematic way on a yearly basis at least and preferably gathered the time on a state basis (time stamp and coding for start, stop for maintenance, start, unplanned outage, etc.). Some systems and components fail so often that only a few years of information is sufficient to arrive at an acceptable estimate for reliability parameters. For other systems and components such as step up transformers one really needs generic data from other plant to arrive at an acceptable value for forced failure rate, average repair time and unavailability. In order to enlarge the population and make the estimates less susceptible to statistical outliers, data for similar plants having the same operational characteristics should be used. These are present in databases for plant unavailability such as the VGB KISSY database, ORAP and the NERC Page 9 of 37 databases. The VGB KISSY database has the large advantage that for study purposes, raw event data including texts can be made available. When operators and maintenance personnel (and many still do) supply a fair amount of details, these raw data are invaluable. A plant that does not often run due to its relatively high costs would have a small forced unavailability on a calendar time basis. However, basically it should function without mishap when needed for opportunity price windows. How to calculate the unavailability "when needed" for a cycling or reserve plant using an analytical model is by no means simple. Monte Carlo simulation is a better tool to allow differentiating between unavailable time when the plant is not needed (low costs) and when it is needed (high costs and opportunities missed). Furthermore one should use energy based dependability data3 (instead of time based data) to differentiate between plant deratings (reduced power) and full outages especially for large plants. For all purposes, one should analyse unavailability (failure rate, MTBF, MTTR, etc.) of components from reasonably comparable units as a function of their age knowing for which plants due investments in maintenance have been made and, to the extreme, which plants have been subject to minimal maintenance which usually occurs at the end of commercial life. A large external grid for assistance in case of outages and for trading purposes will result in a lower reserve factor with the same LOLE = loss of load expectation. The LOLE rises sharply with reduced reserve factor4. However, a conservative reserve factor could mean large unused investments in plants and therefore has to be determined with care. Figure 9 3 Loss of load expectation as a function of reserve factor In an energy based scheme a plant is unavailable with an outage that effects power delivered, in a time based schema a plant is available when not fully out of operation due to a failure. 4 In figure 9 LOLE is plotted logarithmically. Page 10 of 37 3.2 Power plant concepts The choice between concepts for an individual plant is a choice for type of plant, configuration, fuel type, etc. By having derived quality availability parameters such as failure rate and repair time for plant components from databases, optimum decisions on layout can be made with large consequences for Asset Management during the life of the plant. Detailed concepts as well as redundancy (for instance 1, 2 or 3 series or parallel components) can be optimised using Block Diagram type analyses in combination with (probabilistic) cost-benefit analysis. A typical example is the choice for a flexible plant having 2 smaller but mature gas turbines delivering steam through heat recovery boilers to a steam turbine or a process requiring steam such as city heating, cogeneration or desalination versus an alternative such as a single shaft plant with a larger, more efficient and advanced gasturbine & steam turbine. Steam to city heating can also be provided by steam from other means (for instance separate boilers). Such a plant may be cheaper in installation costs but clearly less flexible when the shaft is rigidly coupled to the steam turbine . The 2 gasturbine plant will be more expensive with regard to installation costs and maintenance costs, however the probability that zero electricity or zero steam is produced is substantially smaller compared to the singleshaft plant. Figure 10 Simple RBD for a plant with 2 gasturbines and 1 steamturbine Page 11 of 37 The probability that zero electricity or zero steam is generated is easily calculated using Reliability Block diagrams such as shown above. For a plant with 2 or more products, the so called cut sets define when and how often product 1, product 2, the combination of products etc. is not delivered. The inputs are failure rates (expected number of failures per hour) and repair time for systems and components. Typical similar layout and concept applications are decisions on auxiliary burners, the choice for a separate GT exhaust for open cycle gasturbine operation during boiler outages, optimum redundancy with regard to coal milling equipment, and the classical example of redundant feedwater and condensate pumps. Even for components and plants that just have come from the design office, one would like to see the availability characteristics compared to similar components or at least of those components that operate in similar plants. This is possible to some extent since any new item will consist of components with known behaviour while for new components engineering judgement can be applied comparing these new components to existing components. A Pareto analysis and resultant diagram is a very useful instrument to decide what components are dominant with respect to unavailability. Only this limited number of dominant components has to be analysed further, since they already define 50 % - 80 % of the plant unavailability. For a Pareto type of analysis one needs to pinpoint the system / component involved, analyse its contribution to forced unavailability and sort components such that the dominant components become clear. A data gathering system such as NERC or VGB's KISSY allows analyzing systematically what components over a certain time window and for a specific class of plants define the majority of plant unavailability. Having a data gathering system within the utility set up in a similar way and having the same coding allows plant specific to be easily uploaded to the generic database as well as easily comparison with the generic database data. Bayesian statistics are an accepted and systematic way of mixing plant specific and generic data. When carrying out such a Bayesian analysis one should clearly understand: The statistical uncertainty in mean value for the generic data (usually small) The statistical uncertainty in mean value for plant specific data (large to small depending on the number of failures) The uncertainty in realisation of unavailability over a period, given a fixed and certain mean value as a result of the statistical distribution of failure and repair times The uncertainty due to differences between plant and realisations of failures for the plants from which the generic data have been derived. Plotting failure data per plant is recommended to assess such differences. Page 12 of 37 Analysis of Unplanned Unavailability 2004 - 2008 Causers All Areas (KKS Function keys F1: A to Z) Collective: Combined Cycle Units, up to 300 MW Unavailability incidents per block and year Energy unavailability [%] KKS Functions 6.0 45.0 postponable not postponable other (no KKS function key) other other other 40.0 MB HA - Pressure system, feedwater and steam sections HF - Bunker, feeder and pulverizing system HH - Main firing system (electricpowered as well) HQ 5.0 LA - Feedwater system MB MB LB - Steam system 35.0 other MA - Steam turbine plant MB - Gas turbine plant MB LA 4.0 30.0 MK - Generator plant MM - Compressor plant MA LA HQ XA - Steam turbine plant HH XA 25.0 MM MK LB LA MA MK MA HA 3.0 MA HF HA HA 20.0 HA HF 15.0 HF HF 2.0 10.0 MB 1.0 5.0 MB MB 0.0 2008 MB 2004 - 2008 0.0 2008 2004 - 2008 Figure 11 Pareto analysis showing the main contributions for unavailability (energy as well as number of incidents) derived from VGB's KISSY database Page 13 of 37 In order to model a power plant concept bottom up from its components, one needs at least failure data defined at 3 letters KKS code. This is sufficient level of detail to pinpoint systems and, in some cases, components. An analysis of some raw KISSY flue gas ventilator data is given in figure 12. From the raw event type data, the failure rate per hr and the average unavailable (repair) time is calculated. By analysis of partial outages and full outages, one is able to assess common cause failures. During common cause failures, more than 1 component in a redundant system is out of operation for example due to an operating error, a maintenance error, effects of external conditions, etc. Please note that the fraction that can be calculated from the figure (49 / (91+ 49) = 35 %) (from cells C43 and C83) is much higher than the usual common cause fraction of 10 % found in handbooks. On reason might be that the plant management want to take both components out of operation to start repair or wants to investigate whether a problem in component 1 is also present at component 2. Figure 12 4 Common cause failure analysis REFERENCE VALUES WHEN BUYING POWER PLANTS In this area of failure data application, the ability of the plant to deliver a certain amount of energy production may be analysed. The availability data can be sourced from manufacturers, contractors, etc. to assess the new power plant. However, these data should be judged for credibility as well as for definitions applied, applicability to contracts and guarantees etc. The data can be used for decision information to choose between contracts. While this area of application is not fully Asset Management, the data originate from plants that are being managed and the choices made will directly influence Asset Management during operation of the new plant. Page 14 of 37 Figure 13 5 Data sourced from manufacturers when contracting a new plant PRODUCTION Production in Asset Management is the operation process (monitoring, control, service and maintenance) of the plant. By the start of operation of a new plant, estimates for the probability of failures, the average duration of production curtailments, availability, etc. preferably are known. By setting targets right away or after having dealt with teething troubles, one is able to measure the performance of the plant against these measures. Targets should be realistic and deviations from these targets should be analysed taking uncertainties into account. Such uncertainties will show when analysing otherwise identical sister plants. Page 15 of 37 Figure 14 Target for plant X The data in figure 14 show that the target for unplanned equivalent hours is met after a period of teething troubles. However, the target for planned hours is not met and asset managers should investigate this further. It is obvious that the definitions for planned and unplanned outages are important. There is no universal agreement on what definitions to take. For example NERC and VGB definitions differ as shown in the following figure. NERC has originated from time based definitions, VGB definitions are essentially energy based. A plant is available in NERC definitions during a power curtailment (derating), whereas in VGB definitions the plant is unavailable in such system state. Comparisons should therefore be made using identical definitions are far as possible. Figure 15 VGB definitions Page 16 of 37 Figure 16 NERC definitions Contracts to operate plants or taking over the operation of a plant abroad or in one’s own country are increasingly becoming usual. These contracts should contain precisely defined indicators that allow calculating whether or not the contract is fulfilled. Availability indicators can for example be coupled to a penalty-incentive system. In figure 17 an example is given of uncertainty within the firm itself in due to several registration systems, interpretations by the plant, by the dispatch department and by corporate. Also differences in calculation may be present. In column B – D registration is on the basis of calendar time. The calculations by Reliability Block diagrams in column E essentially indicate availability "when needed" under semi-base load conditions. With semibase load it is meant that the time window for need of the plant is large (weekly at least, not daily) and effects such as a repair time longer than the (daily) need, cycling effects, etc. have not been modelled in the RBD except being implicitly present due to data from plants that were cycling. Page 17 of 37 Figure 17 6 Different unavailability assessments in 1 company WEAK POINTS IN PLANTS Weak points in a plants components have a higher than generic average failure rate or unavailability. Asset Management should better these weak points on a cost-benefit basis. By comparing for instance superheater failure data between plants, it was found that some plants had troublesome constructions and/or were subject to minimal maintenance, which explained the significant difference in failure rate. As is often evident from the data, many plants have a large fraction of return of the same failure in less than a week. The analysis of weak points and the return probability of failures makes sure that Asset Management budgets for betterment are spent at the dominant systems for unavailability and at the dominant problems. Page 18 of 37 Figure 18 6.1 Weak point: superheater failures for plants no 7 and 19-21 Modification of existing concepts / plants The basis for modification can be a large number of failures, betterment of availability, efficiency, inability to acquire spare parts due to obsolescence, etc. The modified plant may have an increased power compared to the original. A trend in failure rate underpinning the need for modification is easy to carry out when sufficient plant specific data are gathered. The controls in the next figure show an increasing average repair time, the flue gas fans show an increasing number of failures5. 5 The difference between the 2 series of data are related to separate failure data registration schemes Page 19 of 37 Figure 19 Ageing of components (controls, flue gas fans) For other critical components such as generators and step-up transformers where one cannot tolerate High Impact failures, one should derive bathtub curves from generic data taking expert judgement on the component into regard. In the next figures, a bathtub curve is shown for a reasonably sized generator resulting from analysis of a VGB generator database. Clearly both rotor and stator show generic ageing issues. The rotors in the database show teething problems. On the vertical axis in figure 20 is the relative frequency of problems corrected for the number of units as good as possible, on the horizontal axis the age of the generators. The database is inputted by experts that are sharing technical information on the problems. How many generators did have such problems and, equally important, how many generators did not have problems (in short: what is the exposed population), needed to be estimated separately from the VGB database. Page 20 of 37 Figure 20 6.2 Ageing of components (generator rotor and stator) Changing the operation of a plant Let us assume that, due to a different merit-order because of changing market conditions, fuel price, efficiency of new plant etc., it becomes necessary to daily start and stop a plant, otherwise known as cycling. To what extent will this influence the forced unavailability and what measures can Asset Managers take? The first step in this analysis should be a qualitative one: to carry out a FMECA using plant specific expert judgement into account pinpointing the more sensitive components. Generic data are necessary to devote the attention to the components that are expected to be the most susceptible to cycling. These Page 21 of 37 components can in principle be derived from an analysis of international databases such as NERC or VGB by analysing the subset of plants in the database that cycle and preferably compare with the same subset that has operating in base load also. Given a list of such components, the databases in combination with expert judgement certainly allow quantifying. This type of analysis is shown in figure 21. Clearly the failure rate increases with starts per year. The increase however is not without question marks. Figure 21 Cycling analysis using NERC data Figure 22 A RAMP model applicable to a plant cycling Page 22 of 37 With a simple RAMP model such as that in figure 22, it can be calculated that for abt. 50 % of the duration of repair times the plant is not needed when cycling daily and being not in operation during the weekend. Therefore this part of the repair time should not be taken into account to the same extent as the repair time when the plant is indeed needed. Since the average repair time and number of failures per operating hour differ per component, a detailed analysis is not straightforward and needs further modelling. However, given an estimated increase of the failure rate per component effected by cycling and an estimate of the number of operator errors (learning effects should be incorporated), one is able to judge the effects of cycling quantitatively. Of course, such efforts should be accompanied by assessing the life consumed by cycling, f.i. for HP parts. As shown in figure 23, the conservative creep and fatigue life calculations that are the design basis should be extended into a probabilistic failure probability (Structural Reliability). Figure 23 shows for example the difference in creep life using minimum material characteristics versus average material characteristics. Figure 23 7 Life reduction of a steam header due to cycling SPARE PART POLICY The costs for spare parts should be weighed against the financial benefits. When the wrong spares have been bought, not only the investment does not pay off but substantial additional costs for waiting on and implementing a different crash-spare are suffered. Essential elements in a cost-benefit analysis are therefore the need for spares (applicability and frequency) and the difference in repair time warranting the investment. More than 1 spare might be applicable when a large number of plants use the same spare (pooling). Simple queuing models are able to estimate the probability that another spare is needed as a function of inventory during the window that the item from the failed component is being repaired or ordered. Simulation models such as in figure 24 do a better job in calculating that Page 23 of 37 probability. Such analysis demonstrates the benefits of certain spares used in an overhaul in combination with application of that spare as a strategic spare (f.i. gasturbine blading, a spare rotor, etc.). DNV-KEMA uses the RAMP software by Atkins to simulate power plants, assessment of spares and maintenance optimisation. Figure 24 Intermediate results of a spares analysis Page 24 of 37 Figure 24 (ctnd) Intermediate results of a spares analysis Also when the number cased where spares are needed is not precisely known, one can estimate the fraction of failures in which spares might be needed on the basis of the text from raw failure data as well as from the distribution of repair times. As shown in figure 25, for HP bypass (KKS-code = MAN), 11 % of the failures has a duration over 24 hrs and thought to have needed a spare. By counting the number of records in which a spare could have been applied on the basis of engineering judgement into the problem described, one arrives at 27 % of the failures. This gives an order of magnitude. Due to the enormously large number of components in a power plant in combination with the fact that the operation and maintenance crew tries to limit failures as much as possible, one cannot refrain from engineering judgement. By doing so however, Asset Managers should be able to save several million Euros on the spares budget. Figure 25 Spares analysis by comparing repair times Page 25 of 37 8 MAINTENANCE Application of failure data to maintenance is one of the most difficult areas of application. However, it is rewarding due to the amount of understanding the failures that can be reached which could be applied to other plants also. The problem in general is that data are gathered at a system level and only for those events where plant unavailability has occurred. Maintenance optimisation is however carried out at a component level, mainly by estimating the ageing part of bathtub curves. Weighing of preventive costs and corrective costs show the interval for which maintenance on a specific component is optimum. Many maintenance optimisations assume however that a component is as-good-as-new after maintenance and this assumption is equal to replacing a component a new component. Trending such as in figure 19 shows that this is seldom the case: if a component would be as good-as-new, there would be a straight line without the typical "steps" in cumulative number of failures. The simplest way to apply failure data to maintenance optimisation is by analysis of the raw data from a plant unavailability data gathering scheme and to pinpoint those systems and those components where maintenance obviously is not removing the failures to the maximum extent possible. Whereas for Asset Management purposes one would like to have a root-cause registered for every failure, such an effort is out-of-place. Maintenance optimisation is therefore a kind of art rather then science requiring such information as raw failure data, bathtub curves (Weibull modelling) and indications for the time window between initiation of a failure and actual occurrence of a failure (or the time the component is taken out of operation to prevent an imminent failure). The basis for a proper maintenance optimisation is a qualitative description of the maintenance actions that would be effective against a failure mechanism. If, as is the case with Reliability Centered Maintenance RCM, these are charted by means of a Failure Mode Effect Analysis such as in figure 26, the basic tool to estimate the contribution of maintenance to unavailability is already present. It certainly pays to estimate probability of occurrence by using failure data gathering systems and damage investigations (see figure 27) in addition to the usual expert judgement. The failure rates from failure data, damage investigations and expert judgement should be comparable. Page 26 of 37 Figure 26 FMECA correlated with failure analysis Figure 27 Superheater degradation and failure rate from damage analysis Page 27 of 37 Figure 28 Markov model Figure 29 optimum inspection results using a Markov model 9 MARKOV MODELLING FOR MAINTENANCE OPTIMISATION Markov modelling such as in figure 28 is about using discrete system states and their transition rates. Discrete system states described as "new", "moderate", "bad" and "failed" are easily assessed by maintenance and operation experts. The result of a Markov model quantification is an optimised maintenance and inspection schedule without the usual fixed intervals. Instead for such a schedule, optimum moments for inspection and other maintenance actions are calculated. By application of this type of modelling, Asset Page 28 of 37 Management is able to save Euros risk based, for not carrying out inspections during the stages in component life where analysis shows that only with a low probability any degradation will be found. Inputs into such an analysis are the probability that a component is in a certain state at the start and the probability that a transition to a degraded state or a failure state would occur from which the probability on degraded states can easily be calculated. The transition rates can be derived from expert judgement, damage analysis and maintenance / failure data. Asset Management also is able to save Euros risk based by intensifying inspection and other maintenance actions when the probability of degradation rises while preventing failures. 10 REPORT PROCEDURES AND PUBLIC RELATIONS Reports are defined by their content and form and should be arrived at by systematic way of working in order to clearly present availability characteristics. Asset Management provides the data for these reports. Recording and evaluation must take place in a uniform way taking account of for instance VGB definitions. In the annex, the VGB input forms are given. Based on the results, a uniform and regular representation in internal and external reports should be put into operation. Preferably, evolutionary series are to be chosen with corresponding commentaries to help interpretation. The type of the representation should be oriented to the interests of the benefiting parties (technicians, authorities, customers, etc.). Two examples of KISSY VGB reports are given. In example 1, availability is shown for supercritical units over a range of time. In example 2, for a specific peer group of plants, the number of units having a certain availability is given. By plotting one's own plant into this figure, a first step in benchmarking is reached. Page 29 of 37 Availability numbers Collective: Supercritical Units Time range: 2004 - 2008 Time Availabily Time utilization Energy availability Energy utilization 100 % 80 60 40 20 0 2004 2005 2006 2007 2008 04-08 Energy unavailability unplanned postponable planned unplanned not postponable 10 % 0 2004 2005 2006 2007 2008 2004 Blockyears 155 Nominal Power (gross) 2005 155 04-08 2006 151 2007 152 2008 04-08 150 763 MW 53,232 53,630 51,045 51,423 51,044 260,374 Time Availabily % 87.4 86.2 86.8 86.3 85.4 86.4 Time utilization % 54.8 55.9 59.2 61.5 57.8 57.8 Energy availability % 84.2 83.3 83.0 83.6 83.2 83.5 Energy unavailability % 15.8 16.7 17.0 16.4 16.8 16.5 Planned energy unavailability % 6.8 8.6 9.3 8.5 8.7 8.4 Unplanned energy unavailability % 9.0 8.1 7.7 7.9 8.0 8.2 Postponable unplanned energy unavailability % 1.1 1.2 1.0 0.9 0.8 1.0 Not postponable unplanned energy unavailability % 7.9 Energy utilization % 47.2 6.8 6.6 7.0 7.2 7.1 47.9 48.0 49.2 47.2 47.9 © This document is protected by national and international law. Verberk, Coen Figure 30 3/9/2009 14:30 standard KISSY availability report Page 30 of 37 KISSY - Power Plant Information System Numbers of Power Plant Units over Classes of Energy availability Collective: Fossil fired Units Time range: 2008 60 48 50 49 Number of Units 40 34 30 25 20 19 >70 >75 20 12 11 10 4 4 >55 >60 1 0 <=50 >50 >65 >80 >85 >90 >95 Percent © This document is protected by national and international law. Verberk, Coen, 09/04/2009 11:17 Figure 31 VGB PowerTech e.V. • www.vgb.org Klinkestr. 27-31 • D-45136 Essen • kissy@vgb.org standard KISSY benchmarking report Every piece of availability information can be used in the public area. In this utilization, detail information is not in the foreground. A careful selection of parameters is important in stating the message. Furthermore one should guard that all used specialist expressions and language are sufficiently explained in simple language, so that they can be understood by a non-specialist public. Any values used nevertheless must be build from well defined and pertinent information based on clear definitions. Page 31 of 37 11 ECONOMIC ASPECTS Availability data allow optimum planning of the dispatch of present and future power plants and make an economic optimisation of power plant blocks in combination with purchasing additional energy fuels for the companies own production installations possible. An interesting application for Asset Managers to estimate the probability that short term contracts are not fulfilled is by the use of Markov models such as shown in the next figures. The resulting reliability and the unavailability as a function of time in figure 33 clearly show that, given that at time (t) a plant is repaired and in operation, the risk on no delivery at time (t + dt) with dt not too large is indeed smaller than the average unavailability that is for example calculated on a yearly basis. Only for such an average value, the long term steady state of the operation-and-repair process is valid. Therefore such unavailability is larger than the short term value. Figure 32 Markov model for a plant using discrete system states Page 32 of 37 Figure 33 Failure probability and unavailability as a function of time Economically optimum dispatch should take into account accurate values for plant failure rate and unavailability taking the plant size, operational characteristics etc. into account. For this economic dispatch process, one can either use commercial software such as PLEXOS or software developed in-house. Such software will result in a optimum spinning reserve as shown in figure 34. Figure 34 Failure probability and unavailability as a function of spinning reserve Page 33 of 37 12 BENCHMARKING: COMPARING PLANTS Benchmarking is ranking one’s own plants and compare the rank to similar plants to judge the efficiency of the processes (including managerial) within the plant. Benchmarking can be difficult when the layout of the plant is uncommon or when steam is delivered from a set of plants that is not comparable to other situations. In such cases we recommend to calculate the benchmark parameter using Reliability Block diagrams or Fault Tree analysis based on plant specific as well as generic data. For the plant such as shown in the figure 35 the probability of having no steam on LP and IP steam headers was calculated. The calculated value can be regarded as a benchmark value taking the configuration into account. The results showed that maintenance of the gasturbine and boiler during operation of the remainder of the plant in theory would result in only a minimal risk increase for the plant under consideration. Page 34 of 37 Figure 35 Layout of a plant for steam and power production Page 35 of 37 13 INSURANCE QUESTIONS The risks for power plant components, described by the likelihood of failure occurrence and the severity of consequences of damage, can be insured. Also unavailability itself can be insured. Availability data deliver information whether it is optimal to insure a risk or to carry the risk by one’s own means. Long-term high quality availability data can be used for the reduction of the insurance payments within the contract negotiations. Since usually only large components are assured because of the amount of possible damage incurred, the elaboration for this application can be limited to the availability analysis of large components. For the data of one’s own installation, the complete time series since starting the operation is generally demanded. The insurer may ask for the listing of all events including description and affiliated investigation reports. As comparison values, statistics from example the technical and scientific reports of the VGB can be taken for supplementary information. Questions on the effectiveness of state-of-the-art precautionary measures (especially when these are not yet present but the insurer is sounding to have these installed) should be answered taking the present measures and their influences into account using reliability analysis. 14 INDICATORS FOR BUDGETARY PLANNING Availability parameters can be used to set up the budgets for plant maintenance, the amount of spare parts or modifications and the corresponding budgets. The optimisation problem involved is weighing direct costs for measures against indirect costs caused by unavailability events as shown before. Optimisation preferably starts with a Failure Mode Effect & Criticality Analysis. By detailing and confirming the expert judgement by inspection results from the past and accurate condition assessments during an overhaul, a maintenance plan for "normal" maintenance, replacements that are certain given obsolescence etc. and replacements that are uncertain since condition based maintenance is cost-optimum. The steps such as carried out in KEMA's Altima = Advance Life Time Management are shown in the next figure. Page 36 of 37 Figure 36 15 ALTIMA = Advanced Life Time Management CONCLUSIONS AND RECOMMENDATIONS In order to judge production assets with regard to concept, construction, operation, maintenance effectiveness etc., a consequent, systematic and unified data gathering scheme should feed Asset Managers with reliability and availability indicators for plants as a whole as well as for components taking into account industry specific definitions and guidelines. We would like to express our gratitude to VGB for allowing the use of KISSY examples as well as making KISSY reports incorporating unavailability data available. LITERATURE VGB PowerTech – „Terms of utility industry“ Part B – „Power and District Heat“ Booklet 3 – „Fundamentals and systematics of availability determination for Thermal Power Plants“ 2008 VGB – Technical Scientific Reports "Thermal Power Plants“ „Analysis of Unavailability of Thermal Power Plants“ VGB TW 103Ae Page 37 of 37 Annex 1 Input sheet for unavailability analysis of plants Annex 2 Input sheet for availability of energy (plant as a whole)