Uploaded by abdoul7.camara

cooling towers

advertisement
See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/324531183
Petri Net Based Reliability Analysis of Thermoelectric Plant Cooling Tower
System: Effects of Operational Strategies on System Reliability and
Availability
Conference Paper · April 2018
CITATIONS
READS
7
1,903
5 authors, including:
Adherbal Caminada Netto
Arthur H. A. Melani
University of São Paulo
University of São Paulo
31 PUBLICATIONS 147 CITATIONS
49 PUBLICATIONS 230 CITATIONS
SEE PROFILE
Carlos Murad
São Paulo State University
21 PUBLICATIONS 116 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Reliability Centered Maintenance of a Coal-fired Power Plant View project
Hydropower Plants Reliability Improvement View project
All content following this page was uploaded by Arthur H. A. Melani on 10 June 2018.
The user has requested enhancement of the downloaded file.
SEE PROFILE
Proceedings of the joint ICVRAM ISUMA UNCERTAINTIES conference
Florianópolis, SC, Brazil, April 8-11, 2018
Petri Net Based Reliability Analysis of Thermoelectric Plant Cooling Tower
System: Effects of Operational Strategies on System Reliability and Availability
A. Caminada Netto1, A. H. A. Melani1, C. A. Murad1, S. I. Nabeta1, G. F. M. Souza1
1
Escola Politécnica de São Paulo, Av. Prof. Mello Moraes, 2231 – CEP 05508-010,
adherbal@usp.br, melani@usp.br, carlos.murad@usp.br, gfmsouza@usp.br
Keywords. Reliability Analysis, Petri Net, Cooling Tower, Coal-Fired Power Plant.
Abstract. The reliability analysis is a set of studies that characterize the behavior of a system, with respect to the
occurrence of failures. One of the ways to carry out such analysis is through the development of a model of the system
under study. The results obtained from the reliability analysis contribute to the decision making related to both the
systems design and maintenance planning. For systems with great number of components, where usually components
failure cannot be considered independent, a more complex reliability analysis tool must be used such as state machines.
Stochastic Petri Net (SPN) is a mathematical modeling language widely used in different areas of knowledge, especially
when it is necessary to represent nondeterministic phenomena. SPN is currently being used for systems reliability (R),
availability (A), risk and performance analysis in situations where the components failure and repair rates cannot be
modeled as time independent. This paper proposes the development of a reliability analysis based on Stochastic Petri
Nets for the cooling tower system of a coal based thermoelectric power plant. The cooling tower system plays a
fundamental role in the operation of a coal based thermoelectric plant, since it is one of the systems responsible for the
thermodynamic balance of the plant. This system failure usually affects the power plant output. There are other
techniques that can be used in reliability analysis, but this paper will also deal with Functional Tree and Functional
Description, in order to support the Petri net model development. The cooling tower system is composed of identical
cells for which components failure modes and reliability can be estimated. The obtained SPN model will be tested and
validated through failures historical data. Once the model is developed, the possible operational strategies of the cooling
tower system can be evaluated to attend the heat load as design intent. This evaluation aims at the optimization of the
system reliability. Uncertainties in loading conditions and units failure modes are evaluated considering variations in
cells reliability. Aiming at long term availability analysis, the influence of number of repair maintenance teams and
application of predictive maintenance policy are also evaluated. The analyses allow the definition of the most suitable
operational configuration and repair resources aiming at achieving target availability.
1 INTRODUCTION
Supplying electricity has become very competitive over the years. Maintenance, availability and reliability are some
of the most important factors of steam-process plant. Power plants are regarded as the core module of power systems
and are responsible for producing power to be transmitted and distributed to the end consumers. Power plants need to be
available and reliable all the time, and so do their constituent components, (Sabouhi, et al., 2016). If the power
generation stations are not well maintained and reliably operated, a significant amount of damages would be possibly
imposed to the society as a consequence of power shortage. Any improvement in system reliability is connected to an
imposed amount of cost; the reliability enhancement is justified to the extent that the system unavailability cost is more
than that of the normal service provided.
The main goal of maintenance consists in finding balances between costs and time of maintenance or the most
appropriate moment to execute maintenance, (Burhanuddin, et al., 2014). In order to decrease costs of electricity
production, companies have to take every new methodology available to apply on their internal procedures to achieve
better key performance indicators (KPI’s) and therefore, improve efficiency, quality and reliability of its repairable
systems. In other words, produce the same amount of energy with fewer expenses. A repairable system is defined as a
system that can be restored to fully satisfactory operation by parts replacements or adjustments when failing to perform
one or more of its functions satisfactorily, (Pamuk and Uyaroglu, 2012). There are some attempts in the literature to
develop more realistic techniques using simulation models for reliability and availability analysis of systems, (Rao and
Naikan, 2014). This study proposes first the understanding of the system under study by using Functional Tree (FT) and
Functional Description (FD) and then applies Stochastic Petri Net (SPN). It intends to advance the understanding of
Petri net application for reliability analysis and demonstrate how maintenance team availability can play an important
role on a steam-process plant scenario, (Lee and Lu, 2012). Such models are useful for simulation purposes in case of
complex systems; they are often defined as stochastic Petri nets. For this reason, an identification of an event systems
based on the analysis of state sequences that can be observed when the system is working, (Leclercq, et al., 2009).
A. Caminada Netto, A. H. A. Melani, C. A. Murad, S. I. Nabeta, G. F. M. Souza.
Cooling Tower Analysis Using Petri Net for Reliability and Availability
2 COOLING TOWER BASICS
Cooling towers are heat exchangers that use water and air to transfer heat. Water to be cooled is distributed in the
tower by spray nozzles, splash bars or film-type fill, which exposes a very large water surface area to atmospheric air.
In a coal fired power plant, this water passes through the condenser to condensate the steam rejection from low pressure
turbine. As the hot water flows through the tower, the heat is rejected to the ambient air (heat is dissipated to the
atmosphere through the evaporative process) as a result the hot water is cooled, first through evaporation of a small
percentage of the total water flow. Evaporation is a process by which heat is absorbed by air and the remaining
condenser water is cooled to the desired exit temperature, as specifically designed for the cooling tower location,
(Stanford III, 2003). They are often neglected by operation and maintenance technicians, most of the time because of
the hard access to tower, resulting in low cooling efficiency. They are common in industries such as oil refining,
chemical processing, power plants, steel mills, and many different manufacturing processes that require cooled water
(Kiran and Muthukumar, 2017). The make-up water source is used to replenish water lost to evaporation. Hot water
from heat exchangers is also sent to the cooling tower. The water exits the cooling tower and is sent back to the
exchangers or to other units for further cooling. Figure 1 shows a schematic arrangement of a cooling tower application
in a coal fired power plant with six cells only, where it is used to condensate exhaust steam from the low pressure
turbine.
Figure 1: Cooling Tower System.
According to (SPX Cooling Technologies, 2017), each cell contains the basic components such as concrete
structure or frame to support the exterior enclosures fan, motor, speed reducers (gearboxes). Most towers employ fills
(made of plastic, wood or even ceramic) to promote the heat transfer between air and water. The cold-water basin is
located at the bottom of the tower and receives the cooled water that flows down through the tower and fills. Drift
eliminators capture water droplets entrapped in the air stream so it will not be lost to the atmosphere. An air inlet area is
where the air enters the tower. Nozzles spray the warm water through the fills uniformly, which is essential to achieve a
good heat transfer. Fans are used to move large volumes of air efficiently and with minimum vibration. Speed reducers
are used because optimum speed of a cooling tower fan seldom coincides with the most efficient of the driver (motor).
Electric motors are used to drive the fans on cooling towers on this study; they must be reliable under extremely adverse
conditions.
The efficiency of cooling towers depends on the heat rejection load with which the tower must operate. During
design phase the manufacturer considers the heat transfer surface area, ambient wet bulb temperature, time duration the
water is exposed to the air flow and the volume of airflow to the water, (Carazas and Souza, 2009). During operation,
the quality of the water is one very important control the plant maintenance staff needs to have in mind; a proper
treatment of the circulating water for biological control and corrosion must be in accordance with accepted industry
practice. When water is evaporated or lost from a cooling tower, the solids and chemicals used to treat the water remain
in the system, when water is bled from the system, chemicals lost through the bleed need to be replaced so the system
remains protected. If the water is left unchecked, the system would lead to solids build up that would cause scale,
corrosion, biological growth and sludge, not to mention loss of efficiency of the heat transfer.
The main components of the cooling tower are:
Proceedings of the joint ICVRAM ISUMA UNCERTAINTIES conference
Florianópolis, SC, Brazil, April 8-11, 2018
-Water Distribution System (Hot and Cooled): hot water from the heat exchangers is delivered to the top of the
cooling tower by condenser pump through distribution piping. Hot water is sprayed through nozzles on the heat transfer
(fills) inside the cooling tower; it falls through the fill and reaches the basin cooled, ready to go back to the cycle again.
-Heat Transfer (Fill): hot water from the heat exchangers is slowed down and spread out over the fill. Some of the
hot water is evaporated in the fill area, which cools the water. Cooling tower fill is typically arranged in packs of thin
corrugated plastic sheets supported by a framework of spaced bars.
-Air Flow System: the cooling tower fun generates large volumes of air flowing through the heat transfer (fill). The
size of the fun and air flow rate is selected to achieve the designed conditions of temperature, water flow rate and wetbulb temperature.
-Water Treatment System: cooling water must be regularly treated with chemicals, to prevent the growth of bacteria
and minimize corrosion, and inhibit the buildup of scale on the fill and piping system.
Cooling tower is one of the components of large scale water cooled in coal fired power plants; its thermal efficiency
can influence the coal consumption to a large extent. Efficiency operation and thermal performance of a cooling tower
depend not only on mechanical maintenance, but also on cleanliness of the entire system.
3 METHODOLOGY PROPOSAL
The methodology proposed will make a reliability analysis of the cooling tower in a coal-fired power plant. Figure 2
shows the steps proposed for this study. Reliability methods have been established to take into account the uncertainties
involved in the analysis of an engineering problem.
Figure 2: Methodology Proposed for Reliability Analysis.
The failure probability and the reliability index are used to quantify risks and therefore evaluate the consequences of
failure. A component functions for a certain period of time, and then it fails. It is repaired and then put back into
operation again, and the whole process is repeated. A component failure occurs when it cannot perform its required
function; the presence of this failure may cause the whole system to deviate from its required operation.
Let MTBF (Mean Time Between Failures) and MTTR (Mean Time To Repair) represent the expected time to failure
and the expected duration of the repair of the component, respectively. Both metrics can be expressed by Eq. (1) and
Eq. (2):
A. Caminada Netto, A. H. A. Melani, C. A. Murad, S. I. Nabeta, G. F. M. Souza.
Cooling Tower Analysis Using Petri Net for Reliability and Availability
‫= ܨܤܶܯ‬
‫= ܴܶܶܯ‬
்௢௧௔௟ ௎௣ ்௜௠௘
(1)
ே௨௠௕௘௥ ௢௙ ஻௥௘௔௞ௗ௢௪௡௦
்௢௧௔௟ ௎௣ ்௜௠௘
(2)
ே௨௠௕௘௥ ௢௙ ஻௥௘௔௞ௗ௢௪௡௦
MTBF is a basic measure of a system’s reliability. It is typically represented in units of hours. The higher the MTBF
number is, the higher the reliability of the product or system. Equation (3) illustrates this relationship.
೟
ܴሺ‫ݐ‬ሻ = ݁ ିಾ೅ಳಷ
(3)
MTBF impacts both reliability and availability. Before MTBF methods can be explained, it is important to
understand these concepts. The difference between reliability and availability is often unknown or misunderstood. High
availability and high reliability often go hand in hand, but they are not interchangeable terms (Torell and Avelar, 1997).
According to IEEE, 1990; reliability and availability can be defined as:
Reliability: is the ability of a system or component to perform its required functions under stated conditions for a
specified period of time. It is the likelihood that the system or component will succeed within its identified mission
time, with no failures.
Availability: is the degree to which a system or component is operational and accessible when required for use. It
is the likelihood that the system or component is in a state to perform its required function under given conditions at a
given instant in time. In short, availability tells information about how you use time and reliability tells information
about the failure-free interval. Both are described in % values.
MTTR is the expected time to recover a system from a failure. This may include the time it takes to diagnose the
problem, the time it takes to get a repair technician onsite, and the time it takes to physically repair the system, it is
represented in units of hours. Many people will look at the MTBF of a product and make assumptions. For example, if
an MTBF is very high in hours, one may think that this system will have a long time before having to replace their
product. That is not the case. MTBF is the average time between system failures of the entire sample population. It
means that a component can experience failure before or even after this average.
Assumptions are required to simplify the process of estimating MTBF, but they must be realistic. It would be nearly
impossible to collect the data required to calculate an exact number. There are some traps when dealing with MTBF and
MTTR calculation. This type of analysis always requires the removal of some dirty data, such as planned inspections or
unplanned stop due to any other reason other than the cooling tower, and even the root cause analysis needs to be
considered. Another common mistake is that, some companies do not consider a failure anything that takes less than a
work day to fix. It is a misguided way to conduct this indicator and it will never show the truth about the system or
equipment. Organizations seek to achieve higher MTBF and lower MTTR and these two metrics are important key
performance indicators to measure a system performance.
The failure data for the six cells in the cooling tower were obtained from the coal-fired power plant maintenance
records and a computerized maintenance management system, this way MTBF and MTTR could be calculated. In
complex and repairable systems, failures are considered to be those when the system does not meet the design intent.
Even working within their correct operating environment, individual components fail randomly. It puts the system out
of service and places it into a state of repair. Before starting any reliability analysis, one needs to fully understand the
system to be analyzed, so the first phase of this methodology is the construction of a Functional Tree. The objective of
the Functional Tree is to structure, in a logical and hierarchical way, the interdependence between the different
components of the system under study, in order to expose how each one performs its functions, and then the functions
for all equipments are listed in the Functional Description. After that the system SPN model can be tested by using
MTBF and MTTR data and the software TimeNet4.0 for reliability analysis and availability.
3.1 Functional Tree and Functional Description
Although all cooling towers have essentially the same systems as air circulating, hot water supply, fill, cold water
supply, cooling tower structure (most of the time concrete or steel structure) and water treatment, each cooling tower
design possesses its specific characteristics introduced by the design team. Therefore, a functional tree must be
developed for each specific cooling tower. Functional tree is a diagram that shows the interdependencies among
systems, breaking them into single components. The purpose of a functional tree is to structure; in a logical and
hierarchical way, in order to expose how each one of the components performs their functions. Figure 3 shows in a
shorter version the functional tree for the cooling tower under study (Souza, 2012).
Proceedings of the joint ICVRAM ISUMA UNCERTAINTIES conference
Florianópolis, SC, Brazil, April 8-11, 2018
Figure 3: Basic Functional Tree for a Cooling Tower.
If failures are observed at the bottom of the functional tree, it could affect a subsystem and then the cooling tower
will not deliver the water at designed temperature. In other words, water will be warmer and when it passes through the
condenser, it will not remove heat as designed. It causes a pressure increase in the condenser (vacuum loss), which can
lead to more coal consumption. The next phase on this study is to have the function description for each one of these
components. At each level of system resolution, the system engineer needs to understand the full implications of the
goals and constraints to formulate a representative system. It is accomplished by performing a functional description.
Functional description is the systematic process of identifying, describing, and relating the functions a system must
perform to fulfill its goals and objectives. It is a method for understanding product functions in complex systems by
converting the activity performed in a system to functions, (NASA, 1995). In the function description, the main function
of each component and subsystem is listed, according to the function tree. As an example, Fig. 4 shows the description
of only one system from the cooling tower: Hot Water Supply System.
Figure 4: Functional Description of the Hot Water Supply System.
A. Caminada Netto, A. H. A. Melani, C. A. Murad, S. I. Nabeta, G. F. M. Souza.
Cooling Tower Analysis Using Petri Net for Reliability and Availability
3.2 Stochastic Petri Net (SPN)
Petri net was originated by Carl Adam Petri in 1962; it was initially used as a general purpose mathematical tool to
describe the casual relationships between conditions and events in computer system, (Gu, 2002). The original PN did
not include the concept of time, this way a transition would fire immediately. However, starting late in the 1970s a
“time” PN called Stochastic Petri Net (SPN) was presented, (O’Connor and Kleyner, 2012). A SPN allows timed
transitions which are associated to exponentially distribute firing delays. It uses some basic symbols for describing
relations existing between conditions and events. Both PN and SPN can represent and analyze the behavior of a variety
of systems. It is a modeling method which can show a system’s structure and dynamic behavior in formal graph. PN can
model the flow of information and control of systems. Initially, it was developed for modeling and analysis of computer
hardware and software, but during the last few years, PN is being increasingly used in the area of reliability systems,
(Reddy, et al., 1993). The graphical representation of PN employs the following notations: Places (described by
circles), Transitions (described by boxes or bars), Arcs (connect Places to Transitions and vice versa), and Tokens
(described by a black dot), as shown on Fig. 5. It also shows a simple Petri net with an immediate transition (t1) and one
with a timed transition (t2).
Figure 5: Graphical Structure Model of a Simple Petri Net.
Converting these symbols to reliability analysis, Places (P1, P2, P3 and P4) correspond to the state of the system or
condition in the process, Transitions (t1 and t2) correspond to events causing the system state to change, such as
component faults, maintenance, etc., Arcs connect Places to Transitions, and it is the relationship between state and
event. The state in Petri net state is represented by a Token. The system model should include elements of the marking
states and those causing the states to change (Mehrez, 1995). Each Place may hold either none or a positive number of
Tokens. The dynamic behavior of a Petri net is described by a sequence of transition firings. Firing results in moving
one token from one Place to other Place by the Transition, the transition has an exponentially distributed firing time.
Stochastic Petri net is a class of Petri net where the firing times are random variables (Lee, et al., 2003). Figure 6 is a
simple view of a system of Petri net that models a repairable system. Figure 6(a) shows that the system is in
“OPERATING STATE” (Place with the Token), it is called situation before firing. Then when the system reaches a
precondition for this transition the Mean Time Between Failures (MTBF) the system changes its state from “SYSTEM
OPERATING” to the Place: “SYSTEM IN FAILURE”, situation after firing (the Token moved from SYSTEM
OPERATING to the SYSTEM IN FAILURE), Fig. 6(b). As explained earlier, MTBF does not mean that a system or
equipment will experience failure at this exact time; it is an average, so it may failure some time before or even after
this given value.
The next phase on the PN, the system is repaired, Mean Time to Repair (MTTR), which is the expected time to
recover a system, and finally the system goes back to System Operating state again (the Token moved again to the
Place: SYSTEM OPERATING).
Figure 6: The dynamic behavior of a Petri Net.
There are only two states in the lifetime model: failed or working; in some cases, before the component fails, it has
abnormal behavior between good and failure, which can be detected by sensor readings or inspections.
These defective states help to schedule inspections and preventive maintenance. The delay time, shown on Fig. 7,
divides the failure into two stages: the first stage is the normal working stage; and the second is the failure delay time
Proceedings of the joint ICVRAM ISUMA UNCERTAINTIES conference
Florianópolis, SC, Brazil, April 8-11, 2018
stage, where maintenance must take action to avoid failure to occur. When analyzing the data for one year operation
(8760 hours), it was observed that after a certain point of the data analysis, sensor measures started to increase before
the failure occurs. This region of the data was ranging from 75% to 80% of the time before failure. For the purpose of
this study, it will be assumed that MTBF1 = 0,8xMTBF2, as shown on Fig.7. This assumption will be used later to
model the Petri net for the whole cooling tower. MTBF1 is the average time where a signal will be sent to the control
room as an alarm, so operators in the control room understand that maintenance teams must be warned about this
increase on sensor reading; it means that a tendency for a defect was observed and the maintenance activities can be
carried out to prevent the failure to occur.
Figure 7: MTBF1 Assumption.
After analyzing all the data for one year operation, MTBF = 2253 hours and MTTR = 84 hours. This way:
MTBF1 = 1802 hours and the remaining hours to reach the total MTBF is MTBF2 = 451 hours
Even though the cooling tower has one single design for all the cells, it is subject to many uncertainties coming from
its daily working process, variations such as the quality on components used and the knowledge of the workers who
assembled these components on the cooling tower and more particularly the change in the operational durations. When
analyzing the data for different sensors in the cooling tower, it was observed that despite the same design and same
supplier (same fans, gearboxes, electric motors, pipes, gaskets, flanges, etc.), they had different behavior. Some cells
would experience more failures than others. It becomes frustrating to deal with the uncertainty of the relationship
among the components in the mechanical system. When collecting the data can also be an important factor for error or
uncertainty in this process. An error is defined as a discrepancy between an observed value and the true correct value or
condition, (Yang and Liu, 1998). Sometimes only the sensor fails to produce the correct signal for a given stimulus
(hardware degradation, inaccurate readings, and environmental changes), but the cell still is working as designed and
supplying cooled water at right temperature. Sometimes values for a certain period are either too high or too low, which
is easily understood as an error on sensor readings. From the practical standpoint this is a major disadvantage, since
sensors include some degree of uncertainty. These uncertainties need to be clearly understood and removed from the
study so the model can be as close as possible to the real cooling tower performance.
Figure 8 represents two cells of the cooling tower and only one maintenance team available to fix any failure that
may occur, as an example.
Figure 8: Two Cooling Tower Cells and Maintenance Team.
For the understanding of the Petri net logics on this study, this example will start with two cells to understand the
Petri net logics applied on this study, and then it will move to the entire cooling tower cells.
The system is modeled with places and transitions, inputs to a transition are the precondition of a corresponding
event. The transition will occur if the precondition happens.
A. Caminada Netto, A. H. A. Melani, C. A. Murad, S. I. Nabeta, G. F. M. Souza.
Cooling Tower Analysis Using Petri Net for Reliability and Availability
For every transition a precondition must be written so the Petri net can run. The logic for this system works like this:
• Cells 1 & 2: Tokens are in the places “EQUIPMENT OPERATING-1 & EQUIPMENT OPERATING-2”
respectively.
• Maintenance Team: Token is in the place “AVAILABLE”.
• Supposing cell 1 reaches “MTBF1”, then token moves (fires) to the place “ALARM-1”.
Since the maintenance team is available, token moves (fires) straight to the place “REPAIR-1” through
immediate transition “T1”. A precondition for this transition “T1” was met.
• Maintenance Team: Token also moves (fires) from the place “AVAILABLE” to “UNAVAILABLE” through
immediate transition “T5”. A precondition for this transition “T5” was met.
• Supposing while maintenance team is in the place “REPAIR-1”, cell 2 reaches “MTBF3”, token goes to the place
“ALARM-2” and since the maintenance team is not available, this cell keeps working until it reaches “MTBF4”
and then token moves to the place “EQUIPMENT IN FAILURE-2” and stay there, until maintenance team is
available.
• Meanwhile maintenance team working in cell1 reaches “MTTR1”, and token moves to the place “EQUIPMENT
OPERATING-1”.
• After that, maintenance team becomes available again, and so token moves from the place “UNAVAILABLE”
to the place “AVAILABLE” through immediate transition “T6”. A precondition for this transition “T6” was met.
• Maintenance team becomes available and it is ready to get busy again, so token in the place “EQUIPMENT IN
FAILURE-2” moves to the place “REPAIR-2” and token from maintenance team moves to the place
“UNAVAILABLE” through immediate transition “T5”. A precondition for this transition “T5” was met.
• Maintenance team working on cell 2, it reaches “MTTR2” and token moves (fires) to the place “EQUIPMENT
OPERATING-2”.
• In case both systems experience a failure at the same time, the software TimeNET4.0 will randomly select one
system to repair. As mentioned earlier, firing is a random variable by the SPN.
• Finally, the whole process can start again.
Figure 9 shows six cells from the cooling tower system and one maintenance team available for repairing. All the
cells are in the place “EQUIPMENT OPERATING”, which means cooling tower is supplying cooled water as
designed.
Figure 9: Cooling Tower: Six Cells Working (One Maintenance Team).
This first attempt will show the results for reliability and availability of the entire system considering only one
maintenance team for repairing. For this Petri net model the data was extracted from the power plant data center. It will
be used to run this model: MTBF1 = 1802 hours, MTBF2 = 451 hours, and MTTR = 84 hours, as shown earlier.
Looking at Fig. 9, one can see that all tokens are in the place “EQUIPMENT OPERATING” and the maintenance
team in the place “AVAILABLE”. It means the cooling tower at this time is supplying cooled water at the right
temperature as it was designed. Following the same logic explained earlier in this study, but at this time considering the
whole cooling tower (all six cells) and one maintenance team.
Figure 10, as an example, shows a different situation where some cooling tower cells have experienced some sort of
failure (chosen randomly) and the maintenance team is already busy performing other repair.
Proceedings of the joint ICVRAM ISUMA UNCERTAINTIES conference
Florianópolis, SC, Brazil, April 8-11, 2018
Figure 10: Cooling Tower Cells Experiencing Failures (Considering One Maintenance Team).
The Petri net logic on Fig.10 works as follow:
• Supposing Cell 2 was the first one to experience a failure in this example. Token fired to the place “ALARM-2”
and since maintenance team is available, token fired again to place “REPAIR-2” through immediate transition
“T3” and then one maintenance team fired from place AVAILABLE to UNAVAILABLE through its immediate
transition “T13”.
• Cell 5 was the second one to experience a failure; it has reached its “MTBF9”. Token fired to the place
“ALARM-5” and since the maintenance team was unavailable, it kept working until either maintenance team
becomes available and fires to the place “REPAIR-5” or until it reaches “MTBF10” and then fires to the place
“EQUIPEMENT IN FAILURE-5”, which is case in this example.
• Cooling tower will experience other failures and the PN logic will continue to work as follow.
• Cell 1 has reached its “MTBF1” and token fired to the place “ALARM-1” and since maintenance team was still
unavailable, this cell kept working until either maintenance team becomes available and fires to the place
“REPAIR-1” or until it reaches “MTBF2” and then fires to the place “EQUIPEMENT IN FAILURE-1”.
• Cell 6 has reached its “MTBF11” and token fired to the place “ALARM-6” and since the maintenance team was
still unavailable, this cell kept working and it reached its “MTBF12” and token fired to the place “EQUIPEMENT
IN FAILURE-6”, just like the example above.
• When cell 2 reaches its “MTTR”, it will fire to the place “EQUIPMENT OPERATING-2” and the maintenance
team becomes available to repair any other cell randomly, and so on with other cells.
• Finally the process will continue to run as explained above.
Figure 11 shows the reliability curve obtained for the cooling tower, considering one maintenance team for
repairing.
Figure 11: Cooling Tower – Reliability Curve (One Maintenance Team).
Reliability (R) and availability (A) obtained for this model, are as follows: R (t) = 83.38% and A = 98.50%
A. Caminada Netto, A. H. A. Melani, C. A. Murad, S. I. Nabeta, G. F. M. Souza.
Cooling Tower Analysis Using Petri Net for Reliability and Availability
The next SPN situation, the cooling tower system has gained another maintenance team for repairing. Now it has
two maintenance teams available, as shown on Fig. 12.
Figure 12: Cooling Tower: Six Cells Working (Two Maintenance Teams)
Following the same logics presented earlier, some cells will experience failures as shown on Fig. 13.
Figure 13: Cooling Tower Cells Experiencing Failures (Two Maintenance Team).
The Petri net logic on Fig.13 works as follow:
• Supposing Cell 2 was the first one to experience a failure in this example. Token fired to the place “ALARM-2”
and since both maintenance teams were available, token fired again to place “REPAIR-2” through immediate
transition “T3” and then one maintenance team fired randomly from place “AVAILABLE” to
“UNAVAILABLE” through its immediate transition.
Proceedings of the joint ICVRAM ISUMA UNCERTAINTIES conference
Florianópolis, SC, Brazil, April 8-11, 2018
•
•
•
•
•
•
•
Cell 4 was the second one to experience a failure. Token fired to the place “ALARM-4” and since one
maintenance team was still available, token fired again to the place “REPAIR-4” through immediate transition
“T7”. Now both maintenance teams are not available for any repair.
Cooling tower will experience other failures and the PN logic will continue to work as follow.
Cell 1 has reached its “MTBF1” and token fired to the place “ALARM-1” and since none of the maintenance
teams were available, this cell kept working until either maintenance team becomes available and fires to the
place “REPAIR-1” or until it reaches “MTBF2” and then fires to the place “EQUIPEMENT IN FAILURE-1”.
Cell 5 has reached its “MTBF9” token fired to the place “ALARM-5”, none of the maintenance teams were
available, so this cell kept working until it reached “MTBF10” and then fired again to the place “EQUIPEMENT
IN FAILURE-5”. It stays there until any maintenance team becomes available to repair this cell.
Cell 6 has reached its “MTBF11” and token fired to the place “ALARM-6” and since none of the maintenance
teams were available, this cell kept working until either maintenance team becomes available and fires to the
place “REPAIR-6” or until it reaches “MTBF12” and then fires to the place “EQUIPEMENT IN FAILURE-6”.
When cells 2 or 4 reaches their “MTTR”, they will fire to the place “EQUIPMENT OPERATING” and either
one of the maintenance team becomes available to repair any other cell randomly.
Finally the process will continue to run as explained above.
Figure 14 compares the reliability curve for one and two maintenance teams, obtained for the cooling tower system.
Figure 14: Cooling Tower – Reliability Curve (One and Two Maintenance Team)
The new values for reliability and availability can be obtained, as follow: R (t) = 92.41% and A = 100%
Finally, Tab. 1 shows the results obtained from both analyses made on this study.
Table 1: Reliability and Availability (One & Two Maintenance Teams)
It shows that having maintenance team available to fix the cooling tower as needed; it improves the performance of
the entire system. After determining these reliability R(t) and availability (A) values on Tab. 1, another proposal is to
figure out how improvements over the years on equipment design and maintenance plans would affect R(t) and A. The
SPN will run with improved MTTR and MTBF in 5% per year for three years, as an example.
Figure 15 shows the reliability values for three consecutive years, considering 5% of improvement over the previous
year. During these three years reliability growth becomes very close for both one and two maintenance teams,
management could decide not to have two teams anymore due to the low reliability growth, However, to achieve those
changes, component design and maintenance plans will need to change along, which usually means more investments.
Before changing component design a good strategy is to improve maintenance plan by increasing the number of
periodic inspections on cooling tower system such as, gearbox (oil level in reservoir, gaskets, and fasteners), pumps
(flow and noise), motors (overheating, vibration, and fasteners), fans (blades vibration, noise), and water quality
A. Caminada Netto, A. H. A. Melani, C. A. Murad, S. I. Nabeta, G. F. M. Souza.
Cooling Tower Analysis Using Petri Net for Reliability and Availability
(controls of corrosion, deposition, and microbiological growth), hot and cold water piping (leaks, flow and cleanness),
etc. Experience will show that pinpointing minor problems during periodic inspections and give the proper priority to
them before they become major problems can be useful.
Besides all that, another way to improve those indexes is to implement a diagnose system where failure would be
pointed out automatically. It would reduce the investigation timing on the root causes analysis, which can bring the
fixing timing down.
Figure 15: Reliability Improvement over the Years.
4 SUGGESTIONS FOR REFINEMENT
The purpose of this study is to introduce reliability and availability analysis using SPN concept. To illustrate the
study a cooling tower system was chosen for the analysis. The SPN model has considered two approaches, first only
one maintenance team available and second two maintenance teams available for repairing. As a result reliability and
availability for both hypotheses were determined. After that another analysis was made considering 5% of improvement
on MTBF and MTTR over the following three years. But the SPN model did not consider all components that form this
cooling tower so called here by the authors as “System”. To bring this model closer to a real cooling tower system, it
would be necessary to add more components in this SPN model (more places and transitions), since sensor readings are
available for most of them; and where there are no sensor readings records of regular inspections are kept by
maintenance staff. This allows the addition of more systems into the study, such as the condenser, vacuum pumps,
condensate extraction pumps, feed water pumps, etc. However, even without considering all components in the cooling
tower analysis, the model has proven to be useful and practical to use.
5 CONCLUSION
This study has attempted to model a cooling tower system in a coal-fired power plant. The proposal is based on
Stochastic Petri Net model as an alternative to determine reliability and availability for this system. Petri net models
provide a powerful modeling tool for representing complex systems. The control room operators from power station
need a system to aid and support them to make decisions during critical situations and reduce the time delay between
alarm and failure. This certainly helps maintenance management to plan their strategies on how to take action faster.
A Petri net model for the cooling tower system is proposed to deal with alarms and making visible when to take
action before failure occurs. The study has made assumptions with one and two maintenance teams to perform all
repairs at the cooling tower. However, one has to keep in mind that maintenance teams are available for the entire
power plant and this study has considered them exclusively for the cooling tower. MTTR was used as it was extracted
from the data base system. On a daily basis, management will assign the proper priority to maintenance and repair jobs
to be performed by these teams. So this could bring some sort of uncertainty to the SPN model itself. MTTR for the
cooling tower might increase due to this repairing priority.
For future study management could make a deep diving in the system, and learn the accurate number of failures and
their frequency. Based on the results, management can make business-cases for hiring more professionals or to have
outsourced suppliers that can come in on these emergencies for repairing. These results have shown that the more
available maintenance teams are the better reliability and availability will be for the cooling tower system. Establishing
the relationship between reliability improvement and coal consumption (fuel used in this plant) would be another
suggestion for future study.
Spare parts arrival and/or availability for maintenance play an important role in planning the operation sequences; it
strongly affects the MTTR index. Maintenance teams’ technical knowledge is also an important variable on this
scenario, so the better trained the maintenance teams are, the better the repair and as a consequence the indexes for
reliability and availability can be improved. Another limitation for this model is that in the real world, the number of
Proceedings of the joint ICVRAM ISUMA UNCERTAINTIES conference
Florianópolis, SC, Brazil, April 8-11, 2018
components in the cooling tower is far bigger than the quantity considered in this Petri net model. The complex system
structure and operation options will need more transitions and places in the modeling. From these results obtained on
this study, it is easy to comprehend that maintenance team availability is one way to improve quality in a power plant
operation.
ACKNOWLEDGEMENTS
This research reported here was supported in part by both the Foundation for Engineering Technological
Development (FDTE) and Foundation CAPES. The authors are deeply grateful for this support.
REFERENCES
Burhanuddin, M. A., Ghani, M. K. A., Ahamad A., Abas Z. A., Izzah, Z., 2014, “Reliability analysis of the failure data
in industrial repairable systems due to equipment risk factors”, Applied Mathematical Sciences, Vol. 8, nº 31,
pp.1543-1555.
Carazas F. J. G. and Souza, G. M. F., 2009. “Method for cooling towers maitenance policy selection based on RCM
concepts”. 20th International Congress of Mechanical Engineering, November 15-20, Gramado, RS, Brazil.
Gu, T., Bahri, P., A., 2002, “A survey of Petri net application in batch process”. Computers in Industry Vol. 47, pp. 99111.
IEEE Standard Glossary of Software Engineering Terlinology, 1990.
< http://www.mit.jyu.fi/ope/kurssit/TIES462/Materialit/IEEE_SoftwareEngGlossary.pdf > 06Oct17.
Kiran Naik, B. and Muthukumar P., 2017, “A novel approch for performance assessment of mechanical draft wet
cooling towers”. Applied Thermal Engineering. Vol. 121, pp.14-26.
Leclercq, E., Lefebvre, D., Ould El Medhi, S., 2009, “Identification of timed stochastic Petri net models with normal
distributions of firing periods”. Proceedings of the 13th IFAC Symposium on Information Control Problems in
Manufacturing, Moscow, June 3-5, 2009.
Lee, A., Lu, L., 2012, “Petri net modeling for probabilistic safety assessment and its application in the air lock
system of a CANDU nuclear power plant”, Procedia Engineering Vol. 45, pp.11-20.
Lee, J., Liu, K., F., R., Chiang, W., 2003, “Modeling Uncertainty Reasoning With Possibilistic Petri Nets”. IEEE,
Transactions onSystems, Man and Cybernetics, Vol. 33, Nº 2, April 2003.
Mehrez, A., Muzumdar, M., Acar, W., Weinroth, G.,1995, “A Petri Net Model View od Decision Making: na
Operational Management Analysis”. International Journal of Management, Vol. 23, Nº 1, pp.63-78.
National Aeronautics and Space Administration, 1995, “Systems Engineering Handbook”. SP610S.
< https://web.stanford.edu/class/cee243/NASASE.pdf > 02 Oct 2017.
O’Connor, P.D.T. and Kleyner, A., 2012. “Practical Reliability Engineering”. WILEY – Fifth Edition.
Pamuk, N., Uyaroglu, Y., 2012, “An Expert System for Power Transformer Fault Diagnosis Using Advanced
Generalized Stochastic Petri Net”, PRZEGLĄD ELEKTROTECHNICZNY (Electrical Review), ISSN 0033-2097,
R. 88.
Reddy, G., B., Murty, S., S., N., Ghosh, K., 1993, “Timed Petri Net: An Expeditious Tool for Modelling and Analysis of
Manufacturing Systems”. Mathl. Comput. Modelling Vol. 18, No. 9, pp.17-30. Pergamon Press.
Sabouhi H., Abbaspour, A., Fotuhi-Firuzabad M., Dehghanian, P., 2016, “Reliability modeling and availability analysis
of combined cycle power plants”. Electrical Power and Energy Systems Vol. 79 pp.108-119.
Sanfors III, H., W., 2003, “HVAC Water Chillers and Colling Towers”, Marcel Dekker, Inc., ISBN: 0-8247-0992-6.
Souza, G. M. F., 2012, “ Thermal Power Plant Performance Analysis”. Springer-Verlag London Limited. ISBN 9781-4471-2308-8.
SPX Cooling Technologies – Cooling Tower Fundamentals. 02 Sep. 2017
< http://spxcooling.com/pdf/Cooling-Tower-Fundamentals.pdf>
Srinivasa Rao, M., Naikan, V. N. A., 2014, “Relaibility analisys of repairable systems using system dynamics
modeling and simulation”, J. Ind. Eng. Int. DOI 10.1007/s40092-014-0069-3.
Torell, W., Avelar, V., 1997, “Mean Time Between Failures: Explanation and Standards”. Schneider Electric’s Data
Center
Science
Center.
<http://www.apc.com/salestools/VAVR-5WGTSB/VAVR-5WGTSB_R1_EN.pdf>
06Oct2017.
Yang, S., K., Liu, T., T., 1998, “ A Petri Net Approach to Early Failure Detection and Isolation for Preventive
Maintenance”. International journal of Quality and Reliability Engineering, Vol. 14, Issue 5, pp.319-330.
Zimmermann, A., Knoke, U., 2007, “A Software Tool for the Performability Evaluation with Stochastic and Colored
Petri Nets”. <http://www2.tu-ilmenau.de/sse_file/timenet/ManualHTML4/UserManual.html> 02Oct2017.
RESPONSIBILITY NOTICE
The following text, properly adapted to the number of authors, must be included in the last section of the paper:
The authors are the only responsible for the printed material included in this paper.
View publication stats
Download