1 EVOLUTIONARY ALGORITHM AND EXPECTATION 2 MAXIMISATION STRATEGIES FOR IMPROVED DETECTION OF 3 PIPE BURSTS AND OTHER EVENTS IN WATER DISTRIBUTION 4 SYSTEMS 5 Romano, M., Kapelan, Z., and Savić, D. A. 6 Abstract 7 A fully automated data-driven methodology for the detection of pipe bursts and other events which 8 induce similar abnormal pressure/flow variations (e.g., unauthorised consumptions) at the District 9 Metered Area (DMA) level has been recently developed by the authors. This methodology works by 10 simultaneously analysing the data coming on-line from all the pressure and/or flow sensors deployed 11 in a DMA. It makes synergistic use of several self-learning Artificial Intelligence (AI) and statistical 12 techniques. These include: (i) wavelets for the de-noising of the recorded pressure/flow signals, (ii) 13 Artificial Neural Networks (ANNs) models for the short-term forecasting of pressure/flow signal 14 values, (iii) Statistical Process Control (SPC) techniques for the short and long term analysis of the 15 burst/other event-induced pressure/flow variations, and (ivi) a DMA level Bayesian Inference Systems 16 (BISs) for inferring the probability that a pipe burst/other event has occurred in the DMA being 17 studied and, raising the corresponding detection alarms, and provide information useful for 18 performing event diagnosis. This paper focuses on the (re)calibration of the above detection 19 methodology with the aim of improving the ANN models forecasting and the DMA level BIS 20 classification performances of the ANN models and the classification performances of the BIS used to 21 raise the detection alarms (i.e., DMA level BIS). This is achieved by using: (1) an Evolutionary 22 Algorithm optimisation strategy for selecting the best ANN input structures and related parameter 23 values to be used for training the ANN models, and (2) an Expectation Maximisation strategy for 24 (re)calibrating the values in the Conditional Probability Tables (CPTs) of the DMA level BIS. The 25 (re)calibration procedure is tested on a case study involving several UK DMAs in the United 26 Kingdom (UK) with both real-life pipe bursts/other events, and engineered pipe burst events (i.e., 27 simulated by opening fire hydrants) and synthetic pipe burst events (i.e., simulated by arbitrarily 28 adding “burst flows” to an actual flow signal). The results obtained illustrate that the new 29 (re)calibration procedure improves the performance of the event detection methodology in terms of 30 increased detection speed and reliability. Michele Romano, Centre for Water Systems, College of Engineering, Mathematics and Physical Sciences, University of Exeter, Harrison Building, North Park Road, Exeter, Devon, EX4 4QF, United Kingdom, Email: mr277@exeter.ac.uk (corresponding author) Zoran Kapelan, Centre for Water Systems, College of Engineering, Mathematics and Physical Sciences, University of Exeter, Harrison Building, North Park Road, Exeter, Devon, EX4 4QF, United Kingdom, Email: Z.Kapelan@exeter.ac.uk Dragan A. Savić, Centre for Water Systems, College of Engineering, Mathematics and Physical Sciences, University of Exeter, Harrison Building, North Park Road, Exeter, Devon, EX4 4QF, United Kingdom, Email: D.Savic@exeter.ac.uk 31 INTRODUCTION 32 Pipe burst events in Water Distribution Systems (WDSs) are a compelling issue for the water 33 companies worldwide as the loss of large volumes of treated and frequently pumped water is 34 environmentally and economically damaging. Furthermore, they have a negative impact on the water 35 companies’ operational performance, customer service and reputation. Cost-effective reduction of 36 water leakages caused by the pipe burst events is however a challenging task. New and more efficient 37 methodologies are required for their timely and reliable detection of these events. This said, note that 38 despite the fact that the pipe burst events only initiate the water leakages, in this paper the two terms 39 are used interchangeably (as it is often done in engineering practice). 40 Despite its importance, the detection of pipe burst events is one of the tasks faced by the water 41 companies that is becoming more and more difficult. The main reasons for this are the increasing 42 complexity of the WDSs, the fact that water companies are starting to routinely implementing 43 pressure management programmes, and the installation of more and more poly-ethylene pipes (on 44 which the conventional, acoustic equipment-based techniques do not work that well). As a result, in 45 many cases, pipe bursts are brought to the attention of a water company only when someone calls in 46 to report a visible event. 47 A fully automated, data-driven methodology for the near real-time detection of pipe bursts and other 48 events at the District Metered Area (DMA) level has been recently developed (Romano et al. 2012). 49 This methodology is implemented in a fully automated Event Recognition System (ERS) which works 50 by simultaneously analysing the data coming on-line from all the pressure and/or flow sensors 51 deployed in a DMA. It makes synergistic use of several self-learning Artificial Intelligence (AI) and 52 statistical techniques. These include Artificial Neural Network (ANN) models for the short-term 53 forecasting of pressure/flow signal values, and a DMA level Bayesian Inference System (BIS) for 54 inferring the probability that a pipe burst/other event has occurred in the DMA being studied and 55 raising the corresponding detection alarms. The objective of the work presented here is to develop a 56 new methodology for the effective data-driven (re)calibration of the ERS by using: (1) an 57 Evolutionary Algorithm optimisation strategy (Schwefel 1981) for improving the forecasting 58 performance of the ANN models, and (2) an Expectation Maximisation strategy (Dempster et al. 59 1977; Lauritzen 1995) for improving the classification performance of the DMA level BIS. 60 This paper is organised as follows. After this introduction, relevant background information is 61 presented in the next section. Then an overview of the ERS is given. This is followed by two sections 62 presenting the theoretical background and methodological details of the Evolutionary Algorithm and 1 63 Expectation Maximisation optimisation strategies, respectively. The latter sections constitute the core 64 of the new contribution presented in this paper. Once this is done, the results of the (re)calibration 65 methodology tests on several UK DMAs in the United Kingdom (UK) with both real-life pipe 66 burst/other events, and engineered pipe burst events (i.e., simulated by opening fire hydrants) and 67 synthetic pipe burst events (i.e., simulated by arbitrarily adding fictitious “burst flows” to an actual 68 flow signal) Engineered Events (EEs) are presented in the case study section. Finally, the main 69 conclusions are drawn and acknowledgements given. 70 BACKGROUND 71 Current solutions to the pipe burst events detection problem are based on various principles. A large 72 group of techniques utilize utilises highly specialized hardware equipment. Techniques such as leak 73 noise correlators (e.g., Grunwell and Ratcliffe 1981), gas injection (e.g., Field and Ratcliffe 1978), 74 and pig-mounted acoustic sensing (e.g., Mergelas and Henrich 2005), belong to this group. Despite 75 the fact that some of these techniques are the most accurate ones used today (Puust et al. 2010), they 76 are expensive, labour-intensive, slow to run and may require the cessation of pipeline operations for 77 long periods of time. Consequently, much research has been focussed on finding equally effective, but 78 faster techniques that cost less to run. 79 Several techniques exist that make use of transient analysis. These include inverse transient analysis 80 (e.g., Liggett and Chen 1994) and frequency domain techniques (e.g., Mpesha et al. 2001). Other 81 techniques from this group seek to exploit special features in the unsteady signal rather than infer the 82 presence and location of a pipe burst by reproducing the transient trace in a simulator in the time or 83 frequency domain (e.g., Brunone 1999; Kim 2005; Wang et al. 2002). The techniques in this group 84 are potentially appealing due to their inexpensive and non invasive nature, good operational range, 85 relative insensitivity to pipe characteristics (e.g., material and diameter), and sensor-to-sensor spacing 86 required. However, transient analysis-based techniques require pressure and other measurements 87 sampled with a high frequency and a significantly larger number of such sensors that are normally 88 present in a pipeline system, thereby leading to increased costs. Also, these techniques often rely on 89 complex and inaccurate transient network simulation models and require precise knowledge of the 90 pipeline and instrumentation parameters. As a result, they have had limited success so far, generally 91 on single pipelines only. Various steady state analysis-based techniques have also been developed 92 (e.g., Pudar and Liggett 1992; Misiunas et al. 2006; Puust et al. 2006; Wu et al. 2010). Compared to 93 the transient analysis-based techniques, these methods do not require the collection of data at high 94 frequencies but seem to be reliant on the availability of an extensive number of accurate 2 95 measurements. In addition, the availability of well calibrated hydraulic models still significantly limits 96 their widespread use by the water companies. 97 Note that the aforementioned techniques are based on intermittent pipeline inspections in the field and 98 involve manual and resource intensive processes for data collection, transfer to the point of use, and 99 analysis. Opposite of that, automatic detection techniques that make use of data from permanently 100 installed sensors may provide a more rapid response to the pipe burst events. Negative pressure wave- 101 based techniques (e.g., Misiunas et al. 2005; Srirangarajan et al. 2011) belong to this group. The 102 sensors required by these techniques, however, are quite expensive. Furthermore, as they record the 103 data at very high frequencies, the maintenance and data transmission costs are significant as well. 104 These techniques have the same limitations of the transient analysis-based techniques. Additionally, 105 they have to cope with the unknown shape of a pipe burst event-induced transient, noise from the 106 normal pipeline/pipeline system operations and often very weak pipe burst signatures. All this makes 107 these techniques difficult to apply to large pipeline systems with complicated configurations. 108 Techniques that attempt to use statistical and AI techniques for automatically processing operational 109 variables (e.g., pressure and flow) in near real-time have recently started to appear. This is mainly due 110 to the latest developments in hydraulic sensor technology/on-line data acquisition systems, which 111 have enabled water companies to deploy a larger number of more accurate and cheaper pressure and 112 flow devices. To this group belong techniques such as the hybrid ANN/Fuzzy Inference System 113 described in Mounce et al. (2010), the Principal Component Analysis based technique proposed by 114 Palau et al. (2011), the Adaptive Kalman Filtering technique proposed by Fenner and Ye (2011), and 115 the Support Vector Machines technique proposed by Mounce et al. (2011). These techniques are very 116 promising in the context of extracting useful information (required for making reliable operational 117 decisions) from the vast amount of and often imperfect sensor data collected by modern Supervisory 118 Control And Data Acquisition (SCADA) systems. Statistical/AI-based have a requirement for 119 pressure and/or flow measurements sampled much less frequently (e.g., every 15 minutes) than those 120 required for transient analysis. Also, pressure and/or flow measurements come from a limited number 121 of sensors permanently installed in the pipeline system. Furthermore, these techniques rely on 122 empirical observation of the behaviour of the pipeline system, thus precise knowledge of the pipeline 123 and instrumentation parameters is not required. 124 Despite their initial success, the aforementioned statistical/AI-based techniques can be further 125 improved in terms of both detection reliability and detection time. Romano et al. (2012) described the 126 development of a methodology for the automated near real-time detection of pipe bursts and other 127 events (which induce similar abnormal pressure/flow variations) that offers noticeable improvements 3 128 over the aforementioned statistical/AI-based techniques. They showed that the use of: (i) advanced 129 techniques for more efficient and effective processing of the hydraulic data gathered (i.e., wavelets for 130 removing noise from the measured flow and especially pressure signals), (ii) different ensembles of 131 statistical/AI techniques (i.e., Statistical Process Control, and ANNs) for recognising the various types 132 of evidence of a burst/other event occurrence, and (iii) a probabilistic inference engine based on 133 Bayesian Networks (Edwards 2000; Jensen 2001) for simultaneously (synergistically) analysing 134 multiple evidence/signals at the DMA level, resulted in more reliable and faster detections. 135 EVENT RECOGNITION SYSTEM OVERVIEW 136 The pressure and flow signals from a DMA show daily, weekly and seasonal variations and are also 137 influenced by socioeconomic and meteorological factors such as population characteristics or number 138 of industrial establishment and air temperature or precipitation. Furthermore, they show a different 139 behaviour under the DMA normal and abnormal (e.g., when a pipe burst has occurred) operating 140 conditions. However, their pattern is specific to a particular location. During the DMA normal 141 operating conditions, it gives this means that a signal may show a predictable pattern or fingerprint. It 142 should be therefore feasible, during the DMA abnormal operating conditions, to identify the pipe 143 burst/other event-induced pressure/flow deviations from this predictable pattern fingerprint. Note that 144 hereafter event will be used as a generic term to indicate pipe bursts and other events which induce 145 similar abnormal pressure/flow variations. The acronym NOP (i.e., Normal Operating Pattern) will be 146 used to refer to the pattern of a (pressure or flow) signal assuming that no event occurred in the DMA 147 being studied (i.e., during the DMA normal operating conditions). 148 In the light on the above assumptions, a computer-based ERS which implements a novel event 149 detection methodology has been recently developed (Romano et al. 2012). This section presents a 150 brief overview of this system together with relevant details associated with the newly introduced 151 Evolutionary Algorithm and Expectation Maximisation based modules which enable implementing 152 the novel (re)calibration methodology presented in this paper. A more detailed description of the ERS 153 is available in the above reference. 154 The data processing route in the ERS starts by receiving the data communicated by the sensors 155 deployed in the DMA being studied. For each DMA signal and at each communication interval, u 156 readings are obtained (e.g., 2 readings – assuming 15 minute sampled data, which are communicated 157 every 30 minutes to improve the sensors’ battery life). These readings update a time series record 158 which is stored in a Time Series database. Once all the DMA signals are fully processed as described 159 below, the resulting u probability values that an event has occurred in the DMA and any additional 4 160 information that may be used to perform a diagnosis of the event occurring (e.g., to determine the 161 likely cause of an alarm) are stored in the Alarms database. If any of the u probability values exceed a 162 fixed detection threshold an alarm is generated. In order to avoid raising unnecessary detection alarms 163 for the same event at the following communication intervals, however, any further detection alarm is 164 suppressed for a user specified ‘alarm inactivity time’ period. Note that the choice of this parameter is 165 dependent on the time taken to repair the burst and the pressure/flow variations to go back to normal. 166 Figure 1. To appear here. 167 Figure 1 shows a diagrammatic representation of the ERS. As it can be observed from this figure, the 168 processing of the pressure/flow data is performed into the four ERS components (i.e., dashed dotted 169 rectangles) as follows: 170 1. Capturing of the NOPs of the various DMA pressure/flow signals; 171 2. Identification and estimation of the event-induced deviations between observed (i.e., 172 measured) and captured DMA signal patterns; 173 3. Inference about the probability that an event has actually occurred based on above deviations. 174 4. (Re)calibration of the probabilistic inference engine based on the information about past 175 events occurred in the DMA being studied. 176 Note that the first three ERS components are used in a fully automatic fashion for the actual event 177 detection. The last ERS component is used in a semi-automatic fashion (i.e., on behest of the user 178 when the past events information is available) for the initial calibration and the follow-on periodic 179 recalibrations of the DMA level BIS. 180 Figure 1 also shows that the aforementioned four ERS components are further organised into six 181 subsystems (i.e., solid snipped corner rectangles) each containing a number of different modules (i.e., 182 solid rectangles). The six ERS subsystems are as follows: (1) the Setup subsystem, (2) the 183 Discrepancy Based Analysis (DBA) subsystem, (3) the Boundary Based Analysis (BBA) subsystem, 184 (4) the Trend Based Analysis (TBA) subsystem, (5) the Inference subsystem, and (6) the Bayesian 185 Inference System (BIS) parameters learning subsystem. 186 The first ERS subsystem (equivalent to the first ERS component) is used to perform the pressure/flow 187 signal NOP capturing. Its first two modules (i.e., data retrieval, and data pre-processing) are used for 188 retrieving the historical data from the Time Series database and assembling a set of data that best 189 represents the most recent NOP of the DMA signal being analysed (i.e., NOP data set). Once this is 5 190 done, the third module (i.e., statistics estimation) is used for estimating several vectors of descriptive 191 statistics from the NOP data set. The remaining modules (i.e., the data de-noising module, the newly 192 introduced ANN parameters & input structure selection module, and the ANN training & testing 193 module), on the other hand, are first used for removing noise from the NOP data set and then for: (i) 194 automatically selecting the optimal (i.e., that yield the best forecasting performance) ANN input 195 structure and parameters set (by means of an Evolutionary Algorithm optimisation strategy), (ii) 196 training and testing a “specialised” (i.e., signal-specific) ANN prediction model (by using the 197 optimised ANN input structure and parameters set), and (iii) estimating the ANN model prediction 198 error’s variability. 199 The second, third and fourth ERS subsystems are used synergistically to perform the deviations 200 identification and estimation data analysis in the second ERS component. This is done as follows: (i) 201 the DBA subsystem checks that the discrepancies between the incoming observed DMA signal values 202 and their ANN predicted counterparts do not exceed pre-defined limits based on the estimated ANN 203 model prediction error’s variability, (ii) the BBA subsystem checks that the incoming observed DMA 204 signal values lie inside a “data envelope” whose boundaries are defined by using the vectors of 205 descriptive statistics estimated from the NOP data set, and (iii) the TBA subsystem monitors, on a 206 Control Chart (Shewhart 1931), how the mean of the historical DMA signal values recorded during a 207 particular time window during the day (e.g., from midnight to 4 am, 4am to 8 am, etc.) varies over 208 time. The reason for using three subsystems is that, by using an ensemble of different statistical and 209 AI techniques, each of them focuses on recognising a specific type of evidence that an event has 210 occurred. Furthermore, since they perform tasks in parallel they allow simultaneously assessing how 211 an event affects the pressure/flow measurements from different perspectives (e.g., short-term and 212 long-term effects). 213 The fifth ERS subsystem (equivalent to the third ERS component) is used to perform the event 214 probability inference data analysis. Starting from the event occurrence evidence generated as 215 described above, BISs are used here for: (i) combining the generated event occurrence evidence, (ii) 216 inferring the probability of an event occurrence in the DMA, (iii) raising detection alarms, and (iv) 217 providinge additional information for incident diagnosis. 218 Finally, the newly introduced sixth ERS subsystem, namely the BIS parameters learning subsystem 219 (equivalent to the fourth ERS component), is used to perform the inference engine (re)calibration data 220 analysis. The main aim is to improve the classification performance of the DMA level BIS and hence 221 the detection performance of the ERS. As it will be shown in the relevant section, this is achieved by 222 using: (i) an Expectation Maximisation strategy, and (ii) information (i.e., start time and duration) 6 223 about the past events occurred in the DMA being studied (e.g., obtained from the water company’s 224 historical records and/or from periodic reviews of the alarms raised by the ERS). The data analyses 225 carried out in this subsystem are organised into the following two modules: (1) data retrieval module, 226 and (2) Expectation Maximisation (EM) based parameters learning module. The first module is used 227 for retrieving the past events information (assuming that it has been obtained and stored in the Alarms 228 database). This information forms the data set of ‘event cases’ which is then used in the second 229 module for (re)calibrating the parameters of the DMA level BIS. Once this is done, the (re)calibrated 230 parameters are passed on to the Inference subsystem for use in the DMA level BIS. 231 As it is shown in Figure 1, the ERS has three main modes of operation: (1) the “Assemble” mode, (2) 232 the “Execute” mode, and (3) the “Learn” mode. These modes of operation define the time schedule 233 according to which the data analyses in each subsystem are performed. The “Assemble” mode is used 234 for ‘tuning’ the data-driven ERS when it is initialised (i.e., used for the first time in a DMA). Later on, 235 it is used: (i) regularly (i.e., every l days - e.g., every week) when the ERS is updated (to capture the 236 latest normal operating conditions of a DMA) thereby providing a continuously adaptive ERS, and (ii) 237 periodically (i.e., every h days - e.g., every three months) when the ERS is reinitialised (to account for 238 the seasonal variations in the DMA’s pressure/flow regime, growing demand over time, etc.; or 239 following occasional operational/other DMA changes - e.g., re-valving). The “Execute” mode is the 240 normal operating mode used at every communication interval to detect the events and raise the 241 alarms. Finally, the “Learn” mode may be used for the initial calibration and for the follow-on 242 periodic recalibrations of the ERS probabilistic inference engine. As the data analyses performed in 243 this mode of operation have a requirement for the past events information, however, its actual 244 utilization depends on whether or not this information is available/considered. 245 OPTIMISATION OF ARTIFICIAL NEURAL NETWORK PARAMETERS AND 246 INPUT STRUCTURE OPTIMISATION 247 An ANN model is used in the ERS for performing short-term prediction (i.e., one time step ahead) of 248 the future values of the particular DMA signal being analysed. The reason for choosing an ANN 249 model is the inherent complexity of the WDSs. The main aim is to exploit the ability of this powerful 250 data modelling tool to model any function without explicit knowledge of the parameters involved. 251 This said and despite the fact that the efficiency and superiority of the ANN models over approaches 252 that employ time-series analysis, regression models, and Autoregressive Integrated Moving Average 253 models for in modelling and forecasting water consumption has been demonstrated in a large number 254 of studies (e.g., Adamowski 2008), several issues have to be considered in order to build ANN models 7 255 that exhibit good forecasting performance for different DMA signals. These issues include the choice 256 of the ANN input structures, and of the ANN parameters. 257 Figure 2. To appear here. 258 Generally speaking, as in the ERS presented in Romano et al. (2012), a suitable ANN input structure 259 for solving the problem at hand would include a certain number of past pressure/flow values (i.e., 260 LagSize), and other explanatory variables such as the Time of the Day (TofD) (i.e., a value between 1 261 and g, where 1 corresponds to midnight and g is the number of samples in one day) and the Day of the 262 Week (DofW) (i.e., a value between 1 and 7) associated with the forecasting horizon (i.e., next time 263 step) (which have to be converted into a field type form to avoid data representation issues when 264 using ANN models). This is shown in Figure 2. Note that in this framework a field type form (i.e., 265 ones and zeros) has to be used for encoding the TofD and DofW indices. This is motivated by data 266 representation issues when using ANN models. These indices, in fact, can be seen as a finite set of 267 categorical data. Despite they have a natural ordering (e.g., Tuesday follows Monday), they do not 268 have an intrinsic ‘value’ associated with them (e.g., Tuesday is more important than Monday). Thus to 269 prevent the ANN model from learning these artificial relationships a ‘binary’ flag has to be created for 270 each possible value (e.g., Monday-flag=0000001, Tuesday-flag=0000010, etc.). A strategy for 271 selecting suitable ANN parameters (that enables striking a balance between learning and 272 generalization), on the other hand, would involve (e.g., Nelson and Illingworth 1991; Moody 1992): 273 (i) choosing a sufficiently large number of hidden neurons (to ensure the ANN model is flexible), and 274 (ii) controlling the number of training cycles and/or applying a penalisation coefficient (i.e., 275 coefficient of Weight Decay Regularisation - Bishop 1995) to the weights of the ANN model (to 276 control the flexibility of the ANN model). 277 With a view to the application of the ERS to entire WDSs, however, the use of the same LagSize, 278 explanatory variables, number of hidden neurons, number of training cycles, and coefficient of 279 Weight Decay Regularisation for all the analysed DMA signals (i.e., a pre-defined ANN input 280 structure and parameters set) has the potential to lead to the development of ANN prediction models 281 that exhibit sub-optimal forecasting performance. This is because signals from different DMA types 282 (e.g., rural, residential, etc.) and different DMA signal types (e.g., pressure vs. flow) may show 283 extremely varying patterns. 284 Bearing in mind the above, the objective of the new methodology presented here is as follows. For 285 each analysed DMA signal, to select the ANN input structure and parameters set that enables the 286 resulting ANN prediction model to yield the best forecasting performance. This said, it is important to 8 287 stress that the potential benefits resulting from doing this are two-fold. On the one hand, the quality of 288 the ANN models’ predictions improves. On the other hand, as the resulting ANN models are signal- 289 specific, the ERS becomes “tailored” to the particular DMA to which it is applied, whilst more 290 generally applicable to different DMAs. 291 It is clear, however, that the selection of the optimal input structure and parameters set for each ANN 292 model is a combinatorial problem. In this scenario and bearing in mind that the ERS has to potentially 293 deal with many hundreds of different signals, the use of a manual trial and error procedure would not 294 be feasible. Similarly, the use of an automated full enumeration procedure would be far too 295 computational expensive. Therefore, an Evolutionary Algorithm optimisation strategy is brought into 296 play here. The main reason for this is that Evolutionary Algorithms do well in large search spaces by 297 working only with a sample available population and have the power to discover good solutions 298 rapidly for difficult high-dimensional problems. Thus, they enable circumventing the computational 299 limitations of the “brute-force” methods that use full search space enumeration. 300 The term Evolutionary Algorithm is used for a broad spectrum of heuristic approaches that simulate 301 evolution. Primary examples include Genetic Algorithms (Holland 1975), and Evolutionary Strategies 302 (Schwefel 1981). Because of their relative efficiency, Evolutionary Algorithms have been extensively 303 applied in the water resources planning and management field to solve a wide range of problems (see 304 Nicklow et al. 2010). These include (similarly to what presented in this paper) the optimal ‘design’ of 305 ANN prediction models (e.g., Giustolisi and Simeone 2006). 306 Considering the ANN parameters and the variables that define the ANN input structure shown in 307 Table 1 as the set of decision variables (i.e., design parameters) for the problem at hand, the 308 Evolutionary Strategy described in Schwefel (1981) is used here for automatically finding the set of 309 decision variables that minimises the ANN model prediction error on the test set (i.e., a randomly 310 chosen subset - e.g., 30% - of the de-noised NOP data set). The ANN model prediction error on the 311 test set is computed by using the Nash-Sutcliffe index (Nash and Sutcliffe 1970). Note that the range 312 of values used in optimisation for the ANN parameters and the variables that define the ANN input 313 structure shown in Table 1 were selected after carrying out a number of preliminary tests aimed at 314 defining the size of the search space that is likely to enable finding an optimal solution for the 315 problem at hand. 316 Table 1. To appear here. 9 317 As it is shown in Figure 1, the above optimised set of decision variables is then passed on to the ANN 318 training & testing module where it is used to build the signal-specific ANN prediction model. This 319 said, note that in Figure 1 it is also shown that the ANN parameters & input structure selection 320 module is not used when the ERS is updated. The ANN parameters and input structure selected at 321 ERS (re)initialisation continue to be used at each ERS updating time. The rationale is that only 322 relatively minor changes are expected to affect the NOP of a DMA signal in the interval between two 323 updates. Thus, in principle, the possible decline in the ANN forecasting performance does not justify 324 the added computational burden of using the Evolutionary Algorithm optimisation strategy. 325 BAYESIAN INFERENCE SYSTEM (RE)CALIBRATION 326 Various (i.e., one for each DMA signal) Signal level BISs and one DMA level BIS are used in the 327 ERS for inferring, at each time step during a data communication interval, the probability that an 328 event has occurred in the DMA being studied. Each of these BISs consists of a Bayesian Network 329 (Edwards 2000; Jensen 2001). This Bayesian Network combines all the evidence of an event 330 occurrence resulting from the three ERS analysis subsystems and, in the case of the DMA level BIS 331 only, coming simultaneously from all the DMA signals (see Figure 1). Bayesian Networks are used 332 here because they allow reasoning under uncertainty and updating the probability that an event has 333 occurred as evidence accumulates. With enough evidence, it should become very high or very low. 334 A Bayesian Network is a directed graph consisting of nodes and arcs. Each node represents a random 335 variable or a group of random variables whilst the arcs express a probabilistic relationship between 336 these variables. A Conditional Probability Table (CPT) is associated with each node. The CPTs 337 contain the prior probabilities of all root nodes (i.e., nodes with no predecessors) and the conditional 338 probabilities of all non root nodes given all the possible combinations of their direct predecessors. 339 These prior/conditional probabilities encode the strengths of the dependencies among the nodes. 340 According to Jensen (2009), there are two knowledge sources for selecting the parameters (i.e., 341 prior/conditional probabilities) in the CPTs of a Bayesian Network. These are: (i) domain experts, and 342 (ii) databases. 343 The CPT parameters ‘normally’ used (i.e., when no information about past events is 344 available/considered) by the ERS in its Signal and DMA level BISs are the same for each signal and 345 for every DMA. The CPT parameters ‘normally’ used by the ERS in its DMA level BIS are the same 346 for every DMA. These CPT parameters have been selected according to the former knowledge source 347 (i.e., domain experts). Specifically, this has been done by carrying out a number of preliminary tests 348 (results not shown here) and by incorporating domain knowledge obtained by the theoretical research 10 349 framework, literature, and observational experience. An example of these parameters is given in 350 Figure 3. 351 Figure 3. To appear here. 352 Alternatively, when information about past events is available, the CPT parameters could be 353 (re)calibrated directly from these data (i.e., ‘event cases’). Note that this is commonly known as the 354 Bayesian Network parameters learning problem (Heckerman 1995). In this scenario, if the structures 355 of a Bayesian Network (or equivalently BIS) has no hidden nodes (i.e., those for which there is no 356 observed data), the estimation of the parameters is simple and can be done just by calculating 357 (counting and dividing) the prior or conditional probabilities. As the BISs used in the ERS include 358 such nodes (see for example the ‘alert’ nodes in Figure 3), however, they are only partially 359 observable. Therefore, an algorithm capable of estimating the CPT parameters from incomplete data 360 must be used. 361 The Expectation Maximisation algorithm (Dempster et al. 1977; Lauritzen 1995) is the most 362 commonly employed algorithm for estimating CPT parameters from incomplete data (Jensen 2009). 363 In theory, other numerical optimization techniques, such as Gradient Descent or Newton-Raphson, 364 could be used instead of Expectation Maximisation. In practice, however, the Expectation 365 Maximisation algorithm has the advantage of being simple, robust and easy to implement (Do and 366 Batzoglou 2008). This algorithm was developed in the statistics community by Dempster et al. (1977) 367 and adapted for use with the Bayesian Networks by Lauritzen (1995). The Expectation Maximisation 368 is an algorithm that, given a set of training data, determines estimates of the CPT parameters that are 369 optimal within a neighbouring set of solutions. It starts with initial values (e.g., chosen at random) for 370 all the parameters in the CPTs of a Bayesian Network, and then iteratively refines them. Each 371 iteration ensures that the likelihood function increases and eventually converges to a local maximum. 372 The iteration process consists of two steps, namely the Expectation and Maximisation steps, which are 373 performed in alternating manner until convergence. 374 In the light of the above, it is important to note that both the CPT parameters of the various Signal 375 level BISs and the CPT parameters of the DMA level BIS could be (re)calibrated by using 376 information about the confirmed past events. However, in the ERS such information is used for 377 (re)calibrating the CPT parameters of the DMA level BIS only. This is because, in the ERS, only the 378 DMA level BIS is used for raising the detection alarms. Furthermore, as a particular past event may 379 have occurred anywhere in the DMA, it may have not affected the measurements from a sensor 380 located farther away from it. In this scenario, “forcing” changes (through learning) to the CPT 11 381 parameters of that sensor’s Signal level BIS(s) in order to obtain high Signal level event occurrence 382 probabilities could be counterproductive. 383 CASE STUDY 384 Description 385 In this section Tthe data analyses performed to evaluate the benefits of the Evolutionary Algorithm 386 and Expectation Maximisation optimisation strategies are presented here. In order to evaluate the 387 benefits of the Evolutionary Algorithm strategy for selecting optimal ANN input structure and 388 parameters sets, the ERS was tested and verified on real-life events. In order to evaluate the benefits 389 of the Expectation Maximisation strategy for (re)calibrating the parameters in the CPTs of the DMA 390 level BIS, the ERS was tested and verified: (i) on a series of Engineered Events (EEs), whereby fire 391 hydrants were opened to simulate different pipe burst events, and (ii) on synthetic pipe burst events 392 whereby fictitious “burst flows” were arbitrarily added to an actual flow signal. 393 In the first case, the data analyses performed made use of the pressure and flow data recorded by the 394 sensors deployed in five UK DMAs during the eleven month period from July 2009 to June 2010.The 395 evaluation of the Evolutionary Algorithm strategy’s benefits was carried out as follows. Five UK 396 DMAs were selected. These selected DMAs have different characteristics and varying sizes. As an 397 ensemble, in fact, they contain light industrial, urban, semi rural, and rural areas. The number of 398 domestic properties in each of them varies between 409 and 3,493. The number of commercial users 399 varies between 11 and 231. Their individual total mains length varies between 6.3 and 30 km. These 400 DMAs are equipped with a flow sensor at the DMA inlet and a pressure sensor at the critical point 401 (i.e., the one located either at the point of highest elevation or alternatively at a location farthest away 402 from the inlet) in the DMA and, in one case only, with a flow and pressure sensor at the DMA inlet 403 only. The pressure and flow data recorded by all these sensors during the eleven month period from 404 July 2009 to June 2010 were used. This said, it has to be stressed that, although the ERS 405 simultaneously processes the data coming from the relevant pressure/flow sensors deployed in a 406 single DMA to raise detection alarms for that particular DMA (i.e., the ERS is applied to each DMA 407 independently), the results of data analyses aimed at evaluating the Evolutionary Algorithm strategy’s 408 benefits will be presented in the ANN optimisation results section in an aggregated form (i.e., results 409 for all the five selected DMAs). 410 In the second case, the data analyses performed when the ERS was tested and verified on the EEs 411 made use of the pressure and flow data recorded by the sensors deployed in a sixth UK DMA during 412 July-August 2009 and March 2010 (as the EEs were conducted in these 3 months). The evaluation of 12 413 the Expectation Maximisation strategy’s benefits was carried out as follows. On the one hand, a sixth 414 UK DMA was considered when the ERS was tested and verified on the EEs. This DMA is 415 predominantly urban, it has a total of 2,640 domestic properties and 500 commercial users. The total 416 mains length is 24 km. It is equipped with a pressure and flow sensor at the DMA inlet as well as one 417 pressure sensor at the critical DMA point. The pressure and flow data recorded by all these sensors 418 during July-August 2009 and March 2010 (as the EEs were conducted in these 3 months) were used 419 for raising detection alarms for this particular DMA. On the other hand, a seventh UK DMA was 420 considered when the ERS was tested and verified on the synthetic pipe burst events. This DMA is also 421 predominantly urban, it has a total of 1,916 domestic properties and 234 commercial users. The total 422 mains length is about 27 km. This DMA is equipped with eight pressure sensors within the DMA and 423 two flow sensors at the DMA inlets. However, only the data recorded during February-March 2012 by 424 the flow sensor deployed at one of the DMA inlets were used for raising detection alarms for this 425 particular DMA. 426 In both cases the aforementioned evaluations, the considered DMAs are gravity fed and have no 427 storage. tThe sensors recorded data at 15 minute intervals. The flow data were averaged values during 428 the 15 minute sampling interval whilst the pressure data were 15 minute instantaneous values. Note 429 that although historical data were used for the data analyses performed here, the pressure and/or flow 430 measurements were fed to the ERS in a simulated ‘on-line’ fashion (i.e., as the ERS would have been 431 used in real-life). The user-defined ERS parameters used for the case study analyses are as in Romano 432 et al. (2012). The ERS was not reinitialised. 433 Artificial Neural Network Optimisation Results 434 In order to evaluate the benefits of the Evolutionary Algorithm optimisation strategy, the ERS was 435 tested with and without making use of the ANN parameters & input structure selection module (see 436 Figure 1). When this module was not used, the ANN parameters and input structure employed for all 437 the ANN models (i.e., all the signals coming from the DMAs being studied) were the same and 438 chosen as follows. The number of hidden neurons was computed by using the Neuroshell2 439 (Neuroshell2 manual 1996) rule of thumb. The coefficient of Weight Decay Regularisation was 440 chosen as equal to 0.01. The number of training cycles was chosen as equal to 400. The ANN input 441 structure included 4 past pressure/flow values, and the TofD and DofW explanatory variables. Note 442 that the use of these particular ANN parameters and input structure was found, after a number of 443 preliminary tests, to ensure that the resulting ANN prediction models were able to closely 444 approximate the training sets whilst allowing good generalisation performance for all the considered 445 signals. 13 446 In the data analyses performed here, the value of the ‘alarm inactivity time’ parameter was set as 447 equal to 1 week. With this setting, when the Evolutionary Algorithm optimisation strategy was used, 448 the ERS raised a total of 37 alarms. In the opposite case (i.e., the Evolutionary Algorithm optimisation 449 strategy was not used), it raised a total of 38 alarms. In both cases, the raised alarms were later 450 compared against a set of events identified by means of a careful visual inspection of the signal trends 451 supported by data on Main Repair (MR) works carried out in the network and on recorded Customer 452 Contacts (CCs) (i.e., customer complaints about potential problems related to water supply). This set 453 of events included: (i) 17 burst events (i.e., related to the MR records), (ii) 7 pressure/flow anomalies 454 whose exact cause is uncertain (e.g., illegal water usage, unusual system activity, operational DMA 455 changes, etc.) but for which a record existed (i.e., CC records), (iii) 1 sensor failure event, and (iv) 5 456 other visible pressure/flow anomalies which were, however, not accompanied by any CC or MR 457 records (i.e., did not impact the customers). Note that further information about the performed visual 458 inspection of the signal trends can be found in Romano et al. (2012). The aforementioned comparison 459 enabled to check if a correlation existed (i.e., genuine alarms) or not (i.e., false alarms) and, in turn, to 460 evaluate the ERS performance (i.e., success rate, and reliability). 461 In the case of ANN optimisation, of the 37 raised alarms: (i) 21 alarms were correlated to the 17 burst 462 events, (ii) 7 alarms were correlated to the 7 pressure/flow anomalies that generated several CCs, (iii) 463 3 alarms were correlated to the sensor failure event, (iv) 5 alarms were correlated to the 5 visible 464 pressure/flow anomalies that did not affect the customers, and (v) 1 was a false alarms. On the other 465 hand, when the ANN optimisation was not carried out, of the 38 raised alarms: (i) 21 alarms were 466 correlated to the 17 burst events, (ii) 6 alarms were correlated to the pressure/flow anomalies that 467 generated several CCs, (iii) 3 alarms were correlated to the sensor failure event, (iv) 5 alarms were 468 correlated to the 5 visible pressure/flow anomalies that did not affect the customers, and (v) 3 were 469 false alarms. Note that because of the ‘alarm inactivity time’ parameter used, multiple alarms were 470 sometime related to the same events. Furthermore, some events caused alarms in different DMAs. 471 Figure 4 summarises the above results. In the light of the results obtained, it is possible to state that 472 the use of the Evolutionary Algorithm optimisation strategy improved both the ERS event detection 473 reliability and effectiveness. In fact, the number of false alarms was reduced three-fold (from 3 to 1), 474 and one additional event that resulted in several CCs being received was detected. 475 Figure 4. To appear here. 476 The improved event detection reliability and efficiency obtained in this case study, however, are not 477 the only benefits yielded by the use of the Evolutionary Algorithm optimisation strategy. In fact, the 478 ERS detection speed was also enhanced. As an example, Figure 5 shows the results obtained when a 14 479 large (i.e., about 30 l/s) burst event occurred. Six CCs (shown as vertical lines with a circle) and one 480 MR (not shown as it did not have an associated time) were recorded during this event. When the ANN 481 optimisation was used, the ERS raised an alarm 30 minutes before the first CC was received. On the 482 other hand, when it was not used, the ERS raised an alarm 45 minutes later. That is to say, 15 minutes 483 after the first CC. The fact that by making use of the Evolutionary Algorithm optimisation strategy the 484 ERS could have detected this burst event ahead of the customers is very important. This is because the 485 early information may have enabled the water company to react more quickly and decrease the 486 potential damages to the infrastructures and to third parties. Furthermore, it may have also helped the 487 company to improve its customer service by: (i) reducing the time of the supply interruption, (ii) 488 enabling to allocate more time for planning and implementing mitigation measures (if the supply 489 interruption could have not been avoided), and (iii) allowing proactive and/or more informed 490 communications with the customers. 491 Figure 5. To appear here. 492 Bayesian Inference System C(Re)calibration Results 493 Tests on Engineered Events 494 In the data analyses performed here order to evaluate the benefits of the Expectation Maximisation 495 strategy, the capabilities of the ERS with the parameters in the CPTs of the DMA level BIS based on 496 domain experts knowledge were compared with those of the ERS with the parameters in the CPTs of 497 the DMA level BIS calibrated by using information about some of the EEs carried out. Here, the ERS 498 capabilities were evaluated based on comparisons between the ERS detection times for the two 499 relevant cases considered and the corresponding actual hydrant opening times. In addition to this, the 500 Receiver Operating Characteristics (ROC) graphs (Egan 1975) were used too. Note that a ROC graph 501 is a technique for visualizing and selecting classifiers based on their performance. ROC graphs have 502 long been used in signal detection theory to depict the trade-off between true and false alarm rates of 503 classifiers. 504 A total of 9 EEs were carried out as follows: 3 in July 2009, 3 in August 2009, and 3 in March 2010. 505 For the purposes of the analysis that evaluated the ERS capabilities based on comparisons between the 506 ERS detection times and the corresponding actual hydrant opening times, the CPT parameters were 507 calibrated by making use of the information about the start and end times of the 6 EEs carried out in 508 July 2009 and August 2009. Note that, in this analysis, the ‘alarm inactivity time’ parameter was set 509 as equal to 1 day. This is because any single EE lasted one day maximum and different EEs were 510 sometimes carried out during the same week. 15 511 Table 2 shows the ERS detection times obtained for the two relevant cases considered and the 512 corresponding actual hydrant opening and closing times for all the EEs. The underlined alarm start 513 times refer to those events that were detected at the best possible time (within the 15-minute sampling 514 rate). Alarm start times in normal text font refer to those events that were detected with a delay not 515 greater than one hour, whilst alarm start times in bold refer to those events that were detected with 516 delays longer than 1 hour. 517 As it can be seen from Table 2, in both cases all the EEs were successfully detected. However, when 518 the calibration procedure was used, the detection speed improved significantly. Given that the CPT 519 parameters were calibrated by using the July-August 2009 EEs’ start and end times information, it is 520 not unexpected that the EEs carried out during these two months were timely detected (i.e., EE3 with 521 a 15 minute delay only, and the other EEs at the best possible time). This said, the significant 522 detection speed improvement is evident when the EEs carried out in March 2009 2010 are considered 523 (i.e., set of EEs not used for calibration). The first of these EEs, in fact, was detected 11 hours and 30 524 minutes earlier than in the case where the calibration procedure was not used. Additionally, the third 525 of these EEs was detected 15 minutes earlier (resulting in detecting it at the best possible time). 526 Finally, a “detection speed improvement” can also be observed for the second of the March 2009 527 2010 EEs. The additional time gained would have made a lot of difference in real-life in terms of 528 repairing the pipe burst and reducing the negative impact on the nearby customers. 529 With regard to the second of the March 2009 2010 EEs, however, the following consideration applies 530 to both cases studied. The close proximity in time between EEs (i.e., 10 minutes separated the first 531 and the second EEs carried out in March 2009 2010) together with the chosen value for the ‘alarm 532 inactivity time’ parameter, made possible to raise the relevant alarm only 1 day after the alarm for the 533 first EE was raised. Bearing in mind this fact, on the 2nd of March 2010 at 08:00 the ERS generated a 534 detection probability of 0.68 and 0.74 in the case of CPT parameters based on domain experts 535 knowledge and calibrated by using the EEs information, respectively. As both generated detection 536 probabilities were above the 0.5 user-defined detection threshold for raising the alarms, the ERS 537 would have raised an alarm for that particular EE at the best possible time if it had not been carried 538 out the day immediately after the first EE. 539 Table 2. To appear here. 540 The above analysis shows that the Expectation Maximisation strategy is beneficial for improving the 541 detection speed of the ERS. That analysis alone, however, does not allow conclusions making 542 conclusion about the calibrated DMA level BIS superiority to be made. Therefore, ROC graphs were 16 543 used in the analysis carried out as shown below in order compare the classification performance of the 544 DMA level BIS with CPT parameters based on domain experts knowledge with those that of the 545 DMA level BISs with CPT parameters calibrated by using the EEs information. 546 In this analysis, the value of the ‘alarm inactivity time’ parameter was set as equal to 15 minutes (i.e., 547 all the DMA level event occurrence probabilities greater than the detection threshold raised detection 548 alarms). This is because the ROC graph’s true and false alarm rates are obtained by comparison, at 549 every time step, between the status of a hydrant (i.e., opened/closed) and the output of the relevant 550 DMA level BIS (i.e., DMA level event occurrence probability greater/smaller than the detection 551 threshold). Additionally, given the limited availability of “event cases”, a four step procedure 552 involving the use of a cross-validation technique was used. The first step of this procedure involved, 553 separately evaluating the classification performances of the DMA level BIS with the CPT parameters 554 based on domain experts knowledge on each of the three months studied. The second step involved 555 calibrating the CPT parameters by using, in turn, information about the EEs carried out during two of 556 the three months considered (i.e., August 2009 and March 2010, July 2009 and March 2010, and July 557 2009 and August 2009) and evaluating the resulting DMA level BIS classification performance on the 558 remaining month (i.e., July 2009, August 2009, and March 2010). As a result of these first two steps, 559 6 ROC curves representing the classification performances of the relevant DMA level BISs were 560 obtained (i.e., 1 for each month and for each relevant case considered). The third step involved using 561 the Vertical Averaging technique (Fawcett 2006) for averaging the 3 ROC curves obtained for each of 562 the two relevant cases considered. Finally, a measure of variance was derived for visualising the 563 classification performance variability across the three months studied (i.e., 3-fold cross-validation 564 runs). 565 Figure 6 shows the results obtained after applying the above procedure. The two ROC curves 566 represent the ‘average’ classification performance (across the three months studied) of the relevant 567 DMA level BISs for the two cases considered. The Box plots show the classification performance 568 variability. It is possible to observe from this figure that performance is similar for low false positive 569 rates (i.e., 0.05). That is, in both cases, reliable positive classifications (i.e., event occurrence) are 570 made with strong evidence. However, for higher false positive rates the DMA level BISs with 571 calibrated CPT parameters perform better than the DMA level BIS with CPT parameters based on 572 domain experts knowledge and also show less variability. The results of this analysis show clearly 573 that the detection reliability and effectiveness of the ERS can be improved if information about past 574 events is used for calibrating the parameters in the CPTs of the DMA level BIS. 575 Figure 6. To appear here. 17 576 Tests on synthetic pipe burst events 577 A cross-validation technique was used in the analysis performed as outlined above. However, it is 578 important to stress that when the ERS is used for the on-line monitoring of a DMA, the past events 579 information could be more efficiently exploited by using a procedure that enables the semi-automatic 580 (re)calibration of the CPT parameters as knowledge about the events occurred in the DMA being 581 monitored becomes available. This said the following ‘cumulative learning’ procedure is proposed 582 here. Once information about a certain number of ‘event cases’ become available for the first time, it 583 is used for calibrating the CPT parameters. Subsequently, when information about new ‘event cases’ 584 become available, it is used together with (‘cumulative learning’) the previously used information for 585 recalibrating the CPT parameters. 586 To this end, it is also important to note that the past events information is very difficult to obtain if the 587 current water company’s historical records are used (because of the nature of underground pipe bursts 588 and due to the fact that the MR and CC records do not provide reliable information about the exact 589 start date/time and duration of these events). However, when the ERS is used for the on-line 590 monitoring of a DMA, the past events information could be more easily obtained by performing 591 periodic reviews of the alarms raised by the ERS. Indeed, assuming that the ERS alarms are timely, 592 such reviews have the potential to enable an operator to simply flag the genuine alarms (which have 593 an associated alarm start time) as confirmed and check the associated event durations. In this way, 594 Progressing in a like fashion, over time, the Alarms database will be populated with ‘cases’ of events 595 of different type and size that occurred in different areas of the DMA being monitored. In this 596 scenario, this the use of the ‘cumulative learning’ procedure outlined above has the potential of 597 enabling the ERS to learn recognising the features of a large variety of events, thereby continuously 598 improving its generalisation and detection capabilities. 599 In view of the above, the main objective of the data analyses performed here was to demonstrate the 600 benefit of the proposed ‘cumulative learning’ procedure. This was achieved by testing the ERS on a 601 series of synthetic pipe burst events. As shown in Tables 3 and 4, 56 synthetic pipe burst events 602 occurring at different times during the day and with variable durations were simulated by adding 603 fictitious “burst flows” (from 1 to 70 l/s) to the flow time series recorded during the period from the 604 1st of February 2012 to the 31st of March 2012. This resulted in a ‘modified flow time series’. Next, 605 the data in the ‘modified flow time series’ referring to the period from the 1 st of February 2012 to the 606 15th of February 2012 were used to initialise the data driven ERS. Note that this time interval included 607 a total of 10 synthetic pipe burst events which only served the purpose of simulating the presence of 608 abnormal measurements in the raw data that have to be used for the ERS initialisation. Once this was 18 609 done, the ERS detection results obtained during the period between the 1st and the 31st of March 610 (which included a total of 28 synthetic pipe burst events) were evaluated for the following three cases: 611 (1) ERS with the parameters in the CPTs of the DMA level BIS based on domain experts knowledge, 612 (2) ERS with the parameters in the CPTs of the DMA level BIS calibrated by using information about 613 the 9 synthetic pipe burst events simulated during the period between the 15th and the 21st of February 614 2012, and (3) ERS with the parameters in the CPTs of the DMA level BIS recalibrated by using 615 information about the 9 synthetic pipe burst events simulated during the period between the 15th and 616 the 21st of February 2012 and the 9 synthetic pipe burst events simulated during the period between 617 the 22nd and the 29th of February 2012 together. 618 The detection results obtained for each of the aforementioned cases are reported in the last three 619 columns of Table 4. It can be observed that, by calibrating the CPTs of the DMA level BIS using 620 information about the 9 synthetic pipe burst events simulated during the period between the 15th and 621 the 21st of February 2012, the number of synthetic events that were not detected by the ERS decreased 622 from 11 to 7. Additionally, when the CPTs of the DMA level BIS were recalibrated by using 623 information about the 9 synthetic pipe burst events simulated during the period between the 15th and 624 the 21st of February 2012 and the 9 synthetic pipe burst events simulated during the period between 625 the 22nd and the 29th of February 2012 together, the number of events that were not detected by the 626 ERS was further reduced to 3. This said, it has to be also stressed that in all the three cases considered 627 no false positive alarms were raised. All this demonstrate how the use of the ‘cumulative learning’ 628 procedure improves the ERS detection capabilities. 629 Figure 7 shows an example of the result obtained from the ERS tests on the synthetic pipe burst. The 630 figure is divided into two parts. The top part shows the result obtained when domain experts 631 knowledge based CPT parameters were used. The bottom part shows the result obtained when 632 recalibrated parameters were used. In each part of the figure, 6 synthetic events are shown together 633 with the resulting DMA level event occurrence probabilities – i.e., Pglobal - (only if greater than the 0.5 634 detection threshold used for raising alarms) at every time step (every 15 minutes). From this figure, it 635 can be observed that, when the ‘cumulative learning’ procedure was used, not only more events were 636 detected (6 rather than 3) but also some of the events were detected in a more timely manner. 637 CONCLUSIONS 638 An automated methodology for the near real-time detection of pipe bursts and other events at the 639 DMA level from observed pressure/flow signals has been developed recently (Romano et al. 2012). 640 This methodology is implemented in a computer-based ERS which is readily transferable to practice. 19 641 To enable the data-driven (re)calibration of the ERS and to enhance the ERS detection performance, 642 an Evolutionary Algorithm optimisation strategy for automatically selecting the parameters and input 643 structures of the ANN pressure/flow signal prediction models, and an Expectation Maximisation 644 strategy for semi-automatically (re)calibrating the parameters in the CPTs of the DMA level BIS have 645 been developed, presented and tested here. 646 The developed data-driven (re)calibration methodology further extends the self-learning capabilities 647 of the ERS and its ability to work in an online-context. Not only is the ERS able to adapt to changes 648 in the DMA operating conditions but also to evolve as knowledge about past events in the DMA is 649 acquired. Furthermore, by automatically developing ANN models that are signal-specific, the ERS is 650 able to tailor itself to the particular DMA being monitored. All of the above also makes the ERS more 651 generically applicable to different DMAs. 652 The tests performed here to evaluate the performance of the new (re)calibration methodology have 653 involved both real-life pipe burst/other events, and simulated and synthetic pipe burst events in 654 several real-life UK DMAs. The ERS was used with pressure and flow measurements fed in an “on- 655 line” fashion (i.e. as it would have been used in real-life). Two Several sets of ERS runs were 656 performed: with and without making use of the Evolutionary Algorithm and Expectation 657 Maximisation optimisation strategies. The results obtained have shown that the use of these strategies 658 improved the overall ERS performance in terms of event detection reliability and speed. Reliable and 659 timely detections may enable the water companies to gain confidence in the raised alarms and, in turn, 660 minimise the negative impacts of burst/other events therefore improving the water companies’ 661 operational efficiency and customer service. 662 Note that the ERS and the novel (re)calibration methodology presented in this paper have been tested 663 and verified, so far, on UK DMAs only. This said, their application to pipe networks in other 664 countries (where DMAs may not have been established) would require further tests. 665 The future work will involve on-line testing of the ERS on a much larger number of DMAs. The aim 666 is to gather further evidence of the benefits yielded by the data-driven (re)calibration methodology 667 presented here and their actual extent. Particular attention will be paid to the task of verifying that the 668 proposed “cumulative learning” procedure leads to continuous improvements of the ERS detection 669 performance. On the other hand, bearing in mind that the water companies are starting to recognise 670 that the near real-time monitoring of their WDSs by means of pressure and flow devices not only 671 provides a potentially useful source of information for quickly and economically detecting the pipe 672 burst events but also yields several other important benefits (e.g., improved network visibility and 20 673 management, higher compliance with regulatory targets, etc.), an increase in the density of coverage 674 of monitoring locations is expected in the near future. In this scenario future work will involve 675 developing a methodology for determining the approximate location of a burst within a DMA. 676 ACKNOWLEDGEMENTS 677 This work is part of the first author’s PhD sponsored by the University of Exeter. The DMA data used 678 in the paper have been collected as part of the Neptune project funded by the UK Engineering and 679 Physical Sciences Research Council (EP/E003192/1) and provided by Mr Ridwan Patel from 680 Yorkshire Water which is gratefully acknowledged. The work presented in this paper has been 681 patented (Publication No. WO/2010/131001PATENT No GB0908184.5.). 21 682 REFERENCES 683 Adamowski, J. F. (2008). “Peak daily water demand forecast modeling using artificial neural 684 networks.” Journal of Water Resource Planning and Management, 134(2), 119-128. 685 Bishop, C. M. (1995). “Neural networks for pattern recognition.” Oxford University Press. 686 Brunone, B. (1999). “Transient test-based technique for leak detection in outfall pipes.” Journal of 687 Water Resources Planning and Management, 125(5), 302-306. 688 Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977). “Maximum likelihood from incomplete data 689 via the EM algorithm.” Journal of the Royal Statistical Society. Series B (Methodological), 39(1), 1- 690 38. 691 Do, C. B., and Batzoglou, S. (2008). “What is the expectation maximization algorithm?” Nature 692 Publishing Group, http://www.nature.com/naturebiotechnology (Accessed 21 Mar 2011). 693 Edwards, D. (2000). “Introduction to graphical modelling.” 2nd edition, Springer. 694 Egan, J. P. (1975). “Signal detection theory and ROC analysis.” Series in Cognition and Perception, 695 Academic Press, New York. 696 Fawcett, T. (2006). “An introduction to ROC analysis.” Pattern Recognition Letters, 27(8), 861-874. 697 Fenner, R. A., and Ye, G. (2011). “Kalman filtering of hydraulic measurements for burst detection in 698 water distribution systems.” Journal of Pipeline Systems Engineering and Practice, 2(1), 14-22. 699 Field, D. B., and Ratcliffe, B. (1978). “Location of leaks in pressurised pipelines using sulphur 700 hexafluoride as a tracer.” Water Research Centre, Technical Report 80. 701 Giustolisi O., and Simeone V. (2006). “Optimal design of artificial neural networks by a multi- 702 objective strategy: groundwater level predictions.” Hydrological Sciences Journal, 51(3), pp. 502- 703 523. 704 Grunmwell, D., and Ratcliffe, B. (1981). “Location of underground leaks using the leak noise 705 correlator.” Water Research Centre, Technical Report 157. 22 706 Heckerman, D. (1995). “A tutorial on learning with Bayesian networks. Learning in graphical 707 models.” http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.112.8434 (Accessed 29 May 708 2011). 709 Holland J. H., (1975). “Adaptation in natural and artificial systems.” University of Michigan Press. 710 Jensen, F. V. (2001). “Bayesian networks and decision graphs.” Springer-Verlag. 711 Jensen, F. V. (2009). “Bayesian networks.” Wiley Interdisciplinary Reviews: Computational 712 Statistics, 1(3), 307-315. 713 Kim, S. H. (2005). “Extensive development of leak detection algorithm by impulse response method.” 714 Journal of Hydraulic Engineering, 131(3), 201-208. 715 Lauritzen, S. L. (1995). “The EM algorithm for graphical association models with missing data.” 716 Computational Statistics and Data Analysis, 19(2), 191-201. 717 Liggett, J. A., and Chen, L.-C. (1994). “Inverse transient analysis in pipe networks.” Journal of 718 Hydraulic Engineering, 120(8), 934-955. 719 Mergelas, B., and Henrich, G. (2005). “Leak locating method for precommissioned transmission 720 pipelines: North American case studies.” Proc. Leakage 2005 conference, Halifax, Canada. 721 Misiunas, D., Lambert, M. F., Simpson, A. R., and Olsson, G. (2005). “Burst detection and location in 722 water distribution networks.” Water Science and Technology: Water Supply, 5(3-4), 71-80. 723 Misiunas, D., Vítkovský, J. P., Olsson, G., Lambert, M. F., and Simpson, A. R. (2006). “Failure 724 monitoring in water distribution networks.” Water Science and Technology, 53, (4-5), 503-511. 725 Moody, J. E. (1992). “The effective number of parameters: an analysis of generalization and 726 regularization in nonlinear learning systems.” in “Advances in neural information processing systems 727 4.” 847-854. 728 Mounce, S. R., Boxall, J. B., and Machell, J. (2010). “Development and verification of an online 729 artificial intelligence system for burst detection in water distribution systems.” Journal of Water 730 Resources Planning and Management, 136(3), 309-318. 23 731 Mounce, S. R., Mounce, R. B., and Boxall, J. B. (2011). “Novelty detection for time series data 732 analysis in water distribution systems using support vector machines.” Journal of Hydroinformatics, 733 13(4), 672-686 734 Mpesha, W., Gassman, S. L., and Chaudhry, M. H. (2001). “Leak detection in pipes by frequency 735 response method.” Journal of Hydraulic Engineering, 127(2), 134-147. 736 Nash, J. E., and Sutcliffe, J. V. (1970). “River flow forecasting through conceptual models, Part I - A 737 discussion of principles.” Journal of Hydrology, 10(3), 282-290. 738 Nelson, M. C., and Illingworth, W. T. (1991). “A practical guide to neural nets.” Addison-Wesley, 739 Reading. 740 Neuroshell2 manual (1996). Ward Systems Group Inc. 741 Nicklow, J., Reed P., Savić, D. A., Dessalegne T., Harrell L., Chan-Hilton A., Karamouz M., Minsker 742 B., Ostfeld A., Singh A. and Zechman E. (2010). “State of the art for genetic algorithms and beyond 743 in water resources planning and management.” Journal of Water Resources Planning and 744 Management, 136(4), 412-432. 745 Palau, C. V., Arregui, F. J., and Carlos, M. (2011). (Accepted). “Burst detection in water networks 746 using principal component analysis.” Journal of Water Resources Planning and Management. 747 Pudar, R. S., and Ligget, J. A. (1992). “Leaks in pipe networks.” Journal of Hydraulic Engineering, 748 118(7), 1031-1046 749 Puust, R., Kapelan, Z., Savic, D. A., and Koppel, T. (2006). “Probabilistic leak detection in pipe 750 networks using the SCEM-UA algorithm.” Proc. 8th Annual International Symposium on Water 751 Distribution Systems Analysis, Cincinnati, USA. 752 Puust, R., Kapelan, Z., Savić, D. A., and Koppel, T. (2010). “A review of methods for leakage 753 management in pipe networks.” Urban Water Journal, 7(1), 25-45. 754 Romano M., Kapelan Z., and Savić, D. A. (2012). (Submitted). “Automated detection of pipe bursts 755 and other events in water distribution systems.” Journal of Water Resources Planning and 756 Management. 24 757 Shewhart, W. (1931). “Economic control of quality of manufactured product.” Van Nostrand 758 Reinhold. 759 Schwefel, H.-P. (1981). “Numerical optimization of computer models.” Wiley, New York. 760 Srirangarajan, S., Allen, M., Preis, A., Iqbal, M., Lim, H. B., and Whittle, A. J. (2011). (Submitted). 761 “Wavelet-based burst event detection and localization in water distribution systems.” Journal of 762 Signal Processing Systems. 763 Wang, X. J., Lambert, M. F., Simpson, A. R., Liggett, J. A., and Vítkovský, J. P. (2002). “Leak 764 detection in pipelines using damping of fluid transients.” Journal of Hydraulic Engineering, 128(7), 765 697-711. 766 Wu, Z. Y., Sage, P., and Turtle, D. (2010). “Pressure-dependent leak detection model and its 767 application to a district water system.” Journal of Water Resources Planning and Management, 768 136(1), 116-128. 25 769 LIST OF FIGURES 770 Figure 1. Diagrammatic representation of the Event Recognition System components, subsystems and 771 modules. 772 Figure 2. Artificial Neural Network for the one step ahead prediction of flow/pressure values showing 773 a generic example of a suitable input structure. 774 Figure 3. Simplified structure of the District Metered Area level Bayesian Inference System with 775 examples of the parameters used, when no information about past events is available/considered, in 776 the Conditional Probability Tables associated with some of its nodes. 777 Figure 4. Number of alarms correlated with the real-life events identified by visual inspection of the 778 data with (a), and without (b) using the Evolutionary Algorithm optimisation strategy. 779 Figure 5. Detection of a large burst event with (a), and without (b) using the Evolutionary Algorithm 780 optimisation strategy. 781 Figure 6. Performance comparison between the District Metered Area level Bayesian Inference 782 System with Conditional Probability Table parameters based on domain experts knowledge and with 783 Conditional Probability Table parameters calibrated by using the Engineered Events information. 784 Figure 7. Example of the result obtained from the ERS tests on the synthetic pipe burst events. CPT 785 parameters based on domain experts knowledge (a), and recalibrated CPT parameters (b). 786 26 787 LIST OF TABLES 788 Table 1. Decision variables and associated ranges of variability. 789 Table 2. Engineered Events time schedule and detection times for the two relevant cases considered. 790 Table 3. Synthetic pipe burst events used for the initialisation of the Event Recognition System and 791 the (re)calibration of the District Metered Area level Bayesian Inference System. 792 Table 4. Synthetic pipe burst events used for testing the Event Recognition System, and test results for 793 the three relevant cases considered. 794 27 795 Table 1. Decision variables and associated ranges of variability. Decision variable Range of values used in optimisation Number of hidden neurons 10 - 100 Number of training cycles 50 - 500 Coefficient of Weight Decay Regularisation 10-5 - 103 Lag Size 4 - 72 Time of the Day use/do not use Day of the Week use/do not use 796 28 797 Table 2. Engineered Events time schedule and detection times for the two relevant cases considered. Alarm start time domain experts based CPT parameters 20/07/2009 - 08:00 July 22/07/2009 - 06:15 24/07/2009 - 16:15 17/08/2009 - 08:30 August 21/08/2009 - 08:00 26/08/2009 - 08:00 01/03/2010 - 21:00 March 02/03/2010 - 21:00 15/03/2010 - 07:45 Alarm start time calibrated CPT parameters 20/07/2009 - 08:00 22/07/2009 - 06:15 24/07/2009 - 16:15 17/08/2009 - 08:30 21/08/2009 - 07:30 26/08/2009 - 08:00 01/03/2010 - 9:30 02/03/2010 - 9:30 15/03/2010 - 07:30 798 29 Hydrant opened time Hydrant closed time 20/07/2009 - 08:00 22/07/2009 - 06:05 24/07/2009 - 16:00 17/08/2009 - 08:20 21/08/2009 - 07:20 26/08/2009 - 07:55 01/03/2010 - 09:10 02/03/2010 - 08:00 15/03/2010 - 07:20 21/07/2009 - 08:00 23/07/2009 - 08:00 25/07/2009 - 16:00 18/08/2009 - 07:05 22/08/2009 - 08:05 27/08/2009 - 07:00 02/03/2010 - 07:50 03/03/2010 - 07:10 16/03/2010 - 07:15 Table 3. Synthetic pipe burst events used for the initialisation of the Event Recognition System and 800 the (re)calibration of the District Metered Area level Bayesian Inference System. DMA level BIS recalibration DMA level BIS calibration ERS initialisation 799 Event started Event ended Duration [hours] Flow [l/s] Flow [% average DMA inflow] 01/02/2012 02:00 01/02/2012 03:45 2.00 2 4.8 02/02/2012 06:00 02/02/2012 15:45 10.00 5 12.0 03/02/2012 09:00 03/02/2012 12:45 4.00 20 48.0 04/02/2012 18:00 04/02/2012 18:45 1.00 10 24.0 06/02/2012 07:00 06/02/2012 10:45 4.00 50 119.9 09/02/2012 05:00 09/02/2012 09:45 5.00 5 12.0 10/02/2012 19:00 10/02/2012 23:45 5.00 10 24.0 12/02/2012 08:00 12/02/2012 08:45 1.00 1 2.4 12/02/2012 13:00 12/02/2012 13:45 1.00 5 12.0 13/02/2012 22:00 13/02/2012 23:45 2.00 15 36.0 15/02/2012 02:00 15/02/2012 07:45 6.00 1 2.4 16/02/2012 00:00 - instantaneous 20 48.0 16/02/2012 12:00 16/02/2012 14:45 3.00 30 71.9 17/02/2012 03:00 17/02/2012 03:45 1.00 1 2.4 18/02/2012 14:00 18/02/2012 16:45 3.00 2 4.8 19/02/2012 07:00 19/02/2012 19:45 13.00 15 36.0 20/02/2012 23:00 21/02/2012 01:45 3.00 5 12.0 21/02/2012 17:00 21/02/2012 19:45 3.00 8 19.2 21/02/2012 22:00 21/02/2012 23:45 2.00 2 4.8 22/02/2012 05:00 22/02/2012 05:45 1.00 20 48.0 23/02/2012 18:00 - instantaneous 40 95.9 24/02/2012 01:00 24/02/2012 04:45 4.00 2 4.8 25/02/2012 10:00 25/02/2012 10:45 1.00 10 24.0 25/02/2012 19:00 25/02/2012 20:45 2.00 2 4.8 26/02/2012 17:00 26/02/2012 17:45 1.00 1 2.4 27/02/2012 15:00 28/02/2012 04:45 14.00 10 24.0 28/02/2012 15:00 28/02/2012 16:45 2.00 2 4.8 29/02/2012 10:00 29/02/2012 13:45 4.00 1 2.4 801 30 the three relevant cases considered. Event started Event ended Event duration [hours] Flow [l/s] Flow % average DMA inflow Recalibrated CPT parameters 803 Calibrated CPT parameters Table 4. Synthetic pipe burst events used for testing the Event Recognition System, and test results for Domain experts based CPT parameters 802 Event detected? 01/03/2012 00:00 01/03/2012 07:45 8.00 40 95.9 yes yes yes 01/03/2012 10:00 01/03/2012 10:45 1.00 1 2.4 no no no 02/03/2012 20:00 02/03/2012 22:45 3.00 3 7.2 yes yes yes 04/03/2012 01:00 04/03/2012 02:45 2.00 1 2.4 no no yes 04/03/2012 16:00 04/03/2012 20:45 5.00 15 36.0 yes yes yes 05/03/2012 18:00 05/03/2012 19:45 2.00 20 48.0 yes yes yes 06/03/2012 00:00 06/03/2012 03:45 4.00 1 2.4 yes yes yes 07/03/2012 07:00 07/03/2012 10:45 4.00 3 7.2 yes yes yes 08/03/2012 15:00 08/03/2012 17:45 3.00 2 4.8 no no yes 10/03/2012 00:00 10/03/2012 09:45 10.00 1-40 2.4-95.9 yes yes yes 11/03/2012 02:00 11/03/2012 11:45 10.00 2 4.8 yes yes yes 12/03/2012 11:00 12/03/2012 11:45 1.00 5 12.0 no yes yes 13/03/2012 09:00 13/03/2012 09:45 1.00 10 24.0 yes yes yes 13/03/2012 14:00 13/03/2012 15:45 2.00 5 12.0 yes yes yes 13/03/2012 22:00 13/03/2012 22:45 1.00 2 4.8 no no yes 15/03/2012 11:00 15/03/2012 11:45 1.00 2 4.8 no no yes 15/03/2012 19:00 16/03/2012 00:45 6.00 5 12.0 yes yes yes 16/03/2012 18:00 16/03/2012 18:45 1.00 1 2.4 no no no 17/03/2012 15:00 17/03/2012 19:45 5.00 15 36.0 yes yes yes 19/03/2012 01:00 19/03/2012 08:45 8.00 1 2.4 yes yes yes 19/03/2012 17:00 21/03/2012 05:00 36.00 3 7.2 yes yes yes 26/03/2012 16:15 26/03/2012 16:45 0.75 3 7.2 no no no 27/03/2012 01:00 27/03/2012 06:45 6.00 10 24.0 yes yes yes 28/03/2012 05:00 28/03/2012 05:45 1.00 5 12.0 no yes yes 29/03/2012 00:00 29/03/2012 00:45 1.00 70 167.9 yes yes yes 30/03/2012 01:00 30/03/2012 03:45 3.00 2 4.8 no yes yes 30/03/2012 23:00 31/03/2012 00:45 2.00 5 12.0 no yes yes 31/03/2012 15:00 31/03/2012 21:45 7.00 8 19.2 yes yes yes 11 7 3 39 25 11 31 Missed Events Missed % Figure 1. Diagrammatic representation of the Event Recognition System components, subsystems and modules. 32 1 2 Figure 2. Artificial Neural Network for the one-step ahead prediction of flow/pressure values showing 3 a generic example of a suitable input structure. 33 1 2 Figure 3. Simplified structure of the District Metered Area level Bayesian Inference System with 3 examples of the parameters used, when no information about past events is available/considered, in 4 the Conditional Probability Tables associated with some of its nodes. 34 1 2 Figure 4. Number of alarms correlated with the real-life events identified by visual inspection of the 3 data with (a), and without (b) using the Evolutionary Algorithm optimisation strategy. 35 1 2 3 Figure 5. Detection of a large burst event with (a), and without (b) using the Evolutionary Algorithm 4 optimisation strategy. 36 1 2 Figure 6. Performance comparison between the District Metered Area level Bayesian Inference 3 System with Conditional Probability Table parameters based on domain experts knowledge and with 4 Conditional Probability Table parameters calibrated by using the Engineered Events information. 5 37 1 2 Figure 7. Example of the result obtained from the ERS tests on the synthetic pipe burst events. CPT 3 parameters based on domain experts knowledge (a), and recalibrated CPT parameters (b). 38