Uploaded by jai_civil

Leak detection in water distribution network using machine learning techniques

advertisement
ISH Journal of Hydraulic Engineering
ISSN: (Print) (Online) Journal homepage: https://www.tandfonline.com/loi/tish20
Leak detection in water distribution network using
machine learning techniques
Nishant Sourabh, P.V. Timbadiya & P. L. Patel
To cite this article: Nishant Sourabh, P.V. Timbadiya & P. L. Patel (2023): Leak detection in water
distribution network using machine learning techniques, ISH Journal of Hydraulic Engineering,
DOI: 10.1080/09715010.2023.2198988
To link to this article: https://doi.org/10.1080/09715010.2023.2198988
Published online: 12 Apr 2023.
Submit your article to this journal
Article views: 246
View related articles
View Crossmark data
Full Terms & Conditions of access and use can be found at
https://www.tandfonline.com/action/journalInformation?journalCode=tish20
ISH JOURNAL OF HYDRAULIC ENGINEERING
https://doi.org/10.1080/09715010.2023.2198988
Leak detection in water distribution network using machine learning techniques
Nishant Sourabh, P.V. Timbadiya
and P. L. Patel
Department of Civil Engineering, Sardar Vallabhbhai National Institute of Technology-Surat, Surat, Gujarat, India
ABSTRACT
ARTICLE HISTORY
Leakage in the water distribution system (WDS) and its control has been challenging for water
resources fraternity for management of precious water demand. This study examines an inverse
engineering technique to find the leaks in water supply pipelines. The main objective of the study has
been to identify the patterns of deviations in the pressure/flow in the network, due to a single leak in
the network, by solving classification and regression problems using artificial neural networks (ANNs)
and support vector machines (SVMs). The leak detections were solved using two scenarios, wherein,
(a) only pressure measurements and (b) only flow measurements, are undertaken in the system. The
multi-layered perceptron (MLP) model and multi-label multi-class SVM classification and regression
models were developed and trained using the pressure and flow signals, separately. It was found that
the ANN model performed better than the SVM model in pressure- and flow-based leak detection in
both classification and regression problems. The model performance could also be improved by
optimizing the number of inputs to the model during the training phase. The present study would be
useful for water supply management while applying the techniques for minimizing the losses in the
water supply network due to leakages.
Received 10 May 2022
Accepted 31 March 2023
1. Introduction
Every drop of clean water is precious. Leakage in water
distribution systems is an important issue which affect the
customers worldwide. The leakages in the water supply sys­
tem are basically the wastage of the water through the cracks
or fissures in the pipe or tanks or reservoirs. A multitude of
things, including as poor pipe connections, internal or exter­
ior pipe corrosion, or mechanical damage brought on by an
increased pipe load, can result in leakage in WDS.
Kumar et al. (2005) says that India has 16% of total popula­
tion in the world, and just 4% of freshwater resources in the
planet. According to the National Commission for Irrigated
Water Resource Development of India, the water shortage
issue faced by country has arisen due to wastage and poor
management. According to Food and Agricultural
Organisation, 92% of available fresh water is used in farming
sector, 5% in domestic usage and remaining 3% is used in the
industrial sector. Due to leaks and inefficiencies in the water
management system, the nation wastes close to 50% of its fresh
water. Keeping in view the Indian scenario of leakages in water
distribution system, it is of utmost importance to develop and
demonstrate low-cost engineering solution to identify location
of leakages which would help public-health engineers to
reduce ‘un-accounted for water (UFW)’. Water audits record
the difference between the total quantity of water used and the
total amount of water produced. An estimation of the leak is
provided by the difference. This approach takes a lot of time
and does not pinpoint the leak’s location.
Leak detection was distinguished by Hamilton (2009) as
a subordinate of the three primary phases of localise, locate,
and pinpoint. After a leak has occurred, localization involves
focusing the leak on a particular district metered area (DMA)
or network segment. Finding the leak in the DMA is
leak detection; artificial
neural network; support
vector machines; EPANET
software; MATLAB
programming
the second stage. The third and last process, pinpointing,
involves finding the leak’s specific position within a 20 cm
radius. El-Zahab and Zayed (2019) states that the challenge is
to distinguish leak signs due to pumps or open fire hydrant
and it confuses the vibration-based leak detection instru­
ments or sensors to generate false alarm (El-Zahab et al.
2016; Khulief et al. 2012; Stoianov et al. 2007). Puust et al.
(2010) and El-Zahab and Zayed (2019) summarizes the var­
ious methods used for leak detection.
Puust et al. (2010) broadly categorised the leak detection
approaches into three categories, viz., equipment-based,
numerical or hydraulic modelling and their combinations.
The equipment-based methods mainly comprise of use of
installed or portable sensors to detect the leaks along the
pipelines. These methods mainly include leak noise correla­
tors, ground penetrating radars (Hunaidi 1998; Lockwood
et al. 2003; O’Brien et al. 2003), acoustic logging (Moyer et al.
1983; Hough 1988; Rajtar and Muthiah 1997), step-testing
(Farley and Trow 2003; Pilcher et al. 2007), etc. Small leaks
are more challenging to find with these labour-intensive,
expensive approaches, especially when employing acoustic
logging in plastic pipes.
The numerical/hydraulic modelling for leak detection are
mostly based on data analysis related to water supply system.
Billmann and Isermann (1987) have proposed transient mod­
elling; Zhang (1993) have carried out the analysis on statistical
methods; Lambert (2002) proposed the analysis through water
balance method; Silva et al. (1996) have used the negative
pressure wave; and Alkasseh et al. (2013) have modelled
using the minimum night flow method. These strategies
often use calibration and optimization methods to examine
various network segments. The effectiveness of these techni­
ques depends on the calibre of the monitoring system and how
frequently water is used. Due to the intrinsic complexity of
CONTACT P.V. Timbadiya
pvtimbadiya@ced.svnit.ac.in
This article has been corrected with minor changes. These changes do not impact the academic content of the article.
© 2023 Indian Society for Hydraulics
KEYWORDS
2
N. SOURABH ET AL.
urban water distribution systems, Mashhadi et al. (2021)
demonstrated that the wide range of existing methodologies
underlines the tremendous difficulty in identifying and loca­
lising water leaks. These methods are useful in locating the
leakage, however, fail to locate the leakages in real time
domain. Such issues can easily be overcome using machine
learning techniques.
Machine learning-based technology has recently drawn
a lot of interest. An artificial neural networks (ANN) model
working on steady-state process parameters was developed by
Belsito et al. (1998) for the purpose of locating leaks in liquified
gas pipeline networks. Leaks as tiny as 1% of flow rate could be
found by the system. While misclassifying the leaks in the case
of small breaches, ANNs did extremely well in locating huge
leaks where noise was not present. Caputo and Pelagagge
(2003) described a method of using multi-layered perceptron
(MLP) to backpropagate ANN to detect leaks in pipeline net­
works with good accuracy, i.e. correctly identifying the leaking
pipe/branch but had predicted the leak size with 3% of error.
They, however, could not account for the noise and measure­
ment errors. Shinozuka et al. (2005) described a method using
neural networks that monitors the online water pressure at
certain selected locations, using supervisory control and data
acquisition in the system to determine the location and sever­
ity of damage in the water supply system. Their results showed
that number of monitoring stations can be less than one-tenth
of the number of nodes in water distribution systems. Fuzzy
ANN system for water supply system problem detection was
described by Izquierdo et al. (2007). Fuzzy estimation states
were produced by the fuzzy model and then utilised to train
ANNs on multi-dimensional units. For big leaks, it was dis­
covered that the system offers good classification accuracy.
The modelling requirements and, to some extent, the compu­
ter processing needs can be partially met by employing ANN
to monitor the state of the pipeline network. Aksela et al.
(2009) found that, for obtaining reasonable prediction using
ANN, to detect the leakage, a lot of historical data are required
to train the neural network, which need to be updated for
every month. Therefore, the efficiency of these methods is
usually lower. Also, the methods are not able to detect the
leakages quickly as the training time is usually relatively long,
which leads to alarming delay. In order to detect anomalies in
the water distribution time series data using a pattern-based
approach, Mounce et al. (2011) presented an ANN method
based on the similarity study between new events and profiles
derived from past occurrences. This aided in categorising
recent occurrences to find anomalies that might be related to
leaks. Jin et al. (2014) used a neural network to detect leaks
from sound signals (de-noised) emitted by the pipeline net­
work. The relative error of the proposed method was found to
be 1.1%. Zhang et al. (2016) used the multiclass SVM and
applied K-means clustering method to subdivide the water
network into leakage zones for large-scale water distribution
network. Monte Carlo simulations was used for generating the
leakage data. It was discovered that using flow and pressure
data, multiclass SVM could locate the leakage zone. Chan et al.
(2018) noted a substantial difficulty in estimating the number
of clusters and a significant influence of the randomised first
cluster on the clustering procedure. Rojek and Studzinski
(2019) proposed that ANN method could correctly identify
and locate the leaks in the water distribution systems. Shravani
et al. (2019) employed the MLP to detect the leak, and pre­
dicted its location based on the deviation in flows due to leaks
in network. The results showed that amongst the machine
learning based models, the MLP performs the best with an
accuracy of 94.47%. A leakage detection technique combined
with GIS-based spatial flow data analysis was proposed by
Cantos et al. (2020). Through a continuous evaluation of the
real-time distributed volume and the consumption for DMA
and/or, in the absence of metre reading, the possible leak in
a DMA has been identified. In order to prevent any false
alarms, the deployment of such a machine learning model
for leak detection necessitates reliable data quality control
and real-time system monitoring of the flow parameters in
a complicated sensor network. For their reliable integration in
the existing smart water networks, they also need further
model training under realistic settings, forecasting abilities
for pattern recognition of the effects of externalities, and data
quality control. Using density-based spatial clustering of appli­
cations with noise (DBSCAN) and multiscale fully convolu­
tional networks (MFCN), Hu et al. (2021) suggested a novel
leakage detection model. It was discovered that the accuracy of
the suggested method is enhanced by 78%, 72%, and 28%,
respectively, when compared to support vector machine
(SVM), naive bayes classifier (NBC), and k-nearest neighbour
(KNN). Tariq et al. (2022) used the data driven application of
MEMS-based accelerometers to measure linear motion, either
movement, shock or vibration due to leakage in the system.
For the precise classification of leak and no-leak scenarios
utilising extracted features, the authors used machine learning
models based on Random Forest and Decision Trees. Random
Forest was found to perform better than the other machine
learning models, and the overall accuracy for metal pipes
reached 100% and for non-metal pipes reached 94.93%.
However, they can only be used for temporary monitoring,
and because each accelerometer has a 30-minute power
backup, personnel must change the accelerometers every 30
minutes. Due to the restrictions on the data collection, two
types of models – metal and non-metal-based models – were
developed and tested on various types of pipes and materials.
However, these cannot be placed very far from the gateway
without using a long-distance transmission antenna, which
may be expensive. In their novel CtL-SSL (Clustering-thenLocalization Semi-supervised Learning) machine learning fra­
mework, Fan and Yu (2022) proposed using the topographical
link between the WDN and its leaking characteristics for the
location of the WDN’s sensors and the monitoring data for
leakage detection and localization. The calculation of the ideal
number of leakage zones for various types of WDNs, as well as
the optimization of final detection and localization accuracy,
are the method’s limitations.
For the leakage detection problem, the best way is to
install monitoring system which should be sufficiently
dense, and can measure the flow and pressure characteristics
in the network and compare the measured values with the
expected values in case of no leak in the network. An abnor­
mal state, such as a potential leak in the vicinity of a certain
measurement site, is demonstrated by the observed discre­
pancy between the measured value and the expected value.
Due to the installation of numerous measuring instruments
in the monitoring system to provide satisfactory emergency
condition detection, this strategy is expensive. Therefore, it
is crucial to maximise the network’s sensor placement and
density. According to Hart and Murray (2010), the issue of
where to place water quality and contamination sensors
within water distribution networks in order to improve
ISH JOURNAL OF HYDRAULIC ENGINEERING
monitoring and security capability has been thoroughly
researched over the past ten years by hydro-informatics
researchers and the community working on water distribu­
tion systems. The Threat Ensemble Vulnerability Assessment
and Sensor Placement Optimization Tool (TEVA-SPOT
v2.5), created by Murray et al. (2010), is the most advanced
software for placing sensors in water distribution networks.
It is based on the EPANET software engine and simulates
various contamination scenarios using the hydraulic and
quality solver. Eliades et al. (2016) presented a software
tool, named as Sensor Placement (S-PLACE) Toolkit, to
compute the locations of water quality sensors on the basis
of impact of the contaminations in the water distribution
network. This toolkit is programmed in MATLAB using
EPANET software library.
The literature review presented in the preceding sections
indicated that previous researchers applied the machine learn­
ing algorithm to either identify the leaks or localise the same in
the pipe network. This research article proposes to fill the gap
by applying the machine learning techniques to detect the
location, pinpoint and find the rate of the leakage in
a complex water distribution system using the network char­
acteristics in the networks obtained through an optimised
network of pressure and flow sensors. The detection of leakage
in the water distribution network requires three things, viz.,
detection of leaking pipe (leak detection & leak localization),
location of leak in the detected pipe (leakage pinpointing) and
the leak rate. The proposed methodology includes all three
required components in the leakage detection in water distri­
bution networks. This article describes a method for utilising
machine learning techniques to interpret data from a network
of pressure sensors and/or flow-measuring devices monitoring
a pipe network in order to determine the position and magni­
tude of network leaks.
2. Materials and methods
The present study was carried out on the Hanoi water dis­
tribution network. The main objective of the present study is
to detect a single leakage in the water supply network. In the
present study, the classification and regression models have
been developed using machine learning techniques such as
MLP ANN and SVM models based on the nodal pressure
and pipe flow measurements. The flowchart of the metho­
dology is shown in Figure 1. The main steps included in the
current methodology are:
● Development of mathematical model of the network
for the pressure/flow computations, according to the
boundary conditions.
● Simulation of the leaks and the computation of result­
ing pressure/flow in the network under the various leak
conditions.
● Correlation of the leak patterns to pressure/flow data
using the machine learning techniques
● Testing and predicting the leaking pipe as per the
sensed pressure/flow data.
2.1. Problem formulation
The problem of leak detection requires three outputs, viz.,
leaking pipe in the network, leakage location in that pipe
3
and the leakage rate. The analysis has been divided into
two different categories of classification problem and
regression problem. The flows and pressure heads in the
network due to leakages have been used to develop the
classification problem to detect the leaking pipe in the
water distribution network using pattern recognition in
ANN and SVM. Both ANN and SVM have their own
model parameters, such as activation and kernel functions,
learning algorithm, scaling of training data, etc., which are
required to be optimized for the better predictions
through them. The classification problem has been for­
mulated in such a way that the output of the model would
be the leaking pipe. The solution of regression problem
provides the location of leak in the leaking pipe and leak
rate.
A neural network is described as an interconnected
assembly of basic processing elements, units, or nodes
(Thirumalaiah and Deo 1998). Similar to how the human
brain does, they are built to spot a hidden pattern in the data.
The majority of ANNs use a non-linear transfer function as
its underlying method, which is then applied to a collection
of input variables. There are basically two types of problems
which can be solved using ANNs. They are classification and
regression problems. The goal of pattern classification is to
assign input patterns to one of a finite number of classes. In
order to effectively apply it, features must be chosen that
have the information necessary to distinguish between
classes, are insensitive to irrelevant input variability, and
are few in number to allow for effective computation of
discriminant functions and to reduce the amount of training
data needed (Dehuri and Cho 2010). The processing ability
of the network is stored in the interconnection unit
strengths, or weights, obtained by a process of adaptation
to, or learning from, a set of training patterns (Huelss 2020).
The example of a neural network architecture is shown in
Figure 2.
Caputo and Pelagagge (2003) have stated that ANNs
have proven successful in approximating non-linear multi­
variable functions, and in classification problem. By mon­
itoring water pressures online at a few chosen points in the
system, Shinozuka et al. (2005) described an approach for
determining the location and extent of damage in a water
delivery system. Mounce and Machell (2006) have pre­
sented an application of ANN for analysis of data from
sensors measuring hydraulic parameters (flow and pres­
sure) of the flow in water distribution system and were
able to locate the leakage in the network with an accuracy
of almost 98.33%.
SVM is a supervised machine learning model that uses
classification algorithms for binary classes. The SVM per­
forms classification by constructing hyperplane that opti­
mally separates the data into two categories. In geometry,
a hyperplane is a subspace whose dimension is one less than
that of its ambient space or the space surrounding an object
(Suthaharan 2016). This model linked with learning algo­
rithms analyse data and finds the equation for the hyperplane
or set of hyperplanes in a high- or infinite-dimensional
space, which can be used for classification, regression, or
other tasks like outlier’s detection (Samudrala 2018). The
training data in the original feature space may not be linearly
separable always, and therefore needs to be improved by
inducing additional dimensions in the feature space. The
kernel trick is used to achieve higher dimensionality in the
4
N. SOURABH ET AL.
Figure 1. Proposed methodology in the present study.
Figure 2. Typical neural network architecture.
original feature space without altering the data. Figure 3
shows the typical SVM architecture.
There are two different ways in which the inputs are
trained against the target values in SVM modelling. They
are, (i) One Against One (O-A-O) and (ii) One Against All
(O-A-A). In O-A-O, one set of input is related to only one
class, which is represented by the set of input (Kreßel 1999).
According to Hsu and Lin (2002), the total number of mod­
els created can be n*(n–1)/2, where n is the number of
classes. Each binary classifier can predict one class label
and the model with the most predictions is predicted by
the one-against-one strategy. Whereas O-A-A is the earliest
implemented method of SVM multi-class classification, in
which one set of input is related to all classes, so the total
number of classifier models created with this technique is
equal to the number of the classes. The one-against-one
strategy to train the multi-class SVM classification model is
superior to one-against-all strategy.
de Silva et al. (2010) explored SVM to act as pattern
recognisers to detect the leak in the pipe networks. To solve
the issue of erroneous leak detection, Mandal et al. (2012)
suggested a novel leak detection system based on rough set
theory and SVM. For the computational training of SVM,
they used swarm intelligence technique: artificial bee colony
algorithm, which imitates intelligent food searching beha­
viour of honey bees. Mashford et al. (2012) developed
ISH JOURNAL OF HYDRAULIC ENGINEERING
5
learning techniques (ANN and SVM), with the application
on three different networks, viz., one single pipe network,
other benchmark problem from Poulakis et al. (2003) and
another experimental network.
The modelling and computer processing needs can be
mitigated in part by employing ANN and SVM to monitor
the state of piping networks, with a particular emphasis on
leaks and losses. These machine learning techniques can be
used to correlate the effect of the leakages or water losses on
the network characteristics such as pressure and flows. The
correlation helps in finding the leaks and supports in deci­
sion making for the water infrastructures authorities.
Figure 3. Typical classification in SVM. (Reproduced after receiving permission
from Baeldung Team).
2.2. Network simulation
a method for interpreting data from a network of pressure or
flow-measuring devices monitoring a pipe network in order
to determine the location and size of leaks in the network
using SVM analysis. Abdulla et al. (2013) have proposed the
method to detect the leak in the pipeline using ANN model
with only three inputs. They investigated neural network
based probabilistic decision support system for detecting
the leakage in pipeline system. Their model correlates mea­
surements of inlet and outlet pressures and flow to leak
status. Van der Walt et al. (2018) have compared the
machine learning and statistical techniques in the pipe net­
work leak detection. In their study, they have used Bayesian
probabilistic framework and compared it with two machine
The Hanoi water distribution network, first introduced by
Fujiwara and Khang (1990), was modelled in the EPANET
software (see Figure 4), and validated using the optimised
solution of Eussuff and Lansey (2003). They had optimised
the diameters in the network using Genetic Algorithm and
Shuffled Frog Leap Algorithm. The network is configured as:
3 loops, 34 number of pipes, and 31 demand nodes with one
reservoir having 100 m of head as shown in Figure 4. The
pipes in the network are laid with different lengths ranging
from 100 to 3500 m, and the total length of the network is
39.42 km and the total demand in the network is 5538.9 litre
per second (lps). The source is fixed head reservoir having an
elevation of water surface at 100 m above mean sea level
(MSL). The minimum head required at all the junctions is
30 m above MSL.
Figure 4. Layout of Hanoi Water distribution network (Fujiwara and Khang 1990) (Figure reproduced after receiving permission).
6
N. SOURABH ET AL.
Figure 5. Modified layout of the Hanoi water distribution network. (Original Nodes: J-1 to J-31, Extra Nodes: J-32 to J-65)
Table 1. Details of the randomly generated data for leak size.
Properties
Number of leak rates
Mean leak size
Maximum leak size
Minimum leak size
Standard deviation
Values
5000
181.60
349.93
10.07
97.74
Units
—–
LPS
LPS
LPS
LPS
pipe (R1-P1) with the unit headloss of 28.59 m/km. The
range of the velocity in the network is 0.01–6.83 m/s. The
distribution of diameters, demand, pressure at the nodes,
pipe lengths, and flow in the network is provided in the
APPENDIX A (Table A1, A2, A3).
2.3. Data generation and feature selection
The leak rates were generated randomly using uniform
distribution through MATLAB programming (MATLAB
2014a). The details of the data set are given in Table 1.
The programming code for the simulations were devel­
oped in the MATLAB (2014a). To model the leakages in
the network, an extra node was added at the centre of
each pipe and the leak rates were added as an extra
demand to the newly added nodes. Each extra added
node represents leak at respective pipes. The modified
EPANET model of Hanoi water distribution network
can be seen in Figure 5. The network’s characteristics
due to different 5000 leak rates in the range of 10–350
lps were formulated using EPANET model coupled with
MATLAB programming and EPANET toolkit.
The pressure head and pipe flows from these simula­
tions were recorded and were used for the development
of the machine learning model using MLP neural net­
work and SVM. The head loss in the network was calcu­
lated using Hazen-William’s equation. The maximum
head loss per unit km of length was found in the first
2.3.1. Case A – model based on pressure measurements
The hydraulic model developed in the EPANET software
were solved for single leak in the Hanoi water distribu­
tion network at a time. The locations of pressure sensors
were optimised using S-PLACE toolkit (Eliades et al.
2014) in MATLAB programming and was found at
nodes (node nos.: J-2, J-4, J-6, J-9, J-17, J-22 and J-24)
as shown in Figure 3. The number of pressure and flow
sensors are taken as per Van der Walt et al. (2018). This
can be again optimized on the basis of availability of
budget and can be considered as the future scope of the
present work. The pressures at the sensor location were
recorded after simulations of the EPANET model with
5000 numbers of leak rates as a base demand at each
extra added node. The pressures at these nodes were used
as the inputs for the ANN and SVM models to train
against the pipe number as target for the classification
problem. The same inputs were used again in training the
ANN and SVM models against the target of leak location
and leak rates in the regression problem.
ISH JOURNAL OF HYDRAULIC ENGINEERING
2.3.2. Case B – model based on flow measurements
The hydraulic model developed in the EPANET software
were solved for single leak in the network at a time. The
locations of flow sensors were optimised using S-PLACE
toolkit in MATLAB programming. The locations of six
flow sensors were found at pipes (Pipe no.: P-2, P-9, P-20,
P-23, P-25, P-28) and the flows in these pipes were recorded
after simulations of the EPANET model with 5000 numbers
of leak rates as a base demand at each extra added node. The
recorded flows were used as the inputs for the ANN and
SVM models to train against the pipe number as target for
the classification problem. The same inputs were used again
in training the ANN and SVM models against the target of
leak location and leak rates in the regression problem.
Modelling leakage depends on understanding the hydrau­
lics of leaks and how to incorporate that hydraulics into
existing models of the water distribution system
(Mutikanga et al. 2011). The leak has been modelled in the
network as an extra demand at the leakage location. Let
us say a distribution network is comprised of ‘m’ demand
nodes and ‘n’ pipes. The total demand in the network is
QTotal. The QL represent the amount of leakage in ith pipe
in the network. So, the new total demand in the network is as
per Eq. (1):
Q0Total ¼ QTotal þ QL
Eq:1
3. Results and discussion
It is an inverse engineering problem for the leakage detec­
tion, which means, if pressure and/or flows in the network is
recorded through sensors in real time, then it can be possible
to detect the leak and its location in the network.
3.1. Classification problem
3.1.1. Solutions from pressure-based ANN model
The ANN recognizes patterns among the data, and classifies
them to identify and locate the leak. The ANN model,
usually, consists of mainly four components, (a) input and
output variables, (b) type of network, (c) transfer function,
and (d) training and learning function. The feed forward
network with gradient descent backpropagation (Amari
1993) was used as a learning function in developing the
pattern recognition in neural network. A MLP neural net­
work classification model was developed to detect the single
leakage in the network based on the pressure head at the
nodes. The network was solved for the randomly generated
leak rates and recorded pressures in the network. In this case,
the pressure heads at total of seven nodes (J-2, J-4, J-6, J-9,
J-17, J-22, and J-24) were taken as the inputs and the pipe
7
numbers were taken as the output as 34 different classes. The
complete ANN configuration is given in Table 2. To guar­
antee that overfitting does not occur, the ANN model was
validated using 10% of the data set before testing the data set.
The structure of the developed MLP neural network is 7-3030-20-34. The optimal network architecture for ANN was
achieved by manual optimization of numbers of hidden
layers, number of neurons and activation functions for the
hidden layers in the network to increase the accuracy of the
model, keeping the maximum number of epochs and other
training parameters as constant in each case. The mean
squared error vs epochs for the trained optimal model is
included at Figure 6. The figure indicates that the minimum
MSE is observed at 2000 epochs.
Boyce et al. (2002) states that the confusion matrix, espe­
cially used in classification problem, compares the predictions
of the model with its respective actual or observed values. The
comparison of the predicted and actual leaking pipe in the
network can be seen as confusion matrix in Figure 7. This
clearly indicates that the accuracy of the ANN model is 91.2%,
i.e. correctly predicting leak in the 31 out of 34 pipes. Crossentropy can be used as a loss function when optimizing
classification models like logistic regression, ANN and SVM.
The objective function cross-entropy has been used in the
testing and determining the accuracy of the ANN and SVM
classification models. Cross-entropy is commonly used in
machine learning as a loss function and is a measure of the
difference between two probability distributions. The classifi­
cation models also work on this principle. Whereas, the sum of
squared error is used in the case of regression models.
3.1.2. Solutions from flow-based ANN model
The EPANET model developed for the Hanoi water distri­
bution network was solved for the 5000 different leak rates, at
the mid location in the different pipes in the network. The
flows at six flow sensors (P-2, P-9, P-20, P-23, P-25 and P-28)
as per Figure 2, were taken as the inputs and the pipe
numbers were taken as the output as 34 different classes.
The supervised MLP network model was developed and
complete configuration for the model can be found in
Table 3. The structure of the developed MLP neural network
is 6-10-25-25-34. The confusion matrix showing the classifi­
cation of the developed neural network is shown in Figure 8.
The accuracy of the model for identifying the correctly
leaked pipe is 91.2%, as shown in Table 3 and Figure 8.
3.1.3. Solutions from pressure-based SVM model
The EPANET model developed for the Hanoi water distri­
bution network was solved for the 5000 different leak rates at
the mid location in the different pipes. The resulting pres­
sures heads due to the leak rates were recorded using the
Table 2. Details of developed MLP neural network model (Case A).
S. No.
1
2
3
4
ANN parameters
Size of input data
Size of output data
Network type
Number of hidden layers, with transfer functions and sizes
5
6
7
8
Objective function
Learning rate
Training:validation:test
Accuracy for testing data
Observation
7 × 5000 × 34, i.e. 7 × 1,70,000
1 × 34 × 5000, i.e. 1 × 1,70,000
Feed forward pattern recognition
1. Logsig − 30
2. Poslin − 30
3. Purelin − 20
Cross-entropy
0.1 (default value)
60:10:30 (in %)
91.2% (misclassification for 3 pipes out of 34 pipes)
8
N. SOURABH ET AL.
Figure 6. MSE vs number of epochs for the trained optimal architecture of pressure-based MLP ANN used in the study.
Figure 7. Confusion matrix for the ANN classification on tested data for on pressure measurements.
Table 3. Details of the MLP neural network model developed.
S. No.
1
2
3
4
ANN parameters
Size of input data
Size of output data
Network type
Number of hidden layers, with transfer functions and sizes
5
6
7
8
Objective function
Learning rate
Training:validation:test
Accuracy for testing data
Observation
6 × 5000 × 34, i.e. 6 × 1,70,000
1 × 34 × 5000, i.e. 1 × 1,70,000
Feed forward pattern recognition
1. Logsig − 10
2. Radbas − 25
3. Poslin − 25
Cross-entropy
0.1 (default value)
60:10:30 (in %)
91.2% (misclassification for 3 pipes out of 34 pipes)
ISH JOURNAL OF HYDRAULIC ENGINEERING
9
Figure 8. Confusion matrix for the ANN classification on tested data for flow measurements.
Table 4. Details of the SVM classification model developed.
S. No.
1
2
3
4
6
7
8
9
10
SVM parameters
Size of input data
Size of output data
Kernel function
Objective function
Training:validation:test
Number of binary classifiers
Scaling factor
Bias
Accuracy against tested data
seven pressure sensors distributed in the Hanoi Network.
The supervised multi-class SVM classification model was
developed using the kernel function as ‘Gaussian’ function.
The SVM configurations used in present study are given in
Table 4.
Observation
7 × 5000 × 34, i.e. 7 × 1,70,000
1 × 34 × 5000, i.e. 1 × 1,70,000
Radial Basis Function (RBF)
Cross-entropy
60:10:30 (in %)
561 (n = 34)
0.1
−0.196
91.2% (misclassification for 3 pipes out of 34 pipes)
From Table 4, it is apparent that the accuracy achieved
against the tested data was 91.2% for the case when the record
of pressure sensors was used as input to the SVM model. The
confusion matrix showing the accuracy of the pressure-based
model is shown in Figure 9.
Figure 9. Confusion matrix for the SVM classification on tested data based on pressure measurements.
10
N. SOURABH ET AL.
3.1.4. Solutions from flow-based SVM model
The pipe flows due to the leak rates were recorded using the
six flow sensors distributed in the Hanoi Network. The
supervised multi-class SVM classification model was devel­
oped using the kernel function as ‘Gaussian’ function. The
SVM configurations which are used in current study are
given in Table 5. The confusion matrix for the same classi­
fication can be found in Figure 10.
From Table 5 and Figure 10, it is seen that perfor­
mance of SVM model against the test data sets are 61.3%
as and when the records of flow sensors were used as
inputs to the model.
As per Hsu and Lin (2002), the one-vs-one approach is
superior to one-vs-all approach and hence the one-vs-one
approach is used in the present study. The model based on
flow values showed the lower accuracy (61.8%), which can be
increased after optimization of the number of sensors and
their locations in the network.
3.2. Regression problem
The Hanoi water distribution network consists of pipes with
varying lengths from 100 to 3500 m. The pipes in the
network are divided into several segments through equal
interval of 20 m to ensure that there were total of 1963
different probable leak locations in the network. The simula­
tion of leak rates at all these locations were carried out using
EPANET model through MATLAB programming. The
number of probable leak locations for each pipe are tabulated
in Table 6.
3.2.1. Solutions from pressure- and flow-based ANN
model
The supervised MLP neural networks for each pipe were
developed using both pressure and flows records separately.
The developed models were tested against the independent
data set and the result can be seen in APPENDIX B. The data
set for the model testing was generated using a known leak
size of 300 lps in each pipe and at the midpoint of all pipes as
the leak location. The model parameters such as learning
scale, number of hidden layers, size of the hidden layers and
the transfer functions have been optimized manually
depending on prediction accuracy of the models through
MATLAB programming. The performance indices for the
models using pressure and flow in network is tabulated in
Table 7.
Table 5. Details of the SVM classification model developed.
S. No.
1
2
3
4
6
7
8
9
10
SVM parameters
Size of input data
Size of output data
Kernel function
Objective function
Training:validation:test
Number of binary classifiers
Scaling factor
Bias
Accuracy against tested data
Observation
6 × 5000 × 34, i.e. 6 × 1,70,000
1 × 34 × 5000, i.e. 1 × 1,70,000
Radial basis function (RBF)
Cross-entropy
60:10:30 (in %)
561 (n = 34)
0.1
−0.291
61.8% (misclassification for 13 pipes out of 34 pipes)
Figure 10. Confusion matrix for the SVM classification on tested data based on flow measurements.
ISH JOURNAL OF HYDRAULIC ENGINEERING
11
Table 6. Number of probable leak locations in each pipe of Hanoi network.
Pipe number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Pipe length (m)
100
1350
900
1150
1450
450
850
850
800
950
1200
3500
800
500
550
2730
1750
No. of probable leak
locations = number of models
5
67
45
57
72
22
42
42
40
47
60
175
40
25
27
136
87
Pipe number
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
Pipe length (m)
800
400
2200
1500
500
2650
1230
1300
850
300
750
1500
2000
1600
150
860
950
No. of probable leak
locations = number of models
40
20
110
75
25
132
61
65
42
15
37
75
100
80
7
43
47
Table 7. Performance of ANN regression models using pressure and flow data for leak location and size
detection.
Statistical
parameters
RMSE
n-RMSE
R2
MAE
MSE
Case A (pressure models)
Leak location
54.18 m
0.093
0.98
42.35
2935.29
Leak size
57.96 lps
0.193
—
32.19
3358.82
Case B (flow models)
Leak location
54.02 m
0.093
0.98
40.59
2917.65
Leak size
29.56 lps
0.10
—
19.36
873.69
Table 8. Performance of the SVM regression analysis using pressure and flow data for leak location and leak
sizes.
Statistical
parameters
RMSE
n-RMSE
R2
MAE
MSE
Case A (pressure models)
Leak location
50.06
0.086
0.986
20.59
2505.88
3.2.2. Solutions from pressure- and flow-based SVM
model
The recorded pressure and flow values from respective sen­
sors, were used to train the SVM model, against the leak
location and the leak rates. The resulting models were tested
against the testing data set prepared from simulating the leak
size of 300 lps at the centre of all the pipes in the network.
The performance indices for the models using pressure and
flow in network is tabulated in Table 8.
The results from ANN and SVM regression modelling can
be found in APPENDIX B (Table B1 & B2)
4. Discussions
The MLP neural network model has been developed to detect
the leaking pipe in the network using pressure values. The
model used three hidden layers of size, 30, 30 and 20 neurons
and the transfer function as logsig (log sigmoidal), poslin
(positive linear) and purelin (pure linear). The model classi­
fied the leaking pipe with almost 91.2% accuracy. The MLP
neural network model for the flow values was also developed
with three hidden layers of size, 10, 25, and 25 neurons and
the transfer functions for each hidden layer was logsig, radbas
(radial basis) and poslin. The model also predicts the leaking
pipe in the network with 91.2% accuracy.
Leak size
13.20
0.044
—
12.88
174.37
Case B (flow models)
Leak location
45.44
0.078
0.986
21.47
2064.71
Leak size
31.85
0.11
—
20.74
1014.24
The supervised multi-class classification model was devel­
oped using the pressure values from the pressure sensors,
with the ‘Gaussian’ function as kernel function. The model
predicted the leaking pipe with 91.2% accuracy when trained
using one-against-one strategy. The SVM multi-class classi­
fication model was again developed using the flow values
from the sensors, with ‘Gaussian’ function as the kernel
function. This model predicted the leaking pipe with 61.8%
accuracy.
The number of sensors installed in the network are one of
the major factors in deciding the performance of the classi­
fication and regression models developed in the present
study. The number of the sensors can further be optimized
using some of the present algorithms. In the present study,
seven pressure sensors and six flow sensors have been
selected as sensor per 5.63 km and sensors per 6.57 km,
respectively.
Salam et al. (2014) have used emitter coefficient to model
the leakage in the network using orifice equation. They have
selected the values of emitter coefficients from 0.005 to 0.3,
having average system pressure as 3.74 m, which in turn,
gives the leak rates of 0.01–0.6 lps. Rojek and Studzinski
(2019) have considered the leak rates in the range of 15–35
lps. The range of leakage rates has been chosen according to
the maximum flow (5538.91 lps) in the Hanoi water distri­
bution network. Also, in previous studies, the researchers
12
N. SOURABH ET AL.
had considered the leak rates as the percentages (2%, 4%, 6%,
8% and 10%) of flow rates in the pipes (Caputo and
Pelagagge 2003); and 0.7% to 3.3% of the maximum flow in
network (Van der Walt et al. 2018). The range of leakage
rates taken in the present study is 10–350 lps, which is
approximately 0.18%–6.31% of the maximum flow in the
network without leakage. The network has been analyzed
as demand-driven network. The leak rates in the present
study have been considered (0.18%–6.31% of the maximum
flow) as per existing practices in the literature.
These leak detection techniques presented in current
study can be combined with online monitoring systems
which can allow us for quick and accurate detection of
leaks. The presented study has been carried out to detect
the leaking pipe, leak location and the leak rates in the water
distribution network. The classification and regression mod­
els developed through ANN and SVM techniques require the
optimization of their model parameters. The performance of
the models depends upon the size of the data used and the
number of features selected for the training.
In the present study, it has been found that the model
developed based on the measurement of pressure sensors have
given better performance than that developed based on flow
sensors. This is due to the fact that the leakages are modelled in
the network as the base demand in the added extra nodes in the
network. As per the basic assumption for the analysis of water
distribution network, whatever demand is there in the network,
it is assumed that the total water is available in the network to
fulfil total demands. This means that, the network will never be
deficient in the terms of demand and supply. But in actual field
conditions, the network will always be in deficit after leakages
and corresponding losses in the network. So, when the leak rates
are added in the network as extra demand then the average
system pressure increases, making pressure sensors more sensi­
tive to catch the deviations in nodal pressures.
Carreno-Alvarado et al. (2017) compared the machine
learning classifiers for leak detection and isolation in water
distribution networks. They used PCA (principal component
analysis), SVM and relevance vector machine (RVM), and
found out that RVM is suitable for leak detection as it is having
almost same accuracy than SVM but requires a smaller num­
ber of vectors. Quinones-Grueiro et al., (2018a) have used
k-nearest neighbour, Bayes classifier, ANN and SVM and
compared them for the leak location in Hanoi distribution
networks. The results showed that SVM outperforms all other
techniques. Again, Quinones-Grueiro et al. (2018b) have con­
sidered unsupervised approach to leak detection and localiza­
tion in water distribution networks and tested their
methodology on Hanoi benchmark problem. The leak rates
considered by them was in the range of 18–40 lps (less than
2.5%). They found out that periodic dynamic PCA along with
using three pressure sensors gives 72% accuracy in detecting
the leak and 85.25% accuracy in leak location. Akinsete and
Oshingbesan (2019) investigated five intelligent models such
as gradient boosting (GB), decision tree (DT), random forest
(RF), SVM and ANN, in natural gas pipelines. The results
showed that the RF and DT models are the most sensitive as
they can detect a leak of 0.1% of nominal flow in about 2 h.
However, the ANN and SVM showed the best performance.
Lučin et al. (2021) used RF classifier for data-driven leak
localization on urban water distribution networks using big
data and found that RF showed maximum of 82% accuracy for
smaller sized networks and 62% for greater networks.
The data generation, training of the model, and evaluation
of the trained model has been performed on the PC with the
configurations as follows:
Core i5 7th Gen,
RAM − 8 GB
Graphics Card − 4GB
NVIDIA GeForce GTX 1050Ti
Number of physical cores = 4
Multithreading – Available
Maximum threads − 2
The data generation took the maximum time of 7–8 days.
While training of the both ANN and SVM classification and
regression models, the maximum time taken was 5–6 h. The
testing of the model required time less than 0.5 s.
5. Summary and conclusions
The present study has been carried out to detect, locate and
pinpoint the leakage in the water distribution network using
machine learning technologies, viz., ANN and SVM. The
benchmark problem of Hanoi water distribution network
has been used in the study and the leakage detection has
been carried out for the range of leak rates from 10 to 350 lps
(0.18–6.32% of the maximum flow in the network). The
network is supposed to have seven pressure sensors (J-2,
J-4, J-6, J-9, J-17, J-22 and J-24) and six flow sensors (P-2,
P-9, P-20, P-23, P-25 and P-28). The problem of leak detec­
tion was divided into two different problems of classification
and regression. The classification problem was solved to find
the leaking pipe in the network, whereas the regression
problem was solved to find leak location in the particular
pipe and the leak rates. The ANN and SVM models were
developed to detect the single leakage in the network. The
leak simulated in the problem was 300 lps.
The following are the key findings in present study:● ANN classification performed better than SVM multi-
●
●
●
●
class classification, achieving 91.2% accuracy in both
pressure and flow-based models, in contrast to SVM
models, which gave the accuracy of 91.2% and only
61.8% in pressure and flow-based models, respectively.
In regression problem, all the pipe segments were divided
into several parts with interval of 20 m. For each location,
a model was developed using ANN and SVM. Since there
are total of 1963 different locations in the network, with
20 m intervals, so the number of models developed is also
1963 for each case in ANN and SVM.
The pressure-based ANN regression model has yielded
the performance with normalised RMSE (n-RMSE) of
0.093 and 0.197, whereas pressure-based SVM regres­
sion model has shown the performance with n-RMSE
of 0.086 and 0.044, for detection of leak location and
leak rate, respectively.
Flow-based ANN regression model performed with
n-RMSE of 0.093 and 0.10, whereas in case of flowbased SVM regression model, the n-RMSE was 0.078
and 0.11, for the detection of leak location and the leak
rate.
For pinpointing and detecting of leak rate, in the
regression problem, the SVM regression model per­
formed better than ANN regression in both pressure
and flow-based models.
ISH JOURNAL OF HYDRAULIC ENGINEERING
6. Limitations and future scope
The present study was carried out to detect the leakage in
the water distribution network using machine learning
techniques and a network of sensors. The techniques
applied during the study, require the sensors to capture
the maximum effects of leakage to flow characteristics of
the network. The number and location of the sensors can
further be optimized using some of the present algorithms.
The reason behind this concept is that the machine learn­
ing techniques applied here are used to recognise the
pattern in changes in network due to leakages. So, this
can ascend the problem of the optimisation of the number
of sensors and their numbers, in the water distribution
network. Also, the study has been carried out considering
the pressure and flow measurements separately. The pre­
sent study features the comparison between ANN and
SVM techniques in leak detection and the performance
accuracy. The leak detection using better methods such
as CNN can be the future scope of present study.
However, Yamashita et al. (2018) states that the methods
like CNN require large amount of training data because of
estimation of its numerous learnable parameters, making it
more computationally expensive. Also, it will require gra­
phical processing units (GPUs) for model training.
The scope for the future work can be, development of
classification and regression models combining the pressure
and flow measurements, using the machine learning
techniques.
Acknowledgements
The authors would like to acknowledge the Centre of Excellence on
“Water Resources and Flood Management” at Department of Civil
Engineering, Sardar Vallabhbhai National Institute of Technology –
Surat, Gujarat, India established under TEQIP-II grant of Ministry of
Education for providing the required facilities and infrastructural sup­
port. Authors are thankful to the Editor, Associate Editor and Reviewers
for their comments which helped in improvement of readability of the
present paper.
Disclosure statement
No potential conflict of interest was reported by the authors.
ORCID
P.V. Timbadiya
http://orcid.org/0000-0001-8472-3318
Data availability statement
The distribution network data used and results from regression analysis
in this study are available in the Appendix A & B after the references.
Any other data related to study will be available based on the request for
academic purposes only. Interested readers may directly contact the
corresponding author for any other data requirements.
References
Abdulla, M.B., Herzallah, R.O., and Hammad, M.A. (2013). “Pipeline
leak detection using artificial neural network: Experimental study”.
Proceedings of International Conference on Modelling, Identification
and Control (ICMIC), Cairo, Egypt, 328–332.
Akinsete, O., and Oshingbesan, A. (2019). “Leak detection in natural
gas pipelines using intelligent models”. Proceedings of SPE Nigeria
13
Annual International Conference and Exhibition, Nigeria: OnePetro,
10.2118/198738-MS.
Aksela, K., Aksela, M., and Vahala, R. (2009). “Leakage detection in
a real distribution network using a SOM.” Urban Water J., 6(4),
279–289.10.1080/15730620802673079.
Alkasseh, J., Adlan, M.N., Abustan, I., Aziz, H.A., and Hanif, A.B.M.
(2013). “Applying minimum night flow to estimate water loss using
statistical modelling: A case study in Kinta Valley, Malaysia.” Water
Resour. Manage., 27(5), 1439–1455. 10.1007/s11269-012-0247-2.
Amari, S.I. (1993). “Backpropagation and stochastic gradient descent
method.” Neurocomputing, 5(4–5), 185–196. 10.1016/0925-2312(93)
90006-O.
Belsito, S., Lombardi, P., Andreussi, P., and Banerjee, S. (1998). “Leak
detection in liquefied gas pipelines by artificial neural networks.”
Process Systems Engineering, AIChE Journal, 44(12), 2675–2688.10.
1002/aic.690441209.
Billmann, L., and Isermann, R. (1987). “Leak detection methods for
pipelines.” Automatica, 23(3), 381–385.10.1016/0005-1098(87)
90011-2.
Boyce, M.S., Vernier, P.R., Nielsen, S.E., and Schmiegelow, F.K. (2002).
“Evaluating resource selection functions.” Ecol. Modell, 157(2–3),
281–300. 10.1016/S0304-3800(02)00200-4.
Cantos, W.P., Juran, I., and Tinelli, S. (2020). “Machine-learning–based
risk assessment method for leak detection and geolocation in a water
distribution system.” J. Infrastruct. Syst., 26(1), 04019039.10.1061/
(ASCE)IS.1943-555X.0000517.
Caputo, A.C., and Pelagagge, P.M. (2003). “Using neural networks to
monitor piping systems.” Process Safety Progress AIChE Journal, 22
(2), 119–127. 10.1002/prs.680220208.
Carreno-Alvarado, E.P., Reynoso-Meza, G., Montalvo, I., and
Izquierdo, J. (2017). “A comparison of machine learning classifiers
for leak detection and isolation in urban networks.” In: Proceedings
of Congress on numerical methods in engineering, SEMNI 2017,
Valencia, Spain: International Center for Numerical Methods in
Engineering (CIMNE), 1545–1552. http://hdl.handle.net/10251/
160954 .
Chan, T.K., Chin, C.S., and Zhong, X. (2018). “Review of current
technologies and proposed intelligent methodologies for water dis­
tributed network leakage detection.” IEEE Access, 6, 78846–78867.
10.1109/ACCESS.2018.2885444.
Dehuri, S., and Cho, S.B. (2010). “A hybrid genetic based functional link
artificial neural network with a statistical comparison of classifiers
over multiple datasets.” Neural Comput. Appl., 19(2), 317–328. 10.
1007/s00521-009-0310-y.
de Silva, D., Mashford, J., and Burn, S. (2010). “Computer aided leak
location and sizing in pipe network.” St Lucia, Queensland,
Australia: Urban Water Security Research Alliance Technical
Report No 17.
Eliades, D.G., Kyriakou, M., and Polycarpou, M.M. (2014). “Sensor
placement in water distribution systems using the S-PLACE
Toolkit.” Procedia Engineering, 70, 602–611. 10.1016/j.proeng.2014.
02.066.
Eliades, D.G., Kyriakou, M., Vrachimis, S., and Polycarpou, M.M.
(2016). “EPANET-MATLAB toolkit: An open-source software for
interfacing EPANET with MATLAB”. Proceedings of the 14th
International Conference on Computing and Control for the Water
Industry, Computer Control for Water Industry (CCWI 2016),
Amsterdam, Netherlands, 10.5281/zenodo.437751.
El-Zahab, S., and Zayed, T. (2019). “Leak detection in water distribution
networks: An introductory overview.” Smart Water, 4(5), 1–23. 10.
1186/s40713-019-0017-x
Eussuff, M.M., and Lansey, K.E. (2003). “Optimization of water dis­
tribution network design using the shuffled frog leaping algorithm.”
J. Water Resour. Plann. Manage. (ASCE), 129(3), 210–225. 10.1061/
(ASCE)0733-9496(2003)129:3(210).
Fan, X., and Yu, X. (2022). “An innovative machine learning based
framework for water distribution network leakage detection and
localization.” Struct. Health Monit., 21(4), 1626–1644.10.1177/
14759217211040269.
Farley, M., and Trow, S. (2003). Losses in water distribution networks.
IWA Publishing, London, UK.
Fujiwara, O., and Khang, D.B. (1990). “A two‐phase decomposition
method for optimal design of looped water distribution networks.”
Water Resour. Res., 26(4), 539–549.10.1029/WR026i004p00539.
14
N. SOURABH ET AL.
Hamilton, S. (2009). “ALC in low pressure areas—it can be done“.
Proceedings of 5th IWA Water Loss Reduction Specialist Conference,
Cape Town, South Africa, 131–137.
Hart, W.E., and Murray, R. (2010). “Review of sensor placement stra­
tegies for contamination warning systems in drinking water distri­
bution systems.” J. Water Resour. Plann. Manage. (ASCE), 136(6),
611–619.10.1061/(ASCE)WR.1943-5452.0000081.
Hough, J.E. (1988). “Leak testing of pipelines uses pressures and acous­
tic velocity.” Oil and Gas Journal, 86(47), 35–41.
Hsu, C.W., and Lin, C.J. (2002). “A comparison of methods for multi­
class support vector machines.” IEEE Trans. Neural Netw., 13(2),
415–425. 10.1109/72.991427.
Huelss, H. (2020). “Norms are what machines make of them:
Autonomous Weapons Systems and the normative implications of
human-machine interactions.” International Political Sociology, 14
(2), 111–128.10.1093/ips/olz023.
Hu, X., Han, Y., Yu, B., Geng, Z., and Fan, J. (2021). “Novel leakage
detection and water loss management of urban water supply network
using multiscale neural networks.” J. Clean. Prod., 278, 123611. 10.
1016/j.jclepro.2020.123611.
Hunaidi, O. (1998). “Ground-penetrating radar for detection of leaks in
buried plastic water distribution pipes”. Proceedings of the 7th
International conference on Ground Penetrating Radar, GPR ’98,
Lawrence, Kansas, USA.
Izquierdo, J., López, P.A., Martínez, F.J., and Pérez, R. (2007). “Fault
detection in water supply systems using hybrid (theory and
data-driven) modelling.” Math Comput. Model, 46(3–4),
341–350.10.1016/j.mcm.2006.11.013.
Jin, H., Zhang, L., Liang, W., and Ding, Q. (2014). “Integrated leakage
detection and localization model for gas pipelines based on the
acoustic wave method.” Journal of Loss Prevention in the Process
Industries, 27, 74–88. 10.1016/j.jlp.2013.11.006.
Khulief, Y.A., Khalifa, A., Mansour, R.B., and Habib, M.A. (2012).
“Acoustic detection of leaks in water pipelines using measurements
inside pipe.” J. Pipeline Syst. Eng. Pract., 3(2), 47–54.10.1061/(ASCE)
PS.1949-1204.0000089.
Kreßel, U.H.G. (1999). “Pairwise classification and support vector
machines.” Advances in Kernel Methods: Support Vector Learning,
255–268. Cambridge, Massachusetts, United States: The MIT Press.
https://www.researchgate.net/publication/2346087_Advances_in_
Kernel_Methods_-_Support_Vector_Learning .
Kumar, R., Singh, R.D., and Sharma, K.D. (2005). “Water resources of
India.” Curr. Sci., 89(5), 794–811. https://www.jstor.org/stable/
24111024
Lambert, A.O. (2002). “International report: Water losses management
and techniques.” Water Sci. Technol.: Water Supply, 2(4), 1–20.10.
2166/ws.2002.0115.
Lockwood, A., Murray, T., Stuart, G., and Scudder, L. (2003). “A study
of geophysical methods for water leak location”. Proceedings of PEDS
2003 (Pumps, Electromechanical Devices and Systems Applied to
Urban Water Management). Valencia, Spain.
Lučin, I., Lučin, B., Čarija, Z., and Sikirica, A. (2021). “Data-driven leak
localization in urban water distribution networks using big data for
random forest classifier.” Mathematics, 9(6), 672.10.3390/
math9060672.
Mandal, S.K., Chan, F.T., and Tiwari, M.K. (2012). “Leak detection of
pipeline: An integrated approach of rough set theory and artificial
bee colony trained SVM.” Expert Syst. Appl., 39(3), 3071–3080. 10.
1016/j.eswa.2011.08.170.
Mashford, J., de Silva, D., Burn, S., and Marney, D. (2012). “Leak
detection in simulated water pipe networks using SVM.” Appl.
Artif. Intell., 26(5), 429–444.10.1080/08839514.2012.670974.
Mashhadi, N., Shahrour, I., Attoue, N., El Khattabi, J., and Aljer, A.
(2021). “Use of machine learning for leak detection and localization
in water distribution systems.” Smart Cities, 4(4), 1293–1315.10.
3390/smartcities4040069.
MATLAB. (2014). The Math Works. : Natick, MA.
Mounce, S.R., and Machell, J. (2006). “Burst detection using hydraulic
data from water distribution systems with artificial neural networks.”
Urban Water J, 3(1), 21–31.10.1080/15730620600578538.
Mounce, S.R., Mounce, R.B., and Boxall, J.B. (2011). “Novelty detection
for time series data analysis in water distribution systems using
support vector machines.” J. Hydro-Informatics,13(4), 672–686. 10.
2166/hydro.2010.144.
Moyer, E., Male, J.W., Moore, C., and Hock, G. (1983). “The economics
of leak detection and repair – a case study.” Journal of American
Water Works Association, 75(1), 29–35. 10.1002/j.1551-8833.1983.
tb05054.x.
Murray, R., Haxton, T., Janke, R., Hart, W.E., Berry, J., and Phillips, C.
(2010). Sensor network design for drinking water contamination
warning systems: A compendium of research results and case studies
using the TEVA-SPOT software-Report, Cincinnati, OH: US
Environmental Protection Agency, EPA Number: EPA/600/R-09/
141.
Mutikanga, H.E., Vairavamoorthy, K., Sharma, S.K., and Akita, C.S.
(2011). “Operational tool for decision support in leakage control.”
Water Practice & Technology, 6(3), wpt2011057. 10.2166/wpt.2011.
057.
O’Brien, E., Murray, T., and McDonald, A., (2003). “Detecting leaks
from water pipes at a test facility using ground penetrating radar”.
Proceedings of PEDS 2003 (Pumps, Electromechanical Devices and
Systems Applied to Urban Water Management), April 22-25, 2003, 1,
Valencia, Spain, CORDIS, 395–404.
Pilcher, R., Hamilton, S., Chapman, H., Ristovski, B., and Strapely, S.,
(2007). “Leak location and repair guidance notes”. Proceedings of
International Water Association. Water Loss Task Forces: Specialist
Group Efficient Operation and Management, Bucharest, Romania,
IWA.
Poulakis, Z., Valougeorgis, D., and Papadimitriou, C. (2003). “Leakage
detection in water pipe networks using a Bayesian probabilistic
framework.” Probabilistic Engineering Mechanics, 18(4),
315–327.10.1016/S0266-8920(03)00045-6.
Puust, R., Kapelan, Z., Savic, D.A., and Koppel, T. (2010). “A review of
methods for leakage management in pipe networks.” Urban Water J.,
7(1), 25–45.10.1080/15730621003610878.
Quiñones-Grueiro, M., Bernal-de Lázaro, J.M., Verde, C., PrietoMoreno, A., and Llanes-Santiago, O. (2018a). “Comparison of clas­
sifiers for leak location in water distribution networks.” IFACPapersonline, 51(24), 407–413.10.1016/j.ifacol.2018.09.609.
Quiñones-Grueiro, M., Verde, C., Prieto-Moreno, A., and LlanesSantiago, O. (2018b). “An unsupervised approach to leak detection
and location in water distribution networks.” International Journal
of Applied Mathematics and Computer Science, 28(2), 283–295. 10.
2478/amcs-2018-0020.
Rajtar, J., and Muthiah, R. (1997). “Pipeline leak detection system for oil
and gas flowlines.” J. Manuf. Sci. Eng. (ASME), 19(1), 105–109. 10.
1115/1.2836545.
Rojek, I., and Studzinski, J. (2019). “Detection and localization of water
leaks in water nets supported by an ICT system with artificial
intelligence methods as a way forward for smart cities.”
Sustainability, 11(2), 518.10.3390/su11020518.
Salam, A.E.U., Tola, M., Selintung, M., and Maricar, F. (2014). “On-line
monitoring system of water leakage detection in pipe networks with
artificial intelligence.” ARPN J. Eng. Appl. Sci., 9(10), 1817–1822.
Samudrala, S. (2018). Machine Intelligence: Demystifying machine learn­
ing, neural networks and deep learning. Notion Press Chennai, India.
https://books.google.co.id/books?id=LC2DDwAAQBAJ.
Shinozuka, M., Liang, J., and Feng, M.Q. (2005). “Use of supervisory
control and data acquisition for damage location of water delivery
systems.” J. Eng. Mech., 131(3), 225–230.10.1061/(ASCE)0733-9399
(2005)131:3(225).
Shravani, D., Prajwal, Y.R., Prapulla, S.B., Salanke, N.G.R., Shobha, G.,
and Ahmad, S.F. (2019). “A machine learning approach to water leak
localization”. Proceedings of 4th International Conference on
Computational Systems and Information Technology for Sustainable
Solution (CSITSS), Florida International University, Miami, USA, 4,
1–6. 10.1109/CSITSS47250.2019.9031010.
Silva, R.A., Buiatti, C.M., Cruz, S.L., and Pereira, J.A. (1996). “Pressure
wave behaviour and leak detection in pipelines.” Comput. Chem.
Eng., 20(1), S491–S496. 10.1016/0098-1354(96)00091-9
Stoianov, I., Nachman, L., Madden, S., and Tokmouline, T. (2007).
“Pipeneta wireless sensor network for pipeline monitoring”.
Proceedings of the 6th international conference on Information pro­
cessing in sensor networks, 264–273, April 25 - 27, 2007, New York,
United States: Association for Computing Machinery, Cambridge,
Massachusetts, USA. 10.1145/1236360.1236396.
Suthaharan, S. (2016). “Support Vector Machine In machine learning
models and algorithms for big data classification.” Integrated series in
ISH JOURNAL OF HYDRAULIC ENGINEERING
information systems, 36, 207–235. Springer: Boston, MA. 10.1007/
978-1-4899-7641-3_9.
Tariq, S., Bakhtawar, B., and Zayed, T. (2022). “Data-driven application
of MEMS-based accelerometers for leak detection in water distribu­
tion networks.” Sci. Total Environ., 809, 151110. 10.1016/j.scitotenv.
2021.151110.
Thirumalaiah, K., and Deo, M.C. (1998). “Real‐time flood forecasting
using neural networks.” Computer‐aided Civil and Infrastructure
Engineering, 13(2), 101–111. 10.1111/0885-9507.00090.
Van der Walt, J.C., Heyns, P.S., and Wilke, D.N. (2018). “Pipe network
leak detection: Comparison between statistical and machine learning
techniques.” Urban Water J., 15(10), 953–960.10.1080/1573062X.
2019.1597375.
Yamashita, R., Nishio, M., Do, R.K.G., and Togashi, K. (2018).
“Convolutional neural networks: An overview and application
15
in radiology.” Insights Imaging, 9(4), 611–629. 10.1007/s13244018-0639-9.
Zahab, S.E., Mosleh, F., and Zayed, T. (2016). “An accelerometerbased real-time monitoring and leak detection system for pressur­
ized water pipelines.” Pipelines 2016, Kansas City, Missouri:
American Society of Civil Engineers, 257–268. 10.1061/
9780784479957.025.
Zhang, X.J. (1993). “Statistical leak detection in gas and liquid
pipelines.” Pipes and Pipelines International, 38(4), 26–29.
Zhang, Q., Wu, Z.Y., Zhao, M., Qi, J., Huang, Y., and Zhao, H. (2016).
“Leakage zone identification in large-scale water distribution systems
using multiclass support vector machines.” J. Water Resour. Plann.
Manage.(ASCE)142(11), 4016042. 10.1061/(ASCE)WR.1943-5452.
0000661.
16
N. SOURABH ET AL.
APPENDICES
Appendix A. Details of the network simulation
Table A1. Details of the nodes in the network.
Node ID
1 (Source)
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
Demand (lps)
—
247.22
236.11
36.11
201.39
279.17
375
152.78
145.83
145.83
138.89
155.56
261.11
170.83
77.78
86.11
240.28
373.61
16.67
354.17
258.33
134.72
290.28
227.78
47.22
250
102.78
80.56
100
100
29.17
223.61
Head (m)
100
97.14
61.67
57.25
51.77
46.03
44.71
43.17
41.96
41.08
39.52
38.37
34.16
34.72
34.26
34.26
41.31
51.36
58.14
50.78
41.43
36.27
44.84
39.88
36.82
33.55
33.01
36.31
31.72
30.85
31.34
32.65
Table A2. Lengths of the different pipe diameters in the network.
S. No.
1
2
3
4
5
6
Diameter (mm)
1016
762
609.6
508
406.4
304.8
No. of pipes
11
4
3
4
6
6
Length of the network (m)
12750
4680
4700
5050
8390
3850
ISH JOURNAL OF HYDRAULIC ENGINEERING
17
Table A3. Details of the piping network (pipe length, optimal diameter and the flow in the pipes).
Pipe No.
P-1
P-2
P-3
P-4
P-5
P-6
P-7
P-8
P-9
P-10
P-11
P-12
P-13
P-14
P-15
P-16
P-17
P-18
P-19
P-20
P-21
P-22
P-23
P-24
P-25
P-26
P-27
P-28
P-29
P-30
P-31
P-32
P-33
P-34
From node
R-1
J-1
J-2
J-3
J-4
J-5
J-6
J-7
J-8
J-9
J-10
J-11
J-9
J-13
J-14
J-15
J-17
J-18
J-2
J-2
J-19
J-20
J-19
J-22
J-23
J-25
J-26
J-26
J-22
J-27
J-28
J-30
J-30
J-31
To node
J-1
J-2
J-3
J-4
J-5
J-6
J-7
J-8
J-9
J-10
J-11
J-12
J-13
J-14
J-15
J-16
J-16
J-17
J-18
J-19
J-20
J-21
J-22
J-23
J-24
J-24
J-25
J-15
J-27
J-28
J-29
J-29
J-31
J-24
Length (m)
100
1350
900
1150
1450
450
850
850
800
950
1200
3500
800
500
550
2730
1750
800
400
2200
1500
500
2650
1230
1300
850
300
750
1500
2000
1600
150
860
950
Diameter (mm)
1016
1016
1016
1016
1016
1016
1016
1016
1016
762
762
609.6
406.4
406.4
304.8
406.4
508
609.6
609.6
1016
508
304.8
1016
762
762
508
304.8
304.8
406.4
406.4
304.8
304.8
406.4
508
Flow (lps)
5538.9
5291.68
2140.84
2104.73
1903.34
1624.17
1249.17
1096.39
950.56
555.56
416.67
261.11
249.17
78.34
0.56
135.79
376.07
749.68
766.35
2148.38
393.05
134.72
1401.16
902.88
675.1
302.54
52.54
50.24
208
127.44
27.44
72.56
101.73
325.34
Velocity (m/s)
6.83
6.53
2.64
2.6
2.35
2
1.54
1.35
1.17
1.22
0.91
0.89
1.92
0.6
0.01
1.05
1.86
2.57
2.63
2.65
1.94
1.85
1.73
1.98
1.48
1.49
0.72
0.69
1.6
0.98
0.38
0.99
0.78
1.61
Unit headloss (m/km)
28.59
26.27
4.92
4.76
3.95
2.95
1.81
1.42
1.09
1.64
0.96
1.2
7.95
0.93
0
2.58
5.74
8.48
8.83
4.95
6.23
10.33
2.24
4.03
2.36
3.84
1.81
1.66
5.69
2.3
0.54
3.28
1.51
4.39
18
N. SOURABH ET AL.
Appendix B. Results from regression modelling
Table B1. Regression results for Case A and Case B (ANN).
Case A
Case B
(Pressure models)
Pipe No.
P-01
P-02
P-03
P-04
P-05
P-06
P-07
P-08
P-09
P-10
P-11
P-12
P-13
P-14
P-15
P-16
P-17
P-18
P-19
P-20
P-21
P-22
P-23
P-24
P-25
P-26
P-27
P-28
P-29
P-30
P-31
P-32
P-33
P-34
Length
100
1350
900
1150
1450
450
850
850
800
950
1200
3500
800
500
550
2730
1750
800
400
2200
1500
500
2650
1230
1300
850
300
750
1500
2000
1600
150
860
950
Observed leak location
distance from u/s node (m)
50
675
450
575
725
225
425
425
400
475
600
1750
400
250
275
1365
875
400
200
1100
750
250
1325
615
650
425
150
375
750
1000
800
75
430
475
Leak location predicted
(m)
50
770
470
550
670
190
450
530
430
470
670
1710
390
250
230
1370
770
430
250
1010
850
250
1350
690
710
490
150
350
810
930
890
70
410
470
Leak size predicted
(LPS)
14.7
229.5
267.77
266.31
327.25
260.56
284.22
274.56
284.55
299.5
307.54
296.25
254.27
295.12
271.52
282.31
309.98
303.78
251.46
306.21
265.7
228.06
306
310.79
254.24
288.6
238.14
258.07
296.88
287.71
286.82
299.35
268.22
272.61
(Flow models)
Leak location predicted
(m)
50
670
350
530
770
250
390
470
370
510
570
1690
390
270
290
1350
850
370
250
1190
770
270
1250
490
750
450
150
390
750
850
750
90
450
530
Leak size predicted
(LPS)
181.6
349.89
309.07
319.16
290.25
275.41
299.97
297.68
317.09
295.71
301.97
294.77
322.93
283.67
279.21
291.36
312.61
290.37
310.09
285.55
240.94
284.17
309.5
291.79
263.93
306.67
266.02
304.06
312.2
343.25
305.35
263.69
303.34
307.02
ISH JOURNAL OF HYDRAULIC ENGINEERING
19
Table B2. Regression results for Case A and Case B (SVM).
Pipe No.
P-1
P-2
P-3
P-4
P-5
P-6
P-7
P-8
P-9
P-10
P-11
P-12
P-13
P-14
P-15
P-16
P-17
P-18
P-19
P-20
P-21
P-22
P-23
P-24
P-25
P-26
P-27
P-28
P-29
P-30
P-31
P-32
P-33
P-34
Length
100
1350
900
1150
1450
450
850
850
800
950
1200
3500
800
500
550
2730
1750
800
400
2200
1500
500
2650
1230
1300
850
300
750
1500
2000
1600
150
860
950
Observed leak location
distance from u/s node (m)
50
675
450
575
725
225
425
425
400
475
600
1750
400
250
275
1365
875
400
200
1100
750
250
1325
615
650
425
150
375
750
1000
800
75
430
475
Case A
Case B
(Pressure models)
(Flow models)
Leak location predicted
(m)
50
630
450
570
730
230
430
430
390
470
470
1930
390
250
270
1370
870
410
190
1090
930
250
1310
610
650
430
150
370
750
990
790
70
430
490
Leak size predicted
(LPS)
287.5
297.9
287.5
287.5
287.3
287.3
287.4
287.4
287.5
287.5
287.5
287.5
283.4
287.5
286.6
287.4
287.2
284.8
275
286.6
287.5
287.5
287.4
287.3
287.5
287.3
287.5
286.7
287.5
287.5
287.5
287.5
287.5
287.5
Leak location predicted
(m)
50
630
450
570
730
230
430
430
390
430
770
1630
390
250
270
1370
870
410
210
1090
610
210
1330
610
650
430
150
370
750
990
790
70
450
490
Leak size predicted
(LPS)
182.7
287.5
287.5
286.7
286.6
286.9
287.1
287.3
287.3
287.5
287.5
287.5
240.8
287.5
285.3
287.6
285.8
273
194.3
277.8
287.5
287.5
287.1
287.3
287.5
286.3
287.5
285.1
287.5
287.5
287.5
287.5
287.5
287.5
Download