RAINFALL-RUNOFF MODELLING USING ARTIFICIAL NEURAL NETWORK METHOD NOR IRWAN BIN AHMAT NOR

advertisement
RAINFALL-RUNOFF MODELLING USING
ARTIFICIAL NEURAL NETWORK METHOD
NOR IRWAN BIN AHMAT NOR
A thesis submitted in fulfilment of the
requirements for the award of the degree of
Doctor of Philosophy
Faculty of Civil Engineering
Universiti Teknologi Malaysia
AUGUST 2005
PSZ 19:16 (Pind. 1/97)
UNIVERSITI TEKNOLOGI MALAYSIA
BORANG PENGESAHAN STATUS TESISυ
JUDUL:
RAINFALL-RUNOFF MODELLING USING
ARTIFICIAL NEURAL NETWORK METHOD
SESI PENGAJIAN: 2001/02
Saya
NOR IRWAN BIN AHMAT NOR
mengaku membenarkan tesis (PSM/Sarjana/Doktor Falsafah)* ini disimpan di Perpustakaan Universiti
Teknologi Malaysia dengan syarat-syarat kegunaan seperti berikut:
1.
2.
3.
4.
Tesis adalah hakmilik Universiti Teknologi Malaysia.
Perpustakaan Universiti Teknologi Malaysia dibenarkan membuat salinan untuk tujuan
pengajian sahaja.
Perpustakaan dibenarkan membuat salinan tesis ini sebagai bahan pertukaran antara institusi
pengajian tinggi.
**Sila tandakan (4)
√
SULIT
(Mengandungi maklumat yang berdarjah keselamatan atau
kepentingan Malaysia seperti yang termaktub di dalam
AKTA RAHSIA RASMI 1972)
TERHAD
(Mengandungi maklumat TERHAD yang telah ditentukan
oleh organisasi/badan di mana penyelidikan dijalankan)
TIDAK TERHAD
Disahkan oleh
[ Signed ]
[ Signed ]
(TANDATANGAN PENULIS)
(TANDATANGAN PENYELIA)
Alamat Tetap:
NO. 46, LOT 11545, TAMAN SRI PERKASA
93050 JALAN MATANG
KUCHING, SARAWAK
PROF. MADYA DR. SOBRI BIN HARUN
Nama Penyelia
Tarikh:
19 OGOS 2005
Tarikh:
19 OGOS 2005
CATATAN: * Potong yang tidak berkenaan.
** Jika tesis ini SULIT atau TERHAD, sila lampirkan surat daripada pihak
berkuasa/organisasi berkenaan dengan menyatakan sekali sebab dan tempoh tesis ini
perlu dikelaskan sebagai SULIT atau TERHAD.
υ Tesis dimaksudkan sebagai tesis bagi Ijazah Doktor Falsafah dan Sarjana secara
penyelidikan, atau disertasi bagi pengajian secara kerja kursus dan penyelidikan, atau
Laporan Projek Sarjana Muda (PSM).
SUPERVISOR’S DECLARATION
“We hereby declare that we have read this thesis and in our
opinion this thesis is sufficient in terms of scope and quality for the
award of the degree of Doctor of Philosophy”
Signature
:
[ Signed ]
Name of Supervisor I :
ASSOC. PROF. DR. SOBRI BIN HARUN
Date
:
AUGUST, 2005
Signature
:
[ Signed ]
Name of Supervisor II :
PROF. IR. DR. AMIR HASHIM BIN
MOHD. KASSIM
Date
:
AUGUST, 2005
DECLARATION
“I declare that this thesis entitled “Rainfall-runoff modelling using artificial neural
network method” is the result of my research except as cited in references. The thesis has
not been accepted for any degree and is not concurrently submitted in candidature of any
degree”
Signature
:
[ Signed ]
Name of Candidate
:
NOR IRWAN BIN AHMAT NOR
Date
:
AUGUST, 2005
DEDICATION
“Dan sesungguhnya tiadalah seseorang itu memperolehi melainkan apa
yang telah diusahakannya”
(Al-Najm: 39)
I pay my most humble gratitude to Allah Subhanahuwataala for blessing me with good
health and spirit to undertake and complete this study.
To my beloved mother and father
i
ACKNOWLEDGEMENT
I would like to express my sincerest and deepest appreciation and thanks to my
supervisors, Assoc. Prof. Dr. Sobri Bin Harun (UTM) and Prof. Ir. Dr. Amir Hashim Bin
Mohd. Kassim (KUiTTHO) for their guidance and kind encouragement throughout the
length of this research.
High gratitude I intend to the authorities of the Universiti Teknologi Malaysia,
Skudai, Johor Darul Takzim. I would like also to express my gratitude and sincere
thanks to the Ministry of Science, Technology, and Environmental that provided financial
support during my study in the Universiti Teknologi Malaysia. I would like to thanks to
the office staff of Sekolah Pengajian Siswazah (SPS) and Graduate Studies Committee,
Faculty of Civil Engineering for their support and their good management for the
students. My thanks also to the office staff of Hydrology Division, Department of
Irrigation and Drainage (DID) Malaysia for providing me the data for my study and a
good advice, and also to colleagues and friends who have given me invaluable assistance
throughout my research work.
Most important of all, I am deeply indebted to my parent for providing me the
peace of mind to pursue knowledge and at the same time being close at hand to render
love, comfort, and support. My family has been the source of my perseverance with the
research at times all seemed lost.
ii
ABSTRACT
Rainfall and surface runoff are the driving forces behind all stormwater studies
and designs. The relationship is known to be highly non-linear and complex that is
dependent on numerous factors. In order to overcome the problems on the non-linearity
and lack of information in rainfall-runoff modelling, this study introduced the Artificial
Neural Network (ANN) approach to model the dynamic of rainfall-runoff processes. The
ANN method behaved as the black-box model and proven could handle the non-linearity
processes in complex system. Numerous structures of ANN models were designed to
determine the relationship between the daily and hourly rainfall against corresponding
runoff. Therefore, the desired runoff could be predicted using the rainfall data, based on
the relationship established by the ANN training computation. The ANN architecture is
simple and it considers only the rainfall and runoff data as variables. The internal
processes that control the rainfall to runoff transformation will be translated into ANN
weights. Once the architecture of the network is defined, weights are calculated so as to
represent the desired output through a learning process where the ANN is trained to
obtain the expected results. Two types of ANN architectures are recommended and they
are namely the multilayer perceptron (MLP) and radial basis function (RBF) networks.
Several catchments such as Sungai Bekok, Sungai Ketil, Sungai Klang and Sungai Slim
were selected to test the methodology.
The model performance was evaluated by
comparing to the actual observed flow series. Further, the ANN results were compared
against the results produced from the application of HEC-HMS, XP-SWMM and multiple
linear regression (MLR). It had been found that the ANN could predict runoff accurately,
with good correlation between the observed and predicted values compared to the MLR,
XP-SWMM and HEC-HMS models. Obviously, the ANN application to model the daily
and hourly streamflow hydrograph was successful.
iii
ABSTRAK
Hujan dan airlarian permukaan merupakan daya penggerak kepada semua kajian
dan rekabentuk berkaitan ributhujan. Diketahui umum bahawa perhubungan antara
keduanya adalah taklinear dan komplek yang mana bergantung kepada banyak faktor.
Bagi menyelesaikan masalah akibat kekurangan maklumat dan ketaklinearan hubungan
antara hujan dan airlarian, maka kajian ini memperkenalkan kaedah atau pendekatan
rangkaian neural buatan (ANN) untuk memodelkan proses dinamik hubungan tersebut.
Kaedah ANN bercirikan model ‘kotak hitam’ dan telah dibuktikan bahawa ianya boleh
menghadapi proses taklinear dalam sistem yang komplek ini. Pelbagai struktur bagi
model ANN telah direkabentuk untuk mendapatkan perhubungan harian dan jam yang
selaras dengan hubungan hujan dengan airlarian. Dengan itu, data airlarian sebenar boleh
diramal menggunakan data hujan berdasarkan kepada hubungan yang telah dikenalpasti
perkiraannya melalui proses latihan dalam ANN. Senibina ANN adalah mudah kerana ia
mengambilkira data hujan dan airlarian sebagai pembolehubah. Proses dalaman yang
mengawal transformasi hujan kepada airlarian dapat diterjemahkan melalui pemberatpemberat pada ANN. Setelah senibina rangkaian ANN dikenalpasti dan pemberatpemberat ditentukan, ia akan dapat menterjemahkan keluaran sebenar melalui proses
pembelajaran yang mana ANN telah dilatih untuk mendapatkan keputusan seperti yang
dijangkakan. Dua jenis senibina ANN telah dicadangkan iaitu kaedah rangkaian
perseptron pelbagai lapisan (MLP) dan fungsi asas jejarian (RBF). Beberapa kawasan
tadahan iaitu kawasan tadahan Sungai Bekok, Sungai Ketil, Sungai Klang dan Sungai
Slim telah dipilih untuk menguji metodologi ini. Keupayaan model dinilai dengan
membandingkannya dengan siri-siri aliran cerapan sebenar. Seterusnya, keputusan ANN
ini dibandingkan dengan keputusan yang diperolehi dari aplikasi HEC-HMS, SWMM
dan regresi linear berbilang (MLR). Didapati bahawa, ANN boleh meramalkan airlarian
setepatnya dengan korelasi yang baik antara nilai cerapan sebenar dengan nilai ramalan
berbanding model-model MLR, XP-SWMM dan HEC-HMS. Jelasnya, aplikasi ANN
untuk permodelan hidrograf aliran sungai bagi sela masa harian dan jam dapat
dilaksanakan dengan jayanya.
iv
TABLE OF CONTENTS
CHAPTER
TITLE
PAGE
DECLARATION
1
ACKNOWLEDGEMENTS
i
ABSTRACT
ii
ABSTRAK
iii
TABLE OF CONTENTS
iv
LIST OF TABLES
ix
LIST OF FIGURES
xiv
LIST OF SYMBOLS
xvii
LIST OF APPENDICES
xxii
INTRODUCTION
1.1
Background of Study
1
1.2
Statement of the Problem
5
1.3 Study Objectives
8
1.4 Research Approach and Scope of Work
9
1.5
10
Significance of the Study
1.6 Structure of the Thesis
11
v
2
LITERATURE REVIEW
2.1
General
13
2.2
Rainfall-Runoff Process and Relationship
14
2.3
Review of Hydrologic Modelling
18
2.4
Rainfall-Runoff Models
22
2.5
Artificial Neural Network
27
2.5.1 Basic Structure
30
2.5.2 Transfer Function
32
2.5.3 Back-propagation Algorithm
34
2.5.4
Learning or Training
35
2.6
Neural Network Application
37
2.7
Neural Network Modelling in Hydrology
2.8
3
and Water Resources
38
2.7.1 Versatility of Neural Network Method
44
Bivariate Linear Regression and Correlation
in Hydrology
45
2.8.1 Fitting Regression Equations
48
2.9
Review on HEC-HMS Model
50
2.10
Review on XP-SWMM Model
55
2.11
Summary of Literature Review
57
RESEARCH METHODOLOGY
3.1
Introduction
59
3.2
Multilayer Perceptron (MLP) Model
60
3.2.1 Training of ANN
67
3.2.2
69
3.3
3.4
Selection of Network Structures
Radial Basis Function (RBF) Model
70
3.3.1
71
Training RBF Networks
Multiple Linear Regression (MLR) Model
74
vi
3.5
76
3.5.1 Evaporation and Transpiration
77
3.5.2
Computing of Runoff Volumes
77
3.5.3
Modelling of Direct Runoff
80
3.6
XP-SWMM Model
85
3.7
Calibration of Distributed Models
89
3.8
Evaluation of the Model
90
3.8.1
Goodness of Fit Tests
90
3.8.2 Missing Data and the Outliers
93
The Study Area
94
3.9.1 Selection of Training and Testing Data
95
3.9.2
The Sungai Bekok Catchment
97
3.9.3
The Sungai Ketil Catchment
99
3.9
3.10
4
HEC-HMS Model
3.9.4 The Sungai Klang Catchment
101
3.9.5
103
The Sungai Slim Catchment
Computer Packages
106
RESULTS AND DISCUSSIONS
4.1
General
107
4.2
Results of the Multilayer Perceptron (MLP) Model
108
4.2.1 Results of Daily MLP Model
108
4.2.2 Results of Hourly MLP Model
117
4.2.3 Training and Validation
125
4.2.4 Testing
126
4.2.5 Robustness Test
128
Results of the Radial Basis Function (RBF) Model
128
4.3.1 Results of Daily RBF Model
129
4.3.2 Results of Hourly RBF Model
132
4.3
vii
4.4
4.5
4.6
4.7
4.3.3 Training and Validation
135
4.3.4 Testing
136
4.3.5 Robustness Test
137
Results of the Multiple Linear Regression
(MLR) Model
138
4.4.1 Calibration
138
4.4.2 Results of Daily MLR Model
143
4.4.3 Verification
146
4.4.4 Robustness Test
146
Results of the HEC-HMS Model
147
4.5.1 Calibration
148
4.5.2 Results of Daily HEC-HMS Model
152
4.5.3
156
Results of Hourly HEC-HMS Model
4.5.4 Verification
158
4.5.5 Robustness Test
159
Results of the SWMM Model
160
4.6.1 Calibration
161
4.6.2 Results of Daily SWMM Model
165
4.6.3
169
Results of Hourly SWMM Model
4.6.4 Verification
171
4.6.5 Robustness Test
172
Discussions on the Rainfall-Runoff Modelling
173
4.7.1 Basic Model Structure
176
4.7.2
Model Performance
184
4.7.3
Transfer Function and Algorithm
188
4.7.4
Robustness and Model Limitation
190
4.7.5
River Basin Characteristics
193
4.7.6
Time Interval
195
viii
5
CONCLUSIONS AND RECOMMENDATIONS
5.1
General
216
5.2
Conclusions
217
5.3
Recommendations for future work
220
REFERENCES
Appendices A-J
223
241-357
ix
LIST OF TABLES
TABLE NO.
TITLE
3.1
Infiltration rates by the soil groups
3.2
Rain Gauges used in calibration and verification
of the models for Sg. Bekok catchment
3.3
103
104
Results of 3 Layer neural networks for Sg. Bekok
catchment-using 100% of data sets in training phase
4.1(b)
101
Rain Gauges used in calibration and verification
of the models for Sg. Slim catchment
4.1(a)
98
Rain Gauges used in calibration and verification
of the models for Sg. Klang catchment
3.5
79
Rain Gauges used in calibration and verification
of the models for Sg. Ketil catchment
3.4
PAGE
109
Results of 3 Layer neural networks for Sg. Bekok
catchment-using 50% of data sets in training phase
110
x
4.1(c)
Results of 3 Layer neural networks for Sg. Bekok
catchment-using 25% of data sets in training phase
4.2(a)
Results of 4 Layer neural networks for Sg. Bekok
catchment-using 100% of data sets in training phase
4.2(b)
120
Results of 4 Layer neural networks for Sg. Bekok catchment
-using 65% of available data sets in training phase
4.10(c)
119
Results of 4 Layer neural networks for Sg. Bekok catchment
-using 100% of available data sets in training phase
4.10(b)
119
Results of 3 Layer neural networks for Sg. Bekok catchment
-using 25% of available data sets in training phase
4.10(a)
118
Results of 3 Layer neural networks for Sg. Bekok catchment
-using 65% of available data sets in training phase
4.9(c)
112
Results of 3 Layer neural networks for Sg. Bekok catchment
-using 100% of available data sets in training phase
4.9(b)
112
Results of 4 Layer neural networks for Sg. Bekok
catchment-using 25% of data sets in training phase
4.9(a)
111
Results of 4 Layer neural networks for Sg. Bekok
catchment-using 50% of data sets in training phase
4.2(c)
110
121
Results of 4 Layer neural networks for Sg. Bekok catchment
-using 25% of available data sets in training phase
121
xi
4.17(a)
Results of RBF networks for Sg. Bekok catchment
-using 100% of data sets in training phase
4.17(b)
Results of RBF networks for Sg. Bekok catchment
-using 50% of data sets in training phase
4.17(c)
144
Calibration Coefficients of Sg. Bekok catchment
-using 100% of data
4.29(b)
144
Results of MLR Model for Sg. Bekok catchment
-using 25% of data sets in training phase
4.29(a)
143
Results of MLR Model for Sg. Bekok catchment
-using 50% of data sets in training phase
4.25(c)
133
Results of MLR Model for Sg. Bekok catchment
-using 100% of data sets in training phase
4.25(b)
133
Results of RBF networks for Sg. Bekok catchment
-using minimum data sets in training phase
4.25(a)
130
Results of RBF networks for Sg. Bekok catchment
-using 25% of available data sets in training phase
4.21(b)
130
Results of RBF networks for Sg. Bekok catchment
-using 25% of data sets in training phase
4.21(a)
129
150
Calibration Coefficients of Sg. Bekok catchment
-using 50% of data
151
xii
4.29(c)
Calibration Coefficients of Sg. Bekok catchment
-using 25% of data
4.33(a)
Calibration Coefficients of Sg. Bekok catchment
-using 25% of data
4.33(b)
157
Calibration Coefficients of Sg. Bekok catchment
-using 100% of data
4.45(b)
157
Results of HEC-HMS Model for Sg. Bekok catchment
-using minimum data sets in training phase
4.45(a)
154
Results of HEC-HMS Model for Sg. Bekok catchment
-using 25% of data sets in training phase
4.41(b)
154
Results of HEC-HMS Model for Sg. Bekok catchment
-using 25% of data sets in training phase
4.41(a)
153
Results of HEC-HMS Model for Sg. Bekok catchment
-using 50% of data sets in training phase
4.37(c)
152
Results of HEC-HMS Model for Sg. Bekok catchment
-using 100% of data sets in training phase
4.37(b)
152
Calibration Coefficients of Sg. Bekok catchment
-using minimum data
4.37(a)
151
163
Calibration Coefficients of Sg. Bekok catchment
-using 50% of data
163
xiii
4.45(c)
Calibration Coefficients of Sg. Bekok catchment
-using 25% of data
4.49(a)
Calibration Coefficients of Sg. Bekok catchment
-using 25% of data
4.49(b)
167
Results of SWMM Model for Sg. Bekok catchment
-using 25% of data sets in training phase
4.57(b)
166
Results of SWMM Model for Sg. Bekok catchment
-using 25% of data sets in training phase
4.57(a)
166
Results of SWMM Model for Sg. Bekok catchment
-using 50% of data sets in training phase
4.53(c)
165
Results of SWMM Model for Sg. Bekok catchment
-using 100% of data sets in training phase
4.53(b)
165
Calibration Coefficients of Sg. Bekok catchment
-using minimum data
4.53(a)
164
169
Results of SWMM Model for Sg. Bekok catchment
-using minimum data sets in training phase
170
xiv
LIST OF FIGURES
FIGURE NO.
2.1
TITLE
PAGE
A schematic outline of the different steps in the
modelling process
25
2.2
Simple mathematical model of a neuron
29
2.3
A three-layer neural network with i inputs and k outputs
31
2.4
A threshold-logic transfer function
33
2.5
A hard-limit transfer function
33
2.6
Continuous transfer function: (a) the sigmoid,
(b) the hyperbolic tangent
33
2.7
The gaussian function
33
2.8
Steps in training and testing
37
2.9
Typical HEC-HMS representation of watershed runoff
53
3.1
Structure of a MLP rainfall-runoff model with one hidden layer
61
xv
3.2
Hyperbolic-tangent (tansig) activation function
64
3.3
The structure of RBF Model
71
3.4
The Sungai Bekok catchment area
99
3.5
The Sungai Ketil catchment area
100
3.6
The Sungai Klang catchment area
102
3.7
The Sungai Slim catchment area
105
4.1(a)
Daily results of 3-layer neural networks for Sg. Bekok
catchment using 100% of data sets in training phase
4.1(b)
Daily results of 3-layer neural networks for Sg. Bekok
catchment using 50% of data sets in training phase
4.1(c)
203
Daily results of 4-layer neural networks for Sg. Bekok
catchment using 25% of data sets in training phase
4.9(a)
202
Daily results of 4-layer neural networks for Sg. Bekok
catchment using 50% of data sets in training phase
4.2(c)
201
Daily results of 4-layer neural networks for Sg. Bekok
catchment using 100% of data sets in training phase
4.2(b)
200
Daily results of 3-layer neural networks for Sg. Bekok
catchment using 25% of data sets in training phase
4.2(a)
199
204
Hourly results of 3-layer neural networks for Sg. Bekok
catchment using 100% of data sets in training phase
205
xvi
4.9(b)
Hourly results of 3-layer neural networks for Sg. Bekok
catchment using 65% of data sets in training phase
4.9(c)
Hourly results of 3-layer neural networks for Sg. Bekok
catchment using 25% of data sets in training phase
4.10(a)
213
Hourly results of RBF networks for Sg. Bekok catchment
using 25% of data sets in training phase
4.21(b)
212
Daily results of RBF networks for Sg. Bekok catchment
using 25% of data sets in training phase
4.21(a)
211
Daily results of RBF networks for Sg. Bekok catchment
using 50% of data sets in training phase
4.17(c)
210
Daily results of RBF networks for Sg. Bekok catchment
using 100% of data sets in training phase
4.17(b)
209
Hourly results of 4-layer neural networks for Sg. Bekok
catchment using 25% of data sets in training phase
4.17(a)
208
Hourly results of 4-layer neural networks for Sg. Bekok
catchment using 65% of data sets in training phase
4.10(c)
207
Hourly results of 4-layer neural networks for Sg. Bekok
catchment using 100% of data sets in training phase
4.10(b)
206
214
Hourly results of RBF networks for Sg. Bekok catchment
using min of available data sets in training phase
215
xvii
LIST OF SYMBOLS
net j
-
a summation of weighted input for the j th neurons
Wij
-
a weight from the i th neuron in the previous layer to the j th
neuron in the current layer
Xi
-
the input form the i th to the j th neuron
x, y
-
the variables for their population linear regressions
b1 , b2
-
the tangents of slope angles of the two regression lines
a1 , a 2
-
the intercepts
α
-
learning rate parameter
μ
-
momentum parameter
xi
-
input rainfall variables
yj
-
output signal from rainfall
y _ in j
-
sum of weighted input signals
w0 j
-
weight for the bias
wij
-
weight between input layer and hidden layer
f (t )
-
hyperbolic-tangent function
x _ in k
-
weighted input signals
c 0( k )
-
weight for the bias
c (kj )
-
weight between second layer and third layer
f ( z _ in j )
-
output signal from rainfall
xviii
zj
-
input signal or rainfall
δk
-
error information term
Δc (kj )
-
weight correction term
Δc 0( k )
-
bias correction term
tk
-
target neural network output
r
y (k )
-
neural network output
δ _ in j
-
delta inputs
Δ w0 j
-
bias correction term
wij (new)
-
updates bias and weights
Δc (jk ) (t + 1)
-
update weight for bias with momentum
Δwij (t + 1)
-
update weight for backpropagation with momentum
η
-
learning rate
E min
-
minimum error
H
-
Hessian matrix
J
-
Jacobian matrix
E
-
sum of squares function
g
-
gradient
JT
-
transposition of J
e
-
vector of network errors
wk
-
vector of current weights and biases
gk
-
current gradient
y (t )
-
runoff at the present time
x(t )
-
rainfall at present time
x(t − i )
-
rainfall at previous time
y (x)
-
output with input vector x
c
-
centre
ℜ
-
metric
xix
rj
-
Euclidean length
φ
-
transfer function
T
-
transposition
I
-
interposes
y
-
datum vector
Y
v
y (k )
-
radial centre
-
output layer with linear combination of φ (r j )
y'
-
prediction of the actual output
x
-
input vector
yi
-
actual output
n
-
length of input vector
p
-
set of input pattern stored
y ij
-
desired output
y 'j
-
predicted output component
xi
-
stored pattern
W ( x, x i )
-
the weight
D
-
distance function
σk
-
sigma value
Nj
-
the summation units computes
y
-
dependent variable
xi
-
independent variables
a, b
-
constants
e
-
random variable
x ki
-
value of independent variable x k
n
-
number of observations
α, β
-
coefficients
S
-
summation of square function
( j)
xx
PMAP
-
total storm mean areal precipitation
pi (t )
-
precipitation depth at time t at gage i
fc
-
rate of precipitation loss
pet
-
the excess precipitation at time t
Ia
-
initial loss
Pe
-
accumulated precipitation
P
-
accumulated rainfall depth
S
-
potential maximum retention
Ai
-
the drainage area of subdivision i
Qn
-
storm hydrograph ordinate
Pm
-
depth
U n −m +1
-
dimensions of flow rate per unit depth
Up
-
UH peak discharge
Tp
-
the time to UH peak
C
-
the conversion constant
tc
-
time of concentration
It
-
average inflow to storage
Ot
-
outflow from storage
St
-
storage at time t
R
-
constant linear
C A , CB
-
routing coefficients
Ot
-
average outflow
A
-
the drainage area
L
-
the distance from the upper end of the plane to the point of interest
n
-
the Manning resistance coefficient
S
-
dimensionless slope of the surface
N
-
basin roughness
xxi
Qp
-
the peak discharge
tp
-
the time to peak
C
-
constant
R2
-
correlation of coefficient
Q0
-
actual observed streamflow
Qs
-
model simulated streamflow
n
-
is the number observed streamflow
xxii
LIST OF APPENDICES
APPENDIX
TITLE
PAGE
A
Daily and hourly results of MLP model
241
B
Daily and hourly results of RBF model
259
C
Results of application of MLR model
267
D
Daily and hourly results of the HEC-HMS model calibration
272
E
Daily and hourly results of application of HEC-HMS model
277
F
Daily and hourly results of the SWMM model calibration
285
G
Daily and hourly results of application of SWMM model
290
H
Daily and hourly results of PBIAS
298
I
Figures illustrate the daily and hourly result of ANN models
301
J
The architecture of MLP network structures
352
CHAPTER 1
INTRODUCTION
1.1
Background of Study
Hydrologists are often confronted with problems of prediction and estimation of
runoff, precipitation, contaminant concentrations, water stages, and so on (ASCE, 2000).
Moreover, engineers are often faced with real situations where little or no information is
available. The processes and relationship between rainfall and surface runoff for a
catchment area require good understanding, as a necessary pre-requisite for preparing
satisfactory drainage and stormwater management projects. In the hydrological cycle, the
rainfall occurs and reaching the ground may collect to form surface runoff or it may
infiltrate into the ground. The surface runoff and groundwater flow join together in
surface streams and rivers which finally flow into the ocean.
Most of hydrologic
processes has a high degree of temporal and spatial variability, and are further plagued by
issues of non-linearity of physical processes, conflicting spatial and temporal scales, and
uncertainty in parameter estimates. That the reason why our understanding in many areas
especially in hydrologic processes is far from perfect. So that empiricism plays an
important role in modelling studies. Hydrologists strive to provide rational answers to
problems that arise in design and management of water resources projects. As modern
computers become ever more powerful, researchers continue testing and evaluating a new
approach of solving problems efficiently.
2
A problem commonly encountered in the stormwater design project is the
determination of the design flood.
Design flood estimation using established
methodology is relatively simple when records of streamflow or runoff and rainfall are
available for the catchment concerned. The quantity of runoff resulting from a given
rainfall event depends on a number of factors such as initial moisture, land use, and slope
of the catchments, as well as intensity, distribution, and duration of the rainfall.
Knowledge on the characteristics of rainfall-runoff relationship is essential for risk and
reliability analysis of water resources projects. Since the 1930s, numerous rainfall-runoff
models have been developed to forecast streamflow. For example, conceptual models
provide daily, monthly, or seasonal estimates of streamflow for long term forecasting on
a continuous basis. Sherman (1932) defined the unit graph, linear systems analysis has
played an important role in relating input-output components in rainfall-runoff modelling
and in the development of stochastic models of single hydrological sequences (Singh,
1982). The performance of a rainfall-runoff model heavily depends on choosing suitable
model parameters, which are normally calibrated by using an objective function (Yu and
Yang, 2000). The entire physical process in the hydrologic cycle is mathematically
formulated in conceptual models that are composed of a large number of parameters
(Tokar and Johnson , 1999).
The modelling technique approach used in the present study is based on artificial
neural network methods in modelling of hydrologic input-output relationships. The
rainfall-runoff models are developed to provide predicts or forecast rainfalls as input to
the rainfall-runoff models. The observed streamflow was treated as equivalent to runoff.
The previous data were used in the test set to illustrate the capability of model in
predicting future occurrences of runoff, without directly including the catchment
characteristics.
Tokar and Markus (2000) believed that the accuracy of the model
predictions is very subjective and highly dependent on the user’s ability, knowledge, and
understanding of the model and the watershed characteristic. Artificial intelligence (AI)
techniques have given rise to a set of ‘knowledge engineering’ methods constituting a
new approach to the design of high-performance software systems. This new approach
represents an evolutionary change with revolutionary consequences (Forsyth, 1984). The
3
systems are based on an extensive body of knowledge about a specific problem area.
Characteristically this knowledge is organized as a collection of rules, which allow the
system to draw conclusions from given data or premises.
Application of neural networks is an extremely interdisciplinary field such as
science, engineering, automotive, aerospace, banking, medical, business, transportation,
defense, industrial, telecommunications, insurance, and economic. In the last few years,
the subject of artificial neural networks or neural computing has generated a lot of
interest and receives a lot of coverage in articles and magazine. Nowadays, artificial
neural networks (ANN) methods are gaining popularity, as is evidenced by the increasing
number of papers on this topic appearing in engineering and hydrology journals,
conferences, seminars, and so on. This modelling tool is still in its nascent stage in terms
of hydrologic applications (ASCE, 2000). Recently there are increasing number of works
attempt to apply the neural network method for solving various problems in different
branches of science and engineering.
This highly interconnected multiprocessor
architecture in ANN is described as parallel distributed processing and has solved many
difficult computer science problems (Blum, 1992). Electrical Engineers find numerous
applications in signal processing and control theory. Computer engineers and computer
scientists find that the potential to implement neural networks efficiently and by
applications of neural networks to robotics and it show promise for difficult problems in
areas such as pattern recognition, feature detector, handwritten digit recognition, image
recognition, etc. Manufacturers use neural networks to provide a sophisticated machine
or instrument enabling the consumers to gain some benefit in a modern society and our
life become comfortable and productive.
In medical, the neural networks used to
diagnose and prescribe the treatment corresponding to the symptoms it has been before.
It is a tool to provide hydraulic and environmental engineers with sufficient details for
design purposes and management practices (Nagy et. al., 2002).
In other word,
apparently neural network models are able to treat problems of different disciplines.
The main function of all artificial neural network paradigms is to map a set of
inputs to a set of output. However, there are a wide variety of ANN algorithms. An
4
attractive feature of ANN is their ability to extract the relation between the inputs and
outputs of a process, without the physics being explicitly provided to them. They are
able to provide a mapping from one multivariate space to another, given a set of data
representing that mapping.
Even if the data is noisy and contaminated with errors, ANN
has been known to identify the underlying rule (ASCE, 2000). Neural network can learn
from experience, generalize from previous examples to new ones, and abstract essential
characteristics from inputs containing irrelevant data (Fausett, 1994; Wasserman, 2000).
Therefore, the natural behaviour of hydrological processes is appropriate for the
application of ANN methods.
In this study, artificial neural network (ANN) methods were applied to model the
hourly and daily rainfall-runoff relationship. The available rainfalls and runoffs data are
from four catchments known as Sungai Bekok, Sungai Ketil, Sungai Klang, and Sungai
Slim. An attractive feature of ANN methods is their ability to extract the relation
between the inputs and outputs of process, without the physics being explicitly provided
to them.
The networks were trained and tested using data that represent different
characteristics of the catchments area and rainfall patterns. The sensitivity of the network
performance to the content and length of the calibration data were examined using
various training data sets. Existing commercially available models used in modelling
study were HEC-HMS and XP-SWMM. The performances of the ANN model for the
selected catchments were investigated and comparison was made against the XPSWMM, HEC-HMS and linear regression models. The performance of the proposed
models and the existing models are evaluated by using correlation of coefficient, root
mean square error, relative root mean square error, mean absolute percentage error and
percentage bias.
5
1.2
Statement of the Problem
In many parts of the world, rapid population growth, urbanization, and
industrialization have increased the demand for water.
These same pressures have
resulted in altered watersheds and river systems, which have contributed to a greater loss
of life and property damages due to flooding. It is becoming increasingly critical to plan,
design, and manage water resources systems carefully and intelligently. Understanding
the dynamics of rainfall-runoff process constitutes one of the most important problems in
hydrology, in order to predict or forecast streamflow for purposes such as water supply,
power generation, flood control, water quality, irrigation, drainage, recreation, and fish
and wildlife propagation. During the past decades, a wide variety of approaches, such as
conceptual, has been developed to model rainfall-runoff process. However, an important
limitation of such approaches is that treatment of the rainfall-runoff process as a
realization of stochastic and statistical process means that only some statistical features of
the parameters are involved. Therefore, what is required is an approach that seeks to
understand the complete dynamics of the hydrologic process, capturing not only the
overall appearance but also the intricate details.
The rainfall-runoff relationships are among the most complex hydrologic
phenomena to comprehend due to the tremendous spatial and temporal variability of
watershed characteristics, snow pack, and precipitation patterns, as well as a number of
variables involved in modelling the physical processes (Tokar and Johnson, 1999). The
modelling of rainfall-runoff relationship is very important in the hydraulics and
hydrology study for new development area. The transformation of rainfall to runoff
involves many highly complex components, such as interception, infiltration, overland
flow, interflow, evaporation, and transpiration, and also non-linear and cannot easily
calculate by using simple equation. The runoff is critical to many activities such as
designing flood protection works for urban areas and agricultural land and assessing how
much water may be extracted from a river for water supply or irrigation. Despite the
complex nature of the rainfall-runoff process, the practice of estimating runoff as fixed
percentage of rainfall is the most commonly used method in design of urban storm
6
drainage facilities, highway culverts, and many small hydraulic structures. The quantity
of runoff resulting from a given rainfall event depends on a number of factors such as
initial moisture, land use, and slope of the catchments, as well as intensity, distribution,
and duration of the rainfall. Various well known currently available rainfall-runoff
models have been successfully applied in many problems and catchments. Numerous
papers on the subject have been published and many computer simulation models have
been developed. All these models, however, require detailed knowledge of a number of
factors and initial boundary conditions in a catchments area which in most cases are not
readily available. However, the existing popular rainfall-runoff models can be detected
as not flexible and they require many parameters for calibration.
Beven (2001) reported that the ungauged catchment problem is one of the real
challenges for hydrological modellers in the twenty-first century.
Furthermore, the
traditional method of investigation and the collection of data in the field involving the
installation and maintenance of a network of instruments tend to be costly. Furthermore,
some of these models are expensive, and of limited applicability. The availability of
rainfall-runoff data is important for the model calibration process. Rainfall-runoff
modelling for sites where there are no discharge data is a very much more difficult
problem. However, it is considered that the main limitation in the development of a
design flood hydrograph estimation procedure lies in the availability of rainfall and
streamflow data, rather than any inherent limitations in the techniques used to develop the
procedure. However, discharge data are available at only a small number of sites in any
region. In this respect the problem is that there are very few major floods for which
reliable rainfall and streamflow data are available, particularly on small catchments. Any
relationships developed are therefore based on data from relatively small storms, and
hence the flood estimates are made from extrapolated relationships. Even more often,
physical measurements of the pertinent quantities are very difficult and expensive
especially in a virgin rural area. That is reasons why many catchments in many countries
in the world are not installed the measurement instruments. These difficulties lead us to
explore the use of neural networks as a way of obtaining models based on experimental
measurements. In terms of hydrologic applications, this modelling tool is still in its
7
nascent stages.
An attractive feature of this model is their ability to extract the
relationship between the inputs and outputs of a process, without the physics being
explicitly provided to them. The goal is to create a model for predicting runoff from a
gauged or ungauged catchment. For long term runoff modelling, use a continuous model
rather than a single-event model.
Rainfall-runoff modelling software’s and guideline from USA, Australia and
United Kingdom are required as reference for understanding and development of
hydrologic model in Malaysia. Those models and guidelines to study the modelling
technique, hydrologic problems, management and design of urban or rural watershed
system. Since the present software and guidelines are based on the compilation of the
practice of urban stormwater management of USA, United Kingdom and Australia, hence
it is important for us to develop our own. Furthermore, various well-known currently
available rainfall-runoff models such as HEC-HMS, MIKE-11, SWMM, etc. have been
successfully applied in many problems and watersheds. However, the existing popular
rainfall-runoff models can be detected as not flexible and they require too many
parameters for calibration. Obviously, the models have their own weaknesses, especially
in the calibration processes and the ability to adopt the non-linearity of processes.
However, there are also many areas where today’s tools are lacking the features and
functions needed to build these applications effectively (Wasserman, 2000).
Furthermore, the software’s are not robust and performed by selective calibration. The
rapid development of modern Malaysia, the demand of water resources utility has also
increased, and therefore, time has already come to develop new techniques to overcome
the problems regarding the hydrology and water resources design and management. In
this context, one of the main potential areas of application of rainfall-runoff models is the
prediction and forecasting of streamflow.
An alternative approach to predicting
suggested in recent years is the neural network method, inspired by the functioning of the
human brain and nervous systems. Artificial neural networks are able to determine the
relationship between input data and corresponding output data. When presented with
simultaneous input-output observations, artificial neural network adjust their connection
8
weights (model parameters), and discover the rules governing the association between
input and output variables.
1.3
Study Objectives
The research is focused on the application of the neural networks method on the
rainfall-runoff modelling. Comparison between neural networks and other methods is
made.
The overall objective of the present study is developing mathematical models that
are able to provide accurate and reliable runoff estimates from the historical data of
rainfall-runoff of selected catchments area.
To address the performance of various
rainfall-runoff models applied in Malaysian environment, the following specific
objectives are made:
(i)
To develop rainfall-runoff model using artificial neural network (ANN)
methods, based on the Multilayer Perceptron (MLP) model and Radial
Basis Function (RBF) computation techniques.
(ii)
To examine and quantify the predicting accuracy of neural networks
models using multiple inputs and output series.
(iii)
To evaluate and compare the neural networks and multiple linear
regression (MLR) models for daily flow prediction only.
(iv)
To compare and evaluate the performance of the neural networks models
against XP-SWMM and HEC-HMS models for daily and hourly
predictions.
9
1.4
Research Approach and Scope of Work
The present study was undertaken to develop daily and hourly rainfall-runoff
models using the ANNs method that can possible be used to provide reliable and accurate
estimates of runoff based on rainfall as input variable. The ANN models used are the
MLP and RBF.
It is believed that the ANN is able to overcome the non-linear
relationship between rainfalls against runoff. The ANN methods of computation are
MLP and RBF. Calibration methods (algorithm) apply for MLP is back-propagation and
the transfer function used is tangent sigmoid (tansig). Meanwhile, calibration methods
apply for RBF is Generalized Regression Neural Network (GRNN) and the transfer
function used is Gaussian for hidden units.
The modelling work was carried out using five years period of daily data and ten
years period of hourly data consisting the rainfall and runoff records from selected
catchments in Peninsular of Malaysia. There are four catchments being selected for
analysis and modelling. Those stations have sufficient length of records and fairly good
quality of data.
Those are Sungai Bekok (Johor, Malaysia), Sungai Ketil (Kedah,
Malaysia), Sungai Klang (Kuala Lumpur, Malaysia), and Sungai Slim (Perak, Malaysia)
catchments. Those sites were selected to demonstrate the development and application of
ANN, multiple linear regression (MLR), XP-SWMM and HEC-HMS models.
It is
emphasized that the MLR model is only applied to model the daily rainfall-runoff for
those catchments. The data required to carry out this study are catchment physical data,
rainfall and river (at catchments outlet). The data of all these gauges is recorded and
maintained by Department of Drainage and Irrigation (DID) Malaysia.
This study is subjected to the following limitations:
(i)
Analyses treat the catchment as one single catchment. No sub-division of
catchment is carried out.
(ii)
It is assumed that the HEC-HMS and XP-SWMM can be applied to a big
catchment without sub-division.
10
(iii)
The available observed data for analysis are rainfall, runoff or streamflow,
evapotranspiration, and size of the catchment area.
Other data or
parameters such as time of concentration, runoff coefficient and
infiltration loss coefficient in the HEC-HMS and XP-SWMM will be
estimated.
1.5
Significance of the Study
The relationship, or the operation of transforming the input (rainfall) into the
output (runoff), is implied uniquely by any corresponding input-output pair.
This
relationship can be abstracted and used to find the output for any arbitrary input or, the
input corresponding to any given output, though, in practice, in analysing systems which
are not exactly linear time variant, or where the data are subject to errors. Problems may
arise both in identifying the operation or in computing an input corresponding to a given
output function of time (Singh, 1982).
Overton and Meadows (1976) defined
mathematical model as, “a quantitative expression of a process or phenomenon one is
observing, analyzing, or predicting”.
Meanwhile, Woolhiser and Brakensiek (1982)
defined mathematical model as, “a symbolic, usually mathematical representation of an
idealized situation that has the important structural properties of the real system.
Mathematical models that require precise knowledge of all the contributing variables, a
trained artificial intelligence such as neural networks can estimate process behaviour
even with incomplete information. It is a proven fact that neural networks have a strong
generalization ability, which means that once they have been properly trained, they are
able to provide accurate results even for cases they have never seen before (HechtNielsen, 1991; Haykin, 1994). This generalization capability provides an understanding
of how the runoff hydrograph system can respond under different rainfall and catchments
characteristics.
11
Most synthetic procedures for estimating design flood hydrographs are
deterministic in that the design flood is derived from a hypothetical design storm. A
review of some of the more widely used procedures for estimating design flood
hydrographs has been made by Cordery et. al. (1970). Three basic steps are common to
this methodology of flood estimation: (1) the specification of the design storm of which
the important characteristics are usually the recurrence interval, the total rainfall volume,
the areal distribution of rainfall over the catchment, the temporal distribution of rainfall,
and the duration of rainfall; (2) the estimation of the runoff volume resulting from the
design storm; and (3) the estimation of the time distribution of runoff from the catchment.
Over recent years there have been numerous and diverse techniques developed for
estimating all of the above components. Today, most urban drainage systems in the
tropical regions are relying upon the ‘old concept’ of rapid stormwater disposal
determined from tradition rainfall-runoff modelling approach. The obvious negative
impacts of urbanization towards water balance are increased stormwater runoff,
degradation of water quality, recession of the water table and reduction of roughness and
thus time of concentration. Therefore, in view of the importance of the relationship
between rainfall-runoff, the present study was undertaken in order to develop predicting
models that can be used to provide reliable and accurate estimates of runoff.
1.6
Structure of the Thesis
This thesis consists of five chapters. The first chapter presents the introduction of
this study, and outlined the objectives and scopes of this research. A review of the
relevant literature is presented in Chapter 2. The proposed models for rainfall-runoff
modelling are described in Chapter 3. The fundamentals and concepts of rainfall-runoff
relationship, and also the concepts of hydrology modelling are discussed in detail in
Chapter 3. The description of selected catchments area, as well as the current catchment
management practice and related problems also discussed in this chapter. Meanwhile,
results and discussions are presented in Chapter 4. Results of the Multilayer Perceptron
12
(MLP) model were discussed in sub-topic 4.2 and results of the Radial Basis Function
(RBF) model were discussed in sub-topic 4.3. Meanwhile, results of the Multiple Linear
Regression (MLR), HEC-HMS and XP-SWMM were discussed in sub-topic 4.4, 4.5 and
4.6 respectively. The results and discussions involving the application and performance
of the proposed models, the robustness and limitation of the model, river basin
characteristics, etc. were discussed in detail in sub-topic 4.7. Finally, in the last chapter,
conclusions from the present study are summarized and recommendations for future
studies are outlined.
13
CHAPTER 2
LITERATURE REVIEW
2.1
General
Determining the relationship between rainfall and runoff for a catchments area is
one of the most important problems faced by hydrologists and engineers. It is because;
the information about rainfall and runoff is needed for hydrologic and hydraulics
engineering design and management purposes. This relationship is known to be highly
non-linear and complex. In addition to rainfall, runoff is dependent on numerous factors
such as initial soil moisture, land use, catchment geomorphology, evaporation,
infiltration, distribution, duration of the rainfall, and so on. Although many catchments
have been gauged to provide continuous records of stream flow, engineers are often faced
with situations where little or no information is available. Todini (1988) reviewed the
historical development of mathematical methods used in rainfall-runoff modelling and
classified the models based on a priori knowledge and problem requirements. It found
that the increasing role of distributed models, satellite, and radar technology in watershed
hydrology and noted that techniques for model calibration and verification remained less
than robust. However, efforts have been made to compare models of some component
processes. Also, developers of some models have compared their models with one or a
few other models. During the past until recent years, there are various rainfall-runoff
studies have been carried out by researchers. The ability to transform the rainfall to the
14
flow or runoff accurately is required for further investigation.
The neural network
approach possibly can lead to new insights in problem solving and to give better solutions
of predicting or forecasting via mathematical modelling.
Watershed models are fundamental to water resources assessment, development,
and management. They are, for example, used to analyze the quantity and quality of
streamflow, reservoir system operations, groundwater development and protection,
surface water and groundwater conjunctive use management, water distribution systems,
water used, and a range of water resources management activities (Wurbs, 1998).
The development of the computer and neural network techniques provides
hydrologists and researchers with enhanced computational power to solve complicated
problems from time to time, as far as engineering is concern. This power increased the
possibilities of applying search algorithms, and the ability to simulate models of
cognitive processes.
This developments stimulated neural networks research.
In
particular, several models based on time series analysis methods or based on regression
techniques have been commonly used to describe the random behaviour of various
physical phenomena in hydrology. With the advent of computational and simulation
capability using computers, new technique such as neural networks have been developed
to represent more accurately complex non-linear stochastic processes (Harun, 1999). The
accuracy of model predictions is very subjective and highly dependent on the user’s
ability, knowledge, and understanding of the model and of the watershed characteristics
(Tokar and Johnson, 1999).
2.2
Rainfall-Runoff Process and Relationship
Hydrology is the scientific of water and its properties, distribution, and effects on
the earth’s surface, soil, and atmosphere (McCuen, 1997).
Most of the hydrologic
processes are non-linear processes, such as the relationship between rainfall and runoff.
15
The most important processes in hydrology cycle are the section where rainfall occurs
and result in runoff. The rainfall-runoff relationship describes the time distribution of
direct runoff as a function of excess rainfall. The varying portion of the precipitation or
rainfall becomes runoff, moving via overland flow into stream channels. Runoff is
generated by rainstorms and its occurrence and quantity are dependent on the
characteristics of the rainfall event such as the intensity, duration, and distribution.
Rainfall intensity is defined as the ratio of the total amount of rain (rainfall depth) falling
during a given period to the duration of the period. It is expressed in depth units per unit
time, usually as millimeter per hour (mm/hr).
When rain falls, the first drops of water are intercepted by the leaves and stems of
the vegetation. This is usually referred to as interception storage. As the rain continuous,
water reaching the ground surface infiltrates into the soil until it reaches a stage where the
rate of rainfall (intensity) exceeds the infiltration capacity of the soil. Thereafter, surface
puddles, ditches, and other depressions are filled (depression storage), after which runoff
is generated. The infiltration capacity of the soil depends on its texture and structure, as
well as on the antecedent soil moisture content (previous rainfall or dry season). The
initial capacity of a dry soil is high but, as the storm continuous, it decreases until it
reaches a steady value termed as final infiltration rate. Apart from rainfall characteristics
such as intensity, duration and distribution, there are a number of site or catchment
specific factors which have a direct bearing on the occurrence and volume of runoff. The
major factors which influence the rainfall-runoff process are as follows:
(a)
Soil type
The infiltration capacity is among others dependent on the porosity of a
soil which determines the water storage capacity and affects the resistance
of water to flow into deeper layers. Porosity differs from one soil type to
the other. The highest infiltration capacities are observed in loose, sandy
soils, while heavy clay or loamy soils have considerable smaller
infiltration capacities. The infiltration capacity depends furthermore on
the moisture content prevailing in a soil at the onset of a rainstorm. This
however, is only valid when the soil surface remains undisturbed.
16
(b)
Vegetation
The amount of rain lost to interception storage on the foliage depends on
the kind of vegetation and its growth stage. A cereal crop has a smaller
storage capacity than a dense grass cover. More significant is the effect
the vegetation has on the infiltration capacity of the soil.
A dense
vegetation cover shields the soil from the raindrop impact and reduces the
crusting effect. The root systems as well as organic matter in the soil
increase the soil porosity thus allowing more water to infiltrate.
Vegetation also retards the surface flow particularly on gentle slopes,
giving the water more time to infiltrate and to evaporate. In conclusion,
an area densely covered with vegetation, yields less runoff than bare
ground.
(c)
Slope and catchment size
It was observed that the quantity of runoff decreased with increasing slope
length. This is mainly due to lower flow velocities and subsequently a
longer time of concentration. This means that the water is exposed for a
longer duration to infiltration and evaporation before it reaches the
measuring point. The same applies when catchment areas of different
sizes are compared. The runoff efficiency (volume of runoff per unit of
area) increases with the decreasing size of the catchment. For example,
the larger the size of the catchment, the larger the time of concentration
and the smaller the runoff efficiency.
Apart from the above-mentioned site specific factors which strongly influence the
rainfall-runoff process, it should also be considered that the physical conditions of a
catchment area are not homogenous. Even at the micro level, there are a variety of
different slopes, soil types, vegetation covers, etc. Each catchment has therefore its own
runoff response and will respond differently to different rainstorm events. The design of
water harvesting schemes requires the knowledge of the quantity of runoff to be produced
by rainstorms in a given catchment area. It is commonly assumed that the quantity or
volume of runoff is a proportion (percentage) of the rainfall depth, where, runoff (mm) is
17
equal to C × rainfall depth. In rural catchments where no or only small parts of the area
are impervious, the coefficient, C which describes the percentage of runoff resulting
from a rainstorm, is however not a constant factor. Instead its value is highly variable
and depends on the above described catchment specific factors and on the rainstorm
characteristics.
An analysis of the rainfall-runoff relationship and subsequently an
assessment of relevant runoff coefficients should best be based on actual, simultaneous
measurements of both rainfall and runoff in the study area.
The runoff from a watershed is governed by local combinations of several factors
such as hydraulic conductivity and porosity, topography, landuse and etc. Discontinuous
encompass the boundaries separating soil types, geologic formations, or land covers.
Physical properties control interception, surface retention, infiltration, overland flow, and
evapotranspiration at difference scales, and these processes control runoff. It has been
observed empirically that the form of hydrologic response changes with the spatial scale
of heterogeneities, usually considered to be simpler and more linear with increasing
watershed size (Dooge, 1981).
The discharge hydrograph is a graph of instantaneous discharge at the catchment
outlet versus time. The hydrograph are separated into two components, direct runoff and
base flow. In flood studies the portion of the discharge hydrograph of major interest is
the direct runoff hydrograph, and the portion of the rainfall hyetograph of interest is the
storm rainfall. The rainfall hyetograph is a plot in discrete form, of the rainfall intensity
over the catchment versus time. The point rainfall intensity was extracted at hourly
intervals from the recording raingauge data and then adjusted linearly so that the area
under the hyetograph equaled the catchment mean rainfall over the same period. The
storm rainfall is that portion of the rainfall hyetograph for which the intensity exceeds the
phi-index. The phi-index or loss rate is defined as the rainfall intensity above which the
volume of rainfall or rainfall excess equals the volume of direct runoff. The volume of
rainfall below this intensity goes to catchment recharge. It is assumed that rainfall of
intensity less than the phi-index does not contribute to direct runoff. These concepts are
18
largely empirical and obviously over simplify the very complex relationship between
rainfall and runoff.
In practice, there are two methods of deriving the volume of runoff from the
volume of rainfall. In other word is rainfall-runoff relationship. The first method is the
loss rate approach in which an initial loss prior to the onset of direct runoff and a
continuing loss during the storm are abstracted from the design storm. The selection of a
continuing loss or phi-index approach is usually the most important factor for the design
storm. The second approach a rainfall-runoff relationship is developed from which the
volume of runoff can be estimated for any design storm volume. This approach is usually
adopted for design storms which are usually outside the range of the observed data.
For practical purposes it is useful to express the rainfall-runoff relationship as a
mathematical equation rather than graphically. Several forms of equation have been
reported by Chow (1964) who describes an empirical equation developed by the U.S. Soil
Conservation Service and the form Q = Pe2 (Pe + I ) , where, Q is the direct runoff, Pe is
the total rainfall during the storm minus the initial loss, and I is the potential infiltration.
The value of I is dependent on the soil type, ground cover and the antecedent moisture
conditions. However, the above equation is attractive in that the proportion of estimated
runoff relative to rainfall increases as the storm rainfall increases. This is logical since
the catchment recharge is high in the early stages of a storm and decreases to a more or
less constant rate as the storm duration increases.
2.3
Review on Hydrologic Modelling
Beven (2001) defined the rainfall-runoff model as a model to predict the
hydrograph peaks correctly at least to within the magnitude of the errors associated with
the observations, to predict the timing of the hydrograph peak correctly, and to give a
good representation of the form of the recession curve to set up the initial conditions prior
19
to the next event. Meanwhile, Loague and Freeze (1985) defined hydrologic modelling
as,
‘In many ways, hydrologic modelling is more an art than a science, and it
is likely to remain so. Predictive hydrologic modelling is normally carried
out on a given catchment using a specific model under the supervision of
an individual hydrologist. The usefulness of the results depends in large
measure on the talents and experience of the hydrologist and...
understanding of the mathematical nuances of the particular model and
the hydrologic nuances of the particular catchment. It is unlikely that the
results of an objective analysis of modelling methods…can ever be
substituted for the subjective talents of an experienced modeler’.
Hydrologic modelling is employed to address a wide spectrum of environmental
and water resources problems, and to understand dynamic interactions between climate
and land surface hydrology. Hydrological models are fundamental to water resources
assessment, development and management. They are, for example used to analyzed the
quantity and quality of streamflow, reservoir system operations, surface water and
groundwater conjunctive use management, water distribution systems, water use, and a
range of water resources management activities (Wurbs, 1998).
Words such as
deterministic, stochastic, conceptual, black box, empirical, lumped and distributed input
lumped and distributed parameter, linear and nonlinear systems abound, and this is very
subjective, complex and hard to understand. In this study, the history of catchment
modelling over the last ninety years is described. The field of catchment modelling today
encounters a bewildering array of models and model terminology. One is likely to
receive the impression that this is a very complex and hard to understand subject. To
understand the present, it usually helps to understand how it came about and to this end;
the history of catchment modelling must be considered. This history can be divided into
five phases, or eras (Torno, 1985).
20
2.3.1
First era (1910-1930)
Known as “crude” era. In the decades, or perhaps centuries, prior to this time,
people had some interest in hydrologic processes and had methods by which they could
infer conclusions about the response of rivers to meteorological phenomena. In this
period that the first attempts were made to actually analyse and quantify those processes.
One such approach was to observed, for a number of events, the quantity of rainfall at
some point in the catchment, usually the outlet, and the amount by which the stage of the
river increased and to then plot precipitation versus rise in stage. Another very early and
very simple approach is called the Rational Method. In it, the maximum rate of outflow
from a catchment is equated to the product of three quantities, the rainfall intensity, the
catchment area, and a coefficient. It is assumed that the rain continuous at the stated
intensity until the maximum flow is reached, and equilibrium condition. Obviously, if a
rain storm maintains a constant intensity until equilibrium is reached and if the correct
value of the coefficient is known, the method will yield a correct value of peak discharge.
2.3.2
Second era (1930-1945)
Known as the “theoretical” era. After the crude methods had been thoroughly
exploited and after it was realized that the physical process is really much more
complicated than this, investigators began to turn to what they felt were more scientific
approaches. Work done during the theoretical era was characterized by attempts to
understand the processes and to represent them mathematically. Typical of the period
was the work of Horton (1933) and others in analyzing the infiltration process. The
solution to the problem was obvious that to determine how much water soaked into the
ground, subtract this from the total rainfall and the reminder was what went to the river.
The key to the whole thing was the infiltration process.
21
2.3.3 Third era (1945-1965)
Known as the “empirical” era. The word empirical is defined as ‘relying upon or
derived from observation or experiment”. The investigators of the empirical era decided
that the reason the theoretical approach did not work well was that no one really did
understand the rainfall-runoff process and that the mathematical relationships based in
this false understanding were wrong. Consequently, any attempt to fit observed data to
such formulations was doomed to failure. The right way to handle the problem then was
to observe, for many events, the magnitude of rainfall, of river response, and of any other
factors which were logically thought to be involved in the process. And then experiment
with different types of mathematical, or graphical, relationships until one is found which
causes everything to “fit together”. That is, let the data, rather than theory, dictate the
form of relationship to be used. The most notable example of this philosophy is the work
of Kohler et. al. in the late 1940’s with a coaxial graphical rainfall-runoff relationship
employing the now well known “Antecedent Precipitation Index” or “API”. This method
soon proved to be quite suitable for various types of hydrologic studies and has been very
popular for the last thirty years.
2.3.4
Fourth era (1965-1975)
Known as the “conceptual” era. Computers became available to hydrologists in
the late 1950’s and one of the first uses made of them was the automation of existing
hydrologic techniques.
The conceptual era of catchment modelling began with the
publication by Linsley and Crawford of the Stanford Watershed Model in 1966. Their
approach was new at the time and, to the best of the author’s knowledge, original with
them. It consists of writing mathematical formulations which represent the modeller’s
concept of all of the processes. Linsley and Crawford saw the opportunity to developed
new modelling techniques, to do things which cannot be done without a computer simply
because those things involve a tremendous volume of computation. Since 1965, other
22
hydrologists independently have produced a number of other conceptual catchment
models.
2.3.5
Fifth era (1975-present)
Some observers feel that “black box” modelling is a new era in hydrology that
using modern and powerful computing equipment. The black box approach to catchment
modelling was introduced by Wallis and Todini in 1973 in the form of the “Constrained
Linear system”, or “CLS” Model. It involves the use of a computer to correlate great
masses of input and output data using mathematical formulations having little or nothing
to do with hydrological processes but which result in outputs which resemble a
catchments output, a hydrograph. A claim advantage of black box models is the ability,
when being calibrated on poor quality historical data, to ‘filter out’ random errors in the
data thereby producing a set of parameters which accurately represents the hydrological
characteristics of the catchment.
2.4
Rainfall-Runoff Models
Clarke (1973) described the rainfall-runoff models by one or more of the
characteristic. Firstly are the deterministic or probabilistic models. It is based on the
characteristics of model parameters and variables. If the model parameters or variables
are considered random variables with probability distributions, the model is called
probabilistic. If the model parameters or variables are free from any random variation,
the model is deterministic. Secondly are the lumped or distributed models. It is based on
the geometric or probabilistic. If the model parameters or variables vary spatially within
the watershed, the model is a distributed model; otherwise, it is a lumped model. Thirdly
are linear or nonlinear models. It is system-theory sense or statistical regression sense. If
23
the principle of superposition is not violated, the model is linear in system-theory sense.
If the model parameters are linear, the model is linear in statistical regression sense.
Otherwise, the model is nonlinear. Fourthly are continuous or discrete models. If the
model uses continuous functions in the formulation of the physical phenomena, it is
continuous. Otherwise, it is discrete. Fifthly are the black-box or process models. It is
based on analyses of rainfall input, runoff output, and a transfer function, which simulates
the relationship between rainfall and runoff in a watershed, such as unit hydrograph and
time-area methods (McCuen, 1997). Conceptual models are combinations of process and
black-box models. In these models, the physical processes are defined using process
models and model parameters are optimized employing a black-box approach (Singh,
1988). Examples where the conceptual models are used include the Clark, Nash and
Stanford Watershed models. Lastly are the event-driven or continuous process models.
It is based on the simulation period. If the model is designed to simulate a single event, it
is called an event-driven model. The focus of event-driven models is on the evaluation of
surface runoff and direct infiltration. The most commonly used event-driven models are
HEC-HMS, developed by the US Army Corps Engineers; Storm Water Management
Model (SWMM), developed by US Environmental Agency, and others.
Beven (2001) revealed very basic classification of hydrologic model as describe
follows. Lumped models treat the catchment as a single unit, with state variables that
represent averages over the catchment area, such as average storage in the saturated zone.
Distributed models make predictions that are distributed in space, with state variables that
represent local averages of storage, flow depths or hydraulic potential, by discretizing the
catchment into a large number of elements or grid squares and solving the equations for
the state variables associated with every element grid square. Parameter values must also
be specified for every element in a distributed model. Meanwhile, the deterministic or
stochastic models permit only one outcome from a simulation with one set of inputs and
parameter values. Stochastic models allow for some randomness or uncertainty in the
possible outcomes due to uncertainty in input variables, boundary conditions or model
parameters. The vast majority of models used in rainfall-runoff modelling are used in a
deterministic way, although again the distinction is not clear-cut since there are examples
24
of models which add a stochastic error model to the deterministic predictions of the
hydrological model and there are models that use a probability distribution function of
state variables but make predictions in a deterministic way.
A rainfall-runoff model is essence of much engineering hydrology. Because flow
data are rarely available, design event are usually determined by a combination of rainfall
information and rainfall-runoff relationships.
As mentioned before, determining the
relationship between rainfall-runoff for a catchment area is one of the most important
problems faced by hydrologists and engineers. Although many catchments have been
gauged to provide continuous records of runoff, hydrologists and engineers are often
faced with situations where little or no information is available. In such instances,
simulation models are often used to generate synthetic flows. In hydrologic context,
another approach of modelling is emerging which uses the data to evolve the model
themselves. The data are seen the primary source of information and the control area or
volume is considered as a system. This process is essentially a system identification
process. The traditional methods of system identification based on statistical analysis of
input and output variables such as correlation analysis, spectral analysis, dynamic linear
model or state-space models seem to be inadequate to explain the non-linear system
behaviour. Therefore, a new type of modelling paradigm based on artificial intelligence
such as artificial neural network, fuzzy logic, genetic algorithm, etc. are increasingly
offering a promising alternative. In recent years, there is a growing interest to use the
artificial intelligence models especially in the field of hydrologic. The main advantage of
these soft modelling techniques is that these can be set up with a considerably less time
and the model response also can be obtained fast thus reducing cost. Moreover, these
techniques can be used for modelling systems on a real-time basis and even can be
updated continuously.
Figure 2.1 shows a schematic outline of the different steps in the modelling
process that has been designed based on the experience and observation. It shows the
perceptual model of the rainfall-runoff processes in a catchment. The perceptual model is
the summary of our perceptions of how the catchment responds to rainfall under different
25
conditions.
From this figure, it also can be described how the catchment responds to
rainfall under different conditions. It was from our views according to previous study,
the data sets that have analyzed and particularly the field sites that from experience of in
different environments. The perceptual model is not constrained by mathematical theory
(Beven, 2001). However, a mathematical description is traditionally, the first stage in the
formulation of a model that will make quantitative predictions.
Revise quality
and quantity of
the data
Revise
perceptions/
model structures
Revise equations
Revise
parameter values
Selection of the data
(quality and quantity)
Deciding on the processes or
model structures
The conceptual model
-Deciding on the equations
Model calibration
- Getting values of parameters
Model validation/testing
No
Declare
Declare
Success?
Success?
Yes
Ok!
Figure 2.1
A schematic outline of the different steps in the modelling process
26
The steps involved in the identification of the perceptual model of a system are
selection of input-output data suitable for calibration and validation, and then selection of
a model structure and estimation of its parameters. Before we can apply the model, it is
generally necessary to go through a stage of parameter calibration. It is because all the
models used in hydrology have equations that involve a variety of different input and
state variables. There are variables that define the time variable boundary conditions
during simulation, such as the rainfalls and other meteorological variables at a given time
step, water table depth, initial values of the state variables, etc. Finally, there are the
model parameters which define the characteristics of the catchment area or flow domain.
The most commonly used method of parameter calibration is to use a technique of
adjusting the values of the parameters to achieve the best match between the model
predictions and any observations of the actual catchment response that may be available.
The next stage is validation or evaluation of those predictions. Most model structures
have a sufficient number of parameters that can be varied to allow reasonable fits to the
data. It is usually much more difficult to find a model that is totally acceptable. The
differences may lead to a revision of the parameter values being used, to a reassessment
of the conceptual model being used, or to a revision of the perceptual model of the
catchment as understanding is gained from the attempt to model the hydrological
processes. Bronstert (1999) have revealed that from the experience in using distributed
models as follows:
(a)
There is still a lack of knowledge about how to represent some processes
at or near the soil surface, including surface crusting and preferential
flows.
(b)
In many practical applications the data available on soil characteristics and
boundary conditions, with all their spatial and temporal variability, may
not be adequate to support the use of a fully distributed model.
(c)
In some cases experience suggests that the predicted responses are highly
sensitive to small changes in parameters, initial and boundary conditions.
It may be necessary to allow for this sensitivity by considering ranges of
27
parameters values and the propagation of the resulting uncertainty through
the model.
When applying a neural network to the rainfall-runoff problem, the stimulus is
obviously the rainfall, and the response is the runoff at the catchment outlet or
downstream. Since the flow at any time point is effectively composed of contributions
from different sub-areas whose time of travel to the outlet covers a range of values, both
the concurrent and antecedent rainfalls should be considered as stimuli (see Dawson and
Wilby, 1998). The number of antecedent rainfall ordinates required is broadly related to
the lag time of the drainage area. In addition, the use of previous output variables
(previous runoff value) in the input pattern is also encountered in some cases and is
referred to as recurrent network. Dawson and Wilby (1998) point out the importance of
careful data preparation and the choice of the input variables set. In this study, the writer
has placed emphasis on artificial neural network (ANN) methods. The objective of the
applications approach is to create ANN models that can be used to tackle real life
problems.
2.5
Artificial Neural Network
The development of ANN began approximately 50 years ago by McCulloch and
Pitts in 1943, inspired by a desire to understand the human brain and emulate its
functioning. Although the idea of ANN was proposed by McCulloch and Pitts over fifty
years ago, the development of ANN technique has experienced a renaissance only in the
last decade due to Hopfield’s (1982) effort, because the current algorithms overcome the
limitations of early networks. A tremendous growth in the interest of this computational
mechanism has occurred since Rumelhart et. al. (1986) rediscovered a mathematically
rigorous theoretical framework for neural networks, i.e., backpropagation algorithm
(ASCE, 2000).
28
Artificial Intelligence (AI) can be broadly defined as computer processes that
attempt to emulate the human thought processes that are associated with activities that
require the use of intelligence (Tsoukalas and Uhrig, 1997). The term AI in its broadest
sense, encompasses a number of technologies that includes, but is not limited to, expert
systems, neural networks, genetic algorithms, fuzzy logic systems, cellular automata,
chaotic systems, and anticipatory systems. Interestingly of artificial intelligence is that,
the use of computers to model the behaviour aspects of human reasoning and learning. In
problem solving, one must proceed from a beginning (the initial state) to the end (the goal
state) via a limited number of steps (Encyclopaedia, 2000). The method used to construct
such systems, knowledge engineering, extracts a set of rules and data from an expert or
experts through extensive questioning.
This material is then organized in a format
suitable for representation in a computer and a set of tools for inquiry, manipulation, and
response is applied. While such systems do not often replace the human experts, they can
serve as useful adjuncts or assistants.
Meanwhile, an artificial neural network (ANN) can be defined as ‘a data
processing system consisting of a large number of simple, highly interconnected
processing elements (artificial neurons) in an architecture inspired by the structure of the
cerebral cortex of the brain’ (Tsoukalas and Uhrig, 1997). Work on artificial neural
network models has a long history. The development of artificial neural networks began
approximately more than 40 years ago (Lippman, 1987), motivated by a desire to try both
to understand the brain and to emulate some of its strengths (Fausett, 1994). The brain
could be considered as a highly complex, nonlinear, and parallel computer. It has the
capability of organizing neurons so as to perform certain computations many times faster
than the fastest digital computer in existence today (Haykin, 1994). Neural nets use a
number of simple computational units called “neuron”, of which each tries to imitate the
behaviour of a single human brain cell.
The brain as a “biological neural net” and to implementations on computers as
“neural nets” (Altrock, 1995).
ANN were simulated based on the functions of the
mammalian nervous system. Network structure is formed by two main units, neurons
29
and interconnections between them. Various mathematical models are based on neuron
concept. Figure 2.2 shows a simple mathematical model of a neuron. This three layer
neural network architecture is a universal function approximator. The main control
parameters of neural network model are interneuron connection strengths also known as
weights and the biases. The optimum value of these connection weights and biases are
determined by minimising an objective function, usually mean square error. Weights Wij
are “learned” to fit any function. First, the so-called propagation function combines all
inputs X i that stem from the sending neurons. The means of combination is a weighted
sum, where the weights Wi represent the synaptic strength. Exciting synapses have
positive weights, inhibiting synapses have negative weights. To express a background
activation level of the neuron, an offset (bias) Θ is added to the weighted sum. The socalled activation function computes the output signal Y of the neuron from the activation
level f is of the sigmoid type as plotted in the lower right box of Figure 2.2.
Biological Neuron
Artificial Neuron
X1
Inputs
X2
W1
W2
X3
W3
Output, Y
W4
Wn
…
X4
Xn
Propagation Function
n
f = ∑Wi ⋅ Xi + Θ
Activation Function
Y
i =0
f
Figure 2.2
Simple mathematical model of a neuron
30
During learning, a neuron receives inputs from the inputs or previous layer,
weights each input with a pre-assigned value, and combines these weighted inputs. All
inputs are combined by weighted sum (propagation function). The combination of the
weighted inputs is represented as,
net j = ∑Wij X i
(2.1)
where, net j is a summation of weighted input for the j th neurons; Wij is a weight from
the i th neuron in the previous layer to the j th neuron in the current layer; and X i is the
input form the i th to the j th neuron. The net j is either compared to a threshold or
passed through a transfer function to determine the level of activation. If the activation of
a neuron is strong enough, it produces an output that is sent as an input to other neurons
in the successive layer.
2.5.1
Basic Structure
A network can have one or more layers. As shown in Figure 2.3, the basic
structure of a network usually consists of three layers: the input layer, where the data are
introduced to the network; the hidden layer or layers, where the data are processed; and
the output layer, where the results for given inputs are produced. Layers in between the
input and output layers are called hidden layers that can be one or more. Determination
of structure of hidden layers and number of neurons is important in the multilayer
perceptron modelling. There is no hard and fast rule for defining networks parameters.
Masters (1993) gave three simple guidelines to follow. The first is to use one
hidden layer; second is to use very few hidden neurons; and the third is to train until we
can’t stand anymore. He suggested that using one hidden layer for multilayer perceptron
was sufficient because in most problems the second hidden layer will not produce a large
improvement in performance. The use of more than one hidden layers substantially
increases the number of parameters to be estimated. Such an increase in the number of
31
the parameters may slow the calibration process without substantially improving the
efficiency of network (Masters, 1993). It has been reported that for vast majority of
practical problems, there is no reason to use more than one hidden layer. Additional
hidden layers should be added only when a single hidden layer has found to be
inadequate (Tsoukalas and Uhrig, 1997).
Input layer
Hidden layer
1
Outputlayer
Output
layer
1
1
y
(k )
c(( kj))
m
xi (i = 1,..., m)
Figure 2.3
wij
n
A three-layer neural network with i inputs and k outputs
The architecture of neural networks is designed by weights between neurons, a
transfer function that controls the generation of output in a neuron, and learning laws that
define the relative importance of weights for input to a neuron (Caudill, 1987). In
practice, however, neural networks cannot provide the solution working by themselves
alone. Rather they need to be integrated into consistent system engineering approach.
Neural nets operate analogously. The most distinctive characteristic of an ANN is its
ability to learn from examples. A net must be trained by being repeatedly fed input data
together with corresponding target outcomes. Learning (or training) is defined as selfadjustment of the network weights as a response to changes in the information
environment. When a set of inputs is presented, a network adjusts its weights in order to
approximate the target output (observed or measured output) based on a certain
algorithm. Learning in ANN consists of three elements; weights between neurons that
define the relative importance of the inputs, a transfer function that controls the
32
generation of the output from a neuron, and learning laws that describe how the
adjustments of the weights are made during training (Caudill, 1987). The net of a neuron
is passed through an activation or transfer function to produce the output of a neuron. In
the backpropagation networks, the modification of the network weights is accomplished
with the derivative of the activation function. Therefore, continuous-transfer functions
are desirable. The sigmoid and hyperbolic-tangent functions are the most commonly
used continuous transfer function in the backpropagation networks (see Lipmann, 1987;
ASCE, 2000; Tokar and Johnson, 1999; Tokar and Markus, 2000).
2.5.2
Transfer Function
The transfer function determines the level of activation of a neuron by scaling the
net. If the activation of a neuron is strong enough, it will produce an output and sends the
result to other neurons. The most commonly used transfer functions are hard-limit,
threshold logic, continuous functions such as sigmoid and hyperbolic tangent, and
Gaussian functions. Some examples of typical transfer functions are illustrated in Figure
2.4 to 2.7. The simplest transfer function is the threshold logic function and a hard-limit
transfer function.
The difference between hard-limit function and threshold logic
function is that the activation of a neuron in between upper and lower limits of threshold
is zero for the threshold-logic function whereas this value varies linearly depending on
the value of net for the hard-limit function. In early neural network architecture, these
two types of transfer functions were used in the training process, but proven to be
inefficient. In advanced structures of neural networks, a learning process using the
derivative of the transfer function has been usually employed. A transfer function that is
differentiable or continuous everywhere is required in these types of network.
Continuous functions have been adapted for the advanced form of networks.
The
accuracy of the network trained using hyperbolic tangent transfer function was slightly
better than the one trained using sigmoid transfer functions (Tokar, 1996).
The
hyperbolic tangent function shown in Figure 2.6 is one of the most commonly used
33
continuous functions. As shown in Figure 2.6, the difference between the sigmoid and
the hyperbolic tangent functions is that the latter is bipolar. Meanwhile, Figure 2.7 show
the Gaussian function. Compared to the other types of transfer functions the Gaussian
function is the most commonly used for radial basis function (RBF) networks.
out
out
1
1
or
net
0
net
0
-1
Figure 2.4
A threshold-logic transfer function
out
out
1
1
0
net
or
0
net
-1
Figure 2.5
A hard-limit transfer function
out
1
out
1
or
0
net
0
net
-1
(a)
Figure 2.6
(b)
Continuous transfer function: (a) the sigmoid, (b) the hyperbolic tangent
out
1
0
Figure 2.7
net
The gaussian function
34
2.5.3
Back-propagation Algorithm
Several algorithms of neural network model exist and widely used by the
researchers. In this study, the training of ANN was accomplished by a backpropagation
algorithm. Backpropagation is the most commonly used supervised training algorithm in
the multilayer feed-forward networks (Tokar and Johnson, 1999).
Backpropagation
which belongs to supervised learning algorithm that performs a gradient descent search in
weights space using generalized delta rule is often reported in applications (Minns and
Halls, 1996). Backpropagation is a systematic method for training multiple (three or
more) layers ANN (Tsoukalas and Uhrig, 1997). The least mean square error method,
along with the generalized delta rule, is used to optimize the network weights in
backpropagation network. The objective of a backpropagation network is to find the
weight that approximate target values of output with a selected accuracy.
The
backpropagation algorithm involves two steps. The first step is a forward pass, in which
the effect of the input is passed forward through the network to reach the output layer.
The network output is compared with the desired target output, and an error is computed.
After the error is computed, a second step starts backward through the network. The
errors at the output layer are propagated back toward the input layer with the weights
being modified. With the development of a backpropagation algorithm, the network
weights are modified by minimizing the error between a target and computed outputs.
The momentum factor can speed up training in very flat regions of the error
surface and help prevent oscillations in the weights. A learning rate is used to increase
the chance of avoiding the training process being trapped in local minima instead of
global minima (ASCE, 2000). Such a network, with learning governed by a generalized
delta-rule, is typically called a backpropagation neural network. This algorithm was
developed earlier by Werbos (1974) as part of his Ph.D. dissertation at Harvard
University (Tsoukalas and Uhrig, 1997).
Nevertheless, its powerfulness was not
recognized and appreciated for many years. The elucidation of this training algorithm in
1986 by Rumelhart, Hinton, and Williams was the key step in making neural networks
practical in many real-world situations.
Today, it is estimated that 80% of all
35
applications utilize this backpropagation algorithm in one form or another (Tsoukalas and
Uhrig, 1997).
2.5.4
Learning or Training
The objective of a neural network is to process the information in a way that is
previously trained. Neural network can learn from experience, generalize from previous
examples to new ones, and abstract essential characteristics from inputs containing
irrelevant data (Fausett, 1994). Calibration or training uses sample data sets of inputs and
corresponding outputs to perform of the neural network. The increased knowledge of the
biological background of the brain and the rapidly developing knowledge of electronics
and automatic computation initiated development of artificial neurons and neural
networks.
According to Tokar (1996), learning can be categorized into three fundamental
groups based on whether outputs are provided or not, namely supervised learning, graded
or reinforcement learning, and unsupervised learning or self-organization. The most
popular approach and easiest way to training the network are by using the supervised
learning. In supervised learning, a pair of input–output vector called training pair or set
is introduced to a network. A network computes the outputs for a given set of inputs and
compares these outputs with target outputs.
The difference between the computed
outputs and target outputs is used to modify the weights in the network using a method
minimizing the difference or error. This procedure is repeated until an acceptable level
of accuracy is reached (Hecht-Nielsen, 1991). In unsupervised learning, the training set
is composed of only inputs, no outputs are provided. The network does not have any
knowledge of what the correct answer is. Therefore, the network modifies the weights in
a way that similar inputs yield similar outputs. During the training process, statistical
characteristics of given data are extracted and similar inputs are arranged into similar
classes (Wassermann, 2000).
Meanwhile, reinforcement learning is somewhat in
36
between the supervised and unsupervised learning. In the reinforcement learning, a score
or grade is given to a network based on the performance of network over a series of
multiple training trials instead of output for each trial (Hecht-Nielsen, 1991).
By considering the historical data, the calibration (training) and verification
(testing) of the model is carried out for the rainfall-runoff data series at selected
catchments. Figure 2.8 shows the steps in training and testing of the neural network
model. The models were trained to calibrate parameters. Then, the model is trained
starting with some small randomly generated weights. After the predefined target error
for all patterns is reached, the training process is stopped and the weights are saved.
These weights and the same architecture of the model are utilized in the verification
(testing) phase.
In the development of rainfall-runoff relationship models, the networks were
trained using a backpropagation algorithm and one or more hidden layer. The weights
are update or modified iteratively using the generalized delta rule or the steepest-gradient
descent principle. Training of the networks was accomplished using the MATLAB
software developed by MathWorks Inc. (2000). After a sufficient number of training
iterations, the network learns to recognize patterns in the data and creates an internal
model of the process governing the data. The training process is stopped when no
appreciable change is observed in the values associated with the connection links or some
termination criterion is satisfied. The network can then use this internal model to make
predictions for new input conditions. Evaluation of the model performance will be based
on the generated errors and the model robustness.
The MLP model of multilayer feedforward neural network is the most commonly
used in hydrological modelling so far with regard to input-output function approximation.
MLP has been applied successfully to solve some difficult and diverse problems by
training them in a supervised manner. The MLP is a supervised neural network that
learns the mapping between input data and target data. Meanwhile, RBF networks may
require more neurons than standard feed-forward backpropagation network, but often
37
they can be designed in a fraction of the time it takes to train standard feed-forward
networks. They work best when many training vectors are available (SPSS Inc., 1995).
Given model inputs
Determination of model
architecture and parameters
Parameter estimation
Total sum of square error
between computed outputs and
target values ≤ Allowable?
No
Yes
Verification
Figure 2.8
2.6
Steps in training and testing
Neural Network Application
Several studies indicate that neural networks have proven to be potentially useful
tools in hydrological modelling such as for modelling of rainfall-runoff processes (Buch
et. al., 1993; Anamala et. al., 1995; Smith and Eli, 1995; Shamseldin, 1997; Abrahart et.
al., 1999); flow prediction (Ichiyanagi, 1993; Karunithi et. al., 1994; Dibike and
Solomatine, 1999; Tingsanchali, 2000; Elshorbagy et. al., 2000); water quality
predictions (Maier and Dandy, 1996); operation of reservoir system (Sakakima et. al.,
1992; Raman and Chandramouli, 1996; Harun, 1999); water demand forecasting (Grino,
1992; Zhang et. al., 1994); groundwater reclamation problems (Ranjithan and Eheart,
1993) and modelling rating curve (Tawfik et. al., 1997).
A case study of runoff
38
simulation of a Himalayan glacier basin using artificial neural network was presented
(Buch et. al., 1993). It was found that the neural network performance was superior
compared to the energy balance and the multiple regression models. In addition, it was
observed that the neural network was faster in learning and exhibited excellent system
generalization characteristics.
Smith (1995) used back-propagation neural network
model to predict only the peak discharge and the time to peak resulting from a single
rainfall pattern.
Tokar and Johnson (1999) was employed artificial neural network
(ANN) to forecast daily runoff as a function of daily precipitation, temperature, and
snowmelt for the Little Patuxent River watershed in Maryland. It was found that the
ANN model provides a more systematic approach, reduces the length of calibration data,
and shortens the time spent in calibration of the models. Further, Tokar and Markus
(2000) was applied ANN technique to model watershed runoff in three basins with
different climatic and physiographic characteristics. The ANN technique was applied to
model monthly streamflow. It was found that ANN models could be powerful tools in
modelling the precipitation-runoff process for various time scales, topography, and
climate patterns. At the same time, it represents an improvement upon the prediction
accuracy and flexibility of current methods.
2.7
Neural Network Modelling in Hydrology and Water Resources
In hydrology, the problems are not clearly understood or are too ill defined for a
meaningful analysis using physically-based methods. Even when such models are
available, they have to rely on assumptions that make neural networks seem more
attractive. Moreover, neural networks routinely model the non-linearity of the underlying
process without having to solve complex partial differential equations. The presence of
noise in the inputs and outputs is handled by neural networks without severe loss of
accuracy because of distributed processing within the network. Many articles report that
neural networks method can produce good models of the problem that accurately
represent nonlinearities in the data (Bishop, 1995). Therefore, the natural behaviour of
39
hydrological processes is appropriate for the application of neural networks method. The
characteristics of non-linearity and existence of noise component in hydrological
processes demand a solution that can promise a reliable result.
The neural networks technique is fairly new to water resources research.
However, there is rapidly growing interest among water scientists to apply neural
networks to various water resources problems. Hydrologists are often faced the problems
of prediction and estimation of runoff, precipitation, especially in ungauged catchment,
contaminant concentrations, water stages, and so on. Most hydrologic processes exhibit a
high degree of temporal and spatial variability and are further plagued by issues of nonlinearity of physical processes, conflicting spatial and temporal scales, and uncertainty in
parameter estimates. Hydrologists attempt to provide rational answers to problems that
arise in design and management of water resources. An attractive feature of ANN is their
ability to extract the relation between the inputs and outputs of a process, without the
physics being explicitly provided to them. They are able to provide a mapping from one
multivariate space to another, given a set of data representing that mapping. Even if the
data is noisy and contaminated with errors, ANN has been known to identify the
underlying rule. These properties suggest that ANN may be the well-suited to the
problems of estimation and prediction in hydrology.
ANN has been applied in rainfall-runoff modelling. The relationship between
rainfall-runoff is known highly non-linear and complex. The hydrologists and engineers
always faced the problem with situations where little or no information is available. The
information about rainfall and runoff is important in design and manage watersheds, and
other project related to water resources. Runoff is dependent on numerous factors such
as initial soil moisture, land use, evaporation, transpiration, infiltration, distribution and
so on. It is important to make sure that a watershed should be gauged to provide
continuous records of runoff, rainfall and so on. It is because this data is very useful for
hydrologists and engineers to carried out a research and development, manages water
resources projects, etc.
In such cases of no available data in certain catchment,
simulation models are often used to generate synthetic flows. Several researchers have
40
investigated the potential of neural networks in modelling watershed runoff based on
rainfall inputs. Hsu et. al. (1995) used a three layer feedforward ANN to model daily
rainfall-runoff relationship. They concluded that the feedforward ANN needed a trial and
error procedure to find the appropriate number of time delayed input variables to the
model. ANN was able to provide a representation of the dynamic internal feedback loops
in the system, eliminating the need for lagged inputs and resulting in a compact weight
space. They found that ANN performed well at runoff prediction. Tokar and Johnson
(1999) employed ANN to forecast daily runoff as a function of daily precipitation,
temperature and snowmelt for the Little Patuxent River watershed in Maryland. ANN
model provides a more systematic approach, reduces the length of calibration data, and
shortens the time spent in calibration of the models. They used a three layer feedforward
ANN and apply trial and error procedure to find the appropriate number of input
nodes to the model. They reported that ANN provides reasonable good solutions for
circumstances where there are complex systems that may be poorly defined or understood
using mathematical equations, problems that deal with noise or involve pattern
recognition, and situations where input data are incomplete and ambiguous by nature.
ANN models provided higher training and testing accuracy when compared with
regression and simple conceptual models.
Smith and Eli (1995) applied a back
propagation ANN model to predict peak discharge and time to peak over a hypothetical
watershed. The output was either the watershed runoff alone or the runoff and the time to
peak. The number of nodes in the hidden layer was determined by trial and error for each
case. They found that, for single storm events, the peak discharge and the time to peak
were predicted well by neural network, both during training and testing. But, the ANN
was less successful for multiple storm events. Shamseldin (1997) have compared ANN
with a simple linear model, a season-based linear perturbation model, and a nearest
neighbor linear perturbation model. They used a three layer neural network and adopted
the conjugate gradient method for training the model. The network output consisted of
the runoff time series. The found that neural networks generally performed better than
the other models during training and testing. Minns and Hall (1996) adopted a three layer
neural network with back-propagation algorithms. Data for network training consisted of
41
model results from one storm sequence, and two such sequences were generated for
testing. Each storm sequence was generated using Monte Carlo procedure that preserved
predetermined storm characteristics. Network inputs consisted of concurrent and 14
antecedent rainfall depths and 3 antecedent runoff values, and the network output was
current runoff. They found that ANN performance was hardly influenced by level of
nonlinearity, with performance deteriorating only slightly for high levels of nonlinearity.
Haykin (1994) showed that design of a supervised neural network might be pursued in a
number of different ways. While the back-propagation algorithm for the design of a
multilayer perceptron (under supervision) may be viewed as an application of stochastic
approximation, radial basis function (RBF) networks can be viewed as a curved curvefitting problem in a high-dimensional space. Therefore, the learning for such networks is
equivalent to finding a surface in a multidimensional space that provides a best fit to the
training data, with the criterion for ‘best-fit’ being expressed in a statistical sense.
ANN has been applied in streamflow estimation. Streamflows are often treated as
estimates of runoff from watershed. Karunanithi et. al. (1994) found lag time to be
important in predicting streamflows or runoffs. This reflects the longer memory
associated with streamflows. They claimed that ANN are likely to be more robust when
noisy data is present in the inputs. Markus et. al. (1997) used ANN with the backpropagation algorithm to predict monthly streamflows at the Del Norte gauging station in
the Rio Grande Basin in Southern Colorado.
The inputs used were snow water
equivalent and temperature. They looked at forecast bias and root mean square error for
assessing model performance.
They found that ANN did a god job of predicting
streamflows. The studies of Karunanithi et. al. (1994) and Thirumalaiah and Deo (1998)
directed network training to better replicate low streamflow events, meanwhile Poff et. al.
(1996) concentrated on high flow events to generate improved statistics for floods. They
used ANN to evaluate the changes in stream hydrograph from hypothetical climate
change scenarios based on precipitation and temperature changes. The synthetic daily
hydrograph was generated based on historic precipitation and temperature as inputs. The
presented a procedure developed using current technical literature, heuristics and
42
experience of experts in artificial intelligence.
They applied the back-propagation
algorithm for cases of scarce information were used.
ANN has been applied in water quality modelling. There are several studies have
applied ANN to address water quality related issues. Water quality is influenced by
many factors such as flow rate, contaminant load, water levels and other site-specific
parameters. Maier and Dandy (1996) applied the utility of ANN for estimating salinity at
the Murray Bridge on the River Murray in South Australia. The inputs to the ANN
model were daily salinity values, water levels, and flows at upstream stations and
antecedent times. Network output was the 14-day advance forecast of river salinity.
They used two hidden layers and back-propagation algorithms for training.
They
concluded that the ANN model was able to replicate salinity levels fairly accurately
based on 14-day forecast. Meanwhile, Starrett et. al. (1996) employed an ANN to predict
pesticide leaching through turfgrass-covered soil. The variables of input nodes were
pesticide solubility, rate of pesticide application, time since pesticide application, and
type of irrigation practice. The ANN output was the percentage of pesticide that leached
through 50cm of turfgrass-covered soil. They found that the application of ANN model
was successful. Nagy et al. (2002) proposed feed-forward multilayer neural network
approach on sediment transport using the back-propagation algorithm. Their objective of
their study is to estimate total sediment discharge concentration in streams and natural
rivers. Meanwhile, Zou et. al. (2002) proposed a neural network embedded Monte Carlo
(NNMC) approach to account for uncertainty in water quality modelling. The framework
of their proposed method has three major parts: a numerical water quality model, a neural
network technique and Monte Carlo simulation.
ANN has been applied in groundwater modelling. For example, Yang et. al.
(1997) applied an ANN to predict water table elevations in subsurface drained farmlands.
The input nodes are daily rainfall, potential evapotranspiration, and previous water table
locations. The output was the current location of the water table. They found that a three
layer feedforward ANN could predict water table elevations satisfactorily. Aziz and
Wong (1992) used ANN to determining aquifer parameter values from normalized
43
drawdown data obtained from pumping tests. The input layer consisted of confined
aquifer data and the leaky confined aquifer data. The values of aquifer parameters
predicted by the ANN compared well with results using traditional methods.
Other ANN applications to problem of ground water were investigated by
Ranjithan et. al. (1993). They concluded that the pattern recognition strengths of ANN
are particularly useful for identifying the more critical realizations. The problem of
identifying optimal pumping strategies to control hydraulic gradient becomes simplified.
Rogers and Dowla (1994) employed an ANN to perform optimization studies in ground
water remediation.
They investigated hypothetical scenarios of one or several
contaminant plumes moving through a ground water region with a number of pumping
wells. The optimization arises in trying to minimize the total volume of pumping. A
multilayer feedforward ANN was applied together with back-propagation algorithm. The
input nodes are possible pumping cases, with wells being assigned a value of 1,
indicating that the well was pumping at the maximum capacity, or zero, indicating that
the well was off. The ANN output represent whether or not the realization of pumping
met the successful with the values being either 1 if successful, or 0 if not. They found
that ANN model are robust and flexible tools that can be used for planning effective
strategies in ground water remediation.
ANN has been applied for estimating precipitation. The hydrologists always
faced the problem with the ungauged catchment.
It is because the information of
precipitation are very important in water resources design and management. Precipitation
serves as the driving force for most hydrologic processes. It is difficult to predict because
it exhibits a large degree of spatial and temporal variability. French et. al. (1992) used a
three layer feedforward ANN with back-propagation algorithm to forecast rainfall
intensity fields. They also study studied the impact of different number of hidden nodes,
using 15, 30, 45, 60, and 100 hidden nodes. They compared ANN with the forecasting
models. They found that the ANN model performed slightly better than these models
during the training and testing stage after a suitable architecture had been identified.
Zhang et. al. (1997) has proposed that ANN need to be employed in groups when the
44
transformation from the input to the output space in complex. This group theory treats
the input-output relationship mapping as being piecewise continuous. They found that
ANN were successful in making half-hourly rainfall estimates. Meanwhile, Hsu et. al.
(1998) developed a modified counter-propagation ANN for transforming satellite infrared
images to rainfall rates over a specified area. They trained separately the connection
weight between the input and hidden layer, and between the hidden and output layer. The
training algorithms used is a back-propagation.
They used an unsupervised self-
organizing clustering procedure based on the principle of competition. They showing
comparisons between observed and predicted rainfall rates.
They found that ANN
provided a good estimation of rainfall and yielded some insights into the functional
relationships between the input variables and the rainfall rate.
2.7.1
Versatility of Neural Network Method
Very often in hydrology, the problems are not clearly understood or are too illdefined for a meaningful analysis using physical-based methods.
Even when such
models are available, they have to rely on assumptions that make ANN seem more
attractive. Moreover, ANN routinely models the nonlinearity of the underlying process
without having to solve complex partial differential equations. Unlike regression-based
techniques, there is no need to make assumptions about the mathematical form of the
relationship between input and output. The presence of noise in the inputs and outputs is
handled by an ANN without severe loss of accuracy because of distributed processing
within the network. This, along with the nonlinear nature of the activation function, truly
enhances the generalizing capabilities of ANN and makes them desirable for a large class
of problems in hydrology. Markus (1997) addressed that their versatility of neural
networks application will allow them to be applied to an increasing range of hydrologic
problems in the future.
45
The neural networks model, amongst the black box models, has gained wider
applicability, as the functional form between the input variables and the output is not
required to be defined a priori. ASCE (2000) outlined the following reasons the ANN
become an attractive computational tool:
(i)
They are able to recognize the relation between the input and output
variables without explicit physical consideration.
(ii)
They work well even when the training sets contain noise and
measurement errors.
(iii)
They are able to adapt to solutions over time to compensate for changing
circumstances.
(iv)
They posses other inherent information-processing characteristics and
once trained are easy to use.
(v)
It can evolve on its own a nonlinear relationship between input and output.
(vi)
The model is highly flexible in structure.
Neural networks have a built-in capability to adapt their synaptic weights to changes in
the surrounding environment. In particular, a neural network trained to operate in a
specific environment can be easily retrained to deal with minor changes in the operating
environmental conditions.
Moreover, when it is operating in a non-stationary
environment, for example one whose statistics change with time, a neural network can be
designed to change its synaptic weights in real time.
2.8
Bivariate Linear Regression and Correlation in Hydrology
Correlation and regression procedures are widely used in hydrology, water
resources engineering and other sciences. Their use has been greatly enhanced in recent
decades by the availability of computers and an abundant number of computer programs
for various aspects of the analyses. The premise of the methods is that one variable is
often conditioned by the value of another, or several others or the distribution of one may
46
be conditioned by the value of another. Regression analyses are among the oldest
statistical techniques used in hydrology. They were used primarily in the past to transfer
information between points at which the same variable was observed, or among several
variables observed simultaneously. This included the estimation of missing data in
hydrologic series, and the prediction of a variable from an observed variable or several
other variables. The explicit determination of the regression equation is in the sense of
the final product of the analysis. The followings briefly outlined the analysis by some
researchers using this technique for prediction works.
Laursen (1958) proposed a relationship that give both quantity and quality of
total, suspended and bed loads as functions of stream and sediment characteristics. Colby
(1964) developed graphical solutions for total load based from laboratory and field data.
Meanwhile, Brownlie (1983) succeeded in obtaining an improved solution of the onedimensional equation that is a fairly simple regression equation.
Kitamura and
Nakayama (1985) proposed a multiple linear regression model to describe the
relationships between monthly rainfalls and monthly net inflows for Pedu-Muda reservoir
system. Karim and Kennedy (1990) derived a relation between flow velocities, sediment
discharge, bed-form geometry and friction factor of alluvial rivers using non-linear form
of the multiple regression analysis.
Harun (1999) proposed the multiple linear
regressions for modelling of stochastic reservoir net inflow processes.
Regression represents a mathematical equation expressing one random variable as
being correlatively related to another random variable, or to several random variables.
The regression equation may be any function that can be fitted to a set of points of
observed variables. The selection of the function to be fitted to points determines the
type and the degree of correlative association. Determining mathematical models of
correlative association of two or more variables, so that the best prediction of one
variable can be obtained from the other variable or variables, is referred to as regression
analysis, and the models are called regression functions.
47
The correlation and regression techniques provide a powerful means to identify
the mathematical dependence between observed values of physically related variables
and can account for the additional information contained in correlated sequences of
events. Sampling errors are reduced and the reliability of estimates is improved. In
addition, to predicting the mean or expected value of a hydrologic variable such as
rainfall, runoff, or peak flows, the technique can be used to predict the expected value of
other
statistical
parameters,
for
example,
standard
deviation,
skewness,
or
autocorrelation. The correlation is determined between the desired statistical parameter
as dependent variable, and the appropriate physical and climatic variables within the
basin or region as the independent variables. The procedures are significantly better
than using relatively short historical sequences and point-frequency analysis. Not only
does the method reduce the inherently large sampling errors but it furnishes a means
to estimate parameters at ungauged locations.
There are limitations to the techniques of fitting regression. First, the analyst
assumes the form of the model that can express only linear, or logarithmically linear,
dependence. Second, the independent variables to be included in the regression analysis are selected. And, third, the theory assumes that the independent variables are
indeed independent and are observed or determined without error. Advanced statistical methods that are beyond the scope of this text offer means to overcome some of
these limitations but in practice it may be impossible to satisfy them. Therefore, care
must be exercised in selecting the model and in interpreting results.
Accidental or casual correlation may exist between variables that are not
functionally correlated. For this reason, correlation should be determined between
hydrologic variables only when a physical relation can be presumed. Because of the
natural dependence between many factors treated as independent variables in
hydrologic studies, the correlation between the dependent variable and each of the
independent variables is different from the relative effect of the same independent
variables when analyzed together in a multivariate model. One way to guard against
48
this effect is by screening the variables initially by graphical methods. Another is to
examine the results of the final regression equation to determine physical relevance.
Alternatively, regression techniques themselves aid in screening significant
variables. When electronic computation is available, a procedure can be followed in
which successive independent variables are added to the multiple regression model,
and the relative effect of each is judged by the increase in the multiple correlation
coefficient. Although statistical tests can be employed to judge significance, it is
useful otherwise to specify that any variable remain in the regression equation if it
contributes or explains, say, 1 or 5 percent of the total variance, or of R 2 . A
frequently used method is to compute the partial correlation coefficients for each
variable. This statistic represents the relative decrease in the variance remaining (1-
R 2 ) by the addition of the variable in question.
Most PC spreadsheet software packages have statistical routines for all the
analyses described here and many more. Most are extremely flexible, requiring
minimal instructions and input data other than raw data. Special manipulations can
affect an interchange of dependent and independent variables, bring one variable at a
time into the regression equation, rearrange the independent variables in order of
significance, and perform various statistical tests.
2.8.1
Fitting Regression Equations
Regression analysis has been used in various scientific and engineering
disciplines. There are many problems in hydrology that may be solved by multiple
regression procedures. These types of analysis have been used in flood studies, flow
projection, catchment modelling, etc. It was first applied to filling in missing data and
extending short records at one hydrologic station by relating the available data at this
49
station with those at adjacent stations. For example, the two variables x and y for their
population linear regressions give two equations,
y = a1 + b1 x , and
x = a2 + b2 y
(2.2)
where b1 and b2 are tangents of slope angles of the two regression lines, and a1 and a 2
are their intercepts. The regression line y versus x is different from the regression line
x versus y . The variable on the right side of equation 2.2 is called the independent
variable, and on the left side it is called the dependent variable.
However, the
explanatory variables and the predictive variables as used by some hydrologists are better
terms than the independent variables and the dependent variable, as generally used by
statisticians. Equation 2.2 is used for predictive purposes, namely when one needs the
best value of y for a given value of x , or the converse. One is free to choose which
variable shall be taken as explanatory (independent) and predictive (dependent) in filling
in the missing data.
The fitting technique is the method of least squares, which
minimizes the sum of the residuals squared.
Multiple regression is used is when one dependent variable and several
independent variables are available and it is desired to find a linear model for predicting
unobserved values for the dependent variable. The model that is developed does not
necessarily have contained all of the independent variables. In selecting a multiple
regression model, several regressions on a given set of data are performing using
different combinations of the independent variables. The regression that ‘best fit’ the
data is then selected. A commonly used criterion for the ‘best fit’ is to select the equation
yielding the largest value of correlation coefficient ( R 2 ). One of the most commonly
used procedures for selecting the best regression equation is stepwise regression. The
stepwise regression technique will be employed for determining the number of significant
independent variables to be included in the model.
50
2.8.1.1 Stepwise regression
This procedure consists of building the regression equation one variable at a time
by adding at each step the variable that explains the largest amount of the remaining
unexplained variation. After each step all the variables in the equation are examined for
significance and discarded if they are no longer explaining a significant amount of the
variation. Thus the first variable added is the one with the highest simple correlation with
the dependent variable. The second variable added is the one explaining the largest
variation in the dependent variable that remains unexplained by the first variable added.
At this point the first variable is tested for significance and retained or discarded
depending on the results of this test. The third variable added is the one that explains the
largest portion of the variation that is not explained by the variables already in the
equation. The variables in the equation are then tested for significance. This procedure
is continued until all of the variables not in the equation are found to be insignificant and
all of the variables in the equation are significant. The real test of how good the resulting
regression model is depends on the ability of the model to predict the dependent variable
for observations on the independent variables that were not used in estimating the
regression coefficients. To make a comparison of this nature, it is necessary to divide the
data into two parts. One part of the data is then used to develop the model and the other
part to test the model.
2.9
Review on HEC-HMS Model
The HEC-HMS program was developed at the Hydrologic Engineering Centre
(HEC) of the US Army Corps of Engineers. It utilizes a graphical user interface to build
a watershed model and to set up the precipitation and control variables for simulation.
HEC-HMS is considered the standard model in the private sector in the United States for
the design of drainage systems, quantifying the effect of land-use change on flooding, etc.
(Singh and Woolhiser, 2002). The HEC-HMS program simulates rainfall-runoff and
51
routing processes, both natural and controlled. By referring to the HEC-HMS Technical
Reference Manual (2000), for the rainfall-runoff simulation, HEC-HMS provides the
following components:
(i)
Rainfall specification options which can describe an observed (historical)
rainfall event, a frequency–based hypothetical rainfall event, or an event
that represents the upper limit of rainfall possible at a given location.
(ii)
Loss models which can estimate the volume of runoff, given the rainfall
and properties of the watershed.
(iii)
Direct runoff models that can account for overland flow, storage and
energy losses as water runs off a watershed and into the stream channels.
(iv)
Hydrologic routing models that account for storage and energy flux as
water moves through stream channels.
(v)
Models of naturally occurring confluences and bifurcations.
(vi)
Models of water-control measures, including diversions and storage
facilities.
(vii)
A distributed runoff model for use with distributed precipitation data, such
as the data available from weather radar.
(viii) A continuous soil moisture accounting model used to simulate the longterm response of a watershed to wetting and drying.
HEC-HMS is a deterministic model, where all the input, parameters, and
processes in a model are considered free of random variation and known with certainty.
HEC-HMS models can be used to models an event or continuous rainfall-runoff
processes. An event model simulates a single storm. The duration of the storm may
range from a few hours to a few days. HEC-HMS is a measured-parameter and fittedparameter model. A measured-parameter model is one in which model parameters can be
determined from system properties, either by direct measurement or by indirect methods
that are based upon the measurements. A fitted-parameter model, on the other hand,
includes parameters that cannot be measured. Instead, the parameters must be found by
fitting the model with observed values of the input and the output.
52
A continuous model simulates a longer period, predicting watershed response
both during and between rainfall events. Most of the models included in HEC-HMS are
event models. HEC-HMS is a lumped model. These spatial (geographic) variations of
characteristics and processes are averaged or ignored.
HEC-HMS includes both
empirical and conceptual models. A conceptual model is built upon a base of knowledge
of the pertinent physical, chemical, and biological processes that act on the input to
produce the output. An empirical model, on the other hand, is built upon observation of
input and output, without seeking to represent explicitly the process of conversion. The
model is fitted with observed rainfall and runoff, and it is based upon fundamental
principles of surface flow.
HEC-HMS uses a separate model to represent each component of the runoff
process, including models that compute runoff volume; models of direct runoff; models
of base flow; and models of channel flow. Figure 2.9 is a systems diagram of the
watershed runoff process, at a scale that is consistent with the scaled modelled well with
HEC-HMS. The processes illustrated begin with rainfall. Current HEC-HMS is limited
to analysis of runoff from rainfall. In the simple conceptualisation shown, the rainfall can
fall on the watershed’s vegetation, land surface, and water bodies (stream and lakes). In
the natural hydrologic system, much of the water that falls as rainfall returns to the
atmosphere through evaporation from vegetation, land surfaces, and water bodies and
through transpiration from vegetation.
During a storm event, this evaporation and
transpiration is limited. Water that does not pond or infiltrate moves by overland flow to
a stream channel. The stream channel is the combination point for the overland flow, the
rainfall that falls directly on water bodies in the watershed, and the interflow and base
flow. The resultant streamflow is the total watershed outflow.
The appropriate representation of the system depends upon the information needs
of a hydrologic engineering study. For some analysis, a detailed accounting of the
movement and storage of water through all components of the system is required. For
example, to estimate changes due to watershed land use changes, it may be appropriate to
use a long record of rainfall to construct a corresponding long record of runoff, which can
53
be statistically analyzed. Instead, the model needs only compute and report the peak, or
the volume, or the hydrograph of watershed runoff.
The HEC-HMS view of the
hydrologic process can be somewhat simpler. Then, only those components necessary to
predict runoff are represented in detail, and the other components are omitted or lumped.
HEC-HMS includes models of infiltration from the land surface, but it does not model
storage and movement of water vertically within soil layer. It implicitly combines the
near surface flow and overland flow and models this as direct runoff. It does not include
a detailed model of interflow or flow in the groundwater aquifer, instead representing
only the combined outflow as base flow.
Precipitation
Evapotranspiration
Land
surface
Water body
Soil
Stream
channel
Groundwater
aquifer
Watershed
discharge
Figure 2.9
Typical HEC-HMS representation of watershed runoff
HEC-HMS considers that all land and water in a watershed can be categorized as
either directly-connected impervious surface or pervious surface. Directly-connected
impervious surface in a watershed is that portion of the watershed for which all
contributing precipitation runoff, with no infiltration, evaporation, or other volume losses.
HEC-HMS computes runoff volume by computing the volume of water that is
intercepted, infiltrated, stored, evaporated, or transpired and subtracting it from the
precipitation. Singh and Woolhiser (2002) concluded that most models perform little to
54
no error analysis. Thus, it is not clear what the model errors are and how different errors
propagate through different model components and parameters. This is one of the major
limitations of most current watershed hydrology models. Thus, from the standpoint of a
user, it is not clear how reliable a particular model is.
Anderson et. al. (2002) employed the HEC-HMS model for runoff prediction for
Calaveras river basin. The calibration period for the HEC-HMS model of the Calavaras
basin was a 48 hours period from February 8 to 9, 1999. Meanwhile, a 48 hours forecast
period form January 19 to 21, 1999 was selected. The writer found that that the HECHMS model with distributed precipitation is necessary for forecasts of reservoir inflows.
HEC-HMS can transform spatial representations of rainfall data into runoff at the
subwatershed outlet.
Furthermore, future work is required in order to provide
quantitative measures for this type of forecast accuracy, in terms of matching the timing
and magnitude of the peak inflow and total volume of runoff. Kavvas and Chen (1998)
employed HEC-HMS model as Meteorologic model interface. The writer revealed that
there are several options exist for the parameterization of moist convection and boundary
layer processes for the simulation of atmospheric phenomena at different scales and
different characteristics. Woolhiser and Goodrich (1988) investigated the importance of
time varying rainfall in a model of a small watershed and found that disaggregating total
rainfall amounts into simple, constant, and triangular distributions caused significant
distortion in the peak rate distributions for Hortonian runoff.
Mazion and Yen (1994) investigated the effect of computational spatial size on
watershed runoff simulated by HEC-HMS, RORB and a linear system. They found that
the computational grid size had a significant effect on the model results if the physical
scale was not finer, although the effects decreased with increasing rainfall duration. Wu
et. al. (1982) examined the effects of spatial variability of roughness on runoff
hydrographs from an experimental watershed facility and found that under certain
conditions an equivalent uniform roughness could be used for a watershed with nonuniform roughness. Sargent (1981, 1982) determined the effects of storm direction and
speed on runoff peak, flood volume, and hydrograph shape. Surkan (1974) observed that
55
peak flow rates and average flow rates were most sensitive to changes in the direction
and speed of the rainstorms.
Meanwhile, Singh (1998) evaluated the effect of the
direction of storm movement on planar flow and showed that the direction of storm
movement exercised a significant influence on the peak flow, time to the peak flow, and
the shape of the overland flow hydrograph. Kasmin (2003) applied the HEC-HMS model
to examine the effects of selective logging on stormflow parameters at Berembun
catchment, Negeri Sembilan. HEC-HMS was used to calibrate and validate the observed
data in order to predict stormflow parameters. The writer revealed that the stormflow
volume cannot be satisfactorily predicted especially after logging activities. It may due
to larger storms or peak flows that increased with storm duration.
2.10
Reviews on XP-SWMM Model
Development of new models or improvement of previously developed models
continuous today. The power of computers increased exponentially and, as a result,
advances in hydrology have occurred at an unprecedented pace during the past 35 years.
During the decades of the 1970s and 1980s, a number of mathematical models were
developed not only for simulation of watershed hydrology but also for their applications
in other areas, such as environmental and ecological system management (Singh and
Woolhiser, 2002).
The XP-SWMM is an interactive runoff and streamflow routing program
developed by the US Environmental Agency (XP-SWMM users manual, 2000).
It
provides a comprehensive environment to design urban drainage systems utilizing
sophisticated graphical tools together with associated Geographical Information Systems
(GIS) and Computed Aided Design (CAD). It includes Australian, US and South African
storm patterns. Indeed there has been a proliferation of watershed hydrology models with
emphasize on physically based models. Examples of such watershed hydrology models
are SWMM (Metcalf and Eddy, 1971).
This model has since been significantly
56
improved.
SWMM model has been extended to contain increased catchment
information, more physically based processes and improved parameter estimation.
It calculates the catchment losses and streamflow hydrographs resulting from
rainfall events and/or other forms of inflow to channel networks. It is mainly used for
flood estimation, flood routing and hydraulic structures design. In flood estimation
applications, the program may be used on urban or partly rural or partly urban
catchments. It mostly used for design flood forecasting and prediction. In flood routing
applications, single and multiple reaches, network of streams and lateral inflow and
outflow can be modelled. The model really distributed, nonlinear, and based upon a
storage routing procedure.
The program provides continuous and an event-type
modelling procedure. The rainfall is operated on by a loss model to produce rainfallexcess, and the rainfall-excess is operated on by a catchment storage model representing
the effects of overland flow storage and channel storage to produce the surface runoff
hydrograph.
There are three functions of calibration, testing and design in this model. The
sequence of operations used to model a particular catchment or stream is defined by a
series of numerical codes. The data relevant to each code is stored with that code in a
data file. If rainfall and runoff data are available for an event, the program may be used
to calibrate the model through an interactive, trial and error fitting procedure and also to
provide some testing of the model. At the conclusion of each run there is an option to
change the parameters and run again with new values without re-reading and checking
the data and with no unnecessary re-computation or output. Also re-run may be used
with a new data file or same data file with or without changing the parameters or output.
It is necessary firstly to prepare a data file to run the program.
Zakaria et. al. (2003) employed the XP-SWMM model to study the behaviour of
the Bio-Ecological Drainage system (BIOECODS). The simulation is emphasized on the
impact of minor flood events on the drainage system. They found that the modelling and
simulation indicate that the feature and characteristic of BIOECODS has been
57
satisfactorily represented in the XP-SWMM model. The results generated from the XPSWMM modelling have confirmed that the BIOECODS consists of storage, flow
retarding and infiltration engineering, capable of attenuating flood discharge and
managing stormwater at source. Using a length scale based on surface characteristics and
excess rainfall duration, Julien and Moglen (1990) found that the influence of spatial
variability of slope, roughness, width, and excess rainfall intensity on watershed runoff
varied with the length scale. By using the unit hydrograph model, Hromadka et. al.
(1988) found on 12 watersheds that the variance of model-simulated discharge decreased
significantly with the level of discretization, but this decrease reflected a departure of the
model results from the true watershed behavior. El-Kady (1989) reviewed numerous
watershed models and concluded that the surface water-groundwater linkage needed
improvement, while ensuring an integrated treatment of the complexity and scale of
individual component processes.
2.11
Summary of Literature Review
Numerous applications of commercially available hydrologic model (HEC-HMS,
XP-SWMM, RORB, etc.) can be found in several reports and literature. Most of them
are successful. However, the application of those models for example HEC-HMS, XPSWMM possess has some constraints. Some of the main problems of optimization
methods are the difficulty of finding a unique ‘best’ parameter set. Another difficulty is
the inadequacy of these methods for multi input-output hydrology models (Gupta and
Sorooshian, 1994). The calibration process required more data in order to yield a good fit
calibration model. The rainfall and flow records are normally available. Other data such
as evapotranspiration, land use, soil characteristic, etc. sometimes not available and this
possibly reduces the degree of accuracy and reliability of the calibration processes in
commercially available rainfall-runoff models.
58
In the literature review, the neural networks methodology has been reported to
provide reasonably good solutions for circumstances having complex systems that may
be poorly defined or understood using mathematical equations, problems that deal with
noise or involve pattern recognition, and input data that are incomplete and ambiguous by
nature. Neural networks can identify and learn correlative patterns between sets of input
data and corresponding target values. Once trained, such nets can be used predictively to
forecast outcomes from new input data. Because of these characteristics, it was believed
that neural networks could be applied to model rainfall-runoff relationships.
It is apparent that an ANN method derives its computing power through the
massively parallel distributed structure, and its ability to learn and therefore generalize
that producing reasonable outputs for inputs not encountered during training (learning).
These two information processing capabilities make it possible for neural networks to
solve complex or large-scale problems that are currently intractable.
In practice,
however, neural networks cannot provide the solution working by themselves alone.
Rather, they need to be integrated into a consistent system engineering approach.
Generally, a neural network system must be capable of doing three things: (1) store
knowledge; (2) applies the knowledge stored to solve problems; and (3) acquire new
knowledge through experience.
Mathematically, an ANN may be treated as a universal approximator. The ability
to learn and generalize knowledge form sufficient data pairs makes it possible for ANN
to solve large-scale complex problems such as pattern recognition, non-linear modelling,
prediction, classification, prediction, forecasting, control, and others which is find
application in hydrology today. Nowadays, ANN has found increasing use in diverse
disciplines ranging over perhaps all branches of engineering and science. In this study,
the writer has placed emphasis on neural networks methods.
59
CHAPTER 3
RESEARCH METHODOLOGY
3.1
Introduction
This research evaluates the applicability of neural networks to runoffs predicting
and data generation through comparison of neural networks to a variety of other methods.
To accomplish such a task the models are tested on basins of various sizes and at various
time periods. The ultimate aim of this research is to determine the performance of
rainfall-runoff model, due to the application of neural network (ANN) method using
multilayer perceptron (MLP) and radial basis function (RBF) models. The study was
accomplished through the quantitative analysis of the relationship between rainfall and
river discharge for hourly and daily interval. Further, comparison was made between the
performance of ANN models against multiple linear regression (MLR), HEC-HMS and
XP-SWMM models. To address the above mentioned objectives the following study
approach was planned.
The historical records of rainfall and flow data of several
catchments taken from Department of Irrigation and Drainage (DID) and used in the
study. The rainfall-runoff models were developed based on the relationship between
rainfall and flow. For the HEC-HMS and XP-SWMM models, additional parameters
were added to calibrate the rainfall-runoff model.
Those parameters are the
imperviousness, loss rate, time of concentration and lag time. Both models considered as
the most popular and well-known software’s in the market compared to others.
60
This chapter will provide a description of the research methodology and the
selected study sites. The proposed methodology consists of the MLP and RBF modelling
approaches, MLR method, HEC-HMS and XP-SWMM commercial rainfall-runoff
packages. To develop the ANN model, the MATLAB computer package (The Math
Works Inc., 2000) was utilized. It is required to write various computer programming in
MATLAB environment in order to model the rainfall-runoff relationship.
3.2
Multilayer Perceptron (MLP) Model
The MLP is supervised and feedforward neural network with one or more layers
of nodes between input and output nodes. It is a most commonly used neural computing
technique. Each node is the basic element of a neural network called neuron. The
proposed multilayer perceptron model consists of an input layer linked to the input
rainfall variables xi , a hidden layer, and an output layer that connects to the output
(k )
variables y
. Figure 3.1 illustrates the architecture of the proposed rainfall-runoff
prediction model. The decisions that affect the performance of the multilayer perceptron
models during training include the number of input nodes, the number of hidden nodes,
learning rate, momentum constant and the transfer function. Input layer constitutes the
input nodes or neurons. The number of input nodes in the input layer should be selected
carefully in order to construct a good neural network model. The accuracy of model
depends on the selection of input nodes derived from the characteristics of data series.
As mentioned before, the rainfall-runoff process is very complex, highly nonlinear, time varying, and possesses a stochastic behaviour. Hence, the application of
neural network methods may be able to describe accurately this process. In the rainfallrunoff modelling, the input nodes constitute the rainfall series and the output node
consists of the runoff data. The target value of neural network computation is the runoff.
The data is divided into two subsets. First is training data sets. Training data sets used to
61
train the model, and also to validation data. Validation process is carried out to monitor
neural network model performance during training. Meanwhile, the test data sets used to
measure the model performance. The training data set is the data which the neural
network uses to learn the solution to the problem. The validation process is used to
establish when to stop the algorithm that is to choose the best solution. The training and
validation data set is considered as the model construction stage.
Hidden layer
Input layer
Output layer
1
1
1
.
.
.
.
.
.
n
wij
m
y
(k )
c(k )
( j)
y j ( j = 1,..., n)
xi (i = 1,..., m)
Figure 3.1
Structure of a MLP Rainfall-Runoff model with one hidden layer
The training data set is the data which the neural network uses to learn the
solution to the problem. The validation data set is used to establish when to stop the
algorithm that is to choose the best solution. The training and validation data set is
considered as the model construction stage. After every training iteration the validation
data set is passed through the network, and the error over the data set is calculated. The
best set of weights is defined as those that produce the lowest error over the validation
data set. Meanwhile the test data set is the stage of using model for forecasting and
performance test.
62
The present study employs the supervised training neural network models. Once
the neural network models have been created, their suitability for the application needs to
be investigated. This task involves training the models and then testing the performance
of the neural system. Before begin the training process the data is first normalized. Data
normalization ensures that each input contributes equally to the decision or prediction
made by the network. The normalization technique used is known as zero mean unit
standard deviation normalization. It is to determine the mean and the standard deviation
for each field. Each field is then normalized such that the mean values for the field
becomes zero and the values at plus and minus one standard deviation are mapped onto
plus and minus one.
The training or learning phase is critical to the success of the neural networks. In
this study, back-propagation algorithm is used to correct errors. Back-propagation is a
numerically intensive technique, and there are many different ways to perform backpropagation to teach the neural nets how to respond. Any back-propagation network is
based on a supervised learning technique that compares the actual output from output
units to the target or specified output and then readjusts the weights backward in the
network (Tingsanchali, 2000).
In order to improve the usefulness of the steepest descent method, two parameters
can be altered, the momentum coefficient and the learning coefficient (Harun, 1999).
Learning rate is used to control the amount of weights adjustment at each step of training.
The smaller the learning rate parameter α , the smaller will the changes to the synaptic
weight in the network from one iteration to the next and the smoother will be the
trajectory in weight space. However, the smaller learning rate will take too long to reach
the minimum. If on the other hand, we make the learning rate parameter α too large so
as to speed up the rate of learning, the resulting large changes in the synaptic weights
assume such a form that the network may become unstable.
In the standard
backpropagation, the learning rate may be constant (Fausett, 1994). There is no hard and
fast rule about what value the learning coefficient should have. A simple method of
increasing the rate of learning and avoid the danger of instability is to include the
63
momentum term, μ .
The momentum term is usually a positive number.
The
incorporation of momentum in the backpropagation algorithm represents a minor
modification to the weight update. The momentum term have the benefit of preventing
the learning rate process from terminating in a shallow local minimum on the error
surface.
Basically, the proposed neural network model consists of an input layer linked to
the input rainfall variables xi , a hidden layer, and an output layer that connects to the
output runoff variables, y . Input layer is the rainfall data and the output layer constitute
the runoff data. The information of the data in the input layer transfer to the next
consecutive layers in the system of feed forward networks. The selected activation
function will process the signal send by input data that passes from each node.
Associated with each incoming input signal is a weight.
Each input node unit
( i = 1,..., m ) in input layer broadcasts the input signal to the hidden layer. Each hidden
node ( j = 1,..., n ) sums its weighted input signals,
m
y _ in j = w0 j + ∑ xi wij
(3.1)
i =1
applies its activation function to compute its output signal from rainfall,
y j = f ( z _ in j )
(3.2)
and sends this signal to all units in the hidden layer, which wij is the weight between
input layer and hidden layer, w0 j is the weight for the bias; and xi is the input rainfall
signal.
In this study, the activation function used is hyperbolic-tangent sigmoid (tansig)
function as shown in Figure 3.2. For hyperbolic-tangent function,
f (t ) =
2
−1
1 + e −2t
(3.3)
64
Transfer function f (t )
+1
x
0
-1
Figure 3.2
Hyperbolic-tangent (tansig) activation function
The hyperbolic-tangent activation function will process the signal that passes from each
node:
f ( y _ in j ) =
1
1+ e
− y _ in j
(3.4)
Then from second layer the signal is transmitted to third layer. The output unit ( k = 1 )
sums its weighted input signals,
n
x _ ink = c0( k ) + ∑ z j c (jk )
(3.5)
j =1
and applies its activation function to compute its output signal,
y
(k )
= f ( x _ ink )
(3.6)
where c (kj ) is the weight between second layer and third layer; and c 0( k ) is the weight for
the bias. The output node ( k = 1 ) receives a target pattern corresponding to the input
training pattern, computes its error information term,
(k )
δ k = (t k − y ) f ' ( x _ ink )
(3.7)
calculates its weight correction term (used to update c (kj ) later), and
Δc (jk ) = αδ k y j
(3.8)
calculates its bias correction term (used to update c 0( k ) later),
Δc 0( k ) = αδ k
(3.9)
65
where α is learning rate; t k is the target neural network output; y
(k )
is the neural
network output as net inflow variable; and f ' ( x ) = f ( x )[1 − f ( x )] . The error information
is transfer from the output layer back to early layers.
This is known as the
backpropagation of the output error to the input nodes to correct the weights. This
method uses partial derivatives of error with respect to weights to update the weights of
the connections. Each hidden unit ( j = 1,..., m ) sums its delta inputs (from units in the
layer above),
p
δ _ in j = ∑ δ k c (jk )
(3.10)
k =1
multiplies by the derivative of its activation function to calculate its error information
term,
δ j = δ _ in j f ' ( y _ in j )
(3.11)
calculates its weight correction term (used to update wij later),
Δwij = αδ j xi
(3.12)
and calculates its bias correction term (used to update w0 j later),
Δw0 j = αδ j
(3.13)
The output unit ( k = 1 ) updates its bias and weight s ( j = 0,..., n )
c kj (new) = c kj (old ) + Δc (jk )
(3.14)
and each hidden unit ( j = 1,..., n) updates its bias and weights ( i = 0,..., m )
wij ( new) = wij (old ) + Δwij
(3.15)
The weight update formulas for backpropagation with momentum are,
Δc (jk ) (t + 1) = αδ k y j + μΔc (jk ) (t )
(3.16)
and,
Δwij (t + 1) = αδ j xi + μΔwij (t )
The process is terminated when this difference is achieved a specified value.
(3.17)
The
training phase needs to produce an ANN that is both stable and convergence, to produce
66
accurate input-output relations. After this, the network can be tested using data have not
been assigned during learning.
In general, the processes or procedures of backpropagation algorithm can be
summarized as follows:
(i)
Obtain a set of training patterns
(ii)
Setup neural network model (no. of input neurons, hidden neurons, and
output neurons)
(iii)
Set a model parameters (learning rate η , momentum rate α , etc)
(iv)
Initialize all connection, weights wij , and biases b , to random values.
(v)
Set minimum error, Emin
(vi)
Start training by applying input and desired outputs and propagate through
the layers then calculate total error.
(vii)
Backpropagate error through output and hidden layer and adapt weights.
(viii) Backpropagate error through hidden and input layer and adapt weights.
(ix)
Check it error < Emin . If not repeat steps 6-9. If yes stop training.
Tokar and Johnson (1999) reported that the most commonly used activation
function in the backpropagation network is the hyperbolic-tangent (tansig) functions. It
is a continuous transfer functions that accomplished the modification of the network
weights.
This function is nonlinearity, differentiable, which is antisymmetric with
respect to the origin and for which the amplitude of the output lies between –1 and +1.
The transfer function also introduces a nonlinearity that further enhances the network’s
ability to model complex functions.
Its functional form determines the nonlinear
response of a node to the total input signal it receives to produce a continuous value.
Normally, the output layer is chosen a linear activation function. The hyperbolic tangent
and sigmoid are often employed as transfer functions in the training of network (Tokar
and Johnson, 1999). One of the basic requirements of the backpropagation training is
that the transfer function be continuous and differentiable.
67
3.2.1 Training of ANN
There are two types of training approaches to training ANN namely supervised
learning and unsupervised learning. Supervised learning is the most common type of
learning used in ANN. Adapting the values of the weight and thresholds by presenting
the input and output data is known as learning or training. Training is the actual process
of adjusting weight factors based on trial-and-error. A supervised training requires target
patterns or signals to guide the training process. The objective is to minimize the error
between the target and actual output and to find weights. To train ANN, the weight
factors were adjusted until the calculated output pattern based on the given input matches
the desired output. These weights are modified until the difference between the network
output and the actual outputs are equal or close to targets. This training procedure
involves the iterative adjustment and optimization of connection weights and threshold
values for each node in the network. Tsoukalas and Uhrig (1997) reported that, today it
is estimated that 80% of all applications utilize this backpropagation algorithm in one
form or another. It is a gradient (derivative) technique that are simple to compute locally,
and it performs stochastic gradient descent in weight space (for pattern by pattern
updating of synaptic weights).
In the current study, the Levenberg-Marquardt (LM) algorithm is used. The LM
algorithm is an approximation to Newton’s Method (Hagen and Menhaj, 1994).
Newton’s method is an alternative to the conjugate gradient methods for fast optimization
and often converges faster than conjugate gradient methods (see Demuth and Beale,
1994). The LM algorithm was designed to approach second-order training speed without
having to compute the Hessian matrix. When the performance function has the form of a
sum of squares (as is typical in training feedforward networks), then the Hessian matrix
can be approximated as,
H = JTJ
(3.18)
If E is assumed as a sum of squares function,
N
E = ∑ ei2
i =1
(3.19)
68
Differentiating Eq. 3.19 with respect to e , the gradient can be computed as,
g = JTe
(3.20)
where J is the Jacobian matrix that contain first derivatives of the network errors with
respect to the weights and biases, J T is the transposition of J , and e is a vector of
network errors. Let W denote the total number of free parameters such as weights and
biases of a multilayer perceptron, which are ordered in the manner described to form the
weight vector w . Let N denote the total number of training patterns used to train the
network.
Using backpropagation to compute a set of W partial derivatives of the
approximating function F [w : x(n)] with respect to the elements of the weight vector w
for a specific training pattern x(n) in the training set. Repeating these computations for
n = 1,2,..., N , then it is end up with a N − by − W matrix of partial derivatives. This
matrix is called the Jacobian J of the multilayer perceptron. The Jacobian matrix can be
computed through a standard backpropagation technique that is much less complex than
computing the Hessian matrix. The LM algorithm uses this approximation to the Hessian
matrix in the following Newton’s method,
wk +1 = wk − H k−1 g k
[
= wk − J T J + μI
]
wk +1 = wk + Δw
−1
k
JTe
(3.21)
(3.22)
(3.23)
where wk is a vector of current weights and biases, g k is the current gradient and
Δw = − H k−1 g k . This equation is applied iteratively, with the computed value of wk +1
being used repeatedly as the ‘new’ wk . When the scalar μ is zero, this is just Newton’s
method, using the approximation Hessian matrix. When μ is large, this become gradient
descent with a small step size. Newton’s method is faster and more accurate near an
error minimum, so the aim is to shift towards Newton’s method as quickly as possible.
Thus, μ is decreased after each successful step (reduction in performance function) and
is increased only when a tentative step would increase the performance function. In this
way, the performance function will always be reduced at each iteration of the algorithm.
69
3.2.2
Selection of Network Structures
Networks defined by various combinations of rainfall and runoff at present and
previous time periods were trained and tested to predict runoff using different ANN
configurations. For all different combinations of the input variables, networks were first
trained using a one hidden layer. As the MLP, then the best fit combination of these
input variables for one hidden layer networks was used to train the two hidden layer
networks. The comparison of the one hidden layer with the two hidden layer networks
was carried out.
In this particular study, the structure of ANN model is designed based on trial and
error procedure to find the appropriate number of time-delayed input variables to the
model. Hsu et. al. (1998), Tokar and Johnson (1999) and Dibike and Solomatine (2001)
treat the rainfall as directly related to runoff at the present time t by using the following
equation,
y (t ) = f {x(t ), x(t − 1), x(t − 2), K , x(t − n), y (t − 1), K , y (t − n)}
(3.24)
This model treat the rainfall as directly related to runoff at the present time t . The
goodness-of-fit statistics are computed for both training and testing for each ANN
architecture. At the first step, the rainfall at time t was added to the model. The
goodness-of-fit statistics for the present model were computed for training and testing
procedures. Then rainfall at time ( t − 1 ) was added as an additional input variable to the
model, and the goodness-of-fit statistics were computed. This procedure is repeated by
adding rainfall at previous time periods as input variable until there is no significant
change in model training and testing accuracy.
After the first step was completed,
another input variable; the runoff at previous time periods, (t − 2) is added to the best-fit
model obtained from the first step. Then, the goodness-of-fit statistics for the present
model were computed for training and testing procedures. This procedure is repeated by
adding runoff at previous time periods as input variable until there is no significant
change in model training and testing accuracy.
70
3.3
Radial Basis Function (RBF) Model
A well known non-linear modelling approach is the RBF network. RBF have
only three layers (one input, one hidden, and one output). Figure 3.3 illustrates the
designed architecture of the RBF. In RBF, the number of RBF ‘centres’ (also called
weight vectors) is as many as data points and the performance of an RBF network
depends upon the chosen centres. However, the problem arise is how to select the RBF
centres especially for a large number of parameters. The most general formula for any
RBF is,
(
)
y ( x) = φ ( x − c ) ℜ −1 ( x − c )
T
(3.25)
where φ is the function used, c is the centre and ℜ is the metric.
((x − c ) ℜ
T
−1
(x − c )) is the distance between the input
The term
x and the centre c in the metric
defined by ℜ . Often the metric is Euclidean. In this case, ℜ = r 2 Ι for some scalar
radius r and equation 3.25 simplifies to
⎛ ( x − c )T ( x − c ) ⎞
⎟
y ( x) = φ ⎜⎜
3
⎟
r
⎠
⎝
(3.26)
According to Fausett (1994), the Euclidean length is represented by r j that measures the
radial distance between the datum vector y = ( y1 , y 2 ,..., y m ) ; and the radial centre
Y
( j)
= ( w1 j , w2 j ,..., wmj ) ; can be written as;
rj = y − Y
( j)
⎡m
⎤
= ⎢∑ ( y i − wij ) 2 ⎥
⎦
⎣ i =1
1/ 2
(3.27)
A suitable transfer function is then applied to r j to give,
(
φ (r j ) = φ y − Y ( k )
)
(3.28)
Finally the output layer ( k = 1 ) receives a weighted linear combination of φ (r j ) ,
(
n
n
v
( j)
y ( k ) = ∑ c (jk )φ (r j ) = ∑ c (jk )φ y − Y
j =1
j =1
)
(3.29)
71
x1
x2
w1 j
w2 j
.
wmj
.
xm
1
2
⎡m
2⎤
r j = ⎢∑ (x i − wij ) ⎥
⎦
⎣ i =1
Hidden Layer
Figure 3.3
3.3.1
φ (r j )
c (kj )
n
v
y ( k ) = ∑ c (jk ) φ (r j )
j =1
Output Layer
The structure of RBF Model
Training RBF Networks
Adapting the values of the weights and centres of networks by presenting the
input and output data is known as learning or training. Training in an RBF network may
be done in two stages. First, calculating the parameters of the RBF, including centres and
the scaling parameters. Second, calculation of the weights between the hidden and output
layer. Training is the actual process of adjusting weight factors based on trial-and-error.
A supervised training requires target patterns or signals to guide the training process.
The objective is to minimize the error between the target and actual output and to find
weights. To train RBF network, the weight factors were adjusted until the calculated
output pattern based on the given input matches the desired output.
There are several types of learning algorithms can be used in RBF network such
as Orthogonal Least Squares (OLS), Generalized Regression Neural Network (GRNN),
K-means Clustering, and Probability Density Function (PDF). The emphasis of the paper
is on adopting Generalized Regression Neural Network (GRNN) routine in selecting
parsimonious RBF model. Specht (1991) has popularized ‘kernel regression’ which he
calls a Generalized Regression Neural Network (GRNN). The GRNN algorithm is a kind
of radial basis network that is often used for function approximation. The GRNN was
introduced as a memory based neural network that would store all the independent and
dependent training data available for a particular mapping (Heimes and Heuveln, 1998).
72
GRNN is particularly advantageous with sparse data in a real-time environmental,
because the regression surface is instantly defined everywhere, even with just one sample
(Specht, 1991). GRNN can be designed very quickly, fast learning, and effectively uses
historical data to estimates values for continuous dependent variables. The learning
process is equivalent to finding a surface in a multidimensional space that provides a best
fit to the training data, with the criterion for the ‘best fit’ being measured in some
statistical sense. Using the features of learning and training processes which it learned
from past experience, or generalization of previous examples, RBF is capable of
performing a basis for system modelling and forecasting.
The GRNN predicts the value of one or more dependent variables, given the value
of one or more independent variables. According to Heimes and Heuveln (1998) and
Specht (1991), the GRNN thus takes as an input vector x of length n and generates an
output vector (or scalar) y ' of length m , where y ' is the prediction of the actual output
y . The GRNN does this by comparing a new input pattern x with a set of p stored
patterns x i (pattern nodes) for which the actual output yi is known. The predicted
output y ' is the weighted average of all these associated stored output y ij . Equation 3.30
expresses how each predicted output component y 'j is a function of the corresponding
output components y j associated with each stored pattern x i . The weight W ( x, x i )
reflects the contribution of each known output yi to the predicted output. It is a measure
of the similarly of each pattern node with the input pattern,
p
∑ y W ( x, x )
i
y 'j =
Nj
D
=
i =1
p
ij
∑ W ( x, x )
j = 1,2,..., m
(3.30)
i
i =1
It is clear from equation 3.30 that the predicted output magnitude will always lie between
the minimum and maximum magnitude of the desired output, y ij associated with the
stored patterns (since 0 ≤ W ≤ 1) . In the GRNN algorithm, the output weights are set to
the desired outputs. The GRNN is best seen as an interpolator, which interpolates
73
between the desired outputs of pattern layer nodes that are located near the input vector
(or scalar) in the input space.
A standard way to define the similarly function, W is to base it on a distance
function, D( x1 , x 2 ) , that gives a measure of the distance or dissimilarity between two
patterns x1 and x 2 . The desired property of the weight function W ( x, x i ) is that its
magnitude for a stored pattern x i be inversely proportional to its distance from the input
pattern x (if the distance is zero the weight is a maximum of unity). The standard
distance and weight functions are given by the following equations, respectively:
W ( x, x i ) = e − D ( x , x )
i
⎛ x − x2k
D ( x1 , x 2 ) = ∑ ⎜⎜ 1k
σk
k =1 ⎝
n
(3.31)
⎞
⎟⎟
⎠
2
(3.32)
In equation 3.32, each input variable has its own sigma value, σ k , where σ k is the
normalization constant that controls the width of the basis function. The procedures of
GRNN algorithm can be summarized as follows:
(i)
Input unit stores an input vector x .
(ii)
The pattern units which computes the distances D( x, x i ) between the
incoming patterns x and stored patterns x i . The pattern nodes output the
quantities W ( x, x i ) .
(iii)
The summation units computes N j , the sums of the products of W ( x, x i )
and the associated known output component yi . This unit also has a node
to compute D , the sum of all W ( x, x i ) .
(iv)
Finally, the output unit divides N j by D to produce the estimated output
component y 'j that is a localized average of the stored output patterns.
74
3.4
Multiple Linear Regression (MLR) Model
For MLR model, the input nodes are selected using the MLR method that
proposed by Harun (1999). For both training and testing processes, the application of
input node selection by Harun (1999) considerably can reduce the time taken to find out
the optimum number of inputs to the models. Multiple linear regression applies to
problems in which records have been kept of one variable, y , the dependent variable,
and several other variables x1 , x2 , x3 , ..., xk , the independent variables, and in which the
objective requires the relationship between the variables
y
and the variables
x1 , x2 , x3 , ..., xk to be investigated. The basic multiple regression model is given by
(Holder, 1985),
y = a + b1 x1 + b2 x2 + ... + bk xk + e
(3.33)
where a , b1 , b2 , …, bk are constants and e is a random variable. Thus, it is assumed
that y is linearly related to each of the independent variables and that each independent
variable has an additive effect on y . For any i th set of observations, the model can be
written more conveniently as,
yi = α + β 1 ( x1i − x1 ) + β 2 ( x2i − x 2 ) + ... + β k ( x ki − x k ) + ei
(3.34)
where, xki is the value of independent variable x k at i th set of observations totalling n ;
and
x1 =
1 n
1 n
,…,
x
x
=
∑ 1i
∑ xki , etc.
k
n i =1
n i =1
(3.35)
By comparing model 3.33 and 3.34, we can see that b j = β j (for j = 1,2,..., k ) and
a = α − β1 x1 − β 2 x2 − ... − β k xk . In general case of n observations and k variables, the
method of least squares is used to choose values of a, b1 , b2 ,...bk (or α , β1 , β 2 ,..., β k ) to
minimise,
n
n
i =1
i =1
S 2 = ∑ ei2 = ∑ ( y i − a − b1 x1i − b2 x 2i − ... − bk x ki )
n
2
2
= ∑ ( yi − α − β 1 ( x1i − x1 ) − β 2 ( x2i − x2 ) − ... − β k ( xki − xk ))
i =1
(3.36)
(3.37)
75
Solving ∂S 2 ∂α = 0 , ∂S 2 ∂β 1 = 0 ,…, ∂S 2 ∂β k = 0 gives the following equations for the
values of α , β1 , β 2 ,..., β k which minimise S 2 (denoted by αˆ , βˆ1 , βˆ 2 ,..., βˆ k ):
α̂ = y =
n
n
n
i =1
i =1
1 n
∑ yi
n i =1
(3.38)
n
∑ ( yi − y )(x1i − x1 ) = βˆ1 ∑ (x1i − x1 )2 + βˆ 2 ∑ (x1i − x1 )(x2i − x2 ) + ... + βˆ k ∑ (x1i − x1 )(xki − xk )
i =1
n
∑ (y
i =1
i
i =1
n
n
i =1
i =1
n
− y )( x 2 i − x 2 ) = βˆ1 ∑ ( x1i − x1 )( x 2 i − x 2 ) + βˆ 2 ∑ ( x 2i − x 2 ) + ... + βˆ k ∑ (x 2i − x 2 )( x ki − x k )
2
i =1
M
n
∑ (y
i =1
i
(3.39)
n
n
n
i =1
i =1
i =1
2
− y )( x ki − x k ) = βˆ1 ∑ ( x1i − x1 )( x ki − x k ) + βˆ 2 ∑ (x 2i − x 2 )( x ki − x k ) + ... + βˆ k ∑ (x ki − x k )
Thus, the estimates βˆ1 , βˆ 2 ,..., βˆ k are given by,
βˆ = S xx−1 S xy
(3.40)
⎡ βˆ1 ⎤
⎢ˆ ⎥
β
ˆ
β = ⎢ 2⎥
⎢M ⎥
⎢ ⎥
⎢⎣ βˆ k ⎥⎦
(3.41)
where,
The equations may now be condensed into,
⎡ S x1 y ⎤ ⎡ S x1x1
⎢S ⎥ ⎢S
⎢ x 2 y ⎥ = ⎢ x1x 2
⎢ M ⎥ ⎢ M
⎢
⎥ ⎢
⎣⎢ S xky ⎦⎥ ⎣⎢ S x1xk
S x1x 2 K S x1xk ⎤ ⎡ βˆ1 ⎤
⎢ ⎥
S x 2 x 2 K S x 2 xk ⎥⎥ ⎢ βˆ 2 ⎥
M
M ⎥⎢ M ⎥
⎥⎢ ⎥
K K S xkxk ⎦⎥ ⎣⎢ βˆ k ⎦⎥
(3.42)
or,
S xy = S xx β̂
(3.43)
where,
S xjy = ∑ (x ji − x j )( yi − y )
n
(for j = 1,2,..., k )
(3.44)
i =1
S xjxl = ∑ (x ji − x j )( xli − xl )
n
i =1
(for j = 1,2,..., k and l = 1,2,..., k )
(3.45)
76
In this study, rainfall-runoff equations derived by using multiple regression
procedures have been developed and used for estimation flows or runoff from rainfalls
data. The MLR method is suitable for daily rainfall-runoff modelling. Harun (1999)
reported that the MLR model had the ability to yield best results in modelling of monthly
and daily input-output relationship.
3.5
HEC-HMS Model
A model relates something unknown (the output) to something known (the input).
In the case of the models that are included in HEC-HMS, the known input is rainfall and
the unknown output is runoff. The rainfall may be observed from a historical event, it
may be a frequency-based hypothetical rainfall event, or it may be an event that
represents the upper limit of rainfall possible at a given location. Historical rainfall data
are useful for calibration and verification of model parameters, and for evaluating the
performance of proposed designs or regulations. Similarly, the evapotranspiration data
used may be observed values from a historical record, or they may be hypothetical
values.
The required watershed precipitation depth can be inferred from the depths at
gages using an averaging scheme. Thus,
⎛
⎞
∑ ⎜⎝ w ∑ p (t ) ⎟⎠
=
∑w
i
PMAP
i
i
t
(3.46)
i
i
where PMAP is a total storm mean areal precipitation (MAP) depth over the watershed;
pi (t ) is precipitation depth measured at time t at gage i ; and wi is weighting factor
assigned to gage/observation i .
If gage i is not a recording device, only the
quantity ∑ pi (t ) , the total storm precipitation at gage i , will be available and used in the
computation.
77
3.5.1
Evaporation and Transpiration
In common application, HEC-HMS omits any detailed accounting of evaporation
and transpiration, as there are insignificant during a flood. In the case of shorter storms,
it may be appropriate to omit this accounting. Evaporation, as modelled in HEC-HMS,
includes vaporization of water directly from the soil and vegetative surface, and
transpiration through plant leaves.
This volume of evaporation and transpiration
combined is estimated as an average volume. The evaporation and transpiration are
combined and collectively referred to as evapotranspiration (ET) in the meteorological
input to the program. In this input, monthly varying ET values are specified, along with
an ET coefficient.
The potential ET rate for all time periods within the month is
computed as the product of the monthly value and the coefficient.
3.5.2
Computing of Runoff Volumes
HEC-HMS includes several alternative models to account for the cumulative
losses such as the initial and constant-rate loss model, the deficit and constant-rate model,
the SCS curve number (CN) loss model, and the Green and Ampt loss model. In this
study, the best models used are the initial and constant-rate loss model, the deficit and
constant-rate model, and the SCS curve number (CN) loss models. For each model,
precipitation loss is found for each computation time interval, and is subtracted from the
MAP depth for that interval. The remaining depth is referred to as precipitation excess.
This depth is considered uniformly distributed over a watershed area, so it represents a
volume of runoff.
78
3.5.2.1 The initial and constant-rate and Deficit and Constant-rate loss model
The underlying concept of the initial and constant-rate loss model is that the
maximum potential rate of precipitation loss, f c is constant throughout an event. Thus,
if p t is the MAP depth during a time interval t to t + Δt , the excess, pet during the
interval is given by,
⎧ p − fc
pet = ⎨ t
⎩0
if pt > f c ⎫
⎬
otherwise ⎭
(3.47)
An initial loss, I a , is added to the model to represent interception and depression
storage. Interception storage is a consequence of absorption of precipitation by surface
cover, including plants in the watershed. Depression storage is a consequence of
depressions in the watershed topography; water is stored in these and eventually
infiltrates or evaporates. This loss occurs prior to the onset of runoff.
Until the accumulated precipitation on the pervious area exceeds the initial loss
volume, no runoff occurs. Thus, the excess is given by:
⎧0
if
⎪⎪
pet = ⎨ pt − f c if
⎪
if
⎩⎪0
∑p <I
∑p >I
∑p >I
i
i
i
⎫
⎪⎪
a and p t > f c ⎬
⎪
⎪
a and p t < f c ⎭
a
(3.48)
The initial and constant-rate model, in fact, includes one parameter (the constant rate) and
one initial condition (the initial loss). Respectively, these represent physical properties of
the watershed soils and land use and the antecedent condition. If the watershed is in a
saturated condition, I a will approach zero. If the watershed is dry, then I a will increase
to represent the maximum precipitation depth that can fall on the watershed with no
runoff; this will depend on the watershed terrain, land use, soil types, and soil treatment.
The constant loss rate can be viewed as the ultimate infiltration capacity of the soils.
Skaggs and Khaleel (1982) have published estimates of infiltration rates for those soils,
as shown in Table 3.1. Because the model parameter is not a measured parameter, it
and the initial condition are best determined by calibration.
79
Table 3.1: Infiltration rates by the soil groups
Soil
group
Description
Range of loss rates
(in/hr)
A
Deep sand, deep loess, aggregated silts
0.30 – 0.45
B
Shallow loess, sandy loam
0.15 – 0.30
C
Clay loams, shallow sandy loam, soils low in
organic content, and soils usually high in clay
0.05 – 0.15
Soils that swell significantly when wet, heavy
plastic clays, and certain saline soils
0.00 – 0.05
D
3.5.2.2 SCS Curve Number Loss Model
The Soil Conservation Service (SCS) Curve Number (CN) model estimates
precipitation excess as a function of cumulative precipitation, soil cover, land use, and
antecedent moisture, using the following equation:
Pe =
(P − I a ) 2
P − Ia + S
(3.49)
where Pe is accumulated precipitation excess at time t ; P is accumulated rainfall depth
at time t ; I a is the initial loss; and S is potential maximum retention, a measure of the
ability of a watershed to abstract and retain storm precipitation. Until the accumulated
rainfall exceeds the initial abstraction, the precipitation excess, and hence the runoff, will
be zero. From analysis of results from many small experimental watersheds, the SCS
developed an empirical relationship of I a and S :
I a = 0.2 S
(3.50)
Therefore, the cumulative excess at time t is:
( P − 0.2 S ) 2
Pe =
P + 0.8S
(3.51)
Incremental excess for a time interval is computed as the difference between the
accumulated excess at the end of and beginning of the period. The maximum retention,
80
S , and watershed characteristics are related through an intermediate parameter, the curve
number (commonly abbreviated CN ) as:
⎧1000 − 10 CN
⎫
( foot − pound system)⎪
⎪⎪
⎪
CN
S =⎨
⎬
⎪ 25400 − 254 CN
⎪
( SI )
⎪⎩
⎪⎭
CN
(3.52)
The CN for a watershed can be estimated as a function of land use, soil type, and
antecedent watershed moisture, using tables published by the SCS. This CN is entered
directly in the appropriate HEC-HMS input form. For a watershed that consists of
several soil types and land uses, a composite CN is calculated as:
CN composite =
∑ A CN
∑A
i
i
(3.53)
i
in which CN composite is the composite CN used for runoff volume computations with
HEC-HMS; i is an index of watersheds subdivisions of uniform land use and soil type;
CN i is the CN for subdivision i ; and Ai is the drainage area of subdivision i . Tables
in Appendix A include composite CN for urban districts, residential districts, and
newly graded areas.
That is, the CN shown are composite values for directly-
connected impervious area and open space. If CN for these land uses are selected, no
further accounting of direct-connected impervious area is required in HEC-HMS.
3.5.3
Modelling of Direct Runoff
This section describes the models that simulate the process of direct runoff of
excess precipitation on a watershed.
It is refers to the ‘transformation’ process of
precipitation excess into point runoff. There are two options for these transform methods
namely, empirical models and a conceptual model.
traditional unit hydrograph (UH) models.
The empirical models are the
The system theoretic models attempt to
establish a causal linkage between runoff and excess precipitation without detailed
consideration of the internal processes. Meanwhile, the conceptual model included is a
81
kinematic-wave model of overland flow. It represents, to the extent possible, all physical
mechanisms that govern the movement of the excess precipitation over the watershed
land surface. There are several types of empirical models such as user-specified UH,
parametric and synthetic UH, Snyder’s UH, SCS UH, Clark’s UH, and ModClark UH. In
this study, the SCS UH and Clark UH models are applied. These models are simple and
required only small number of parameters compared to others. In other word, these
models are suitable in the case of less of information available in the study area.
The unit hydrograph is a well-known, commonly used empirical model of the
relationship of direct runoff to excess precipitation. Sherman (1932) defined UH as
“…the basin outflow resulting from one unit of direct runoff generated uniformly over
the drainage area at a uniform rainfall rate during a specified period of rainfall duration”.
The underlying concept of the UH is that the runoff process is linear, so the runoff from
greater or less than one unit is simply a multiple of the unit runoff hydrograph. To
compute the direct runoff hydrograph with a UH, HEC-HMS uses a discrete
representation of excess precipitation, in which a ‘pulse’ of excess precipitation is known
for each time interval. It then solves the discrete convolution equation for a linear
system:
Qn =
n≤ M
∑P U
m =1
m
n − m +1
(3.54)
where Qn is storm hydrograph ordinate at time nΔt ; Pm is rainfall excess depth in time
interval mΔt to ( m + 1) Δt ; M is total number of discrete rainfall pulses; and U n − m +1 is
UH ordinate at time ( n − m + 1) Δt . Qn and Pm are expressed as flow rate and depth
respectively, and U n − m +1 has dimensions of flow rate per unit depth. Use of this equation
requires the implicit assumptions:
(i)
The excess precipitation is distributed uniformly spatially and is of
constant intensity throughout a time interval, Δt .
(ii)
The ordinates of a direct-runoff hydrograph corresponding to excess
precipitation of a given duration are directly proportional to the volume
of excess. Thus, twice the excess produces a doubling of runoff
82
hydrograph ordinates and half the excess produces a halving. This is the
so-called assumption of linearity.
(iii)
The direct runoff hydrograph resulting from a given increment of excess
is independent of the time of occurrence of the excess and of the
antecedent precipitation. This is the assumption of time-invariance.
(iv)
Precipitation excesses of equal duration are assumed to produce
hydrographs with equivalent time bases regardless of the intensity of the
precipitation.
3.5.3.1 SCS UH Model
The Soil Conservation Service (SCS) proposed a parametric UH model; this
model is included in HEC-HMS. The model is based upon averages of UH derived
from gauged rainfall and runoff for a large number of small agricultural watersheds
throughout the US. At the heart of the SCS UH model is a dimensionless. This
dimensionless UH, expresses the UH discharge, U t , as a ratio to the UH peak
discharge, U p , for any time t , a fraction of T p , the time to UH peak. Research by the
SCS suggests that the UH peak and time of UH peak are related by:
Up = C
A
Tp
(3.55)
in which A is watershed area; and C is conversion constant (2.08 in SI and 484 in
foot-pound system). The time of peak (also known as the time of rise) is related to the
duration of the unit of excess precipitation as:
Tp =
Δt
+ tlag
2
(3.56)
in which Δt is the excess precipitation duration (which is also the computational
interval in HEC-HMS); and tlag is the basin lag, defined as the time difference between
the centre of mass of rainfall excess and the peak of the UH. When the lag time is
83
specified, HEC-HMS solves equation 3.79 to find the time of UH peak, and equation
3.55 to find the UH peak. With U p and T p known, the UH can be found from the
dimensionless form, which is included in HEC-HMS, by multiplication. The SCS UH
lag can be estimated via calibration. For ungauged watersheds, the SCS suggests that
the UH lag time may be related to time of concentration, t c , where t lag is 0.6t c .
3.5.3.2 Clark’s UH Model
Clark’s model derives a watershed UH by explicitly representing two critical
processes in the transformation of excess precipitation to runoff. First, is the translation
or movement of the excess from its origin throughout the drainage to the watershed
outlet; and second is attenuation or reduction of the magnitude of the discharge as the
excess is stored throughout the watershed.
Short-term storage of water throughout a watershed in the soil, on the surface,
and in the channels plays an important role in the transformation of precipitation excess
to runoff. The linear reservoir model is a common representation of the effects of this
storage. That model begins with the continuity equation:
dS
= I t − Ot
dt
(3.57)
in which dS dt is time rate of change of water in storage at time t; I t is average inflow
to storage at time t ; and Ot is outflow from storage at time t . With the linear reservoir
model, storage at time t is related to outflow ( S t = ROt ), where R is a constant linear
reservoir parameter. Combining and solving the equations using a simple finite
difference approximation yields:
Ot = C A I t + C B Ot −1
(3.58)
where C A , C B are the routing coefficients. The coefficients are calculated from:
CA =
Δt
R + 0.5Δt
(3.59)
84
CB = 1 − C A
(3.60)
The average outflow during period t is:
Ot =
Ot −1 + Ot
2
(3.61)
With Clark's model, the linear reservoir represents the aggregated impacts of all
watershed storage. Thus, conceptually, the reservoir may be considered to be located at
the watershed outlet.
In addition to this lumped model of storage, the Clark model accounts for the
time required for water to move to the watershed outlet. It does that with a linear
channel model (Dooge, 1959), in which water is routed from remote points to the linear
reservoir at the outlet with delay (translation), but without attenuation. This delay is
represented implicitly with a so-called time-area histogram. That specifies the
watershed area contributing to flow at the outlet as a function of time. If the area is
multiplied by unit depth and divided by Δt , the computation time step, the result is
inflow, I t , to the linear reservoir. Solving equation 3.58 and equation 3.61 recursively,
with the inflow thus defined, yields values of O . However, if the inflow ordinates in
equation 3.58 are runoff from a unit of excess, these reservoir outflow ordinates are, in
fact, U t , the UH.
Application of the Clark model requires properties of the time-area histogram;
and the storage coefficient, R . As noted, the linear routing model properties are
defined implicitly by a time area histogram. Studies at HEC have shown that, even
though a watershed specific relationship can be developed, a smooth function fitted to a
typical time area relationship represents the temporal distribution adequately for UH
derivation for most watersheds. That typical time area relationship, which is included in
HEC-HMS is:
85
1.5
⎫
⎧
⎛t ⎞
t
⎪
⎪1.414⎜⎜ ⎟⎟
for t ≤ c
2
⎪
At ⎪
⎝ tc ⎠
=⎨
⎬
1.5
A ⎪
⎛
tc ⎪
t ⎞
for t ≥ ⎪
⎪1 − 1.414⎜⎜1 − ⎟⎟
2⎭
t
c
⎝
⎠
⎩
(3.62)
where At is cumulative watershed area contributing at time t ; A is total watershed
area; and t c is time of concentration of watershed. For application in HEC-HMS, only
the parameter t c , the time of concentration, is necessary. This can be estimated via
calibration. The basin storage coefficient, R , is a index of the temporary storage of
precipitation excess in the watershed as it drains to the outlet point. It can be estimated
via calibration if gauged precipitation and streamflow data are available. Though R has
units of time, there is only a qualitative meaning for it in the physical sense. Clark
(1945) indicated that R can be computed as the flow at the inflection point on the
falling limb of the hydrograph divided by the time derivative of flow.
3.6
XP-SWMM Model
By referring to the XP-SWMM Technical Reference Manual (2000), there are
five major types of Hydrograph generation techniques available in runoff. There are,
(a)
SWMM runoff nonlinear reservoir method
Subcatchments are modeled as idealized rectangular areas with the slope
of the catchment perpendicular to the width.
(b)
Kinematic wave method
The kinematic wave method for overland flow applies only the kinematic
wave component of the St. Venant shallow flow equations for momentum
and continuity.
(c)
Laurenson nonlinear method/Rafts
When using Laurenson hydrology the catchment width is by default not
used.
The catchment roughness utilized when calculating the storage
86
delay parameter, B is taken from the pervious manning, n value for the
sub-catchment included with the infiltration information.
(d)
SCS unit hydrograph method
SCS is as an alternative to the nonlinear runoff routing method employed
by Runoff, hydrographs may optionally be generated by the SCS method.
Typical values of the previous area curve number vary from 20 for regions
with high infiltration and interception capacities to 98 for impervious
areas. The SCS has been determined the hydrograph shape factor to be
484 for most watersheds.
This was the result of analyzing many
watersheds of various size and geographic location. It is used in the
formulation of peak discharge and the peak of the unit hydrograph,
Q p = 484 A t p , in which Q p is the peak discharge, A is the drainage
area, and t p is the time to peak. The time of concentration, Tc can be
estimated from several formula such as Kinematic waves. For a constant
excess rainfall can be described as,
⎛
⎞
L0.6
Tc = C ⎜⎜ n 0.6 0.4 S 0.3 ⎟⎟
i
⎝
⎠
(3.63)
where L is the distance from the upper end of the plane to the point of
interest, n is the Manning resistance coefficient, i is the excess rainfall
rate, S is the dimensionless slope of the surface, and C is constant that
depends on units of the other variables. T p = 2 3Tc for shape factor 484.
The initial abstraction from the precipitation may be represented as an
absolute number, that is the total depth of precipitation that is less or as a
fraction of the amount of precipitation (between 0 to 1).
(e)
Rational Formula
The rational method as applied in this application provides a unit
hydrograph approach applying a deterministic form of the Rational
Formula, Q = CIA . The three items of data needed for this procedure are
the runoff coefficient, C , the rainfall intensity, I and the size of the
catchment area, A . The method utilizes the rainfall concurrent to the time
87
step being computed. The unit hydrograph incorporates a base length
equal to two times the time of concentration.
(f)
Time area method
Time area methods utilize a convolution of the rainfall excess hyetograph
with a time area diagram representing the progressive area contributions
within a catchment in set time increments.
The time area procedure
assumes a linear time area relationship for the sub-area and is based on an
input time of concentration. The only input necessary for this procedure is
the time of concentration for the sub-area. Determine the runoff generated
from
individual
sub-catchments
using
the
following
equation;
qi = I i A1 + I i −1 A2 + ... + I 1 Ai , where qi is hydrograph ordinate, I i is
effective rainfall intensity, Ai is contributing sub-catchment area at a
particular time and i is number of isochrones area contributing to the
outlet.
(g)
Other unit hydrograph methods are Nash, Snyder, Synder (Alameda)
and Santa Barbara urban hydrograph
(i)
Nash unit hydrograph
Nash in 1957 proposed a conceptual catchment model by
considering a drainage basin with a series of identical linear
reservoir in series. Two items of data are required to apply this
method. These include an exponent and a time of concentration.
(ii)
Santa Barbara urban hydrograph (SBUH)
The SBUH method was developed by Santa Barbara County Flood
and Water Conservation District, California. The SBUH method
directly computes a runoff hydrograph without going through an
intermediate process as the SCS method does. The SBUH method
uses two steps to synthesize the runoff hydrograph; first step is
computing the instantaneous hydrograph, and the second step is
computing the runoff hydrograph.
88
(iii)
Snyder unit hydrograph
Synder (1938) was the first to develop a synthetic unit hydrograph
based on a study of watersheds in the Appalachian Highlands.
Synder’s relationships are, T p = Ct ( LLc ) 0.3 . Where T p is the basin
lag, L is the length of the main stream from the outlet, Lc is the
length along the main stream to a point nearest the watershed
centroid, and C t is the coefficient usually ranging from 1.8 to 2.2.
Peak discharge of the unit hydrograph, Q p = 640C p A / T p , where
A is the drainage area and C p is the storage coefficient ranging
from 0.4 to 0.8. The time base of the hydrograph, Tb = 3 + T p / 8 .
(iv)
Synder (Alameda) unit hydrograph
The Synder (Alameda) method is the Snyder procedure as applied
by Alameda County in California where the individual parameters
more suited to Alameda region are computed from catchment
characteristics. This method requires four items of data; stream
length ( L ), centroid length ( Lc ), stream slope ( S ) and basin
roughness ( N ).
In this study, the model parameters used for modelling rainfall-runoff relationship
are a Horton infiltration/loss and SCS, Time Area unit hydrograph and Rational Formula
unit hydrograph. These models have been selected because it is simple and required a
small numbers of parameters. Therefore, it was easily design and manages with less of
information available in the study area.
89
3.7
Calibration of Distributed Models
The proper choice of calibration data may mitigate difficulties encountered in
model calibration. Critical issues pertaining to calibration data are the amount of data
necessary and sufficient for calibration and the quality of data resulting in the best
parameter estimates. However, our understanding to address such issues is less than
complete (Singh, 2002). Model calibration for the case of distributed models is one of
important process predicted discharge. Suppose that the parameters of a distributed
model have initially been calibrated only on the basis of prior information about soil and
vegetation type, with some adjustment of values being made to improve the simulation of
measured discharges. Furthermore, after initial calibration a decision is made to collect
more spatially distributed information about the catchment response. In fact, because of
the lack of information available about the internal responses of the catchment, it would
probably use effective values of model parameters over wide regions of the flow domain.
But if the new data are being used to improve the local calibration, more data will be
needed to evaluate a model, without necessarily having any impact on the variables of
greatest interest in prediction discharge in rainfall-runoff modelling.
For each different catchment, especially differences in soil, geological condition,
etc, a set of parameters needs to be established so that the HEC-HMS and SWMM
models can simulate hydrological processes. The procedure for determining parameter
values for a particular catchment is called parameter calibration or parameter
optimization. However, before undertaking parameter calibration, a system performance
check was implemented to ensure continuity of mass during simulations. In general, as
with many other complex models, parameter calibration involved multiple trial and error
run, professional judgement being used to decide which parameters to adjusted and to
what extend. The strategy for determining the model parameters can be summarized as
follows:
(a)
Pick initial values for parameters as rational as possible by doing some
simply math calculations
90
(b)
Pick a low-flow period that is followed by a peak-flow at the beginning of
a simulation and determine the initial type of parameter first
(c)
Tests run the model and evaluate results.
Other parameters can also be adjusted, such as the Horton’s parameters, channel
geometric parameters, evapotranspiration and etc. Some of these parameters may have
little effect on short-term hydrological processes and thus may not need to be adjusted.
3.8
Evaluation of the Model
Most models perform little to no error analysis. Thus, it is not clear what the
model errors are and how different errors propagate through different model components
and parameters (Singh and Woolhiser, 2002). This is one of the major limitations of
most current catchments hydrology models. The issue of forecasting accuracy arises at
both the model development stages and when the model is used to handle the forecasting
task (Harun, 1999). During development stage the model performance is monitored
based on the agreement of the proposed model with the actual data. Following the model
development stage, the suitable model will be selected for testing its capability to perform
the predicting task. Basically, the performance of each model will be compared based on
the estimated errors between the computes and the actual data. It is often useful to
calculate more than one criterion for comparison because occasionally different criteria
may give different indications.
3.8.1
Goodness of Fit Tests
There are many measures that can be used to evaluate the results of a model
simulation. These will depend on what observational data are available to evaluate each
model. Most measures of goodness of fit used in hydrograph simulation in the past have
been based on the sum of square errors, or error variance (Beven, 2001). For this study,
91
the quantitative tests will be applied to assess the capability of the models in describing
the process investigated. The quantitative tests will be performed by using several
indicators that described below. It is often more effectual to calculate more than one
criteria for comparison as different measures may give different indications.
The performance of each candidate model will be compared based on the
estimated errors between the computes or predictions and the actual data. The present
study employs therefore five different measures of accuracy to assess the performance of
each model. Formulas for the five measures are:
3.8.1.1 Correlation of coefficient ( R 2 )
The criterion used for assessing the performance of the different models in the
established R 2 criterion of Nash and Sutcliffe (1970) that was expressed as:
∑ [(Q
n
R =
2
t =1
o (t )
]
− Qo (t ) )(Qs ( t ) − Qs ( t ) )
2
⎡n
⎢∑ (Qo ( t ) − Qo ( t ) )
⎣ t =1
∑ (Q
n
t =1
s (t )
− Qs ( t ) )
2
1
⎤2
⎥
⎦
(3.64)
In which, Qo is the actual observed streamflow value; Qs is the model simulated
streamflow value; and n is the number observed streamflows of time periods over which
the errors are computed. The value of R 2 of 90% indicates a very satisfactory model
performance while a value in the range 80-90% indicates a fairly good model. Values of
R 2 in the range 60-80% would indicate unsatisfactory model fit (Kachroo, 1986). For a
perfect match, the value of R 2 should be 1.0 (Thirumalaiah and Deo, 2000; Yu and
Yang, 2000)
3.8.1.2 The root mean square error (RMSE)
1
2
⎡1 n
2⎤
RMSE = ⎢ ∑ (Qo ( t ) − Qs ( t ) ) ⎥
⎣ n t =1
⎦
(3.65)
92
Generally, these formulas evaluate the models based on a comparison of the estimated
errors between the actual observations and the fitted model. A model with the minimum
error is considered the best choice.
3.8.1.3 The relative root mean square error (RRMSE)
⎡ 1 n ⎡ (Qo (t ) − Qs ( t ) )⎤ ⎤ 2
RRMSE = ⎢ ∑ ⎢
⎥⎥
Qo ( t )
⎥⎦ ⎦⎥
⎣⎢ n t =1 ⎢⎣
1
(3.66)
To compare the accuracy of the model used in estimating the runoff, the at-site estimates
(using the actual data from the site being estimated) were compared with those found
using the model.
3.8.1.4 The mean absolute percentage error (MAPE)
MAPE =
1 n Qo ( t )−Qs ( t )
× 100%
∑
n t =1 Qo ( t )
(3.67)
Johnson and King (1988) assigned that the MAPE around 30% is considered a reasonable
prediction. Further, the analysis will be considered very accurate when the MAPE is in
the range of 5% to 10%.
3.8.1.5 The percentage bias (PBIAS)
∑ (Q
n
PBIAS =
t =1
n
∑ (Q
t =1
s (t )
o(t )
− Qo (t ) )
− Qo ( t ) )
× 100%
(3.68)
2
The model performances calibrated by various objective functions are evaluated by using
the percentage bias (PBIAS) as proposed by Yapo et. al (1996). PBIAS measures the
bias of model performance. The optimal value is 0.0, which means that the model has an
unbiased flow simulation. Positives value indicates a tendency of overestimation and
negative values indicate a tendency of underestimation.
93
All these measures are aimed at providing a relative measure of model
performance. That measure should reflect the aims of a particular application in an
appropriate way. Beven (2001) revealed that, there is no universal performance measure
and whatever choice is made, there will be an effect on the relative goodness of fit
estimates for different models and parameter sets, particularly if an optimum parameter
set is sought. It is often useful to calculate more than one criterion for comparison
because occasionally different criteria may give different indications.
In general, a
model with minimum error is chosen as the best model for predicting. A mean squared
error (RMSE) is one of the most commonly used performance measures in hydrological
modelling. Many researchers used RMSE as an accuracy measure (Hsu et. al., 1995;
Shamseldin, 1997; Harun, 1999; Tokar and Markus, 2000; Elshorbagy et. al., 2000; Yu
and Yang, 2000). Harun (1999) reported that the interpretation of the MAPE is quite
straightforward, because most people can readily appreciate the significance of forecasts
being within a certain percentage of the true value.
Therefore, comparisons of trained networks were accomplished by comparing
models with each other using goodness of fit statistics discussed above and selecting the
best fit network for each training and testing set.
3.8.2
Missing Data and the Outliers
Missing data is very common in hydrology data collection and needs to be
estimated or treated. There are quite a number of models to be used for the above
purpose, such as arithmetic mean method, normal ratio method, modified normal ratio
method, inverse distance method, quadrant method, modified inverse distance method,
isohyetal method, Beard’s method, rank matching method, linear programming method,
and etc. The rainfall data of rain gauges were used in calibration of the models. There
was found that some missing data for different stations for different events. The linear
94
regression method was used to fill in the missing rainfall data. Goodness of fit between
the data sets of various stations was judged by the resulting values of Index of Fit ( R 2 ).
Outliers are other aspects that should be tackled in preparing the data series.
There are two types of outliers namely, high outliers and low outliers. Maidment (1993)
reported that the outliers are to be data points which depart significantly from the trend of
the remaining data. In experimental statistics, an outliers is often a rogue observation
which may result from unusual conditions or observational or recording error, such
observations are often discarded. High outliers are retained unless historical information
is identified showing that such floods are the largest in an extended period.
If no cause can be determined, Grubbs (1969) proposed that one of the following
actions for treating the outliers:
(i)
The outliers could be eliminated from the sample, with only the remaining
observations used in further analysis.
(ii)
The outliers could be replaced with the next closest values in the sample.
(iii)
The outliers could be retained, and the median used in the test statistic
rather than the mean.
(iv)
3.9
The outliers could be removed, and truncated sample theory applied.
The Study Area
Automatic gauging stations were chosen from selected catchments over
peninsular Malaysia. These stations were chosen based on several criteria. First and
foremost, the quality and quantity of data for each station was individually screened, the
initial screening was done on the entire available automatic gauging stations in the
country. Some stations had numerous days of missing data in a year that make the annual
maximum values questionable. These stations are excluded from the study. The length
of record or data available is of primary importance. The amount of data available for
95
each station varies between 23 years to 29 years as the earliest date of operation for
automatic rain gauges in Peninsular Malaysia was in 1971.
In this study, the mathematical programming based on neural network methods
was applied to model the rainfall-runoff relationship in the selected catchments area. A
total of four regions have been defined in Peninsular Malaysia, within which reasonably
consistent regional rainfall-runoff relationship have been established. The precision with
which the areal extent of these regions can be specified is largely dependent upon the
areal density of the gauging stations within region.
Each region consists of the
catchments area of all the station records judged to be homogeneous and used in the
study. These regions are Sungai Bekok, Sungai Ketil, Sungai Klang and Sungai Slim
catchments area as shown in Figure 3.4, Figure 3.5, Figure 3.6 and Figure 3.7
respectively, together with the location of the gauging stations used in the analysis. A
rainfall station register is to be completed for every rainfall station within each state.
This includes both manual and automatic recording stations in the primary and secondary
networks. Table 3.2 to 3.5 shows the characteristics of the rainfall and runoff stations
that are available in the selected catchments.
3.9.1
Selection of training and testing data
The selection of training data that represents the characteristics of a watershed and
meteorological patterns is extremely important in modelling (Yapo et. al., 1996). Input
variable (rainfall) is selected to describe the physical phenomena of the rainfall-runoff
process, in order to forecast runoff. The steps involved in the identification of a dynamic
model of a system are selection of input-output data suitable for calibration and
verification; selection of a model structure and estimation of its parameters; and
validation of the identified models (Hsu et. al., 1995). The selection of training data that
represents the characteristics of a watershed and meteorological patterns is extremely
96
important in modelling (Yapo et. al., 1996). Input variable is selected to describe the
physical phenomena of the rainfall-runoff process, in order to forecast runoff.
For the recorded data to be generally useful, the exposure to instruments and the
recording practices must remain as constant as possible for a lengthy period of at least a
decade (Hecht-Nielsen, 1991). The most current data were used in testing in order to
examine the capability of model in predicting future runoff, without directly including the
change in land-use characteristics, such as urbanization. It is very difficult to assign
absolute error values to historic rainfall and runoff data. Some estimate of the likely
errors can be made by examining the station record, the physical features of the site,
actual chart record of water level where available, and the precision with which the
station rating curve is defined at flood stages. In many cases, for historic reasons, very
little of this type of information is available. A subjective classification of the value of
the flood peak information used in this study has been made. In considering the station
records within a proposed flood frequency region, the effects of possible data errors on
the results can be considered to be, (1) errors in the record likely to make the inclusion of
the station within the flood frequency region suspect, (2) errors in the record likely to
influence the mean annual flood-catchment area relationship, and (3) errors in the record
likely to influence the regional flood frequency relationship. The data from each station
record used in this study have been examined and estimates made in respect of these
three considerations. The errors in the data finally used to define the areal extent and the
flood frequency relationships for each region are considered to have negligible effect on
the results presented.
For the daily rainfall-runoff modelling, record of five years of daily rainfall-runoff
series of Sungai Bekok catchment (1984-1988), Sungai Ketil catchment (1986-1990),
Sungai Klang catchment (1996-2000), and Sungai Slim catchment (1996-2000) are
selected to evaluate the performance of the neural network model. The data used consist
of two sets: the first four years of data are used for model calibration (training) and
validation in the case of ANN, and the remaining one year of data are used for model
verification (testing). The data use for calibration process can be more than four years in
97
order to represent the characteristics of the catchment area. Two data sets representing
wet and dry season were selected for testing in order to identify the rainfall-runoff
behaviour in the selected catchments. Increasing the number of training data in the
training phase, with no change in neural network structure, will improve performance on
the training and testing phase. Thus, it is depends on providing an adequate number and
good of training data.
Meanwhile, for the hourly rainfall-runoff modelling, record of ten years of hourly
rainfall-runoff series of Sungai Bekok catchment (1991-2000), Sungai Ketil catchment
(1983-1992), Sungai Klang catchment (1991-2000), and Sungai Slim catchment (19881997) are used. A good quality input-output pairs of data sets was selected to develop
and evaluated of the neural network models. The neural network was trained under two
sets of conditions. In this study, 55 sets of data have been selected from the records. The
first 50 sets of data are used for model calibration (training), and the remaining 5 sets of
data are used for model verification (testing). The data use for calibration process can be
more than 50 sets in order to represent the characteristics of the catchment area. It is
anticipated that by increasing the number of training data in the training phase, with no
change in neural network structure, will improve the performance on the training and
testing phase. Thus, it is depends on providing an adequate number and quality of
training data. In addition, capability of a network on predicting peak flow was examined
with a ratio of the predicted to the observed peak flow values.
3.9.2 The Sungai Bekok Catchment
Figure 3.4 shows the schematics of Sungai Bekok catchment area. It is located at
state of Johor, Malaysia. The Sg. Bekok is a natural catchment with size of 350 km2. In
term of utilization of the Sg. Bekok catchment, it is consists of 85% of open space
(agricultures, fields, roads, utility reserve, etc), and 15% of domestic area. In future,
there is high possibility of any further changes in land use pattern. It is located on
98
southwestern part of Johor, latitude 020 07’ 15” and longitude 1030 02’ 30. Table 3.2
shows the latitude and longitude of water level station and the raingauges station for the
Sungai Bekok catchment. There are two raingauges station (No. 2130068 and 2031069)
located within the catchment boundary have been used for analysis purposes. Rainfalls
up to 1 minute’s interval are recorded by electronic data loggers and retrieved once
fortnightly. An automatic water level recorder is provided at downstream at Sg. Bekok
At Bt. 77 Jalan Yong Peng/Labis (2130422). The water level recorder, can record the
readings up to 1 minutes interval, by an electronic logger. The data are retrieved once
fortnightly. Referring to Soil Map of Malaya (1962) Ministry of Agriculture, Malaysia,
this catchment area consists of two types of soil. There are consists of 35% of red and
yellow latosols and red and yellow podzolic soils on gently to strongly sloping land of
variable fertility derived from a variety of sedimentary rocks; and 65% of organic soils,
principally peat’s, with some mucks, developed over mineral alluvial soils in poorly
drained situations, of limited suitability for agricultural development.
Table 3.2: Raingauges used in calibration and verification of the models for
Sg. Bekok catchment
(a) Water Level Station
Latitude
Longitude
02° 07’ 15”
103° 02’ 30”
Latitude
Longitude
02° 07’ 50”
103° 03’ 00”
02° 04’ 15”
103° 09’ 10”
At Sg. Bekok At Bt. 77 Jalan
Yong Peng/Labis
(2130422)
(b) Raingauge
At Ladang Union, Yong Peng
(2130068)
At Ladang Yong Peng, Batu Pahat
(2031069)
99
Figure 3.4
The Sungai Bekok catchment area
3.9.3 The Sungai Ketil Catchment
Figure 3.5 shows the schematics of Sungai Ketil catchment area. It is located at
state of Kedah, Malaysia. The Sg. Ketil catchment is a semi-developed area and the size
is 704 km 2 . It is consists of 53% of open space (agricultures, fields, roads, utility
reserve, etc), 22% of domestic area, and 25% of commercial area. It is located on south
part of Kedah, the latitude 050 38’ 20” and the longitude 1000 48’ 45”. Table 3.3 shows
the latitude and longitude of water level station and the raingauges station for the Sg.
Ketil catchment.
There are three raingauges station (No. 5608074, 5609072 and
5708071) located within the catchment boundary have been used for analysis purposes.
100
Rainfalls up to 1 minute’s interval are recorded by electronic data loggers and retrieved
once fortnightly. An automatic water level recorder is provided at downstream at Sg.
Ketil at Kuala Pegang (5608418). The water level recorder, can record the readings up to
1 minutes interval, by an electronic logger. The data are retrieved once fortnightly.
Referring to Soil Map of Malaya (1962) Ministry of Agriculture, Malaysia, this
catchment area consists of three types of soil. There are consists of 45% of red and
yellow latosols and red and yellow podzolic soils on gently to strongly sloping land,
mostly of average fertility, derived from acid igneous rocks; and 45% of laterite soils on
gently to strongly sloping land, mostly of average to below average fertility; and 10% of
freely drained coarse textured grey brown podzols of below average fertility, developed
over recently accumulated coast deposits with associated swamps.
Figure 3.5
The Sungai Ketil catchment area
101
Table 3.3: Raingauges used in calibration and verification of the models for
Sg. Ketil catchment
(a) Water Level Station
At Sg. Ketil At Kuala Pegang
Latitude
Longitude
05° 38’ 20”
100° 48’ 45”
Latitude
Longitude
05° 39’ 25”
100° 53’ 55”
05° 40’ 50”
100° 55’ 00”
05° 45’ 05”
100° 53’ 35”
(5608418)
(b) Raingauge
At Pulai
(5608074)
At Hospital Baling
(5609072)
At Kg. Terabak
(5708071)
3.9.4 The Sungai Klang Catchment
Figure 3.6 shows the schematics of Sungai Klang catchment area. It is located at
Kuala Lumpur, Malaysia. The Sg. Klang catchment is a fully-developed or urbanization
area and the size is 468 km 2 . Urbanization is known to have a significant effect on
rainfall-runoff relationships. It is consists of 65% of commercial area, 25% of domestic
area, and 10% of open space (agricultures, fields, roads, utility reserve, etc). It is located
on northwestern part of Kuala Lumpur, the latitude 030 08’ 20” and the longitude 1010
41’ 50”. Table 3.4 shows the latitude and longitude of water level station and the
raingauges station for the Sg. Klang catchment. There are eight raingauges station (No.
3116003, 3116005, 3116006, 3117070, 3118069, 3216001, 3217001 and 3116006)
located within the catchment boundary have been used for analysis purposes. Rainfalls
up to 1 minute’s interval are recorded by electronic data loggers and retrieved once
fortnightly. An automatic water level recorder is provided at downstream at Sg. Klang at
Jambatan Sulaiman (3116430). The water level recorder, can record the readings up to 1
minutes interval, by an electronic logger.
The data are retrieved once fortnightly.
102
Referring to Soil Map of Malaya (1962) Ministry of Agriculture, Malaysia, this
catchment area consists of three types of soil. There are consists of 45% of lithosols and
shallow latosols on steep mountainous and hilly land considered unsuitable for extensive
agriculture development; 40% of red and yellow latosols and red and yellow podzolic
soils on gently to strongly sloping land, mostly of average fertility, derived from acid
igneous rocks; and 15% disturbed land, chiefly tin tailings, of limited suitability for
agriculture.
Figure 3.6
The Sungai Klang catchment area
103
Table 3.4: Raingauges used in calibration and verification of the models for
Sg. Klang catchment
(a) Water Level Station
At Sg. Klang At Jambatan Sulaiman
Latitude
Longitude
03° 08’ 20”
101° 41’ 50”
Latitude
Longitude
03° 09’ 05”
101° 41’ 05”
03° 11’ 50”
101° 38’ 10”
03° 11’ 00”
101° 38’ 00”
03° 09’ 20”
101° 45’ 00”
03° 09’ 30”
101° 48’ 05”
03° 16’ 20”
101° 41’ 10”
03° 16’ 05”
101° 43’ 45”
03° 14’ 10”
101° 45’ 10”
(3116430)
(b) Raingauge
At Ibu Pejabat JPS, Malaysia
(3116003)
At Sekolah Rendah Taman Maluri
(3116005)
At Ladang Edinburgh Site 2
(3116006)
At Pusat Penyelidikan JPS, Ampang
(3117070)
At Pemasokan Ampang
(3118069)
At Kg. Sg. Tua
(3216001)
At Ibu Bekalan KM. 16, Gombak
(3217001)
At Ibu Bekalan KM. 11, Gombak
(3116006)
3.9.5 The Sungai Slim Catchment
Figure 3.7 shows the schematics of Sungai Slim catchment area. It is located at
state of Perak, Malaysia. The Sg. Slim is a semi-developed area with size 455 km2. It is
104
consists of 65% of open space (agricultures, fields, roads, utility reserve, etc), 25% of
domestic area, and 10% of commercial area. It is located on middle part of Perak, the
latitude 030 49’ 35” and the longitude 1010 24’ 40”. Table 4.4 shows the latitude and
longitude of water level station and the raingauges station for the Sg. Slim catchment.
There are two raingauges station (No. 3814156, and 3814157) located within the
catchment boundary have been used for analysis purposes. Rainfalls up to 1 minute’s
interval are recorded by electronic data loggers and retrieved once fortnightly.
An
automatic water level recorder is provided at downstream at Sg. Slim At Slim River
(3814416). The water level recorder, can record the readings up to 1 minutes interval, by
an electronic logger. The data are retrieved once fortnightly. Referring to Soil Map of
Malaya (1962) Ministry of Agriculture, Malaysia, this catchment area consists of three
types of soil. There are consists of 10% of lithosols and shallow latosols on steep
mountainous and hilly land considered unsuitable for extensive agriculture development;
55% of red and yellow latosols and red and yellow podzolic soils on gently to strongly
sloping land of variable fertility derived from a variety of sedimentary rocks; and 35% of
low humic gley soils, being moderately and poorly drained soils developed over coastal
plains and in the valleys and flood plains of the larger rivers, of very variable fertility.
Table 3.5: Raingauges used in calibration and verification of the models for
Sg. Slim catchment
(a) Water Level Station
At Sg. Slim At Slim River
Latitude
Longitude
03° 49’ 35”
101° 24’ 40”
Latitude
Longitude
03° 51’ 40”
101° 26’ 40”
03° 49’ 55”
101° 24’ 30”
(3814416)
(b) Raingauge
At Ladang Bedford, Slim River
(3814156)
At Ladang Baba Bakala
(3814157)
105
Figure 3.7
The Sungai Slim catchment area
The most current data were used in testing in order to examine the capability of
model in predicting future runoff, without directly including the change in land-use
characteristics, such as urbanization.
Based on daily and hourly rainfall-runoff
relationships, five data sets representing set 1, set 2, set 3, set 4 and set 5 were selected
for training and testing in order to identify the rainfall-runoff behaviour in the four
selected catchments in peninsular Malaysia.
106
3.10
Computer Packages
In this study, the computer programmings were using MATLAB copyrighted by
MathWorks Inc. (2000) as tool to develop model structures for prediction of the rainfallrunoff relationship. The modelling technique approach used in the present study is based
on artificial neural network method in modelling of hydrologic input-output relationship.
MATLAB can be incorporated effectively to enhance understanding and enabling the
researcher actively to put theory into practice. This software is known are friendly user
and flexible with high capability for analysis and design the hydrologic processes. The
performances of these models were compared with that of existing models available in
the market. Results of this study will permit the identification of the best models for the
rainfall-runoff modelling.
107
CHAPTER 4
RESULTS AND DISCUSSION
4.1
General
This study focuses on the application of Multilayer Perceptron (MLP) and Radial
Basis Function (RBF) methods in modelling of rainfall-runoff relationship. Results of
MLP and RBF models are illustrated in Section 4.2 and 4.3 respectively. Results of
multiple linear regression (MLR), HEC-HMS and XP-SWMM models are described in
section 4.4, 4.5 and 4.6 respectively. The term ‘training’ used in MLP and RBF models
is similar to the term ‘calibration’ which is used in others comparison models.
Meanwhile, the term ‘testing’ used in MLP and RBF models are similarly with the term
‘verification’ which is used in other models. This kind of process was carried out to
achieve the best parameter or weighted coefficient before the model is ready to be used
for testing, prediction and evaluation. The evaluation is carried out based on correlation
of coefficient ( R 2 ), root mean square error (RMSE), relative root mean square error
(RRMSE), mean absolute percentage error (MAPE) and percentage bias (PBIAS).
Appendix J shows the programming of daily (part A) and hourly (part B) rainfall-runoff
relationships that has been developed by using MATLAB.
108
4.2
Results of the Multilayer Perceptron (MLP) Model
To evaluate the performance of the model, previous records of daily and hourly
rainfall-runoff data of Sungai Bekok, Sungai Ketil, Sungai Klang and Sungai Slim
catchments are used. There are five selected sets of data used as the prediction set.
Testing data set number 1, 2 and 3 are representing normal condition of 1 year, 6 months
and other 6 months data sets respectively in the case of daily rainfall-runoff modelling.
Meanwhile, testing data set number 4 and set number 5 are representing dry and wet
season. For the neural network training process, the best hidden nodes are chosen based
on the minimum root mean square error (RMSE) computed for the training data. For the
training or calibration phase, the data used are 100%, 50% and 25% or minimum training
data sets. For example; in the case of daily rainfall-runoff modelling, 100% data consists
of 4 years or 1460 data sets. Meanwhile, 50% and 25% data means that the calibration
data was randomly selected from the total of 1460 data sets. Even the data sets for the
training process is randomly selected, it must covered the maximum and minimum values
of testing data sets. This approach was carried out as well as in the case of hourly
rainfall-runoff modelling. The objectives are, first is to evaluate the effect of amount or
quantity of the data to the model performance and the results accuracy and second is to
evaluate the robustness of the model.
4.2.1
Results of Daily MLP Model
There are two types of MLP model structure have been developed for modelling
daily rainfall-runoff relationship. First model is the 3 layers MLP structure with one
hidden layer. Second model is the 4 layers MLP structure with two hidden layer. Results
of 3 layers and 4 layers of MLP model are described below.
Tables 4.1(a) to 4.1(c) present the correlation of coefficient ( R 2 ), RMSE,
RRMSE, and MAPE resulting from 3-layer MLP model of daily rainfall-runoff
109
relationship for the Sungai Bekok catchment.
Meanwhile, Figures 4.1(a) to 4.1(c)
illustrates the graphical results of 3-layer MLP during training and testing.
For Sungai Bekok catchment, the numbers of input nodes considered for 3 layers
MLP are 16 nodes with 13 numbers hidden nodes. The total numbers of parameters are
250. The example of calculation is, (16x13)+(13x1)+16+13=250, which are representing
the total numbers of weights in the network structures. By using 100% training data sets
(Table 4.1(a)), it shows the consistency and reliable results in the prediction phase
compared to the results by using 50% and 25% of training data sets. It can be concluded
that a large number of training data sets is required to perform successful training. Most
values of R 2 approach in the range of 80% to 97%. This outcome indicates that the MLP
model consistently show a good performance in rainfall-runoff modelling. Kachroo
(1986) reported that when R 2 more than 90%, the model is very satisfactory. It is fairly
good with R 2 in the range of 80% to 90%. Johnson and King (1988) decided on model
accuracy based on MAPE value. The prediction model considered reasonable with
MAPE below 30% and very accurate with MAPE less than 10%. Results of modelling
for Sungai Bekok with MAPE less than 10% can be considered as very accurate.
Table 4.1(a): Results of 3-Layer neural networks for Sg. Bekok catchment using 100% of data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
16-13-1*
250
0.9350
0.1660
0.0328
2.010
16-13-1*
250
0.9310
0.1220
0.0242
1.8143
16-13-1*
250
0.9580
0.0893
0.0183
1.6232
16-13-1*
250
0.9102
0.1264
0.0253
1.9461
16-13-1*
250
0.9706
0.1254
0.0262
2.3390
16-13-1*
250
0.8441
0.1345
0.0276
2.1922
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
110
Table 4.1(b): Results of 3-Layer neural networks for Sg. Bekok catchment using 50% of data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
16-13-1*
250
0.9660
0.1258
0.0254
1.8522
16-13-1*
250
0.9144
0.1709
0.0337
2.7322
16-13-1*
250
0.9571
0.1050
0.0206
1.6073
16-13-1*
250
0.8860
0.1696
0.0338
2.7705
16-13-1*
250
0.9718
0.0936
0.0193
1.7228
16-13-1*
250
0.8155
0.1612
0.0328
2.6859
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
Table 4.1(c): Results of 3-Layer neural networks for Sg. Bekok catchment using 25% of data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
16-13-1*
250
0.9380
0.1327
0.0270
1.8472
16-13-1*
250
0.8860
0.1248
0.0249
1.7972
16-13-1*
250
0.9451
0.0981
0.0203
1.7598
16-13-1*
250
0.8507
0.1307
0.0266
1.9147
16-13-1*
250
0.9655
0.1357
0.0282
2.6268
16-13-1*
250
0.7680
0.1358
0.0282
2.1433
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
During the training phase, the RMSE for Sungai Bekok is consistently less than
0.17 cumecs for the 3 layer (16-13-1) model structures. The RRMSE also maintains
below 0.033 for this model structures. During testing, the RMSE gives values less than
111
0.17 cumecs and RRMSE come close to zero. Obviously, the application of MLP
method to model rainfall-runoff relationship of Sungai Bekok is successful. The Sungai
Bekok has the observed flow between 4.13 cumecs to 7.10 cumecs and it is 350 km2
catchment sizes. Meanwhile, the results of R 2 , RMSE, RRMSE, and MAPE show that
the networks calibrated using dry-season of data set (set 4) approximate rainfall-runoff
process in the catchment more closely than the networks calibrated using wet-season of
data set (set 5).
Tables 4.2(a) to 4.2(c) present the R 2 , RMSE, RRMSE, and MAPE resulting
from 4-layer MLP model of daily rainfall-runoff relationship for the Sungai Bekok
catchment. Meanwhile, Figures 4.2(a) to 4.2(c) illustrates the graphical results of 4-layer
MLP during training and testing.
Table 4.2(a): Results of 4-Layer neural networks for Sg. Bekok catchment using 100% of data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
16-13-11-1*
402
0.9310
0.1710
0.0339
2.1600
16-13-11-1*
402
0.9340
0.1184
0.0234
1.7441
16-13-11-1*
402
0.9604
0.0920
0.0189
1.7046
16-13-11-1*
402
0.9148
0.1226
0.0246
1.8785
16-13-11-1*
402
0.9732
0.1309
0.0274
2.4430
16-13-11-1*
402
0.8568
0.1296
0.0266
2.1094
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
112
Table 4.2(b): Results of 4-Layer neural networks for Sg. Bekok catchment using 50% of data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
16-13-11-1*
402
0.9624
0.1268
0.0250
1.6902
16-13-11-1*
402
0.9245
0.1802
0.0356
2.9553
16-13-11-1*
402
0.9580
0.1102
0.0214
1.6143
16-13-11-1*
402
0.8987
0.1810
0.0360
3.0342
16-13-11-1*
402
0.9679
0.0946
0.0192
1.7231
16-13-11-1*
402
0.8378
0.1701
0.0346
2.9212
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
Table 4.2(c): Results of 4-Layer neural networks for Sg. Bekok catchment using 25% of data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
16-13-11-1*
402
0.9155
0.1530
0.0304
2.0354
16-13-11-1*
402
0.9233
0.1170
0.0231
1.6629
16-13-11-1*
402
0.9614
0.0983
0.0204
1.8458
16-13-11-1*
402
0.8966
0.1234
0.0247
1.8017
16-13-11-1*
402
0.9662
0.1388
0.0292
2.6270
16-13-11-1*
402
0.8449
0.1278
0.0263
2.0734
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
For Sungai Bekok catchment, the numbers of input nodes considered for 4-layers
MLP are 16 nodes. The numbers of hidden nodes in the hidden layer are 13 and 11 for
the first and second layer respectively. Most values of R 2 in Table 4.2(a) to 4.2(c) are in
113
the range of 80% to 97%. According to Kachroo (1986), it is fairly good and satisfactory
model. Johnson and King (1988) decided on model accuracy based on MAPE value.
According to Johnson and King (1988), this prediction model considered very accurate
with MAPE below 10%. By increasing the number of layer in the hidden layer (to 2hidden layer), the results of rainfall-runoff modelling in Sungai Bekok catchment is not
significantly different. The R 2 is maintained in the range of 80% to 97%.
During the training phase, the RMSE for Sungai Bekok is consistently less than
0.18 cumecs for the 4 layer (16-13-11-1) model structures. The RRMSE also maintains
at 0.034 for this model structures. During testing, the RMSE is consistently less than
1.82 cumecs and RRMSE is less than 0.04, and this come close to zero. Obviously, the
application of 4 layer MLP structures to model rainfall-runoff relationship of Sungai
Bekok is very successful.
It is observed that, the 4 layer networks shows a good
performances during calibrated and testing using dry-season (data set 4) that approximate
rainfall-runoff process in the catchment more closely than the networks calibrated using
wet-season data set (set 5).
Tables A4.3(a) to A4.3(c) in Appendix A (Part A) present the R 2 , RMSE,
RRMSE, and MAPE resulting from 3-layer MLP model for the Sungai Ketil catchment.
Meanwhile, Figures I4.3(a) to I4.3(c) in Appendix I (Part A) illustrate the graphical
results of 3-layer MLP during training and testing. For Sungai Ketil catchment, the
numbers of input nodes considered for 3 layers MLP are also 16 nodes with 14 numbers
of hidden nodes. By using 100% of training data sets, it shows that model performance is
consistent and reliable for prediction compared to the results by using 50% and 25% of
training data sets. It is apparent that a large number of training data sets is required to
perform successful training. Most values of R 2 in Table 4.3(a) are in the range of 80%
to 90%.
This outcome indicates that the MLP model consistently show a good
performance in rainfall-runoff modelling. According to Kachroo (1986), it is fairly good
model. According to Johnson and King (1988), this prediction model considered very
accurate with MAPE less than 10%. During the training phase, the RMSE for Sungai
Ketil is consistently less than 0.41 cumecs for the 3-layer (16-14-1) model structure. The
114
RRMSE also maintains below 0.01 for this model structures. During testing, the RMSE
is less than 0.85 cumecs and the RRMSE is less than 0.027 that come close to zero.
Obviously, the application of MLP method to model rainfall-runoff relationship of
Sungai Ketil also very successful. The Sungai Ketil has the observed flow between 29.02
cumecs to 33.15 cumecs with the maximum rainfall of 110.3mm and it is 704 km2
catchment sizes.
Tables A4.4(a) to A4.4(c) in Appendix A (Part A) present the R 2 , RMSE,
RRMSE, and MAPE resulting from 4-layer MLP model for the Sungai Ketil catchment.
Meanwhile, Figures I4.4(a) to I4.4(c) in Appendix I (Part A) illustrate the graphical
results of 4-layer MLP during training and testing. For Sungai Ketil catchment, the
numbers of input nodes considered for 4 layers MLP are also 16 nodes with 2 hidden
layers. The numbers of hidden nodes in the hidden layer are 14 and 10 for the first and
second layer respectively. Most values of R 2 in Table A4.4(a) to A4.4(c) are around
80% to 88% for 100% and 50% training data sets. Obviously, the model yields good
results when the model calibrated using 100% and 50% training data sets. According to
Kachroo (1986), it is fairly good. According to Johnson and King (1988), this prediction
model considered very accurate with MAPE below 10%. By increasing the number of
layer in the hidden layer (to 2 hidden layers), the results of rainfall-runoff modelling in
Sungai Ketil catchment is not significantly different. For the 100% and 50% training
data sets, the R 2 are maintained in the range of 80% to 90%. For the testing phase, the
RMSE for Sungai Bekok is consistently less than 0.35 cumecs for the 4 layer (16-14-101) model structures. The RRMSE also maintains at 0.0095 for this model structures.
During testing, the RMSE is consistently less than 0.67 cumecs and RRMSE is less than
0.022, and this come close to zero. Obviously, the application of 4 layer MLP structures
to model rainfall-runoff relationship of Sungai Ketil is very successful.
Tables A4.5(a) to A4.5(c) in Appendix A (Part A) present the R 2 , RMSE,
RRMSE, and MAPE resulting from 3-layer MLP model for the Sungai Klang catchment.
Figures I4.5(a) to I4.5(c) in Appendix I (Part A) illustrate the graphical results during
training and testing. For Sungai Klang catchment, the numbers of input nodes considered
115
for 3 layers MLP are 17 nodes with 13 numbers of hidden nodes. By using 100% of
training data sets, the model performance is moderate in term of consistency and reliable
for prediction. Most values of R 2 in Table 4.5(a) are in the range of 70% to 85% for
100% and 50% training data set. According to Kachroo (1986), it is fairly good model.
According to Johnson and King (1988), this prediction model considered reasonable with
MAPE below 30%. All results using 100%, 50% and 25% training data sets exhibit
MAPE less than 30%. However, data set 5 that represent wet-season give MAPE more
than 30%. This clarifies that a large number of training data sets is required to perform
successful training and can produce good results. This outcome indicates that the MLP
model consistently show a fairly good performance in rainfall-runoff modelling for
Sungai Klang catchment. Some of poor performance might be due to the existence of
missing records and this problem would affect the parameters estimation in the
calibration phase. During the training phase, the RMSE for Sungai Klang is consistently
less than 6.4 cumecs for the 3 layer (17-13-1) model structure.
The RRMSE also
maintains below 0.4 for this model structure. During the testing phase, the RMSE is
consistently less than 13.03 cumecs and the RRMSE is maintains below 0.65 for this 3
layer model structures. Obviously, the application of MLP method to model rainfallrunoff relationship of Sungai Klang is successful. The Sungai Klang has the observed
flow between 6.0 cumecs to 89.0 cumecs with the maximum rainfall of 114 mm and it is
468 km2 catchment sizes.
Tables A4.6(a) to A4.6(c) in Appendix A (Part A) present the R 2 , RMSE,
RRMSE, and MAPE resulting from 4-layer MLP model for the Sungai Klang catchment.
Meanwhile, Figures I4.6(a) to I4.6(c) in Appendix I (Part A) illustrate the graphical
results of 4-layer MLP during training and testing. The Sungai Klang receives runoff
from quite large catchment (468 km2) and the observed flow magnitude between 6
cumecs to 89 cumecs. The RMSE during training around 6 cumecs for 4 layer (17-13-91) model structure. The numbers of hidden nodes in the hidden layer are 13 and 9 for the
first and second layer respectively. The RRMSE is approximately 0.37. Testing phase
yields the RMSE between 4.5 cumecs to 14 cumecs and the RRMSE is approximately
0.3-0.4 for most of the data sets. Most values of R 2 in Table 4.6(a) to 4.6(c) are in the
116
range of 70% to 90%. It shows the consistency and reliable results in the prediction
phase. According to Kachroo (1986), it is fairly good. According to Johnson and King
(1988), this prediction model considered reasonable with MAPE below 30%. Most of
results display MAPE less than 30%. However, data set 5 that represent wet-season gives
MAPE more than 30%. By increasing the number of layer in the hidden layer (to 2
hidden layers), the results of rainfall-runoff modelling in Sungai Klang catchment is not
significantly different. The application of 4-layer MLP structures to model rainfallrunoff relationship of Sungai Klang is successful.
Tables A4.7(a) to A4.7(c) in Appendix A (Part A) present the R 2 , RMSE,
RRMSE, and MAPE resulting from 3-layer MLP model for the Sungai Slim catchment.
Figures I4.7(a) to I4.7(c) in Appendix I (Part A) illustrate the graphical results of 3-layer
MLP during training and testing. For Sungai Slim catchment, the numbers of input nodes
considered for 3 layers MLP are 16 nodes with 14 numbers of hidden nodes. By using
100% of training data sets, it shows that this model is moderate in term of consistency
and reliable for prediction. Most values of R 2 in Table A4.7(a) are in the range of 73%
to 90%. According to Kachroo (1986), it is fairly good model. According to Johnson
and King (1988), this prediction model considered very accurate with MAPE less than
10%. This clarifies that a large number of training data sets is required to perform
successful training and can produce good results. This outcome indicates that the MLP
model consistently show a good performance in rainfall-runoff modelling for Sungai
Slim catchment. During the training phase, the RMSE for Sungai Slim is consistently
less than 0.06 cumecs for the 3 layer (16-14-1) model structures. The RRMSE also
maintains below 0.001 for this model structures. During the testing phase, the RMSE is
consistently less than 0.04 cumecs and the RRMSE is maintains below 0.0004 for this 3
layer model structures. Obviously, the application of MLP method to model rainfallrunoff relationship of Sungai Slim is very successful. The Sungai Slim has the observed
flow between 65.5 cumecs to 66.27 cumecs with the maximum rainfall of 1310 mm and
it is 455 km2 catchment sizes.
117
Tables A4.8(a) to A4.8(c) in Appendix A (Part A) present the R 2 , RMSE,
RRMSE, and MAPE resulting from 4-layer MLP model for the Sungai Slim catchment.
Meanwhile, Figures I4.8 in Appendix I (Part A) illustrate the graphical results of 4-layer
MLP during training and testing. For Sungai Slim catchment, the numbers of input nodes
considered for 4-layers MLP are 16 nodes. The numbers of hidden nodes in the hidden
layer are 14 and 11 for the first and second layer respectively. Most values of R 2 in
Table 4.8(a) are in the range of 70% to 80%. According to Kachroo (1986), it is fairly
good model. Johnson and King (1988) decided on model accuracy based on MAPE
values and this prediction model considered very accurate with MAPE below 10%. By
increasing the number of layer in the hidden layer (to 2 hidden layers), the results of
rainfall-runoff modelling in Sungai Slim catchment is not significantly different. During
the training phase, the RMSE for Sungai Slim is consistently less than 0.06 cumecs for
the 4 layer (16-14-11-1) model structures. The RRMSE also maintains at less than 0.001
for this model structure. During testing, the RMSE is consistently less than 0.026 cumecs
and RRMSE is less than 0.0005, and this come close to zero. Obviously, the application
of 4 layer MLP structures to model rainfall-runoff relationship of Sungai Slim is
successful.
4.2.2
Results of Hourly MLP Model
There are two types of MLP model structure have been developed for modelling
hourly streamflow hydrograph. First model is the 3 layers MLP structure with one
hidden layer. Second model is the 4 layers MLP structure with two hidden layer. Results
of three layers (3-layer) and four layers (4-layer) MLP model are described as follows.
Tables 4.9(a) to 4.9(c) present the R 2 , RMSE, RRMSE, and MAPE resulting
from 3-layer MLP model of hourly rainfall-runoff relationship for the Sungai Bekok
catchment. Figures 4.9(a) to 4.9(c) illustrates the graphical results of 3-layer MLP model
during training and testing.
118
The measures of performance of each model are indicated by R 2 , RMSE,
RRMSE, and MAPE. The numbers of input nodes considered for MLP are 7 nodes for
Sungai Bekok. For the neural network training process, the best hidden nodes are chosen
based on the minimum root mean square error (RMSE) computed for the training data.
The numbers of hidden nodes considered are 6 nodes. There are, 50 selected data sets
used for training task and 5 selected sets of data used as the prediction set. Most values
of R 2 approach 1.0. This outcome indicates that the MLP model consistently show a
good performance in rainfall-runoff modelling. Kachroo (1986) reported that when R 2
more than 90%, the model is is very satisfactory. It is fairly good with R 2 in the range of
80% to 90%. Johnson and King (1988) decided on model accuracy based on MAPE
value. The prediction model considered reasonable with MAPE below 30% and very
accurate with MAPE less than 10%. Results of modelling for Sungai Bekok with MAPE
less than 10% can be considered as very accurate.
Table 4.9(a): Results of 3-Layer neural networks for Sg. Bekok catchment using 100% of available data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
7-6-1*
61
0.9927
0.0477
0.0105
1.8910
7-6-1*
61
0.9976
0.0082
0.0016
1.0771
7-6-1*
61
0.9968
0.0091
0.0018
1.1717
7-6-1*
61
0.9896
0.0066
0.0013
0.8792
7-6-1*
61
0.9875
0.0082
0.0019
1.4223
7-6-1*
61
0.9861
0.0033
0.0008
0.6940
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
119
Table 4.9(b): Results of 3-Layer neural networks for Sg. Bekok catchment using 65% of available data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
7-6-1*
61
0.9907
0.0559
0.0123
1.9101
7-6-1*
61
0.9968
0.0094
0.0018
1.2512
7-6-1*
61
0.9967
0.0102
0.0020
1.4880
7-6-1*
61
0.9869
0.0101
0.0020
1.6788
7-6-1*
61
0.9777
0.0104
0.0024
1.9654
7-6-1*
61
0.9528
0.0108
0.0027
2.5093
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
Table 4.9(c): Results of 3-Layer neural networks for Sg. Bekok catchment using 25% of available data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
7-6-1*
61
0.9902
0.0484
0.0100
1.9422
7-6-1*
61
0.9962
0.0104
0.0020
1.3314
7-6-1*
61
0.9954
0.0107
0.0021
1.3631
7-6-1*
61
0.9928
0.0058
0.0012
0.7490
7-6-1*
61
0.9890
0.0076
0.0017
1.2240
7-6-1*
61
0.8978
0.0204
0.0052
4.8408
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
Tables 4.10(a) to 4.10(c) present the R 2 , RMSE, RRMSE, and MAPE resulting
from 4-layer MLP model of hourly rainfall-runoff relationship for the Sungai Bekok
catchment. Figures 4.10(a) to 4.10(c) illustrate the graphical results of 4-layer MLP
120
model during training and testing. For the 4 layer networks, the measures of performance
of each model are indicated by R 2 , RMSE, RMSE and MAPE. The numbers of input
nodes considered for MLP are 7 nodes for Sungai Bekok. The numbers of hidden nodes
considered in the first and the second layer of hidden layer are 6 nodes and 8 nodes
respectively. There are, 50 selected data sets used for training task and 5 selected sets of
data used as the prediction set.
For Sungai Bekok catchment, most values of R 2 approach 1.0. This outcome
indicates that the MLP model consistently show a good performance in rainfall-runoff
modelling. Kachroo (1986) reported that when R 2 more than 90%, the model is is very
satisfactory. Johnson and King (1988) decided on model accuracy based on MAPE
value.
Results of modelling for Sungai Bekok with MAPE less than 10% can be
considered as very accurate.
Table 4.10(a): Results of 4-Layer neural networks for Sg. Bekok catchment using 100% of available data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
7-6-8-1*
119
0.9927
0.0477
0.0105
1.9012
7-6-8-1*
119
0.9972
0.0088
0.0017
1.1251
7-6-8-1*
119
0.9976
0.0076
0.0015
0.9720
7-6-8-1*
119
0.9923
0.0056
0.0011
0.7509
7-6-8-1*
119
0.9908
0.0077
0.0018
1.3467
7-6-8-1*
119
0.9745
0.0036
0.0009
0.7852
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
121
Table 4.10(b): Results of 4-Layer neural networks for Sg. Bekok catchment using 65% of available data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
7-6-8-1*
119
0.9907
0.0558
0.0123
2.1302
7-6-8-1*
119
0.9973
0.0085
0.0017
1.1198
7-6-8-1*
119
0.9959
0.0099
0.0019
1.1864
7-6-8-1*
119
0.9675
0.0120
0.0024
1.3771
7-6-8-1*
119
0.9829
0.0100
0.0023
1.4903
7-6-8-1*
119
0.9643
0.0048
0.0012
0.7130
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
Table 4.10(c): Results of 4-Layer neural networks for Sg. Bekok catchment using 25% of available data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
7-6-8-1*
119
0.9352
0.1216
0.0247
1.2130
7-6-8-1*
119
0.8650
0.0673
0.0131
7.5251
7-6-8-1*
119
0.9965
0.0092
0.0018
1.2206
7-6-8-1*
119
0.9849
0.0079
0.0016
1.0253
7-6-8-1*
119
0.8503
0.0276
0.0065
3.0417
7-6-8-1*
119
0.7833
0.0124
0.0031
1.2463
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
Tables A4.11(a) to A4.11(c) in Appendix A (Part B) present the R 2 , RMSE,
RRMSE, and MAPE resulting from 3-layer MLP model for the Sungai Ketil catchment.
Figures I4.11(a) to I4.11(c) in Appendix I (Part B) illustrate the graphical results of 3-
122
layer MLP model during training and testing. The numbers of input nodes considered for
MLP are 6 nodes for Sungai Ketil. Meanwhile, the best numbers of hidden nodes
considered are 4 nodes for the network structure. For Sungai Ketil, most values of R 2
also approach 1.0. This outcome indicates that the MLP model consistently show a good
performance in rainfall-runoff modelling. According to Kachroo (1986), the model is
very satisfactory.
According to Johnson and King (1988), the prediction model
considered very accurate with MAPE less than 10%.
Tables A4.12(a) to A4.12(c) in Appendix A (Part B) present the R 2 , RMSE,
RRMSE, and MAPE resulting from 4-layer MLP model for the Sungai Ketil catchment.
Figures I4.12(a) to I4.12(c) in Appendix I (Part B) illustrate the graphical results of 4layer MLP model during training and testing. For the Sungai Ketil catchment, the
numbers of input nodes considered for MLP are 6 nodes. The numbers of hidden nodes
considered in the first and the second layer of hidden layer are 4 nodes and 8 nodes
respectively.
For Sungai Ketil catchment, most values of R 2 approach 1.0.
This
outcome indicates that the MLP model consistently show a good performance in rainfallrunoff modelling. Kachroo (1986) reported that when R 2 more than 90%, the model is is
very satisfactory. Johnson and King (1988) decided on model accuracy based on MAPE
value.
Results of modelling for Sungai Ketil with MAPE less than 10% can be
considered as very accurate.
Tables A4.13(a) to A4.13(c) in Appendix A (Part B) present the R 2 , RMSE,
RRMSE, and MAPE resulting from 3-layer MLP model for the Sungai Klang catchment.
Figures I4.13(a) to I4.13(c) in Appendix I (Part B) illustrate the graphical results of 3layer MLP model during training and testing. For Sungai Klang catchment, the numbers
of input nodes considered for MLP are 8 nodes. Meanwhile, the best numbers of hidden
nodes considered are 6 nodes for the network structure. For Sungai Klang, most values
of R 2 are in the range of 80% to 90%. According to Kachroo (1986), the model is fairly
good model. According to Johnson and King (1988), results of modelling for Sungai
Klang with MAPE less than 30% is considered as reasonable prediction. This outcome
indicates that the MLP model consistently show a fairly good rainfall-runoff model.
123
Tables A4.14(a) to A4.14(c) in Appendix A (Part B) present the R 2 , RMSE,
RRMSE, and MAPE resulting from 4-layer MLP model for the Sungai Klang catchment.
Figures I4.14(a) to I4.14(c) in Appendix I (Part B) illustrate the graphical results of 4layer MLP model during training and testing. For the Sungai Klang catchment, the
numbers of input nodes considered for MLP are 8 nodes. The numbers of hidden nodes
considered in the first and the second layer of hidden layer are 6 nodes and 7 nodes
respectively. For Sungai Klang catchment, most values of R 2 are in the range of 70%90%. This outcome indicates that the MLP model consistently show a fairly good model.
Meanwhile, Johnson and King (1988) decided on model accuracy based on MAPE value.
Results of modelling for Sungai Klang with MAPE less than 30% can be considered as
reasonable.
Tables A4.15(a) to A4.15(c) in Appendix A (Part B) present the R 2 , RMSE,
RRMSE, and MAPE resulting from 3-layer MLP model for the Sungai Slim catchment.
Figures I4.15(a) to I4.15(c) in Appendix I (Part B) illustrate the graphical results of 3layer MLP model during training and testing. The numbers of input nodes considered for
MLP are 7 nodes for Sungai Slim catchment. Meanwhile, the best numbers of hidden
nodes considered are 5 nodes. For Sungai Slim, most values of R 2 also approach 1.0.
This outcome indicates that the MLP model consistently show a good performance in
rainfall-runoff modelling. According to Kachroo (1986), the model is very satisfactory.
According to Johnson and King (1988), the prediction model considered very accurate
with MAPE less than 10%.
Tables A4.16(a) to A4.16(c) in Appendix A (Part B) present the R 2 , RMSE,
RRMSE, and MAPE resulting from 4-layer MLP model for the Sungai Slim catchment.
Figures I4.16(a) to I4.16(c) in Appendix I (Part B) illustrate the graphical results of 4layer MLP model during training and testing. For the Sungai Slim catchment, the
numbers of input nodes considered for MLP are 7 nodes. The numbers of hidden nodes
considered in the first and the second layer of hidden layer are 5 nodes and 9 nodes
respectively.
For Sungai Slim catchment, most values of R 2 approach 1.0.
This
outcome indicates that the MLP model consistently show a good performance in rainfall-
124
runoff modelling. Kachroo (1986) reported that when R 2 more than 90%, the model is
very satisfactory. Johnson and King (1988) decided on model accuracy based on MAPE
value.
Results of modelling for Sungai Slim with MAPE less than 10% can be
considered as very accurate.
During the training phase, the RMSE for Sungai Bekok is consistently less than
0.1 cumecs for the 3 layer (7-6-1) and 4 layer (7-6-8-1) model structures. The RRMSE
also maintains at less than 0.0105 for both model structures. During testing, the RMSE is
less than 0.01 cumecs and RRMSE is less than 0.002 that come close to zero. Obviously,
the application of MLP method to model hourly rainfall-runoff relationship of Sungai
Bekok is very successful. The Sungai Bekok has the observed flow between 3.61 cumecs
to 6.95 cumecs and it is 350 km2 catchment size.
For Sungai Ketil, during the training phase, the RMSE is consistently less than 0.1
cumecs for the 3 layer (7-5-1) and 4 layer (6-4-8-1) model structures. The RRMSE is
maintains below 0.03 for both model structures. During testing, the RMSE is less than
0.01 cumecs and RRMSE is less than 0.02 that come close to zero. Obviously, the
application of MLP method to model hourly rainfall-runoff relationship of Sungai Ketil is
moderate and satisfactory.
The Sungai Ketil has the observed flow between 28.87
cumecs to 34.22 cumecs and it is 704 km2 catchment size.
The Sungai Klang receives runoff from quite large catchment (468 km2) and the
observed flow magnitude between 3.6 cumecs to 424.7 cumecs. The RMSE during
training around 10 cumecs for 3 layer (8-6-1) and 4 layer (8-6-7-1) model structures. The
RRMSE is approximately 0.25. Testing phase yields the RMSE between 4.0 cumecs to
13.5 cumecs and the RRMSE is approximately 0.2 cumecs for most of the data sets.
Meanwhile, the Sungai Slim is semi-develop area with size 455 km2. It receives
runoff from the observed flow magnitude between 23.77 cumecs to 26.38 cumecs. The
RMSE during training is less than 0.06 cumecs for 3 layer (7-5-1) and 4 layer (7-5-9-1)
125
model structures. Meanwhile the RRMSE is less than 0.005. Testing phase yields the
RMSE below 0.5 cumecs and the RRMSE is below 0.02 cumecs for most of the data sets.
4.2.3 Training and Validation
The training process is time consuming.
If the architecture of the training
algorithms is not suitable, it will affect the accuracy of predictions and a network’s
learning ability. The number of hidden nodes significantly influences the performance of
a network and the time taken to train the model. The number of nodes in the hidden layer
can be as small or large as required. There are no fixed rules about the number of nodes
in the hidden layer. It is related to the complexity of the system being modelled and to
the resolution of the data fit. The number of nodes in the hidden layer was determined by
trial and error for each case. If this number of hidden nodes is small, the network can
suffer from under fit of the data and may not achieve the desired level of accuracy, while
with too many nodes it will take a long time to be adequately trained and may some times
over fit the data. French et. al. (1992) proposed that normally neural networks were
developed using 15, 30, 45, 60 and 100 hidden nodes. This procedure is also considered
to examine the performance of neural network model with different number of hidden
nodes and hidden layers. In this study, these procedures are not relevance. It is because
every one additional number of node yields different results and may contribute to the
consistency and accuracy of the model. So, we proposed to increase the number of nodes
in the hidden layer one by one.
In the case of MLP model, it is observed that a large number of training data sets
is required to perform successful training. As a result, the model will get a proper trained
and it is capable to yield a better results. So, the accuracy of MLP increases as more and
more input data are made available to them. But, it must be kept in mine to not using an
excessive number of neurons or layers, without considering the limitations due to the
number of observations in training set. The validation process is important to give a
126
signal to the system due to overtraining. Overtraining leads to poor future predictions
because it forces networks to fit the noise in training data rather that generalizing the
patterns in the training set. The main objective of validation process is to stop the
simulation or calibration process when the error was increased. So that, it will shorten
the time taken for calibration process and give the best results with high accuracy.
Table H4.61 to H4.68 in Appendix H shows the daily and hourly results of
percentage bias (PBIAS) of calibration or training for the 3-layer and 4-layer of MLP
model. According to Yapo et. al (1996), the optimal value is zero, which means that the
model has an unbiased flow simulation.
Positives value indicates a tendency of
overestimation and negative values indicate a tendency of underestimation. It shows that,
by using 100% amount of data, the PBIAS are in the range of -0.1 to +0.1. It means that
the models were robust and unbiased and it was ready to be used for prediction.
Meanwhile, model that has been calibrated using 50% and 25% amount of data were also
satisfactory. But, the PBIAS of the models was in the range of +4.0 to -2.0. It may cause
by little or not enough data used in calibration or training process. Therefore, the models
have a tendency of overestimate or underestimate.
Normally, it can be seen that the result of training is lower than the results of
testing in term of correlation of coefficient ( R 2 ). But, some of the results show the
improvement in term of RMSE, RRMSE and MAPE and it is still in the acceptable range.
This scenario shows that with the proper and enough trained, the model would be
consistent and robust and it will have the ability to yield a good results.
4.2.4
Testing
Testing of the model was carried out by using five different data sets with the
different period of time. The objective is to evaluate performance and the robustness of
127
the model by using several sets of data. This new sets of data set will introduce to the
model that has been calibrated.
One of the problems that occur during neural network testing is over-fitting. The
error on the training set is driven to a very small value, but when new or testing data is
presented to the network the error is large. The network has memorized the training
examples, but it has not learned to generalize to new situations. So, in this study the
network that is just large enough was used to provide an adequate fit. The larger a
network has been used, the more complex the functions the network can create. By using
a small enough network, it will not have enough power to overfit the data. So, one
method that has been implemented in this study to estimate how large a network should
be for a specific application is early stopping method. Early stopping can be used with
any of the training functions.
The number of nodes in the hidden layer can be as small or large as required. The
number of nodes in the hidden layer was determined by trial and error for each case. By
using trial and error method, the best optimum number of hidden nodes in the hidden
layer that produced the best fit results can be decided. It is related to the complexity of
the system being modelled and quality of the data sets. By using 100% of training data
sets, it would produced more accurate, consistence and reliable results compared to the
models that using 50% or 25% of training data sets. It is observed that the accuracy of
MLP increases as more and more input data are made available to them. The number of
hidden layer neurons significantly influences the performance of a network. If this
number is small, the network may not achieve a desired level of accuracy, while with too
many nodes it will take a long time to get trained and may sometimes over fit the data.
The application of two hidden layer appear to be an advantage for a bigger and large
catchment such as Sungai Ketil. It can be seen that the smaller catchment as Sungai
Bekok is sufficient for a single hidden layer of neural model structure. In general, 3layers and 4-layers of MLP networks show slightly better performance both in the
training and testing periods.
128
In overall, by increase the number of hidden layer and number of hidden nodes in
the model, it will increase the complexity of the system, and it may slow the training and
testing process without substantially improving the efficiency of the network.
4.2.5
Robustness Test
The robustness tests were carried out for each model by using difference sets of
data with difference period of time. For daily and hourly rainfall-runoff modelling, 5
different sets of data namely set 1, set 2, set 3, set 4 and set 5 were used to study the
capability of the models in transforming a rainfall into runoff in any condition. The
measures of performance of each model are indicated by R 2 , RMSE, RRMSE, and
MAPE.
The MLP model is capable and robust in modelling continuous daily rainfallrunoff relationship. This model is robust and demonstrates remarkable performance in
modelling hourly streamflow hydrograph. Even the MLP model considered a big number
of parameter to be estimated, it shows a good and reasonably robust calibration. It shows
that MLP model is consistent, reliable and robust to coupe with any condition or problem
regarding to the input data that have to introduce to the model.
4.3
Results of the Radial Basis Function (RBF) Model
Result of daily and hourly rainfall-runoff modelling were discussed in section
4.3.1 and 4.3.2 respectively. The different of RBF model compared to the MLP model is
the number of layer for the network structure, which are only three layers.
129
4.3.1 Results of Daily RBF Model
Tables 4.17(a) to 4.17(c) present the R 2 , RMSE, RRMSE, and MAPE resulting
from RBF model of daily rainfall-runoff relationship for the Sungai Bekok catchment.
Meanwhile, Figures 4.17(a) to 4.17(c) illustrates the graphical results of RBF model
during training and testing.
For the RBF training process, the best-input nodes are chosen based on the RMSE
computed for the training data. The numbers of input nodes considered for RBF are 16
nodes for Sungai Bekok. Results of modelling for Sungai Bekok with MAPE less than
10% can be considered as very accurate. Meanwhile, the RBF model give R 2 between
more than 90% and this condition shows that the model performance is very satisfactory.
During the training phase, the RMSE for Sungai Bekok is consistently less than 0.18
cumecs. The RRMSE also maintains at 0.35 for this model structures. During testing,
the RMSE is consistently less than 0.29 cumecs and RRMSE is less than 0.06, and this
come close to zero.
Table 4.17(a): Results of RBF networks for Sg. Bekok catchment using 100% of data sets in training phase
MODEL
Data Set
RBF
TRAINING
RBF-TEST
Set 1
RBF-TEST
Set 2
RBF-TEST
Set 3
RBF-TEST
Set 4
RBF-TEST
Set 5
Model
Structure
16 input
nodes
16 input
nodes
16 input
nodes
16 input
nodes
16 input
nodes
16 input
nodes
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
52
0.9410
0.1776
0.0344
2.5298
52
0.9060
0.2048
0.0401
3.2268
52
0.9299
0.1612
0.0313
2.4287
52
0.8946
0.2091
0.0411
3.3006
52
0.9708
0.1424
0.0275
2.2149
52
0.8196
0.1837
0.0369
2.9040
cumecs-meter cubic second; COC-correlation of coefficient
130
Table 4.17(b): Results of RBF networks for Sg. Bekok catchment using 50% of data sets in training phase
MODEL
Data Set
Model
Structure
RBF
TRAINING
RBF-TEST
Set 1
RBF-TEST
Set 2
RBF-TEST
Set 3
RBF-TEST
Set 4
RBF-TEST
Set 5
16 input
nodes
16 input
nodes
16 input
nodes
16 input
nodes
16 input
nodes
16 input
nodes
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
52
0.9770
0.1097
0.0220
1.5198
52
0.8490
0.2508
0.0493
4.0637
52
0.8706
0.1899
0.0367
2.6731
52
0.8625
0.2518
0.0496
4.1023
52
0.9578
0.1469
0.0282
2.1788
52
0.7342
0.2244
0.0449
3.5356
cumecs-meter cubic second; COC-correlation of coefficient
Table 4.17(c): Results of RBF networks for Sg. Bekok catchment using 25% of data sets in training phase
MODEL
Data Set
Model
Structure
RBF
TRAINING
RBF-TEST
Set 1
RBF-TEST
Set 2
RBF-TEST
Set 3
RBF-TEST
Set 4
RBF-TEST
Set 5
16 input
nodes
16 input
nodes
16 input
nodes
16 input
nodes
16 input
nodes
16 input
nodes
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
52
0.9920
0.0747
0.0152
1.1121
52
0.8336
0.2861
0.0559
4.6072
52
0.8337
0.2078
0.0400
2.8824
52
0.8279
0.2788
0.0548
4.5339
52
0.9550
0.0896
0.0178
1.4919
52
0.7414
0.2303
0.0462
3.6924
cumecs-meter cubic second; COC-correlation of coefficient
The RBF networks show a successful calibration and testing process using dryseason of data set (set 4). It approximates rainfall-runoff process in the catchment more
closely than the networks calibrated using wet-season of data set (set 5).
131
Tables A4.18(a) to A4.18(c) in Appendix B (Part A) present the R 2 , RMSE,
RRMSE, and MAPE resulting from RBF model for the Sungai Ketil catchment.
Meanwhile, Figures I4.18(a) to I4.18(c) in Appendix I (Part A) illustrate the graphical
results of RBF model during training and testing. For the Sungai Ketil catchment, the
numbers of input nodes considered for RBF are also 16 nodes. According to Johnson and
King (1988) results of modelling for Sungai Ketil with MAPE less than 10% can be
considered as very accurate. Meanwhile, the RBF model give R 2 in the range of 80% to
90% and this condition shows that the model performance is fairly good model. During
the training phase, the RMSE for Sungai Ketil is consistently less than 0.172 cumecs.
The RRMSE also consistently less than 0.02 for this model. During testing, the RMSE is
consistently less than 0.56 cumecs and RRMSE is less than 0.019, and this come close to
zero.
Tables A4.19(a) to A4.19(c) in Appendix B (Part A) present the R 2 , RMSE,
RRMSE, and MAPE resulting from RBF model for the Sungai Klang catchment.
Meanwhile, Figures I4.19(a) to I4.19(c) in Appendix I (Part A) illustrate the graphical
results of RBF model during training and testing. For the Sungai Klang catchment, the
numbers of input nodes considered for RBF are 17 nodes. According to Johnson and
King (1988) results of modelling for Sungai Ketil with MAPE less than 30% can be
considered as reasonable model. Meanwhile, the RBF model gives R 2 in the range of
70% to 80% and this condition shows that the model performance can be classified as a
moderate model. During the training phase, the RMSE for Sungai Klang is consistently
less than 6.3 cumecs. The RRMSE also consistently less than 0.34 for this model.
During testing, the RMSE is consistently less than 12.8 cumecs and RRMSE is less than
0.59, and this come close to zero.
Tables A4.20(a) to A4.20(c) in Appendix B (Part A) present the R 2 , RMSE,
RRMSE, and MAPE resulting from RBF model for the Sungai Slim catchment.
Meanwhile, Figures I4.20(a) to I4.20(c) in Appendix I (Part A) illustrate the graphical
results of RBF model during training and testing. For the Sungai Slim catchment, the
best-input nodes chosen are 16 nodes. Results of modelling for Sungai Slim with MAPE
132
less than 10% can be considered as very accurate. Meanwhile, the RBF model give R 2
in the range of 60% to 80% and this condition can be classified as moderate model.
During the training phase, the RMSE for Sungai Slim is consistently less than 0.05
cumecs. The RRMSE also maintains at 0.0008 for this model. Meanwhile, during
testing, the RMSE is consistently less than 0.034 cumecs and RRMSE is less than 0.0005,
and this come close to zero.
4.3.2 Results of Hourly RBF Model
Tables 4.21(a) to 4.21(b) present the R 2 , RMSE, RRMSE, and MAPE resulting
from RBF model of hourly rainfall-runoff relationship for the Sungai Bekok catchment.
Figures 4.21(a) to 4.21(c) illustrate the graphical results of RBF model during training
and testing. For the RBF training process, the best-input nodes are chosen based on the
minimum RMSE computed for the training data. The numbers of input nodes considered
for RBF are 7 input nodes for Sungai Bekok catchment. The RBF model learns faster
than MLP model, and the model can produce results rapidly in the testing phase.
Obviously, the limitation of the RBF model is, it unable to carry large number of data
sets. The RBF model is stable and yield a consistent and reliable results by using the
optimum data sets.
During the training phase, the RMSE for Sungai Bekok is
consistently less than 0.1 cumecs and the RRMSE also maintains below 0.02. During
testing, the RMSE is less than 0.004 cumecs and RRMSE is less than 0.001 and this
come close to zero. The application of RBF method to model rainfall-runoff relationship
of Sungai Bekok is very successful. Results of modelling for Sungai Bekok with MAPE
less than 10% can be considered as very accurate.
133
Table 4.21(a): Results of RBF networks for Sg. Bekok catchment –
using 25% of available data sets in training phase
MODEL
Data Set
RBF
TRAINING
RBF-TEST
Set 1
RBF-TEST
Set 2
RBF-TEST
Set 3
RBF-TEST
Set 4
RBF-TEST
Set 5
Model
Structure
7 input
nodes
7 input
nodes
7 input
nodes
7 input
nodes
7 input
nodes
7 input
nodes
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
25
0.9942
0.0372
0.0074
0.7893
25
0.9995
0.0039
0.0008
0.3151
25
0.9994
0.0038
0.0008
0.3430
25
0.9976
0.0031
0.0006
0.2307
25
0.9998
0.0011
0.0002
0.1108
25
0.9997
0.0004
0.0001
0.0369
cumecs-meter cubic second; COC-correlation of coefficient
Table 4.21(b): Results of RBF networks for Sg. Bekok catchment –
using minimum data sets in training phase
MODEL
Data Set
RBF
TRAINING
RBF-TEST
Set 1
RBF-TEST
Set 2
RBF-TEST
Set 3
RBF-TEST
Set 4
RBF-TEST
Set 5
Model
Structure
7 input
nodes
7 input
nodes
7 input
nodes
7 input
nodes
7 input
nodes
7 input
nodes
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
25
0.9709
0.0975
0.0181
2.8852
25
1.000
0.0004
0.0000
0.0197
25
0.9998
0.0020
0.0004
0.0838
25
0.9991
0.0019
0.0004
0.0862
25
1.000
0.0003
0.0000
0.0108
25
0.9997
0.0002
0.0000
0.0154
cumecs-meter cubic second; COC-correlation of coefficient
Tables B4.22(a) to B4.22(b) in Appendix B (Part B) present the R 2 , RMSE,
RRMSE, and MAPE resulting from RBF model for the Sungai Ketil catchment. Figures
I4.22(a) to I4.22(c) in Appendix I (Part B) illustrate the graphical results of RBF model
134
during training and testing. The numbers of input nodes considered for RBF are 6 input
nodes for Sungai Ketil catchment. Obviously, the RBF model learns faster than MLP
model, and the model can produce results rapidly in the testing phase. During the
training phase, the RMSE for Sungai Ketil is consistently less than 0.03 cumecs and the
RRMSE also maintains below 0.001. During testing, the RMSE is less than 0.04 cumecs
and RRMSE is less than 0.015 and this come close to zero. Obviously, the application of
RBF method to model rainfall-runoff relationship of Sungai Ketil is also very successful.
Results of modelling for Sungai Ketil with MAPE less than 10% can be considered as
very accurate.
Tables B4.23(a) to B4.23(b) in Appendix B (Part B) present the R 2 , RMSE,
RRMSE, and MAPE resulting from RBF model for the Sungai Klang catchment. Figures
I4.23(a) to I4.23(c) in Appendix I (Part B) illustrate the graphical results of RBF model
during training and testing. The numbers of input nodes considered for RBF are 8 input
nodes for Sungai Klang catchment. Obviously, the RBF model learns faster than MLP
model, and the model can produce results rapidly in the testing phase. During the
training phase, the RMSE for Sungai Klang is consistently less than 0.21 cumecs and the
RRMSE also maintains below 0.02 and this come close to zero. During testing, the
RMSE is less than 0.025 cumecs and RRMSE is less than 0.002 and this also come close
to zero. Obviously, the application of RBF method to model rainfall-runoff relationship
of Sungai Klang is very successful and good performance.
According to Johnson and
King (1988), results of modelling for Sungai Klang with MAPE less than 10% can be
considered as very accurate.
Tables B4.24(a) to B4.24(b) in Appendix B (Part B) present the R 2 , RMSE,
RRMSE, and MAPE resulting from RBF model for the Sungai Slim catchment. Figures
I4.24(a) to I4.24(c) in Appendix I (Part B) illustrate the graphical results of RBF model
during training and testing. The numbers of input nodes considered for RBF are 7 input
nodes for Sungai Slim catchment. Obviously, the RBF model learns faster than MLP
model, and the model can produce results rapidly in the testing phase. During the
training phase, the RMSE for Sungai Slim is consistently less than 0.031 cumecs and the
135
RRMSE also maintains below 0.0015 and this come close to zero. During testing, the
RMSE is less than 0.025 cumecs and RRMSE is less than 0.001 and this also come close
to zero. The application of RBF method to model rainfall-runoff relationship of Sungai
Slim is very successful and yields a good performance. According to Johnson and King
(1988), results of modelling for Sungai Klang with MAPE less than 10% can be
considered as very accurate.
4.3.3 Training and Validation
The training and validation of RBF model was also works parallels. It was
important to control the time consuming in the calibration process. As mentioned before,
it would be stop if the errors become higher. In general, the RBF network can be
described as universal approximated function using combinations of basis functions
centred around weights vectors to provide spatial estimates.
The best results were
achieved for the network with Gaussian activation function, GRNN algorithms, and
appropriated number of input nodes. If the architecture of the training algorithms is not
suitable, it will affect the accuracy of predictions and a network’s learning ability. The
number of input nodes significantly influences the performance of a network and the time
taken to train the model. It is related to the complexity of the system being modelled and
to the resolution of the data fit. The number of input nodes in the input layer was
determined by trial and error for each case. If this number of input nodes is small, the
network can suffer from under fit of the data and may not achieve the desired level of
accuracy, while with too many nodes it will take a long time to be adequately trained and
may some times over fit the data.
The trial and error method, will determine the best optimum number of input
nodes in the input layer that produced the best fit results. It is related to the complexity
of the system being modelled and quality of the data sets. The daily rainfall-runoff
modelling, used 100%, 50% and 25% of training data sets to evaluate the accuracy,
136
consistency and reliability of the results. Meanwhile, for hourly rainfall-runoff modelling
only 25% and minimum of training data sets are used because of so many numbers of
data sets involved within the period of time in this can increase complexity. The study
confirms that the accuracy of RBF is not affected by the number of input data that
available to them.
Table H4.61 to H4.68 in Appendix H show the daily and hourly results of
percentage bias (PBIAS) of calibration or training for the RBF model. In general, RBF
model shows consistent values of PBIAS for model the hourly rainfall-runoff relationship
and the PBIAS reaches zero. Meanwhile, the PBIAS for model the daily rainfall-runoff
relationship was in the range of -0.01 to -6.0. It shows that the RBF model has a small
tendency for underestimate.
4.3.4
Testing
Testing of the RBF model was also carried out by using five different data sets
with the different period of time. The objective is to evaluate performance and the
robustness of the model by using several sets of data. This new sets of data set will
introduce to the model that has been calibrated. The different of RBF model compared to
the MLP model is the number of layer for the network structure are only three layers.
As the MLP model, the problems that occur during RBF network testing are overfitting. The error on the training set is driven to a very small value, but when new or
testing data is presented to the network the error is large. So, in this study the network
that is just large enough was used to provide an adequate fit. The larger a network has
been used, the more complex the functions the network can create. If a small enough
network is used, it will not have enough power to overfit the data.
137
The number of nodes in the hidden layer can be as small or large as required. The
number of nodes in the input layer and the hidden layer was determined by trial and error
for each case. By using trial and error method, the best optimum number of hidden nodes
in the hidden layer that produced the best fit results can be found. It is related to the
complexity of the system being modeled and quality of the data sets. The accuracy of
RBF is not affected by the number of input data that available to them. The number of
hidden layer neurons significantly influences the performance of a network. If this
number is small, the network may not achieve a desired level of accuracy, while with too
many nodes it will take a long time to get trained and may sometimes over fit the data.
4.3.5
Robustness Test
As the MLP model, the robustness tests for RBF model were carried out for each
model by using difference sets of data with difference period of time. For daily and
hourly rainfall-runoff modelling, five different sets of data namely set 1, set 2, set 3, set 4
and set 5 were used to study the capability of the models in transforming a rainfall into
runoff in any condition. The measures of performance of each model are indicated by
R 2 , RMSE, RRMSE, MAPE and PBIAS.
The study confirmed that the RBF model was capable and robust in modelling
daily and hourly rainfall-runoff relationship. The results of R 2 and the error analysis,
indicate that the RBF model is very consistent and robust and have a good agreement for
modelling of hourly rainfall-runoff relationship. The RBF model is more reliable and
gives a higher accuracy for model short duration event or storm hydrograph modelling.
138
4.4
Results of the Multiple Linear Regression (MLR) Model
Result of daily rainfall-runoff modelling using MLR model were discussed below.
The MLR model is an alternative tool that applied the regression and statistical concepts
to develop the relationship between the input-output data pairs. This model is only
applied for model of continuous daily rainfall-runoff modelling.
4.4.1
Calibration
The calibration of the model parameters is usually accomplished by trial and error
process. Thus, the calibration accuracy of the models is very subjective and highly
dependent on the knowledge, experience, and understanding of the components of the
model and catchment characteristics.
Most calibration studies in the past have involved some form of optimization of
the parameter values by comparing the results of repeated simulations with whatever
observations of the catchment response are available. The parameter values are adjusted
between each run of the model; either manually or by some computerized optimization
algorithm until some ‘best fit’ parameter set has been found. Methods of model
calibration that assume an optimum parameter set and that ignore the estimation of
predictive uncertainty are simple trial and error method with parameter values adjusted
and the variety of automatic optimization methods.
Once one model have been chosen for consideration in a project, it is necessary to
address the problem of parameter calibration. In general, it is possible to estimate the
parameters of model by either measurement or prior estimation. It is generally necessary
to go through a stage of parameter calibration before apply the model to make
quantitative predictions for a particular catchment. All the models used in hydrology
have equations that involve a variety of different input and state variable. Finally, there
139
are the model parameters which define the characteristics of the catchment area or flow
domain. Once the model parameter values have been specified, a simulation may be
made and quantitative predictions about the response obtained.
The application of MLR model is suitable for daily rainfall-runoff modelling, but
the theory somehow couldn’t link to the daily situation. It is because this model was very
sensitive to the inconsistency of data especially the continuous data with some extremely
maximum data. In the calibration, the data would be sorted by stages that functional of
time t , t − 1 , t − 2 , and so on until the optimum number of input nodes that highly
correlated and can contribute to the output or results are determined. By using the best
structure and parameters that has been selected, then the model will be test by using test
sets of data to evaluate the robustness of the model.
The following multiple linear regression (MLR) model are proposed for the
prediction of runoff using available rainfall measurements from rain gauges locate at the
site. The models are proposed to describe the relationships between daily rainfalls and
daily runoffs using available data series for the five years period. The proposed models
for each catchment are described in section 4.7.1.3, 4.7.1.4 and 4.7.1.5 respectively.
4.4.1.1 Model developed using 100% of calibration data
This section shows the model structures that have been developed by using 100
percents of calibration data. The MLR model structures for each catchment are described
as follows:
(i)
Sungai Bekok catchment area
y (t ) = 0.056 xt + 0.080 xt −1 + 0.168 xt − 2 + 0.235 xt −3 + 0.228 xt − 4 + 0.218 xt −5 +
0.210 xt −6 + 0.187 xt −7 + 0.169 xt −8 + 0.143xt −9 + 0.134 xt −10 + 0.138 xt −11 +
0.146 xt −12 + 0.137 xt −13 + 0.144 xt −14 + 0.145 xt −15
(4.1)
140
(ii)
Sungai Ketil catchment area
y (t ) = 0.042 xt + 0.048 xt −1 + 0.222 xt − 2 + 0.334 xt −3 + 0.170 xt − 4 + 0.145 xt −5 +
0.123xt −6 + 0.135 xt −7 + 0.080 xt −8 + 0.085 xt −9 + 0.109 xt −10 + 0.101t −11 +
0.101xt −12 + 0.086 xt −13 + 0.111xt −14 + 0.089 xt −15
(4.2)
(iii)
Sungai Klang catchment area
y (t ) = 0.561xt + 0.287 xt −1 + 0.136 xt − 2 + 0.109 xt −3 + 0.065 xt − 4 + 0.065 xt −5 +
0.052 xt −6 + 0.040 xt −9 + 0.040 xt −10 + 0.072 xt −11 + 0.049 xt −12 + 0.047 xt −13 +
0.061xt −15 + 0.054 xt −16
(4.3)
(iv)
Sungai Slim catchment area
y (t ) = 0.252 xt + 0.306 xt −1 + 0.184 xt − 2 + 0.156 xt −3 + 0.160 xt − 4 + 0.172 xt −5 +
0.107 xt −6 + 0.106 xt −7 + 0.107 xt −8 + 0.082 xt −9 + 0.084 xt −10 + 0.083xt −11 +
0.049 xt −12 + 0.079 xt −13 + 0.101xt −14 + 0.107 xt −15
(4.4)
where y(t ) is predicted runoff; xt , xt −1 ,..., xt −n are the rainfall input for the
corresponding time. It was observed that 16 inputs data gives significant
contributions to the model structures for Sungai Bekok, Sungai Ketil and
Sungai Slim catchments respectively.
Meanwhile, for Sungai Klang
catchment, there are only 14 inputs data gives a significant contribution to
the model structure, where the rainfall at time t − 7 and t − 8 were not
take into consideration.
4.4.1.2 Model developed using 50% of calibration data
This section shows the model structures that have been developed by using 50
percents of calibration data. The MLR model structures for each catchment are described
as follows:
141
(i)
Sungai Bekok catchment area
y (t ) = 0.054 xt + 0.096 xt −1 + 0.167 xt − 2 + 0.219 xt −3 + 0.213xt − 4 + 0.201xt −5 +
0.197 xt −6 + 0.183xt −7 + 0.167 xt −8 + 0.143xt −9 + 0.126 xt −10 + 0.139 xt −11 +
0.149 xt −12 + 0.152 xt −13 + 0.159 xt −14 + 0.159 xt −15
(4.5)
(ii)
Sungai Ketil catchment area
y (t ) = 0.085 xt + 0.059 xt −1 + 0.210 xt − 2 + 0.260 xt −3 + 0.142 xt − 4 + 0.165 xt −5 +
0.141xt −6 + 0.132 xt −7 + 0.095 xt −8 + 0.109 xt −9 + 0.130 xt −10 + 0.120 t −11 +
0.100 xt −12 + 0.098 xt −13 + 0.105 xt −14 + 0.086 xt −15
(4.6)
(iii)
Sungai Klang catchment area
y (t ) = 0.581xt + 0.286 xt −1 + 0.141xt − 2 + 0.102 xt −3 + 0.064 xt − 4 + 0.085 xt −5 +
0.082 xt −6 + 0.050 xt −9 + 0.046 xt −10 + 0.061xt −11 + 0.079 xt −12 + 0.067 xt −13 +
0.096 xt −15 + 0.048 xt −16
(4.7)
(iv)
Sungai Slim catchment area
y (t ) = 0.305 xt + 0.443xt −1 + 0.252 xt − 2 + 0.220 xt −3 + 0.217 xt − 4 + 0.217 xt −5 +
0.143xt −6 + 0.148 xt −7 + 0.142 xt −8 + 0.121xt −9 + 0.142 xt −10 + 0.122 xt −11 +
0.070 xt −12 + 0.100 xt −13 + 0.121xt −14 + 0.125 xt −15
(4.8)
where y(t ) is predicted runoff; xt , xt −1 ,..., xt −n are the rainfall input for the
corresponding time. It was also found that 16 inputs data gives significant
contributions to the model structures for Sungai Bekok, Sungai Ketil and
Sungai Slim catchments respectively; and Sungai Klang catchment with
only 14 inputs data.
4.4.1.3 Model developed using 25% of calibration data
This section shows the model structures that have been developed by using 25
percents of calibration data. The MLR model structures for each catchment are described
as follows:
142
(i)
Sungai Bekok catchment area
y (t ) = 0.126 xt − 2 + 0.201xt −3 + 0.227 xt − 4 + 0.224 xt −5 + 0.208 xt −6 + 0.175 xt −7 +
0.154 xt −8 + 0.140 xt −9 + 0.125 xt −10 + 0.135 xt −11 + 0.153xt −12 + 0.140 xt −13 +
0.127 xt −14 + 0.116 xt −15
(4.9)
(ii)
Sungai Ketil catchment area
y (t ) = 0.160 xt + 0.139 xt −1 + 0.210 xt − 2 + 0.242 xt −3 + 0.168 xt −4 + 0.237 xt −5 +
0.167 xt −6 + 0.132 xt −7 + 0.126 xt −8 + 0.155 xt −9 + 0.161t −10 + 0.077 t −11 +
0.070 xt −13 + 0.070 xt −14
(4.10)
(iii)
Sungai Klang catchment area
y (t ) = 0.619 xt + 0.286 xt −1 + 0.143xt − 2 + 0.104 xt −3 + 0.091xt − 4 + 0.079 xt −5 +
0.111xt −6 + 0.063xt −9 + 0.062 xt −10 + 0.060 xt −11 + 0.056 xt −12 + 0.093xt −13 +
0.082 xt −15 + 0.075 xt −16
(4.11)
(iv)
Sungai Slim catchment area
y (t ) = 0.316 xt + 0.498 xt −1 + 0.246 xt − 2 + 0.204 xt −3 + 0.194 xt −4 + 0.219 xt −5 +
0.125 xt −6 + 0.156 xt −7 + 0.132 xt −8 + 0.120 xt −9 + 0.124 xt −10 + 0.093xt −11 +
0.088 xt −13 + 0.104 xt −14 + 0.127 xt −15
(4.12)
where y(t ) is predicted runoff; xt , xt −1 ,..., xt −n are the rainfall input for the
corresponding time. It was observed that, by using 25 percents of the data
for calibration, there are only 14 inputs data gives significant contributions
to the model structures for Sungai Bekok, Sungai Ketil and Sungai Klang
catchments respectively. Meanwhile, for Sungai Slim catchment, it was
observed that 15 inputs data gives a significant contribution to the model
structure.
Table H4.61 to H4.68 in Appendix H show the daily and hourly results of
percentage bias (PBIAS) of calibration or training for the MLR model. According to
143
Yapo et. al (1996), the MLR model can be classified as bias model. This study indicates
the PBIAS are in the range of -50.0 to +300.0. Therefore, the models have a tendency of
overestimate or underestimate.
4.4.2 Results of Daily MLR Model
Tables 4.25(a) to 4.25(c) present the R 2 , RMSE, RRMSE, and MAPE resulting
from MLR model of daily rainfall-runoff relationship for the Sungai Bekok catchment.
For Sungai Bekok catchment, the numbers of input considered for MLR are 16
inputs after calibrated using both 100% and 50% of the data sets. 14 inputs are considered
after calibrated using 25% of the data sets. The MLR model give R 2 in the range of 70%
to 90% and this condition can be classified as fairly good model. According to Johnson
and King (1988) results of modelling for Sungai Bekok with MAPE around 30% can be
considered a reasonable prediction.
During the calibration phase, the RMSE is
consistently less than 186 cumecs. The RRMSE is consistently less than 33.7 for this
model. During testing, the RMSE is consistently less than 183.3 cumecs and RRMSE is
less than 35.9.
Table 4.25(a): Results of MLR Model for Sg. Bekok catchment –
using 100% of data sets in training phase
MODEL
Data Set
MLR
TRAINING
MLR-TEST
Set 1
MLR-TEST
Set 2
MLR-TEST
Set 3
MLR-TEST
Set 4
MLR-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
16 input
6
0.8255
180.6978
32.6649
27.2412
16 input
6
0.8803
164.8479
32.4936
28.0611
16 input
6
0.8567
166.2273
32.6982
28.9360
16 input
6
0.7568
164.1098
32.5462
27.1136
16 input
6
0.8106
183.2238
35.8822
32.7099
16 input
6
0.6682
142.3990
28.5501
21.6171
cumecs-meter cubic second; COC-correlation of coefficient
144
Table 4.25(b): Results of MLR Model for Sg. Bekok catchment –
using 50% of data sets in training phase
MODEL
Data Set
MLR
TRAINING
MLR-TEST
Set 1
MLR-TEST
Set 2
MLR-TEST
Set 3
MLR-TEST
Set 4
MLR-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
16 input
6
0.8754
185.6333
33.6433
28.6339
16 input
6
0.8708
163.3746
32.2331
27.9174
16 input
6
0.9353
164.9307
32.3937
28.7739
16 input
6
0.8258
161.9409
32.0716
26.7573
16 input
6
0.9542
182.7206
35.7068
32.6845
16 input
6
0.7409
140.9284
28.2122
21.4292
cumecs-meter cubic second; COC-correlation of coefficient
Table 4.25(c): Results of MLR Model for Sg. Bekok catchment –
using 25% of data sets in training phase
MODEL
Data Set
MLR
TRAINING
MLR-TEST
Set 1
MLR-TEST
Set 2
MLR-TEST
Set 3
MLR-TEST
Set 4
MLR-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
14 input
6
0.8267
149.2337
27.8447
22.7327
14 input
6
0.8494
147.8456
29.1103
24.7764
14 input
6
0.9121
149.1146
29.2196
25.5280
14 input
6
0.8103
147.3607
29.1291
23.8458
14 input
6
0.9051
165.8000
32.2783
28.8100
14 input
6
0.7340
127.7982
25.5352
19.0643
cumecs-meter cubic second; COC-correlation of coefficient
145
Tables C4.26(a) to C4.26(c) in Appendix C present the R 2 , RMSE, RRMSE, and
MAPE resulting from MLR model for the Sungai Ketil catchment. For Sungai Ketil
catchment, the numbers of input considered for MLR are 16 inputs after calibrated using
100% and 50% of the data sets. Meanwhile, 14 inputs considered after calibrated using
25% of the data sets. The MLR model gives R 2 in the range of 60% to 80%. According
to the correlation of coefficient, this condition can be classified as moderate model.
According to Johnson and King (1988) results of modelling for Sungai Bekok with
MAPE more than 30% can be considered as not reasonable model. During the calibration
phase, the RMSE is consistently less than 19.6 cumecs. The RRMSE is consistently less
than 0.66 for this model. During testing, the RMSE is consistently less than 21.8 cumecs
and RRMSE is less than 0.73.
Tables C4.27(a) to C4.27(c) in Appendix C present the R 2 , RMSE, RRMSE, and
MAPE resulting from MLR model for the Sungai Klang catchment. The numbers of
input considered for MLR are 14 inputs after calibrated using 100%, 50% and 25% of the
data sets. The MLR model gives R 2 in the range of 60% to 80%. According to the
correlation of coefficient, this condition can be classified as moderate model. According
to Johnson and King (1988) results of modelling for Sungai Bekok with MAPE more
than 30% can be considered as not reasonable model. During the calibration phase, the
RMSE is consistently less than 9.1 cumecs. The RRMSE is consistently less than 0.54
for this model. During verification, the RMSE is consistently less than 13.1 cumecs and
RRMSE is less than 0.61.
Tables C4.28(a) to C4.28(c) in Appendix C present the R 2 , RMSE, RRMSE, and
MAPE resulting from MLR model for the Sungai Slim catchment. The numbers of input
considered for MLR are 16 inputs after calibrated using 100% and 50% of the data sets.
Meanwhile, 15 inputs considered after calibrated using 25% of the data sets. The MLR
model gives R 2 in the range of 50% to 80%. According to the correlation of coefficient,
this condition can be classified as moderate model. According to Johnson and King
(1988) results of modelling for Sungai Slim with MAPE around 30% can be considered
as a reasonable prediction model. During the calibration phase, the RMSE is consistently
146
less than 144.2 cumecs. The RRMSE is consistently less than 2.2 for this model. During
verification, the RMSE is consistently less than 276 cumecs and RRMSE is less than 4.2.
4.4.3
Verification
The only verification test that is widely used in rainfall-runoff modelling is the
split sample test in which one period of observations is used in calibration and another
separate period is used to check that the model predictions are satisfactory. As the ANN
model, the verification of MLR model was carried out by using difference sets of data
with difference period of time. For daily rainfall-runoff modelling, 5 different sets of
data namely set 1, set 2, set 3, set 4 and set 5 were used to study the capability of the
models in transforming a rainfall into runoff in any condition.
The measures of
performance of each model are indicated by R 2 , RMSE, RRMSE, and MAPE.
MLR model shows an unsatisfied performance with not good agreement of
RMSE, RRMSE and MAPE results in the calibration and validation phase. Obviously,
the main limitation of the MLR model is the model is unable to carry a large number of
data sets. So, it may not properly calibrate. The disadvantage of these models is we have
to using trial and error method to find out the best parameters that produce the best fit
results. So, it will takes longer times to be adequately calibrated and may not achieve the
desired level of accuracy. This will be a source of uncertainty in the modelling.
This study found that even using intensive series of measurements of parameter
values, the results have not been entirely satisfactory.
4.4.4
Robustness Test
As the MLP and RBF models, the robustness tests for MLR model also carried
out by using same difference sets of data with difference period of time. For daily and
147
hourly rainfall-runoff modelling, 5 different sets of data namely set 1, set 2, set 3, set 4
and set 5 were used to study the capability of the models in transforming a rainfall into
runoff in any condition. The measures of performance of each model are indicated by
R 2 , RMSE, RRMSE, and MAPE.
According to the coefficient of efficiency, R 2 of the model, it shows that the
performance of the neural network model is better than the MLR model in the training
and testing phase. The performance of the MLR model can be classified as moderate.
For both calibration and testing processes, the MLR model also offers a reduction in the
time taken for calibration process compared to the ANN models. The application of
input node selection by Harun (1999) considerably can reduce the time taken to fine out
the number of input nodes to the neural network model structures.
This study confirms that the MLR model is not capable and not robust enough in
modelling continuous daily rainfall-runoff relationship and hourly streamflow
hydrograph. By referring to the results of R 2 and the error analysis, it is found that the
MLR model is slightly better than the HEC-HMS and SWMM models. The results of
MLR model is considered as not consistently robust. It is confirm that the MLR model
shows a not good agreement between the input and output of rainfall-runoff relationship
compared to the results of MLP and RBF models.
The limitation of the MLR model is the model is unable to carry a large number
of data sets, and we cannot adjust the parameters in the calibration process.
The
advantage of this model is it were takes a shorter time in training and validation process
compared to the MLP and RBF networks.
It is because the parameters for the MLR
model are less than the parameters for the MLP and RBF networks.
4.5
Results of the HEC-HMS Model
148
The HEC-HMS is designed to simulate the rainfall-runoff processes of watershed
systems. It is designed to be applicable in a wide range of geographic areas for solving
the widest possible range of problems. It utilizes a graphical user interface to build a
watershed model and to set up the rainfall and control variables for simulation. The
program features a completely integrated work environment including a database, data
entry utilities, computation engine, and results reporting tools. The application of HECHMS model involves two steps. First, the model was calibrated using previous data sets
to determine the best parameters. Second, the model was verified by using new sets of
data. HEC-HMS was run with the previous hourly rainfall-runoff data in order to provide
hourly prediction of runoff entering selected catchments.
4.5.1
Calibration
The calibration of the HEC-HMS model parameters is accomplished by trial and
error process. It is because the calibration accuracy of the models is very subjective and
highly dependent on the knowledge, experience, and understanding of the components of
the model and catchment characteristics.
All model calibrations and subsequent
predictions will be subject to uncertainty. This uncertainty arises in that no rainfallrunoff models are a true reflection of the processes involved. That it is impossible to
specify the initial and boundary conditions required by the model with complete
accuracy, and that the observational data available for model calibration are not errorfree. It is observed that, for both calibration and validation processes, the HEC-HMS
take a longer time for calibration process compared to the ANN and MLR models.
It is found that, if a model is calibrated using data that are in error, then the
effective parameter values will be affected and the predictions for other periods, which
depend on the calibrated parameter values, will be affected. This will be a source of
uncertainty in the modelling by using HEC-HMS model. Therefore, it is worth stressing
149
that prior to applying any model the rainfall-runoff data should be checked for
consistency, even some errors may not be obvious.
In calibration process, some initial values of the parameters are chosen and the
model is run with a calibration data set. The resulting predictions are compared with
some observed variables and a measure of goodness of fit is calculated and scaled so that
if the model was a perfect fit the goodness of fit would have a value of 1.0 or get closer to
a value of 1.0, and if the fit was very poor it would have a value of zero. It is relatively
simple matter to set up the model to change the values of the parameters, make another
run, and recalculate the goodness of fit. No all those runs would result in models giving
good fits to the data. A lot of computer time could therefore be saved by avoiding model
runs that give poor fits, to find an optimum parameter sets.
In the calibration processes, it found that the best parameters used for modelling
daily rainfall-runoff relationship for Sungai Bekok, Sungai Ketil, Sungai Klang and
Sungai Slim catchments are a Initial-Constant infiltration/loss parameterisation, the SCS
hydrograph transformation routine, and a recession base flow component. Meanwhile,
the best parameters used for modelling hourly rainfall-runoff relationship for the same
catchments are a Initial-Constant infiltration/loss parameterisation, the Clark hydrograph
transformation routine, and a recession base flow component.
For Sungai Ketil
catchment, a transformation routine used is SCS hydrograph. The initial loss and initial
flow are treated as initial conditions and vary from simulation to simulation.
Table H4.61 to H4.68 in Appendix H shows the daily and hourly results of
percentage bias (PBIAS) of calibration or training for the HEC-HMS model. The same
input and output data that have been used for the previous model were used to calibrate
this distributed or lump model. The trial and error method was implemented to get the
best parameters for this model. The optimization facilities of the model were used to
check either it is the best solutions that have been carried out. According to Yapo et. al
(1996), the HEC-HMS model shows an unsatisfactorily results. For the daily rainfallrunoff model, the PBIAS are in the range of -39.0 to +9.0. Meanwhile, for the hourly
150
rainfall-runoff model, the PBIAS are in the range of -7.0 to +60.0. It means that the
models were not robust and biased.
Therefore, the models have a tendency of
overestimate or underestimate.
Calibration parameters for the daily rainfall-runoff modelling for Sungai Bekok
are shown in Table 4.29(a) to 4.29(c). Meanwhile, calibration parameters for the daily
rainfall-runoff modelling for Sungai Ketil, Sungai Klang, and Sungai Slim catchments
are shown in Table D4.30(a) to D4.30(c), TableD4.31(a) to D4.31(c), and Table D4.32(a)
to D4.32(c) respectively as enclosed in Appendix D (Part A). Calibration parameters for
hourly rainfall-runoff modelling for Sungai Bekok are shown in Table 4.33(a) to 4.33(b).
Meanwhile, calibration parameters for the hourly rainfall-runoff modelling for Sungai
Ketil, Sungai Klang, and Sungai Slim catchments are shown in Table D4.34(a) to
D4.34(b), Table D4.35(a) to D4.35(b), and Table D4.36(a) to D4.36(b) respectively, as
enclosed in Appendix D (Part B).
4.5.1.1 Results of the Daily HEC-HMS Model Calibration
Table 4.29(a) to 4.29(c) shows the results of calibration coefficients for Sungai
Bekok catchment. It was carried out by using 100, 50 and 25 percents of historical data
respectively.
Table 4.29(a): Calibration Coefficients of Sungai Bekok catchment
(using 100% of data)
Model parameter
Calibrated value
Constant Loss Rate (mm/hr)
2.0
Imperviousness (%)
42
SCS Lag (minutes)
3000.00
Recession Constant
1
Threshold Flow (cumecs)
0.995
*Another 2 parameters (catchment size & baseflow) are fixed in the model
151
Table 4.29(b): Calibration Coefficients of Sungai Bekok catchment
(using 50% data)
Model parameter
Calibrated value
Constant Loss Rate (mm/hr)
1.62
Imperviousness (%)
42
SCS Lag (minutes)
2350.00
Recession Constant
1
Threshold Flow (cumecs)
0.995
*Another 2 parameters (catchment size & baseflow) are fixed in the model
Table 4.29(c): Calibration Coefficients of Sungai Bekok catchment
(using 25% data)
Model parameter
Calibrated value
Constant Loss Rate (mm/hr)
1.0
Imperviousness (%)
42
SCS Lag (minutes)
1700.00
Recession Constant
1
Threshold Flow (cumecs)
0.995
*Another 2 parameters (catchment size & baseflow) are fixed in the model
4.5.1.2 Results of the Hourly HEC-HMS Model Calibration
Table 4.33(a) and 4.33(b) shows the results of calibration coefficients for Sungai
Bekok catchment by using 25 percents and minimum amounts of historical data
respectively.
152
Table 4.33(a): Calibration Coefficients of Sungai Bekok catchment
(using 25% of data)
Model parameter
Calibrated value
Constant Rate (mm/hr)
50
Imperviousness (%)
48
Time of Concentration (hr)
25.8
Storage Coefficient (hr)
180
Recession Constant
Threshold Flow (cumecs)
1
0.99
*Another 2 parameters (catchment size & baseflow) are fixed in the model
Table 4.33(b): Calibration Coefficients of Sungai Bekok catchment
(using minimum data)
Model parameter
Calibrated value
Constant Rate (mm/hr)
3
Imperviousness (%)
48
Time of Concentration (hr)
18.25
Storage Coefficient (hr)
18
Recession Constant
1
Threshold Flow (cumecs)
0.99
*Another 2 parameters (catchment size & baseflow) are fixed in the model
4.5.2 Results of Daily HEC-HMS Model
153
Tables 4.37(a) to 4.37(c) present the R 2 , RMSE, RRMSE, and MAPE resulting
from HEC-HMS model for the Sungai Bekok catchment.
The HEC-HMS yield the R 2 below 70%, and this condition shows the poor
performance and is unsatisfactory. The results of MAPE for Sungai Bekok are more than
30%. This condition can be considered as not reasonable prediction. It may cause by the
huge amount of data that intriduced to the model. In general, the performance of the
HEC-HMS model is unsatisfactory during calibration and verification phase. During the
calibration phase, the RMSE for Sungai Bekok is consistently in the range of 0.5 to 0.8
cumecs. The RRMSE also maintains below 0.165 for this model. During verification,
the RMSE is less than 0.84 cumecs and RRMSE is less than 0.19. Further, the results of
RMSE and RRMSE show that the model calibrated using dry-set of data (set 4)
approximate rainfall-runoff process in the catchment more closely than the model
calibrated using wet-set of data (set 5). Obviously, the application of HEC-HMS method
to model rainfall-runoff relationship of Sungai Bekok is not very successful because of
poor correlation between the observed and computed results.
Table 4.37(a): Results of HEC-HMS Model for Sg. Bekok catchment using 100% of data sets in training phase
MODEL
Data Set
HEC
TRAINING
HEC-TEST
Set 1
HEC-TEST
Set 2
HEC-TEST
Set 3
HEC-TEST
Set 4
HEC-TEST
Set 5
COC
( R2 )
RMSE
(cumecs)
RRMSE
No. of Parameter
MAPE
(%)
7
0.1120
0.7984
0.1630
125.3162
7
0.0486
0.5511
0.1156
91.6911
7
0.2014
0.5392
0.1124
87.3058
7
0.3436
0.6037
0.1261
100.1061
7
0.1231
0.6255
0.1312
99.4574
7
0.2680
0.8351
0.1818
139.0932
cumecs-meter cubic second; COC-correlation of coefficient
154
Table 4.37(b): Results of HEC-HMS Model for Sg. Bekok catchment using 50% of data sets in training phase
MODEL
Data Set
HEC
TRAINING
HEC-TEST
Set 1
HEC-TEST
Set 2
HEC-TEST
Set 3
HEC-TEST
Set 4
HEC-TEST
Set 5
COC
( R2 )
RMSE
(cumecs)
RRMSE
No. of Parameter
MAPE
(%)
7
0.0320
0.6600
0.1310
100.5741
7
0.0560
0.5038
0.1054
84.2492
7
0.2005
0.5170
0.1081
84.7230
7
0.2692
0.5428
0.1116
91.8308
7
0.0745
0.6335
0.1316
101.9193
7
0.0755
0.7434
0.1571
138.9936
cumecs-meter cubic second; COC-correlation of coefficient
Table 4.37(c): Results of HEC-HMS Model for Sg. Bekok catchment using 25% of data sets in training phase
MODEL
Data Set
HEC
TRAINING
HEC-TEST
Set 1
HEC-TEST
Set 2
HEC-TEST
Set 3
HEC-TEST
Set 4
HEC-TEST
Set 5
COC
( R2 )
RMSE
(cumecs)
RRMSE
No. of Parameter
MAPE
(%)
7
0.4242
0.5933
0.1185
93.8401
7
0.3465
0.5992
0.1225
95.0859
7
0.2997
0.5478
0.1100
85.0862
7
0.3979
0.6203
0.1272
102.4020
7
0.2195
0.6135
0.1236
92.2602
7
0.2146
0.6888
0.1451
120.4417
cumecs-meter cubic second; COC-correlation of coefficient
155
Tables E4.38(a) to E4.38(c) in Appendix E (Part A) present the R 2 , RMSE,
RRMSE, and MAPE resulting from HEC-HMS model for the Sungai Ketil catchment.
For Sungai Ketil catchment, the HEC-HMS also yield the R 2 below 70%, and this
condition shows the poor performance and is unsatisfactory. In general, the performance
of the HEC-HMS model is unsatisfactory during calibration and verification phase. Most
of the results of MAPE for Sungai Ketil are above 30%. It was considered not reasonable
prediction model. But, some simulation shows a reasonable prediction especially when
we used 25% of data sets in training phase. According to Johnson and King (1988), it is
considered very accurate. During the calibration phase, the RMSE for Sungai Ketil is
consistently below 0.44 cumecs. The RRMSE also maintains below 0.015 for this model.
During verification, the RMSE is less than 0.75 cumecs and RRMSE is less than 0.03 and
approach zero. Obviously, the application of HEC-HMS method to model rainfall-runoff
relationship of Sungai Ketil is moderate.
Tables E4.39(a) to E4.39(c) in Appendix E (Part A) present the R 2 , RMSE,
RRMSE, and MAPE resulting from HEC-HMS model for the Sungai Klang catchment.
For Sungai Klang catchment, the HEC-HMS yield the R 2 below 50%, and this condition
shows the poor performance and is unsatisfactory. The results of MAPE for Sungai Ketil
are above 30%. According to Johnson and King (1988), it is considered not reasonable.
In general, the performance of the HEC-HMS model is unsatisfactory during calibration
and verification phase. During the calibration phase, the RMSE for Sungai Klang is
consistently below 16.6 cumecs.
The RRMSE are below 0.68 for this model.
Meanwhile, during verification, the RMSE is less than 29 cumecs and RRMSE is less
than 0.88. Obviously, the application of HEC-HMS method to model rainfall-runoff
relationship of Sungai Klang is not satisfactory.
Tables E4.40(a) to E4.40(c) in Appendix E (Part A) present the R 2 , RMSE,
RRMSE, and MAPE resulting from HEC-HMS model for the Sungai Slim catchment.
For Sungai Slim catchment, the HEC-HMS also yield the R 2 below 50%, and this
condition shows the poor performance and is unsatisfactory. The results of MAPE for
Sungai Slim are also more than 30%. According to Johnson and King (1988), it is
156
considered not reasonable prediction model. In general, the performance of the HECHMS model is moderate during calibration and verification phase. During the calibration
phase, the RMSE for Sungai Slim is consistently below 0.1 cumecs. The RRMSE are
below 0.002 for this model. During verification, the RMSE is less than 0.15 cumecs and
RRMSE is less than 0.0025. Obviously, the application of HEC-HMS method to model
rainfall-runoff relationship of Sungai Slim is moderate.
The study demonstrates the neural network model based on MLP and RBF
networks are suitable for modelling the rainfall-runoff relationship compared to the HECHMS model. The HEC-HMS model gives a higher error than the MLP and RBF models
give a worse degree of efficiency; in term of correlation of coefficient, RMSE and
RRMSE. It takes a longer time for calibration in the HEC-HMS model. The trial and
error procedure is applied to find out the best parameters for the model structure.
Apparently, the MLP and RBF model shows a better performance than the HEC-HMS
model with good agreement of correlation of coefficient, RMSE, RRMSE and MAPE
results in the calibration and verification phase, revealing best fitting to the data.
4.5.3
Results of Hourly HEC-HMS Model
Tables 4.41(a) to 4.41(b) present the R 2 , RMSE, RRMSE, and MAPE resulting
from HEC-HMS model for the Sungai Bekok catchment.
For Sungai Bekok catchment, the HEC-HMS yield the R 2 below 75%, and this
condition shows the poor performance and is unsatisfactory. Meanwhile, results of
modelling for Sungai Bekok with MAPE less than 30% is considered as reasonable
prediction. In general, the performance of the MLP and RBF models is better than the
HEC-HMS model in the calibration and verification phase.
157
Table 4.41(a): Results of HEC-HMS Model for Sg. Bekok catchment using 25% of data sets in training phase
MODEL
Data Set
HEC
TRAINING
HEC-TEST
Set 1
HEC-TEST
Set 2
HEC-TEST
Set 3
HEC-TEST
Set 4
HEC-TEST
Set 5
RMSE
(cumecs)
RRMSE
No. of Parameter
COC
( R2 )
MAPE
(%)
8
0.4867
0.7105
0.1367
30.8068
8
0.0209
0.2165
0.0427
116.6770
8
0.1480
0.3832
0.0488
62.5096
8
0.0440
0.4732
0.0905
88.5970
8
0.0798
0.2328
0.1312
42.9340
8
0.7069
0.2967
0.0084
22.5021
cumecs-meter cubic second; COC-correlation of coefficient
Table 4.41(b): Results of HEC-HMS Model for Sg. Bekok catchment using minimum data sets in training phase
MODEL
Data Set
HEC
TRAINING
HEC-TEST
Set 1
HEC-TEST
Set 2
HEC-TEST
Set 3
HEC-TEST
Set 4
HEC-TEST
Set 5
RMSE
(cumecs)
RRMSE
No. of Parameter
COC
( R2 )
MAPE
(%)
8
0.6097
0.5329
0.0945
22.1940
8
0.6128
0.3508
0.0677
21.4041
8
0.37326
0.8719
0.1092
40.5906
8
0.3241
0.8794
0.1759
37.8493
8
0.1518
1.1288
0.3922
68.0408
8
0.2066
0.6248
0.0362
72.6862
cumecs-meter cubic second; COC-correlation of coefficient
158
Tables E4.42(a) to E4.42(b) in Appendix E (Part B) present the R 2 , RMSE,
RRMSE, and MAPE resulting from HEC-HMS model for the Sungai Ketil catchment.
For Sungai Ketil catchment, the HEC-HMS yield the R 2 below 70%, and this condition
also shows the poor performance and is unsatisfactory. Meanwhile, results of modelling
for Sungai Ketil with MAPE less than 30% is considered as reasonable prediction. In
general, the performance of the MLP and RBF models is better than the HEC-HMS
model in the calibration and verification phase.
Tables E4.43(a) to E4.43(b) in Appendix E (Part B) present the R 2 , RMSE,
RRMSE, and MAPE resulting from HEC-HMS model for the Sungai Klang catchment.
For Sungai Klang catchment, the HEC-HMS yield the R 2 in the range of 10% to 80%.
This condition shows the model is unsatisfactory and not consistent. Meanwhile, results
of modelling for Sungai Klang with MAPE more than 30% is considered as not
reasonable prediction. In general, the performance of the MLP and RBF models is better
than the HEC-HMS model in the calibration and verification phase.
Tables E4.44(a) to E4.44(b) in Appendix E (Part B) present the R 2 , RMSE,
RRMSE, and MAPE resulting from HEC-HMS model for the Sungai Slim catchment.
For Sungai Slim catchment, the HEC-HMS yield the R 2 in the range of 10% to 60%.
This condition shows the model is unsatisfactory and not consistence and reliable.
Meanwhile, results of modelling for Sungai Slim with MAPE more than 30% is
considered as not reasonable prediction. In general, the performance of the MLP and
RBF models is better than the HEC-HMS model in the calibration and verification phase.
4.5.4
Verification
Verification of distributed model is an issue that has received a great deal of
recent attention in the field of rainfall-runoff modelling following by several number of
studies carried out by hydrologists.
In fact, the verification processes is not an
159
appropriate term to use, since no model approximation can be expected to be a valid
representation of a complex reality processes. Model verification processes in distributed
models make distributed predictions; there is a lot of potential for evaluating of the
discharge predictions at a catchment outlet, and also the internal state variables such as
table levels, soil moisture levels, etc. The lack of evaluation in distributed models is due
to expense of collecting widespread measurements of such internal variables.
It is
because there are some difficulties in measuring quantities that can truly be significantly
different from the model element scale at which the predictions of the model are made.
Freeze (1972) reported that because of uncertainties in the boundary conditions, initial
conditions and parameter values of a distributed model, it is unlikely that a true model
validation will ever be possible since the errors in representing the system and specifying
the inputs will surely induce unavoidable errors in the simulations, however well a model
appears to have been calibrated.
This study has found that, even using intensive series of measurements of
parameter values, the results have not been entirely satisfactory in model verification
processes. It may causes by plenty of scope for the runoff to be simulated by a variety of
different mechanisms or parameters.
The performance of the HEC-HMS model can be classified as moderate. We
observed that the performance of the HEC-HMS model is unsatisfactory, which lack
from consistency and not robust. According to the coefficient of efficiency of the model,
it shows that the performance of the ANN models is better than the HEC-HMS model in
the training and validation phase. For both calibration and validation processes, the
HEC-HMS take a long time for calibration process compared to the other models.
4.5.5
Robustness Test
160
As the other models, the robustness tests for HEC-HMS model also carried out by
using same difference sets of data with difference period of time for daily and hourly
rainfall-runoff modelling. The measures of performance of each model are also indicated
by R 2 , RMSE, RRMSE, and MAPE. The HEC-HMS model has the lowest calibration
and verification accuracy amongst the best-fit networks such as MLP, RBF and MLR
models.
For HEC-HMS model, R 2 values vary in the range from 0.1 to 0.6 for
calibration and verification.
In addition, the percent estimates of MAPE are
comparatively high at more than 30 percents of the observed values for this model.
The HEC-HMS model takes a longer time for calibrate the parameters of the
model. The trial and error procedure is applied to find out the best parameters for the
HEC-HMS model. But, the advantage of this model is it has the optimization function
for approximate the best fit of parameters values. Even this function is simple and
flexible, but it was not consistence and robust. This function consists of many parameters
that related to each others and supposedly affect the overall process of the model.
Furthermore, the main limitation of the SWMM model compared to the MLP and RBF
models are it unable to carry a large number of data sets. The MLP and RBF method has
been shown that it can easily handle the existence of non-linearity processes within the
catchment compared to the HEC-HMS models.
This study confirms that the HEC-HMS model is not capable and shows
unsatisfactory results compared to the MLR and ANN models. The HEC-HMS model is
not robust in modelling continuous daily rainfall-runoff relationship that has not good
agreement between the input and output of rainfall-runoff relationship.
Meanwhile, it
shows moderate results in modelling of hourly streamflow hydrograph. By referring to
the results of R 2 and the error analysis, we found that the HEC-HMS model is not
consistent and robust. In other words, this model cannot work consistently or unstable
with long period or big amounts of data.
161
4.6
Results of the SWMM Model
The application of SWMM model also involves two steps. First, the model was
calibrated using previous data sets to determine the best parameters. Second, the model
was verified by using new sets of data. This program features a completely integrated
work environment including a database, data entry utilities, computation engine, and
results reporting tools. SWMM was run with the previous hourly rainfall-runoff data in
order to provide hourly prediction of runoff entering selected catchments.
4.6.1
Calibration
The calibration of the SWMM model parameters is also accomplished by trial and
error process. In this study, it is found that the calibration process of the SWMM model
also can be affected by using data that are in error. So that, the effective values of
parameter will be affected. Therefore, the predictions for other periods, which depend on
the calibrated parameter values, will be affected. As the HEC-HMS model, the SWMM
model processes are also subject to uncertainty. It is because, it is impossible to specify
the initial and boundary conditions required by the model with complete accuracy, and
that the observational data available for model calibration are not error-free.
Uncertainties and sensitivity may also depend on the period of data used. As in
regression, these uncertainties will normally get larger as the model predicts the
responses for more and more extreme conditions relative to the data used in calibration.
The calibration processes for each different catchment were carried out for
determining parameter values for a particular catchment. For each different catchment, a
set of parameters needs to be established so that the SWMM model can simulate the
rainfall-runoff processes. For the calibration purposed, the same rainfall (input) and
runoff (output) data that have been used for the previous model were used to calibrate
this model. The trial and error method was implemented to get the best parameters for
162
this model. It found that, the model parameters used for modelling daily rainfall-runoff
relationship for Sungai Bekok, Sungai Klang and Sungai Slim catchments are a Horton
infiltration/loss and SCS Hydrology. Meanwhile, for Sungai Ketil catchment, the model
parameters used is a Horton infiltration/loss and Time Area unit hydrograph. For hourly
rainfall-runoff relationship modelling, the model parameters used for modelling Sungai
Bekok and Sungai Slim catchments are Horton infiltration/loss and SCS Hydrology.
Meanwhile, for Sungai Klang and Sungai Ketil catchments, the model parameters used
are Horton infiltration/loss and Rational Formula unit hydrograph.
Calibration parameters for the daily rainfall-runoff modelling for Sungai Bekok
are shown in Table 4.45(a) to 4.45(c). Meanwhile, calibration parameters for the daily
rainfall-runoff modelling for Sungai Ketil, Sungai Klang, and Sungai Slim catchments
are shown in Table F4.46(a) to F4.46(c), Table F4.47(a) to F4.47(c), and Table F4.48(a)
to F4.48(c) respectively as enclosed in Appendix F (Part A). Calibration parameters for
the hourly rainfall-runoff modelling for Sungai Bekok are shown in Table 4.49(a) to
4.49(b). Meanwhile, calibration parameters for the hourly rainfall-runoff modelling for
Sungai Ketil, Sungai Klang, and Sungai Slim catchments are shown in Table F4.50(a) to
F4.50(b), Table F4.51(a) to F4.51(b), and Table F4.52(a) to F4.52(b) respectively as
enclosed in Appendix F (Part B).
Table H4.61 to H4.68 in Appendix H shows the daily and hourly results of
percentage bias (PBIAS) of calibration or training for the SWMM model. According to
Yapo et. al (1996), the SWMM model also shows an unsatisfactorily results. For the
daily rainfall-runoff model, the PBIAS are in the range of -29.0 to +7.0. Meanwhile, for
the hourly rainfall-runoff model, the PBIAS are in the range of -6.0 to +55.0. It means
that the models were not robust and biased. As the HEC-HMS model, the SWMM model
also has a tendency of overestimate or underestimates due to lower performance or the
model compared to the MLP and RBF.
In rainfall-runoff modelling, the hydrological concepts are given priority rather
than the problems of parameter calibration, particularly in physical-based models. For
163
the models that have many parameters, there will likely be sets of parameters that give a
good fit to the hydrograph using the required mechanisms. There are particular problems
in assessing the response surface and sensitivity of parameters in distributed models. It is
because of the very large number of parameter values involved and the possibilities for
parameter interaction in specifying distributed fields of parameters. This will remain a
difficulty for the foreseeable future and the only sensible strategy in calibrating
distributed models would appear to insist that mostly of the parameters are either fixed or
calibrated with respect to some distributed observations and not catchment discharge
alone.
4.6.1.1 Results of the Daily SWMM Model Calibration
Table 4.45(a) to 4.45(c) shows the results of calibration coefficients for Sungai
Bekok catchment. It was carried out by using 100, 50 and 25 percents of historical data
respectively.
Table 4.45(a): Calibration Coefficients of Sungai Bekok catchment
(using 100% of data)
Model parameter
Calibrated value
Imperviousness (%)
34
Pervious Area CN
66
Time of Concentration (hr)
50
Initial Abstraction
0.2
Decay rate of infiltration
0.0012
*Another 2 parameters (catchment size & baseflow) are fixed in the model
Table 4.45(b): Calibration Coefficients of Sungai Bekok catchment
(using 50% data)
Model parameter
Calibrated value
Imperviousness (%)
34
Pervious Area CN
66
164
Time of Concentration (hr)
39.17
Initial Abstraction
0.15
Decay rate of infiltration
0.0012
*Another 2 parameters (catchment size & baseflow) are fixed in the model
Table 4.45(c): Calibration Coefficients of Sungai Bekok catchment
(using 25% data)
Model parameter
Calibrated value
Imperviousness (%)
34
Pervious Area CN
66
Time of Concentration (hr)
28.33
Initial Abstraction
0.15
Decay rate of infiltration
0.0012
*Another 2 parameters (catchment size & baseflow) are fixed in the model
Calibration is not the only problems with finding an optimum parameter set.
Optimization generally assumes that the observations with which the simulations are
compared are error-free and that the model is a true representation of the data. Thus, the
optimum parameter set found for a particular model structure may be sensitive both to
small changes in the observations, or to the period of observations considered in the
calibration, and possibly to changes in the model structure. In calibration process for all
these three models, it has been found that the parameter values determined by calibration
are effectively valid only inside the model structure used in the calibration. It may not be
appropriate to use those values in different models or in different catchments.
Furthermore, the concept of an optimum parameter set may be ill-founded in
hydrological modelling. It is most unlikely that, given a number of parameter sets that
give reasonable fits to the data, the ranking of those sets in terms of the objective function
will be the same for different periods of calibration data.
4.6.1.2 Results of the Hourly SWMM Model Calibration
165
Table 4.49(a) and 4.49(b) shows the results of calibration coefficients for Sungai
Bekok catchment by using 25 percents and minimum amounts of historical data
respectively.
Table 4.49(a): Calibration Coefficients of Sungai Bekok catchment
(using 25% of data)
Model parameter
Calibrated value
Imperviousness (%)
40
Pervious Area CN
60
Time of Concentration (hr)
52.8
Initial Abstraction
0.2
Decay rate of infiltration
0.00115
*Another 2 parameters (catchment size & baseflow) are fixed in the model
Table 4.49(b): Calibration Coefficients of Sungai Bekok catchment
(using minimum data)
Model parameter
Calibrated value
Imperviousness (%)
40
Pervious Area CN
60
Time of Concentration (hr)
Initial Abstraction
Decay rate of infiltration
38.25
0.2
0.00115
*Another 2 parameters (catchment size & baseflow) are fixed in the model
4.6.2
Results of Daily SWMM Model
Tables 4.53(a) to 4.53(c) present the R 2 , RMSE, RRMSE, and MAPE resulting
from SWMM model of daily rainfall-runoff relationship for the Sungai Bekok catchment.
166
Table 4.53(a): Results of SWMM Model for Sg. Bekok catchment using 100% of data sets in training phase
MODEL
Data Set
SWMM
TRAINING
SWMM
TEST Set 1
SWMM
TEST Set 2
SWMM
TEST Set 3
SWMM
TEST Set 4
SWMM
TEST Set 5
RMSE
(cumecs)
RRMSE
No. of Parameter
COC
( R2 )
MAPE
(%)
7
0.2643
0.8421
0.1003
113.2260
7
0.1221
0.4373
0.2013
102.1902
7
0.4220
0.6821
0.2011
46.5643
7
0.5336
0.6001
0.1092
34.3310
7
0.2173
0.7565
0.1237
97.5860
7
0.4381
0.8330
0.1902
103.4751
cumecs-meter cubic second; COC-correlation of coefficient
Table 4.53(b): Results of SWMM Model for Sg. Bekok catchment using 50% of data sets in training phase
MODEL
Data Set
SWMM
TRAINING
SWMM
TEST Set 1
SWMM
TEST Set 2
SWMM
TEST Set 3
SWMM
TEST Set 4
SWMM
TEST Set 5
COC
( R2 )
RMSE
(cumecs)
RRMSE
No. of Parameter
MAPE
(%)
7
0.2635
0.5644
0.1020
107.8650
7
0.3433
0.5253
0.2011
76.4595
7
0.4929
0.4535
0.1104
83.2752
7
0.3001
0.4768
0.1007
67.8901
7
0.1021
0.6218
0.1467
90.6321
7
0.1388
0.8764
0.1623
145.2320
cumecs-meter cubic second; COC-correlation of coefficient
167
Table 4.53(c): Results of SWMM Model for Sg. Bekok catchment using 25% of data sets in training phase
MODEL
Data Set
SWMM
TRAINING
SWMM
TEST Set 1
SWMM
TEST Set 2
SWMM
TEST Set 3
SWMM
TEST Set 4
SWMM
TEST Set 5
RMSE
(cumecs)
RRMSE
No. of Parameter
COC
( R2 )
MAPE
(%)
7
0.5320
0.4532
0.2231
35.6701
7
0.3515
0.4012
0.4519
74.5109
7
0.4310
0.6421
0.2350
76.6534
7
0.3001
0.7509
0.2754
97.6503
7
0.2054
0.8431
0.1176
100.2210
7
0.4012
0.7210
0.1324
55.4341
cumecs-meter cubic second; COC-correlation of coefficient
For the SWMM model, the result shows slightly better than the HEC-HMS model
with little improvement in term of correlation of coefficient and error analysis. For
Sungai Bekok catchment, the SWMM model yield the R 2 below 60%, and this condition
shows the unsatisfactory. The results of MAPE for Sungai Bekok are above 30%.
According to Johnson and King (1988), it is considered as not reasonable prediction
model. In general, the performance of the SWMM model is moderate during calibration
and verification phase. During the calibration phase, the RMSE for Sungai Slim is
consistently below 0.85 cumecs. The RRMSE values are below 0.23 for this model.
During verification, the RMSE is less than 0.88 cumecs and RRMSE is less than 0.46.
For the SWMM model, the results of RMSE and RRMSE show that the model calibrated
using dry-set of data (set 4) approximate rainfall-runoff process in the catchment is more
accurate than the model calibrated using wet-set of data (set 5).
168
Tables G4.54(a) to G4.54(c) in Appendix G (Part A) present the R 2 , RMSE,
RRMSE, and MAPE resulting from SWMM model for the Sungai Ketil catchment. For
Sungai Ketil catchment, the SWMM model yield the R 2 below 80%, and this condition
considered as moderate. Most of the results of MAPE for Sungai Ketil are around 30%.
According to Johnson and King (1988), it is considered a reasonable prediction model.
In general, the performance of the SWMM model is moderate during calibration and
verification phase.
During the calibration phase, the RMSE for Sungai Ketil is
consistently below 0.45 cumecs. The RRMSE are below 0.06 for this model. During
verification, the RMSE is less than 0.68 cumecs and RRMSE is less than 0.035 and
approach zero.
Tables G4.55(a) to G4.55(c) in Appendix G (Part A) present the R 2 , RMSE,
RRMSE, and MAPE resulting from SWMM model for the Sungai Klang catchment. For
Sungai Klang catchment, the SWMM model yield the R 2 below 50%, and this condition
shows a poor performance and unsatisfactory. The results of MAPE for Sungai Ketil are
more than 30%. According to Johnson and King (1988), it is considered unreasonable
prediction.
In general, the performance of the SWMM model is unsatisfactory for
modelling rainfall-runoff relationship in Sungai Klang catchment. During the calibration
phase, the RMSE for Sungai Klang is consistently below 15.5 cumecs. The RRMSE are
below 0.66 for this model. During verification, the RMSE is less than 28 cumecs and
RRMSE is less than 0.91.
Tables G4.56(a) to G4.56(c) in Appendix G (Part A) present the R 2 , RMSE,
RRMSE, and MAPE resulting from SWMM model for the Sungai Slim catchment. For
Sungai Slim catchment, the SWMM model yield the R 2 below 60%, and this condition
shows unsatisfactory. The results of MAPE for Sungai Ketil are also more than 30% and
according to Johnson and King (1988), it is considered as not reasonable prediction
model. In general, the performance of the SWMM model is unsatisfactory for modelling
rainfall-runoff relationship in Sungai Slim catchment. During the calibration phase, the
RMSE for Sungai Ketil is consistently below 0.09 cumecs. The RRMSE are below 0.002
169
for this model. During verification, the RMSE is less than 0.16 cumecs and RRMSE is
less than 0.12.
4.6.3 Results of Hourly SWMM Model
Tables 4.57(a) to 4.57(b) present the R 2 , RMSE, RRMSE, and MAPE resulting
from SWMM model of hourly rainfall-runoff relationship for the Sungai Bekok
catchment.
For Sungai Bekok catchment, the SWMM model yield the R 2 below 71%, and
this condition shows the poor performance and is unsatisfactory. According to Johnson
and King (1988), results of modelling for Sungai Bekok with MAPE around 30% is
considered as reasonable prediction. In general, the performance of the SWMM model is
better than HEC-HMS model. Obviously, the MLP and RBF models are better than the
SWMM model in the calibration and verification phase.
Table 4.57(a): Results of SWMM Model for Sg. Bekok catchment using 25% of data sets in training phase
MODEL
Data Set
SWMM
TRAINING
SWMM
TEST Set 1
SWMM
TEST Set 2
SWMM
TEST Set 3
SWMM
TEST Set 4
SWMM
TEST Set 5
COC
( R2 )
RMSE
(cumecs)
RRMSE
No. of Parameter
MAPE
(%)
7
0.5221
0.6574
0.0937
21.0921
7
0.3049
0.2029
0.0492
27.8877
7
0.1854
0.3430
0.0323
58.7964
7
0.1012
0.4708
0.1090
81.0303
7
0.1102
0.2029
0.1093
39.8821
7
0.7071
0.3029
0.0080
30.8991
cumecs-meter cubic second; COC-correlation of coefficient
170
Table 4.57(b): Results of SWMM Model for Sg. Bekok catchment using minimum data sets in training phase
MODEL
Data Set
SWMM
TRAINING
SWMM
TEST Set 1
SWMM
TEST Set 2
SWMM
TEST Set 3
SWMM
TEST Set 4
SWMM
TEST Set 5
COC
( R2 )
RMSE
(cumecs)
RRMSE
No. of Parameter
MAPE
(%)
7
0.6983
0.4292
0.0494
28.9690
7
0.6303
0.3605
0.0129
30.5830
7
0.3626
0.8332
0.1103
120.1891
7
0.2949
0.8574
0.1563
116.7541
7
0.3829
1.3292
0.4039
149.7655
7
0.3029
0.5079
0.0293
76.2327
cumecs-meter cubic second; COC-correlation of coefficient
Tables G4.58(a) to G4.58(b) in Appendix G (Part B) present the R 2 , RMSE,
RRMSE, and MAPE resulting from SWMM model for the Sungai Ketil catchment. For
Sungai Ketil catchment, the SWMM model yield the R 2 below 70%, and this condition
also shows the poor performance and is unsatisfactory. According to Johnson and King
(1988), results of modelling for Sungai Bekok with MAPE more than 30% is considered
as not reasonable prediction. In general, the MLP and RBF models are better than the
SWMM model in the calibration and verification phase.
Tables G4.59(a) to G4.59(b) in Appendix G (Part B) present the R 2 , RMSE,
RRMSE, and MAPE resulting from SWMM model for the Sungai Klang catchment. For
Sungai Klang catchment, the SWMM model yield the R 2 in the range of 10% to 85%,
171
and this condition shows a poor performance and unreliable. According to Johnson and
King (1988), results of modelling for Sungai Klang with MAPE more than 30% is
considered as not reasonable prediction. In general, the MLP and RBF models are better
than the SWMM model in the calibration and verification phase.
Tables G4.60(a) to G4.60(b) in Appendix G (Part B) present the R 2 , RMSE,
RRMSE, and MAPE resulting from SWMM model for the Sungai Slim catchment. For
Sungai Slim catchment, the SWMM model yield the R 2 below 75%, and this condition
shows a poor performance and is unsatisfactory. According to Johnson and King (1988),
results of modelling for Sungai Bekok with MAPE more than 30% is considered as not
reasonable prediction model. But, some of the simulation shows a reasonable prediction
with MAPE around 30%. In general, the performance of the model is moderate. The
MLP and RBF models are better than the SWMM model in the calibration and
verification phase.
4.6.4
Verification
Model verification is a process of verifying the correctness of model parameters
for a catchment. In general, the procedure of accomplishing can be summarized as
follows:
(i)
Pick a data series for a period of time with has not been used by the
model for parameter calibration
(ii)
Simulate the runoff for the rainfall event of that period
(iii)
Compare the results of simulation to the observed ones
(iv)
If the results are within a specific error range, then the model is verified
for that catchment.
Many different parameter sets may give good fits to the data and it may be very
difficult to decide whether this one is better than another. Furthermore, having chosen a
model parameter sets for one period of observations may not be the optimum sets for
172
another period. So, in order to approaching this problem for models calibration and
testing, the SWMM model has more good features compared to HEC-HMS models. The
graphical features in the SWMM model make the modelling process clearer and easier.
Therefore, the parameter sets can be adjusted easier.
This study has found that, even using intensive series of measurements of
parameter values, the results have not been entirely satisfactory. If the concept of an
optimum parameter set must be superseded by the idea that many possible parameter sets
and perhaps models may provide acceptable simulations of the response of a particular
catchment, then it follows that validation of those models may be equally difficult. In
fact, rejection of some of the acceptable models given additional data may be a much
more practical than suggesting that the models might be validated.
4.6.5
Robustness Test
As the other models, the robustness tests for SWMM model also carried out by
using same difference sets of data with difference period of time for daily and hourly
rainfall-runoff modelling. The measures of performance of each model are also indicated
by correlation of coefficient ( R 2 ), root mean square error (RMSE), relative root mean
square error (RRMSE), and mean absolute percentage error (MAPE). The results show
that the SWMM model has the lower calibration and verification accuracy amongst the
best-fit networks. For the SWMM model, R 2 values vary in the range from 0.1 to 0.7 for
calibration and verification.
In addition, the percent estimates of MAPE are
comparatively high that are more than 30 percents of the observed values for this model.
Obviously, the main limitation of the SWMM model is the model is unable to
carry a large number of data sets. It also involved so many parameters to be approximate
in model calibration. So, it may not properly calibrate. The disadvantage of these
models is the trial and error method has to be used to find out the best parameters that
173
produce the best fit results. So, it will takes longer times to be adequately trained and
may not achieve the desired level of accuracy.
In general, the SWMM model exhibit or demonstrated less capabilities in rainfallrunoff modelling. In modelling continuous daily and hourly rainfall-runoff relationship,
the SWMM displays unsatisfactory performance. By referring to the results of R 2 and
the error analysis, it is clear that the SWMM model is not robust compared to the MLP
and RBF models.
4.7
Discussions on the Rainfall-Runoff Modelling
The nonlinear nature of the relationship of rainfall-runoff processes is appropriate
for the application of ANN methods. The proper use of a neural network requires not
only a physical understanding of the hydrological process under consideration but also
knowledge of neural networks and their system operation. Trying to extract rules from a
network or impart them with some explanation capability will entail extra computer
effort. These fundamental aspects will lead to the construction of good training and
validation data sets, selection and inclusion of relevant input variables, and development
of proper neural networks architectures and selection of training algorithms. Results of
ANN models reflect that the performance of neural network model is better than MLR,
HEC-HMS and SWMM models, for modelling the rainfall-runoff relationship.
Apparently, the neural network has the ability to predict runoff accurately using the
rainfall data as input variable.
This study models the rainfall-runoff relationship where the rainfall as input and
the runoff as output. In general, the ANN models namely MLP networks and RBF
networks are successful in modelling daily and hourly rainfall-runoff process. The HECHMS and SWMM model are considered as not capable and the results show
unsatisfactory performance in modelling daily and hourly rainfall-runoff process that has
174
not good agreement between the input and output of rainfall-runoff relationship. The
variation of the peak flow predictions is not significant. Meanwhile, the MLR model
considers as a moderate for estimates the runoff that has the moderate training and
validation accuracy. But, by referring to the results of R 2 and the error analysis, it is
found that the MLR model is not consistent and not robust compared to the MLP and
RBF models. This due to the fact that the peak flows for calibration and validation of the
model is significantly low for this model. This study revealed that the MLR, HEC-HMS,
and SWMM models are not very sensitive to the number of observations in the
calibration or training data. In fact, when models get more complex with the addition of
more input variables, the variation in calibration and validation accuracy become slightly
lower. Meanwhile, for the MLP and RBF networks, when the complexity increases in
the model, the requirement for the amount of data increases. Thus, the variation in
training and testing accuracy becomes more pronounced.
Based on the results of training or calibration of the models, a MLP and RBF
networks was selected as the best-fit networks to model the rainfall-runoff relationship
for the selected catchments because of its high training and testing accuracy. For the
hourly rainfall-runoff modelling, it shows that the network estimates the peak runoffs or
high flows closely to their observed value.
In overall, the network provides fairly
accurate predictions of low and high flow conditions for training and testing. Meanwhile,
for the daily rainfall-runoff modelling, some of low flows and high flows are over- and
under-predicted for training and testing average-year data. The predictions of flows are
also considered fairly accurate with a good agreement of goodness of fit tests.
It is observed that MLP model performed better than the RBF model in term of
modelling of continuous daily rainfall-runoff relationship. It is because the MLP model
used a back-propagation algorithm to yield a best-fit results and it required a long period
of data for model calibration. Meanwhile, the RBF model performed better that the MLP
model in modelling hourly rainfall-runoff hydrographs or short event hydrographs
modelling. In general, MLP and RBF network shows slightly better performance both in
175
the training and testing periods with good agreement of RMSE and RRMSE results,
revealing best fitting to the data.
In general, for the four selected catchments, the ANN model provided higher
training and testing accuracy when compared to the MLR model. Based on the goodness
of fit statistics, the accuracy of ANN compared favourably to the model accuracy of
existing technique. MLP and RBF network shows slightly better performance both in the
training and validation phase compared to the MLR model. The result shows that the
MLP and RBF models gives a lower error that the MLR model with good agreement of
RMSE and RRMSE results, revealing best fitting to the data. But, it is observed that the
MLR model is consistent and robust compared to the HEC-HMS and SWMM models.
Although the R 2 values for training and testing increased slightly, the results of RMSE,
RRMSE and MAPE are still fairly poor compared to the HEC-HMS and SWMM models.
For the four selected catchments, the ANN model provided higher training and
testing accuracy when compared to the HEC-HMS model. Based on the goodness of fit
statistics, the accuracy of ANN compared favourably to the model accuracy of HECHMS model. The study demonstrates the neural network model based on MLP and RBF
networks are suitable for modelling the rainfall-runoff relationship compared to the HECHMS model. Apparently, the MLP and RBF model shows a better performance than the
HEC-HMS model with good agreement of RMSE and RRMSE results in the training and
validation phase, revealing best fitting to the data. The HEC-HMS model gives a higher
error than the MLP and RBF models with a worse degree of efficiency; in term of
correlation of coefficient, RMSE and RRMSE.
The study demonstrates the neural network model based on MLP and RBF
networks are reliable and suitable for modelling the daily and hourly rainfall-runoff
relationship compared to the SWMM model. For the four selected catchments, the ANN
model provided higher training and testing accuracy when compared to the SWMM
model. It was verified by the goodness of fit results analyzed. The SWMM model gives
a higher error than the MLP and RBF models with a worse degree of efficiency. But, it
176
shows an improvement compared to the HEC-HMS model. As the HEC-HMS model, it
also takes a longer time for calibrate the model. The trial and error procedure is applied
to find out the best parameters for the SWMM model. Apparently, the MLP and RBF
model shows a better performance than the SWMM model with good agreement of
goodness of fit results in the training and validation phase, revealing best fitting to the
data. Furthermore, the MLP and RBF method has been shown that it can easily handle
the existence of non-linearity processes within the catchment compared to the XPSWMM models.
4.7.1
Basic Model Structure
In this study, the first important processes that have been carried out are to
develop the model structure for various models. It was carried out using trial and error
method and some professional judgement to define the optimal number of input nodes. It
is emphasize that this process is important for the neural network models. It is because
there are no methods or formula that can be used to model the structure of watersheds
because of their complexity and nonlinear.
The only way to develop the network
structure is by using appropriate procedures or rules.
The number of nodes in the hidden layer was determined by trial and error for
each case. If this number of hidden nodes is small, the network can suffer from under fit
of the data and may not achieve the desired level of accuracy, while with too many nodes
it will take a long time to be adequately trained and may some times over fit the data.
French et. al. (1992) proposed that normally neural networks were developed using 15,
30, 45, 60 and 100 hidden nodes. This procedure is also considered to examine the
performance of neural network model with different number of hidden nodes and hidden
layers.
In this study, these procedures are not relevance.
It is because every one
additional number of node yields different results and may contribute to the consistency
and accuracy of the model. For example, a networks using 5, 6, 7, 8 were developed and
177
so on for hidden nodes until the optimum number of nodes that produced the highly
accurate and reliable results. In overall, by extremely increase the number of hidden
layer and number of hidden nodes in the model, it will increase the complexity of the
system, and it may slow the calibration process without substantially improving the
efficiency of the network. So, it is proposed to increase the number of nodes in the
hidden layer one by one.
In general, there are no significant different between the results of 3 layer (with
one hidden layer) and 4 layer of MLP (with two hidden layer) networks in modelling
daily and hourly rainfall-runoff relationship for the selected catchments. Although twohidden layer networks provide slightly better training accuracy that the one-hidden layer
networks, the latter has significantly better testing accuracy. The increase in training
accuracy in two-hidden layer networks is due to over-fitting. As a result, the one-hidden
layer networks provide better testing accuracy. There is no significant difference in the
goodness of fit values between one and two-hidden layer networks for both training and
testing phase. In the literature review, the researchers believe that a network with one
hidden layer is enough. This statement is true for situations where the data contain
enough information on the system interest. The results of this study indicate that twohidden layer networks do not lead to an increase in the prediction accuracy.
As suggested by Master (1993) that using one hidden layer for MLP was
sufficient because in most problems with two hidden layer will not produce a large
improvement in performance. The use of more than one hidden layers substantially
increases the number of parameters to be estimated. In general, by increase the number
of hidden layer and number of hidden nodes in the model, it will increase the complexity
of the system, and it may slow the calibration process without substantially improving the
efficiency of the network. The accuracy of MLP increases as more and more input data
are made available to them.
Harun (1999) proposed the stepwise regression technique to determine of the
number of input nodes for the input layer in neural network. This technique will select
only certain variables that contribute to the model. This study has determined that the
178
trial and error method is more relevance and practicality. It had been found that every
single input were contribute to the model and substantially improving the efficiency of
the network is achieved. This process is repeated until the optimal number of input nodes
for the network.
Harun (1999) proposed in their previous work to use multilayer
perceptron with only one types of transfer function, namely binary sigmoid. There are
several types of activation function that are more dynamic and can initialize the weights
as biases to random values between +1 and -1, such as hyperbolic tangent function
(tansig). There are many of previous work that used this kind of activation function. The
hyperbolic tangent and sigmoid are often employed as transfer functions in the training of
network (Tokar and Johnson, 1999).
Tokar (1996) in their previous work have model the daily runoff as a function of
daily precipitation, temperature and snowmelt from two watersheds, the Little Patuxent
River, Maryland and the Independence River, New York. They have compared the
neural network model to the regression and the simple conceptual models. Tokar have
concluded that the ANN model provided higher training and testing accuracy when
compared to the regression and a simple conceptual model.
Tokar found that the
networks trained using two year data provided slightly better training and testing
accuracy and also higher percent predictions of peak discharges compared to the
networks trained using one year data. This work has found that the neural network model
can provide good performance and accurate in training and testing phase by train the
networks using a longer period of data sets. The networks have been trained using 100%,
50% and 25% of 4 years data. It has been found that the network that have been trained
using 100% data will produce higher accuracy and consistence results compared to the
results produced by the networks that have been trained using 50% of the data. Tokar
also concluded that the accuracy of the network trained using hyperbolic tangent
activation function was slightly better than the one trained using sigmoid transfer
functions. To evaluate the network reliability for future predictions, Tokar have proposed
to use the goodness of fit statistic namely roots mean square error (RMSE). There are
several types of goodness of fits statistics can be applied. It is not fair if we just use one
statistic method to measure of accuracy of the sample size. They would try more than
179
one method before make any conclusion to avoid biases. Tokar also concluded that the
ANN models are not very sensitive to the selection of the number of neurons and layers
in the network. This study has found that the neural network model is sensitive to the
selection of the number of neurons in the network. Once the number of hidden nodes is
increased, the number of parameters will increase too. Apparently, it can be concluded
that the number of hidden nodes is sensitive to the selection of the number of input
variables.
In this study, the trial and error procedures are adopted to develop the network
structures for several selected watersheds. It may differ to each other depends on the
time interval, the quality and quantity of input-output data pairs. The rainfall at current
and previous time, t , t − i (where i =1,2,…) appears to contain the data that are needed
to model rainfall-runoff processes due to the high intercorrelations with runoff data at
time t . In most commonly used rainfall-runoff models, such as the Martinec model,
runoff for the previous time period is included as an input variable to the model (Hawley,
1979). Therefore, this study found that by including the runoff at previous time ( t − i ), it
resulting the models capable to produce good results. It was observed especially at the
developed area such as Sungai Klang catchment. The characteristic of the catchment is
very complex and the data consists of many highly extreme events. For Sungai Klang
catchment, it was found that the runoff at previous time ( t − 1) and ( t − 2 ) gave a
significant contribution to the hourly ANN model capability to produce more accurate
prediction.
The following ANN model structures are proposed for the prediction of runoff
using available rainfall measurements from rain gauges locate at the site. The models are
proposed to describe the relationships between daily and hourly rainfall against runoff
using available data series. Section 4.7.1.1 and section 4.7.1.2 described the optimal
numbers of input nodes for daily and hourly rainfall-runoff models respectively.
4.7.1.1 Optimal numbers of input nodes for daily rainfall-runoff models
180
This section shows a model structure of daily rainfall-runoff relationship for each
catchment. It was carried out by using trial and error method. The model structures
proposed for each catchment are described as follows:
(i)
Sungai Bekok catchment
y (t ) = f {x(t ), x (t − 1), ... , x (t − 15)}
(4.13)
For the Sungai Bekok catchment, it was observed that the current and
previous time of rainfalls at time t to ( t − 15 ) gave significant
contributions to the model accuracy of ANN network structure. In overall,
this particular area consists of sixteen inputs data in their model structure.
Figure 4.25(a) and Figure 4.25(b) shows the architecture of 3-layer and 4layer of the MLP network structures for Sg. Bekok catchment
respectively.
(ii)
Sungai Ketil catchment
y (t ) = f {x(t ), x (t − 1), ... , x (t − 15)}
(4.14)
For the Sungai Ketil catchment, it was observed that the current and
previous time of rainfalls at time t to ( t − 15 ) gave significant
contributions and most accurate to the ANN network structure. Thus, this
particular area is also consists of sixteen inputs data in their model
structure. Figure J4.26(a) and Figure J4.26(b) in Appendix J shows the
architecture of 3-layer and 4-layer of the MLP network structures for Sg.
Ketil catchment respectively.
(iii)
Sungai Klang catchment
y (t ) = {x (t ), x (t − 1), ... , x (t − 15), x (t − 16)}
(4.15)
For the Sungai Klang catchment, it was observed that the current and
previous time of rainfalls at time t to ( t − 16 ) yield significant
contributions and most accurate to the ANN network structure. In overall,
it consists of seventeen inputs data in their model structure and the
relationship of the rainfall-runoff in this particular area reflect as more
complex and highly non-linear system.
Figure J4.27(a) and Figure
J4.27(b) in Appendix J shows the architecture of 3-layer and 4-layer of the
MLP network structures for Sg. Klang catchment respectively.
181
(iv)
Sungai Slim catchment
y (t ) = f {x(t ), x (t − 1), ... , x (t − 15)}
(4.16)
Meanwhile, for the Sungai Slim catchment, it was observed that the
current and previous time of rainfalls at time t to ( t − 15 ) gave significant
contributions and most accurate to the ANN network structure. Thus, this
particular area also consists of sixteen inputs data in their model structure.
Figure J4.28(a) and Figure J4.28(b) in Appendix J shows the architecture
of 3-layer and 4-layer of the MLP network structures for Sg. Slim
catchment respectively.
4.7.1.2 Optimal numbers of input nodes for hourly rainfall-runoff models
This section shows a model structure of hourly rainfall-runoff relationship for
each catchment. It was carried out by using trial and error method. The model structures
proposed for each catchment are described as follows:
(i)
Sungai Bekok catchment
y (t ) = f {x (t ), x(t − 1), x(t − 2), x (t − 3), x (t − 4), x (t − 5), y (t − 1)}
(4.17)
For the Sungai Bekok catchment, it was observed that the current and
previous time of rainfalls at time t to ( t − 5 ) and the flow at previous time
( t − 1) gave significant contributions and most accurate to the hourly ANN
network structure. Thus, this particular area consists of seven inputs data
in their model structure. Figure 4.29(a) and Figure 4.29(b) shows the
architecture of 3-layer and 4-layer of the MLP network structures for Sg.
Bekok catchment respectively.
(ii)
Sungai Ketil catchment
y (t ) = f {x(t ), x(t − 1), x(t − 2), x(t − 3), x(t − 4), y (t − 1)}
(4.18)
For the Sungai Ketil catchment, it was observed that the current and
previous time of rainfalls at time t to ( t − 4 ) and the flow at previous time
( t − 1) gave significant contributions and most accurate to the ANN
network structure. In overall, this particular area consists of six inputs
data in their model structure. Figure J4.30(a) and Figure J4.30(b) in
182
Appendix J shows the architecture of 3-layer and 4-layer of the MLP
network structures for Sg. Ketil catchment respectively.
(iii)
Sungai Klang catchment
y (t ) = f {x(t ), x(t − 1), x(t − 2), x (t − 3), x (t − 4), x (t − 5), y (t − 1), y (t − 2)}
(4.19)
For the Sungai Klang catchment, it was observed that the current and
previous time of rainfalls at time t to ( t − 5 ) and the flow at previous time
( t − 1) and ( t − 2 ) gave significant contributions and most accurate to the
ANN network structure. Thus, this particular area consists of eight inputs
data in their model structure.
Again, this condition reflects that the
relationship of rainfall-runoff in this particular area is relatively complex
and non-linear. Figure J4.31(a) and Figure J4.31(b) in Appendix J shows
the architecture of 3-layer and 4-layer of the MLP network structures for
Sg. Klang catchment respectively.
(iv)
Sungai Slim catchment
y (t ) = f {x (t ), x(t − 1), x(t − 2), x (t − 3), x (t − 4), x (t − 5), y (t − 1)}
(4.20)
Meanwhile, for Sungai Slim catchment, it was observed that the current
and previous time of rainfalls at time t to ( t − 5 ) and the flow at previous
time ( t − 1) gave significant contributions and most accurate to the ANN
network structure. This particular area also consists of seven inputs data
in their model structure. Figure J4.32(a) and Figure J4.32(b) in Appendix
J shows the architecture of 3-layer and 4-layer of the MLP network
structures of the MLP model for Sg. Slim catchment respectively.
The selection of training data that represents the characteristics of a catchment
and rainfall patterns is extremely important in modelling. The value of y (t − 1) in the
model structures probably can represent the condition of the soil moisture content or
water table content in that particular study area. The training data should be large enough
to contain the characteristics of the catchment and to accommodate the requirements of
the ANN architecture.
Although the network architecture depends highly on the
complexity of the mapping between the input and the output variables, it is controlled by
183
the availability of data, especially in the applications where data are limited. Using a
large number of neurons or extra layers in a network cannot help to detect the patterns in
the phenomenon that is not included in the training data. The selection of the number of
nodes or neurons in a layer and the number of layers in a network has a significant effect
on the performance of the neural network models and the time spent to train networks. A
network with a large of nodes and layers is usually very slow and requires large training
sets. However, it has the ability to learn more complex patterns. Meanwhile, if the
number of parameters in the network is much smaller than the total number of points in
the training set, then there is little or no chance of over-fitting. If more data is used and
increase the size of the training set, then there is no need to worry about the problem of
over-fitting. So, it can be concluded that the selection of the size of the network is
problem specific and can be accomplished by experience and experimentation.
Figure 4.25(a)
Sg. Bekok catchment.
The 3-layer MLP network structures of the daily model for
184
Figure 4.25(b)
The 4-layer MLP network structures of the daily model for
Sg. Bekok catchment.
Figure 4.29(a)
The 3-layer MLP network structures of the hourly model
for Sg. Bekok catchment.
185
Figure 4.29(b)
The 4-layer MLP network structures of the hourly model
for Sg. Bekok catchment.
4.7.2
Model Performance
The study also demonstrates the neural network model based on MLP and RBF
models is suitable for modelling the rainfall-runoff relationship. By considering a good
training process and suitable algorithms and nodes, the prediction is more accurate. The
GRNN algorithms are ‘fast learners’ and RBF network could predict runoff accurately
with good agreement between the observed and predicted values. The RBF model has
been proven a robust model in modelling the rainfall-runoff relationship.
It was
successful for single-storm events and multiple-storm events. The MLP and RBF method
is highly recommended for a successful rainfall-runoff modelling problems.
In the literature review, the ANN methodology has been reported to provide
reasonably good solutions for circumstances where there are complex systems that may
be poorly defined or understood using mathematical equations, problems that deal with
noise or involve pattern recognition, and situations where incomplete and ambiguous
input-output data. Because of these characteristics, it was believed that ANN could be
applied to model daily and hourly rainfall-runoff relationship. In previous chapter, it was
demonstrated that the ANN rainfall-runoff models exhibit the ability to extract patterns in
the training data.
The training and testing accuracy that is satisfied based on the
goodness of fit tests, supports this conclusion.
The study demonstrated that the neural network model based on MLP is suitable
for modelling the rainfall-runoff relationship. It has the ability to learn spatially rainfallrunoff data from different locations. The MLP has been identified as a robust model in
modelling the rainfall-runoff relationship. It can model accurately the storm hydrograph
for single-storm and multiple-storm events. The predicted peak discharge and time to
186
peak are in close agreement to the actual values.
Obviously, the MLP and RBF
application to model the hourly streamflow hydrograph was successful.
Results of rainfall-runoff modelling indicate that application of MLP and RBF
method is more accurate for Sungai Bekok catchment compared to Sungai Klang
catchment catchment. Normally, for a large catchment size, the river flow is highly
nonlinear and influenced by storage effect.
Furthermore, the river flow also can
influence by the land use of the catchment area especially in fully developed area. In
addition, the effect of spatial rainfall and control structures may contribute to the
complexity of the system.
The rainfall-runoff models that were developed are site-specific and can only be
applied for prediction of daily and hourly runoff in the original catchments. If there are
any major changes in the catchment characteristics such as heavy development or
urbanization, deforestation, or changes in the river characteristics, the models should be
retrained using additional data that accounts for these changes.
Obviously, the application of neural network method in modelling the daily and
hourly rainfall and runoff relationship for Sungai Bekok, Sungai Ketil, Sungai Klang, and
Sungai Slim is satisfactory. The results reflect that the performance of neural network
model is satisfactory and it is feasible for rainfall-runoff model in Malaysia catchment.
The inaccuracy of model could be clarified by utilization of longer period of training data
with many events of peak discharge, especially for the urban catchment area that faced
the problem of missing data such as Sungai Klang catchment.
Using ANN methodology, various combinations of rainfall and runoff at present
and previous time periods were trained and tested in order to develop a runoff model for
the selected catchments area. Based on the discussion provided for the catchments, oneand two-hidden layer networks were used to train the MLP networks. Based on the
training results for the selected catchments, the sensitivity of the runoff to rainfall at
present and previous time periods was examined using the goodness of fit tests. Results
187
of R 2 , RMSE, RRMSE, and MAPE for Sungai Bekok, Sungai Ketil, Sungai Klang and
Sungai Slim catchments reflect that the MLP and RBF models consistently display a
better performances compared to the MLR, HEC-HMS and XP-SWMM models. In daily
and hourly rainfall-runoff modelling, the accuracy of training and testing of MLP and
RBF models was slightly improved with the appropriate number of input values at the
previous time periods. For the daily rainfall-runoff modelling of Sg. Bekok, Sg. Ketil
and Sg. Slim catchments, the addition of rainfall at time t − 16 did not have a large
impact in the prediction accuracy. But, the additional of rainfalls or inputs until at time
t − 15 was appropriated to produce good results. Meanwhile, it was observed that for the
Sg. Klang catchment, the addition of rainfall at time t − 17 did not have a large impact in
the prediction accuracy. This result was expected since the Sg. Klang catchment is
mainly driven by rainfall with obviously have a very wide range of minimum, maximum
and average rainfall values recorded. In overall we can concluded that, for the daily
ANN models, it was adequately to used only rainfalls data to represent the runoff
processes in this particular catchments area.
For the hourly rainfall-runoff modelling of Sg. Bekok, Sg. Klang and Sg. Slim
catchments, the addition of rainfall at time t − 6 did not have a large impact in the
prediction accuracy. Meanwhile, for the Sg. Ketil catchment, the addition of rainfall at
time t − 5 did not have a large impact in the prediction accuracy. In fact, for the Sg.
Bekok, Sg. Ketil and Sg. Slim catchments, the addition of runoff data at time t − 1, the
percent flow estimates for testing improved yields a good results. Meanwhile, with the
inclusion of runoff data at time t − 1 and t − 2 , the percent flow estimates for testing
improved relatively for the Sg. Klang catchment. Further more, the advantage of RBF
model is that it can be trained much faster than the MLP model. It was also found that
ANN performance was hardly influenced by the level of non-linearity, and the selection
of training data. A large number of training data sets are required to perform successful
training especially for the MLP model. By considering a good training process and
suitable algorithms and nodes, the prediction is more reliable and accurate. In other
words, the MLP and RBF models are highly recommended for a successful daily and
hourly rainfall-runoff modelling problems.
188
The training process for MLP network is time consuming. If the architecture of
the training algorithms is not suitable, it will affect the accuracy of predictions and a
network’s learning ability. The number of hidden layer and the number of hidden nodes
significantly influences the performance of a network and the time taken to train the
model. The time taken for calibration or training the 3 layer MLP network is less than
the 4 layer network. For the MLP model, the results of computed errors are quite close
although the number of hidden layer was increased from one to two. It shows that by
increased the number of layer in the hidden layer were not improved the performance of
the model. This indicates that the model structure of 3 layer networks is appropriate to
model the nonlinearity of rainfall-runoff relationship in urban or rural catchments area.
The validation processes are running parallel with the calibration processes. The method
used for validation is early stopping method. The objective is to monitor the progress of
calibration processes as described in chapter three. It can make the model more efficient
and accurate.
In this study, it is also found that the performance of the model also affected by
the number of nodes in the hidden layer. So that, it is important to state the optimal and
sufficient number of the hidden nodes in the hidden layer to make sure the network
structure can work faster, robust and stable. If the number of nodes is too many or too
small, the model will not perform better and may not achieve the desired target.
Furthermore, with too many nodes in the hidden layer, the model will take a long time to
perform the modelling processes. So, the suitable and appropriate number of nodes in the
hidden layer will produce the best fits results.
4.7.3
Transfer Function and Algorithm
In general, for the neural networks structure, a learning process using the
derivative of the transfer function has been usually employed. A transfer function that is
189
differentiable or continuous everywhere is required in these types of network. The
transfer function or activation function is one of the most commonly used continuous
functions for the advanced form of networks. In order to select a transfer function, the
training and testing accuracy of a network that uses the hyperbolic tangent function were
compared with the training and testing accuracy of a network that uses the other
activation function such as sigmoid function. The networks that use a hyperbolic tangent
function provide slightly better training and testing accuracy when compared to the
networks that use a sigmoid function. In addition, the peak flows were predicted more
closely to their observed values by the networks that use the hyperbolic tangent transfer
function. This result indicates that the hyperbolic tangent function is more flexible and
reliable than the other functions in modelling rainfall-runoff process for the selected
catchments.
For the MLP model, the best results were achieved for the network with
hyperbolic tangent (tansig) activation function, Levenberg-Marquardt (LM) algorithm,
and appropriated number of input nodes. The accuracy of the MLP network trained using
tangent sigmoid function was slightly better. Therefore, the hyperbolic tangent function
was chosen as a transfer function to train the networks developed in this study. This
combination of MLP structure was performed faster and more accurate near an error
minimum. Meanwhile, the back-propagation algorithm used in training phase is a feedforward process. It was the good and popular algorithms because it can work forward
and backward until it can produce the best result or output with minimum error, since the
process was depends on the current inputs and the previous outputs. Therefore, the
rainfall-runoff process can be properly modelled with a procedure using backpropagation algorithm.
Meanwhile, the RBF network can be described as universal approximated
function using combinations of basis functions centred around weights vectors to provide
spatial estimates.
The best results were achieved for the network with Gaussian
activation function, GRNN algorithms, and appropriated number of input nodes.
190
Therefore, the Gaussian function was chosen as a transfer function to train the RBF
networks developed in this study. If the architecture of the training algorithms is not
suitable, it will affect the accuracy of predictions and a network’s learning ability. The
accuracy of the RBF network trained using Gaussian function was slightly better.
Both in MLP and RBF models, the number of input nodes significantly influence
the performance of a network and the time taken to train the model. It is related to the
complexity of the system being modelled and to the resolution of the data fit. The
number of input nodes in the input layer was determined by trial and error for each case.
If this number of input nodes is small, the network can suffer from under fit of the data
and may not achieve the desired level of accuracy, while with too many nodes it will take
a long time to be adequately trained and may some times over fit the data. The RBF
model has the ability to training and testing much faster than the MLP model. With the
appropriate architecture and the training algorithms, it would produce the best fit results
even using optimum number of data sets for the calibration of the model.
The ANN algorithms compute weight based on training to fit the objective
function without encompass or indicate the hydrological characteristic of catchment. The
HEC-HMS and SWMM algorithms try to encompass the physical or hydrological
characteristic of a system such as infiltration, overland flow, time of concentration and
etc. With the fix hydrological characteristic the models are calibrated to fit the actual
against simulated hydrograph.
4.7.4
Robustness and Model Limitation
The limitations of the model due to the model structures and the data available on
parameter values, initial conditions and boundary conditions will generally make it
difficult to apply a hydrological model without some form of training process or
191
calibration. In previous study and from the finding, it was found that majority of cases
the parameter values are adjusted to get a better fit to some observed data. The problem
arise on how to assess whether one model or set of parameter values is better that another
is open to a variety of approaches, from a visual inspection of plots of observed and
predicted variables, to a number of different quantitative measures of goodness of fit,
performance measures, fitness measures and likelihood measures.
Robustness test on five different data set with different time period can reveal the
consistency of the model. The robustness test is carried out to evaluate the performance
of the model using different input-output data sets. Certainly, a robust model will
consistently yield the lowest RMSE, RRMSE, and MAPE errors. The neural network has
been proven a robust model in modelling the rainfall-runoff relationship. The MLP and
RBF methods are highly recommended for a successful daily and hourly rainfall-runoff
modelling problems. Furthermore, the MLP and RBF network has been proven a robust
model in modelling the rainfall-runoff relationship compared to the MLR, HEC-HMS
and SWMM models. The study demonstrates the neural network model based on MLP
and RBF is suitable for modelling the rainfall-runoff relationship compared to the MLR,
HEC-HMS and SWMM models.
It is suggested that even a simple model with only four or five parameter values to
be estimated, it required at least 20 to 25 hydrographs for a reasonably robust calibration.
For more complex parameter sets, may more data and different types of data may be
required for a robust optimization unless it might be possible to fix many of the
parameters. But it was found that, in practice this had proven to be very difficult to
achieve. Thus, the optimum parameter set found for a particular model structure may be
sensitive both to small changes in the observations or to the period of observations
considered in the calibration, and possibly to changes in the model structure such as a
change in the element discretization for a distributed model.
The limitations of both the model structures and observed data, there may be
many representations of a catchment that may be equally valid in term of their ability to
192
produce acceptable simulations of the available data. Hence, different model structures
and parameter sets used within a model structure are competing to be considered
acceptable as simulators. An optimum parameter sets will give a range of predictions.
This may actually be an advantage since it allows the possibility of assessing the
uncertainty in predictions, conditioned on the calibration data, and then using that
uncertainty as part of the decision-making process arising from a modelling project. The
objectives of this project and the data available for calibrating the different models will
all limit that potential range of simulators. The important point is that choices between
models and between parameter sets must be made in a logical and scientifically
defensible way to provide good predictions.
The ANN performance is influenced by the level of non-linearity and the
selection of training data. A large number of training data sets are required to perform
successful training. The number of hidden layer neurons significantly influences the
performance of a network. If this number is small, the network can suffer from under fit
of the data and may not achieve the desired level of accuracy, while with too many nodes
it will take a long time to be adequately trained and may some times over fit the data.
Although these models give little insight into the physical processes, they provide good
enough and low-cost solution.
Although several studies indicate that ANN has proven to be potentially useful
tools in hydrology, their disadvantages should not be ignored. The success of an ANN
application depends both on the quality and the quantity of the data available. This
requirement cannot be easily met, as many hydrologic records do not go back far enough.
Quite often, the requisite data is not available and has to be generated by other means,
such as another well-tested model. Even when long historic records are available, it not
certain that conditions remained homogeneous over this time span. Therefore, data sets
recorded over a system that is relatively stable and unaffected by human activities are
desirable. The major limitation of ANN is in the lack of physical concepts and relations.
This has been one of the primary reasons for the skeptical attitude towards this
methodology. Other limitation is there is no standardized way of selecting network
architecture. The choice of network architecture, training algorithm, and definition of
193
error are usually determined by the past experience and preference, and also by using trial
and error method.
The limitation of the HEC-HMS and SWMM models is it requires more physical
data to obtain more accurate results. Those models require the soil types, catchment
characteristics, based flow, infiltration, losses, etc. Furthermore, it cannot model with
rainfall data and runoff data only. Thus, the suitable parameters of the model can be
obtained by using trial and error method in the calibration processes that will takes a long
times to get proper trained.
4.7.5
River Basin Characteristics
The independences of several catchments in Peninsular Malaysia were selected to
simulate runoff from rainfall using the ANN methodology. The characteristics of the
selected catchments are discussed in chapter 3. The daily and hourly rainfall, runoff, and
evapotranspiration values were measured from the year 1980 to year 2000. The locations
of raingauges and water level stations are shown in the Figures 3.1 to 3.4. The whole of
Peninsular Malaysia experiences a typical Rainy Tropical or Tropical Wet climate, giving
arise to a climax vegetation of tropical rain forest.
There are no clearly definable
seasons, it being warm or hot throughout the year. Although the peninsular is subject to a
monsoon regime of winds, there is no distinct dry season of regular occurrence and
significant duration that is liable to affect rainfall-runoff relationships. The uniformity of
the seasonal pattern of rainfall and potential evapotranspiration allows the floodproducing or runoff aspects of climate to be isolated and identified in the nature of the
rainfalls that producing runoffs. The maximum rainfall record for the Sg. Bekok, Sg.
Ketil, Sg. Klang and Sg. Slim catchments are 1030mm, 110.3mm, 114mm and 1310mm
respectively. The maximum runoff record for the Sg. Bekok, Sg. Ketil, Sg. Klang and
Sg. Slim catchments are 7.1m3/s, 33.15m3/s, 89.0m3/s and 66.27m3/s respectively.
Meanwhile, the annual daily average of flow or runoff for the catchments is 4.83m3/s,
29.95m3/s, 22.28m3/s and 65.70m3/s respectively.
194
Regional rainfall-runoff relationships of the type proposed in this study can
generally only be established for hydrologically ‘large’ catchments. Such catchments
within a homogeneous region are those in which the catchment storage processes control
the magnitude of the annual maximum floods, as distinct from hydrologically ‘small’
catchments, in which the soil and vegetation conditions and the temporal characteristics
of the flood producing rainstorm are of major importance. The distinction between the
two cannot be made solely on the basis of catchment area, although in the extreme cases
of ‘very small’ and ‘very large’ the difference is obvious. Certain climatic, topographic
and land use combinations can be made catchment areas of considerable size exhibit
flood characteristics of hydrologically ‘small’ catchments, and vice versa.
The results of MLP modelling shows that the neural network performances are
influenced by the level of nonlinearity and the selection of training data, quality of the
data, and the characteristics of the catchments area. For example, results of rainfallrunoff modelling indicate that application of MLP method is more accurate for Sungai
Bekok catchment compared to Sungai Klang catchment catchment. Normally, for a
smaller catchment size and for a developed area, the river flow is highly nonlinear and
influenced by storage effect.
In addition, the effect of spatial rainfall and control
structures may contribute to the complexity of the system.
In general, there are two families of model (MLP and RBF) with three types of
neural models. The first is MLP with one hidden layer; the second is MLP with two
hidden layer; and finally the third is the RBF model. The developments of neural
network model structure adopt the method by Tokar and Johnson (1999). The results are
shown in the previous tables. Sungai Ketil catchment (704 km2) is 2 times bigger than
Sungai Bekok catchment (350 km2). Meanwhile, the Sungai Klang catchment (468 km2)
and Sungai Slim catchment (455 km2) have relatively the same magnitude of catchment
area. It can be seen that the levels of efficiency of the four catchments were improved in
the testing stage when the models were trained properly. Furthermore, the correlation of
coefficient for Sungai Bekok is better than the Sungai Ketil. Probably, the size of the
catchment contributes to the inaccuracy of neural modelling. A large fully developed
catchment such as Sungai Klang generates considerably a higher peak flood discharge.
195
So, the neural network model require sufficient amount of data with a large peak
discharge during training and generalization to produce the best fit results.
In view of the foregoing and because of sensible relationships established
between rainfall and runoff, it is evident that within a particular flood frequency region
the controlling flood producing characteristics is catchment area. The difference in the
flood frequency relationship between regions for hydrologically ‘large’ catchments is the
result of a number of factors, amongst these being climate, vegetation and land use,
surficial geology and topography. It is important to appreciate the variation in the flood
producing characteristics of the regions in a qualitative sense, so that the procedure can
be sensibly and usefully applied.
This study revealed that the characteristics of the catchment area can give impact
to the capability of the models to produce better results. For the ANN models, the
important characteristics have to be considered are the size of the catchment, land use,
and the quality and the quantity of the input-output data.
For example, Sg. Ketil
catchment is considered as a semi developed or natural area with size 704 km2. The
ANN model that have been developed for this study area yield a very good results
compared to the result of Sg. Klang Catchment which considered as a fully developed
area with size 468 km2. It shows that the hydrologic cycle in that area is consistent and
not extremely disturbed by the infrastructure and highway developments.
4.7.6
Time Interval
In this study, two cases of time intervals of rainfall-runoff data have been carried
out for modelling rainfall-runoff relationship. First is daily time interval and the second
is hourly time interval of rainfall-runoff data. It is found that the neural network models
are suitable for modelling rainfall-runoff relationship for both time intervals and it shows
a good performance and accurate results. In the case of MLP model, it is required longer
196
period of training data sets for being properly trained. So, it can produce much more
good results. Meanwhile, for RBF model, it is only required an optimum number of
training data sets to be properly trained.
If a very long record, say 1000 years data were available it would be expected that
the estimates of the value of flood peaks of a particular return period would be quite
reliable. If, say 10 records were selected from the 1000 years record, each of length 100
years, and the flood estimates examined for particular return periods for each record, it
would be found that they were not all the same. A larger variation would be expected if
50 records each of 20 years length were examined. What has been done in this study is to
examine 50 records of the 10 year records.
A good quality input-output pairs of data sets was selected to develop and
evaluated of the neural network models.
The data are selected from the selected
catchment with a large number of input-output pairs of data. Record of 10 years of
hourly rainfall-runoff series of Sungai Bekok catchment (1991-2000), Sungai Ketil
catchment (1983-1992),
Sungai Klang catchment (1991-2000) and Sungai Slim
catchment (1988-1997) are used. In this study, 55 hourly sets of data have been selected
from the records. The neural network was trained under two sets of conditions. The first
50 sets of data are used for model calibration or training, and the remaining five sets of
data are used for model testing and validation, and also to test the robustness and the
consistency of the model.
Meanwhile, record of five years of continuous daily rainfall-runoff series from
four selected catchments are selected to evaluate the performance of the neural network
model. The data used consist of two sets: the first three years of data are used for model
calibration (training) in the case of ANN, and the remaining two years of data are used
for model validation and testing. Increasing the number of training data in the training
phase, with no change in neural network structure, will improve performance on the
training and testing phase. Thus, it is depends on providing an adequate number of
training data. A good quality input-output pairs of data sets was selected to develop and
197
evaluated of the neural network models. In this study, 1-year, 6-months, dry-season and
wet-season data were selected for testing, based on the annual average rainfall. The data
used for testing or verify the models are divided to 5 set as follows:
(i)
Data set 1: 1-year data (Jan – Dec)
(ii)
Data set 2: 6-month data (Jan – Jun)
(iii)
Data Set 3: 6-month data (Jul – Dec)
(iv)
Data set 4: 3-month data (Mar – May) – Dry season
(v)
Data set 5: 3-month data (Oct – Dec) – Wet season
In selecting the data, particular attention was given to the land use change in the
catchment since it was assumed that the land use data were not available. Recent data
were used whenever possible since they reflect the current land use conditions in the
catchment. The most current data were used in the test set in order to illustrate the
capability of model in predicting future occurrences of runoff, without directly including
the land use characteristics of the catchments.
In this study, it was observed that the effect of the length of training data on the
network accuracy is highly significant especially for the MLP networks.
Networks
trained using long period data provided slightly better training and testing accuracy and
also higher percent predictions of peak flows compared to the existing models.
Meanwhile, the impact of the length of training data was insignificant in the accuracy of
RBF model. Therefore, the length of data required is correlated to the complexity of the
models.
The flood data available for studies are not amenable to rigorous treatment. The
analyses and results are based on about 5 years of daily and 10 years of hourly past data,
which is a very small sample of the long time flood peak population. By combining the
flood records within regions that appear to be homogeneous, a process of averaging is
carried out, which in the long run should provide results consistent with accumulated
flood experience.
This past record has been used to develop relationships that are
assumed to hold true in the future. This necessarily restricts the study to data that are the
198
result of natural, as distinct from man-made influences. Despite the necessary limitations
imposed by the above consideration, it is possible to examine the effects of such aspects
as errors in the basic data and the affect of the length of record on the flood estimation
procedure.
With the addition of optimum number of rainfall at previous time periods in the
models, the performances of the neural networks models increased slightly for training
and testing when compared to MLR, HEC-HMS and SWMM models. But, the most
significant impact of including previous rainfall was the rise in the percent estimation
accuracy values during training and testing for hourly rainfall-runoff modelling.
Harun (1999) observed that the MLR model is suitable for model continuous daily
rainfall-runoff relationship. In this study, it found that the MLR model yield a satisfied
results for modelling of daily rainfall-runoff relationship. Meanwhile, for both of HECHMS and SWMM models, it shows satisfied results in modelling of hourly rainfallrunoff relationship compared to the daily results. It means that both of the models are
suitable to model the hourly events hydrograph with peak discharge and unsuitable to
model continuous daily rainfall-runoff data sets with many events.
199
200
Figure 4.1(a)
Daily results of 3-Layer neural networks for Sg. Bekok
catchment using 100% of data sets in training phase
201
Figure 4.1(b)
Daily results of 3-Layer neural networks for Sg. Bekok
catchment using 50% of data sets in training phase
202
Figure 4.1(c)
Daily results of 3-Layer neural networks for Sg. Bekok
catchment using 25% of data sets in training phase
203
Figure 4.2(a)
Daily results of 4-Layer neural networks for Sg. Bekok
catchment using 100% of data sets in training phase
204
Figure 4.2(b)
Daily results of 4-Layer neural networks for Sg. Bekok
catchment using 50% of data sets in training phase
205
Figure 4.2(c)
Daily results of 4-Layer neural networks for Sg. Bekok
catchment using 25% of data sets in training phase
206
Figure 4.9(a)
Hourly results of 3-Layer neural networks for Sg. Bekok
catchment using 100% of available data sets in training phase
207
Figure 4.9(b)
Hourly results of 3-Layer neural networks for Sg. Bekok
catchment using 65% of available data sets in training phase
208
Figure 4.9(c)
Hourly results of 3-Layer neural networks for Sg. Bekok
catchment using 25% of available data sets in training phase
209
Figure 4.10(a)
Hourly results of 4-Layer neural networks for Sg. Bekok
catchment using 100% of available data sets in training phase
210
Figure 4.10(b)
Hourly results of 4-Layer neural networks for Sg. Bekok
catchment using 65% of available data sets in training phase
211
Figure 4.10(c)
Hourly results of 4-Layer neural networks for Sg. Bekok
catchment using 25% of available data sets in training phase
212
Figure 4.17(a)
Daily results of RBF networks for Sg. Bekok catchment
using 100% of data sets in training phase
213
Figure 4.17(b)
Daily results of RBF networks for Sg. Bekok catchment
using 50% of data sets in training phase
214
Figure 4.17(c)
Daily results of RBF networks for Sg. Bekok catchment
using 25% of data sets in training phase
215
Figure 4.21(a)
Hourly results of RBF networks for Sg. Bekok catchment
using 25% of available data sets in training phase
216
Figure 4.21(b)
Hourly results of RBF networks for Sg. Bekok catchment
using min of available data sets in training phase
216
CHAPTER 5
CONCLUSIONS AND RECOMMENDATIONS
5.1
General
There are clearly implications for other studies that depend on models of rainfallrunoff processes. Predictions of catchment hydrogeochemistry, sediment production and
transport, the dispersion of contaminants, hydroecology, and in general integrated
catchment decision support systems depend crucially on good predictions of water flow
processes. In that all these components will depend on the prediction of water flows that
will be subject to the types of uncertainties in predictive capability. Since the 1930’s,
numerous rainfall-runoff models have been developed to predict or forecast runoff.
Continuous rainfall-runoff processes that are currently available have very complex and
non-linear relationship. These models are very difficult to apply because of the large
number of model parameters and equations that define the components of the hydrologic
cycle.
The potential of artificial neural network models for prediction runoff has been
presented in this paper. The non-linear nature of the relationship of rainfall-runoff
processes is appropriate for the application of ANN methods. Results of ANN models
reflect that the application of neural network methods is feasible for model the rainfall-
217
runoff relationship in Malaysia region. Apparently, the neural network has the ability to
predict runoff accurately using the rainfall data as input variable.
5.2
Conclusions
The main objective to model the rainfall-runoff processes of hydrology because of
the limitations of hydrological measurement techniques. In fact, only limited range of
measurement techniques and a limited range of measurements in space and time. It is
because it is not be able to measure everything that we would like to know about
hydrological systems. Therefore, it need a means of extrapolating from those available
measurements in both time and space, particularly to ungauged catchments, where
measurements are not available and into the future where measurements are not possible
to assess the likely impact of future hydrological change. A new approach such as ANN
models can provide a good prediction that will hopefully be helpful in decision-making
about a hydrological problem. With increasing demands on water resources throughout
the world, improved decision-making within a context of fluctuating weather patterns
from year to year, requires improved models. That is what this study is about.
In order to evaluate how well a model can be applied to approximate the
relationship between rainfall and runoff, it is necessary to compare the predictive
capabilities of a model with existing approaches. The comparison of models is usually
accomplished by testing all the models of interest on a data set from the same catchment.
As discussed in chapter 2, the calibration of existing models (HEC-HMS and SWMM) is
very complex and involves a lengthy calibration procedure. Within the time frame
available, it was not possible to achieve this goal. Therefore, the ANN models that were
developed in this study were compared to the rainfall-runoff models such ad MLR, HECHMS and SWMM.
218
The conclusions of the study have been drawn as the following;
(a)
Obviously, there are enough by using only one hidden layer in MLP
networks. It would train the network much faster than the networks with 2
hidden layer networks.
The use of more than one hidden layers
substantially increases the number of parameters to be estimated and will
take more time to train the networks. Such an increase in the number of
the parameters may slow the calibration process without substantially
improving the efficiency of the network.
(b)
These potential errors tend to get forgotten when the discharge data are
made available as a computer file for use in rainfall-runoff modelling.
There is always a tendency for the modeller to take the values as perfect
estimates of the discharges. To some extent this is justified, the data are
the only indication of the true discharges and the best data available for
calibrating the model parameters. However, if a model is calibrated using
data that are in error, then the effective parameter values will be affected
and the predictions for other periods, which depend on the calibrated
parameter values, will be affected. So, it is worth stressing that prior to
applying any model the rainfall-runoff data should be checked for
consistency.
(c)
The performance of the MLR, HEC-HMS and SWMM models are
moderate. According to the coefficient of efficiency of the model, it was
found that the performance of those models is unsatisfactory. For both
calibration and validation processes, the MLR, HEC-HMS and SWMM
also take a longer time compared to the ANN models. Furthermore, this
evaluation is based on several limitations such as; (1) no sub-divide of the
catchment area and (2) no observed data on infiltration, abstraction and
moisture content.
(d)
Obviously, the RBF networks takes a shorter time in training and testing
process compared to the MLP networks. It is because the parameters for
the RBF networks are less than the parameters for the MLP networks. For
example, for Sungai Bekok catchment, the total of parameters involved are
219
250 and 52 for MLP and RBF respectively. But, both of the models were
produced best fits and good results.
We can conclude that the RBF
network is an alternative method for modelling rainfall-runoff relationship
beside MLP networks. The RBF method also yields good fit results as the
MLP networks. However, the MLP and RBF method has been shown that
it can easily handle the existence of non-linearity processes within the
catchment.
(e)
Compared to other models, the ANN models are relatively easy to use and
their calibration is more systematic. It also demonstrates that the model
structure in the ANN models are not very sensitive to the selection of the
number of neurons and hidden layer(s) in a network and transfer functions
of neurons. It had proved that a single hidden layer network containing a
sufficiently number of nodes could be used to approximate any
measurable functional relationship between the input and the output
variable to any desired accuracy. In other word, once the architecture of
the network is defined, weights are calculated so as to represent the
desired output through a learning process where the ANN is trained to
obtain the expected results.
(f)
The selection of training or calibration data has a very large impact on the
model prediction accuracy.
If the training or calibration data do not
represent the characteristics of a catchment and the climate, the model will
not provide reliable future predicts. When the input data include the low
and high flow extremes, the networks were able to recognize the patterns
in test data that contain low, high and average flow conditions. For the
hourly event hydrograph models, the networks provided the highest
network accuracy for future predictions and estimated the peak discharges
closer to their observed values. Since the data includes information on
both high and low flow conditions, the networks were able to distinguish
the patterns in the test data that is different from the training data. Similar
trends were also observed for daily rainfall-runoff models. Based on the
goodness of fit tests, ANN networks trained using data that include low
220
and high flow had the good prediction accuracy compared to the other
models.
(g)
The relationship of rainfall-runoff is highly non-linear. Normally, for a
small catchment size, the river flow is highly non-linear and influenced by
storage effect, which can affect the quality of the data. Meanwhile, the
non-linearity of river flow for a big catchment is more consistently. In
addition, the effect of spatial rainfall and control structures may contribute
to the complexity of the system. However, the MLP and RBF method has
been shown that it can easily handle the existence of non-linearity
processes within the catchment compared to the MLR, HEC-HMS and
SWMM models.
(h)
The ANN models have been identified as a robust model in modelling the
rainfall-runoff relationship. It can model accurately the storm hydrograph
for single-storm and multiple-storm events. Obviously, the ANN
application to model the daily and hourly streamflow hydrograph was
successful.
(i)
The modelling of hourly event flow hydrograph yields a better accuracy of
prediction compared to daily model. This is time for the ANN to become
a priority tools to overcome the problem of flow hydrograph prediction.
5.3
Recommendations for future work
Even though this research has been conducted with the aim that it will be as
thorough and as exhaustive as possible, it has inspired the possibilities for further
research and refinements. Listed below are some suggestions for further research:
(a)
It has been shown in this study that the Artificial Neural Network (ANN)
model is capable to model the complex relationship between rainfall and
runoff. As part of Artificial Intelligent (AI) groups, the Fuzzy Logic (FL)
model has a good characteristics and capability to model this relationship.
221
FL model can be designed through experience of the experts. Therefore,
for further research the study of Fuzzy Logic should be carried out. Other
type of AI technique is Neuro-Fuzzy model. It is the combination of both
methods; neural network and fuzzy logic. Several studies has found that
the combination of both methods are more powerful and effectiveness.
(b)
Further work on the regionalization need to be taken up by incorporating
more rainfall stations. This is also to further refine the estimation of
quantiles for ungauged sites. The study should also extend to the whole
catchments in peninsular Malaysia and also in east Malaysia to get a more
comprehensive conclusion for the whole country.
(c)
In this study, a rainfall-runoff relationship model was carried out using the
daily and hourly time intervals. For further research, the study should also
extend to short time intervals such as 5 minutes, 10 minutes and 15
minutes. And also, the model predictions were slightly improved using 2
years data rather than using 1 year data. However, this improvement was
insignificant in networks trained using 4 years data. In order to reach a
conclusion on how the model accuracy changes with the length of training
data, more investigation would be needed using training data over a longer
period of time. ANN can be trained using 6 years, 10 years, or 15 years
training data and compared to the model accuracy of the previous study.
(d)
In this study, the training of ANN was accomplished using the same
transfer function for all neurons and layers in a network. The use of
different transfer functions for each layer will be helpful to obtain better
prediction accuracy. For example, a network that is trained using a linear
transfer function in the first hidden layer and a hyperbolic tangent
transfer function in the output layer would differ significantly from a
network that is trained using the same transfer function in all layers.
(e)
In watersheds around the world, it is common that either the rainfall or
runoff records are incomplete for certain periods of time. This difficulty is
usually overcome by throwing out the incomplete part of record.
becomes difficult especially in the calibration of the model.
It
The
222
traditional methods such as statistical analysis have their own weaknesses.
With the advance technology development and computer related activities
it also recommended that to use radar system, remote sensing, satellite
technology, database management systems, error analysis, etc. to observed
data over large and inaccessible areas and to map these areas spatially is
vastly improved, making it possible to develop truly distributed models for
both gauged and ungauged catchment.
(f)
In prediction or forecasting runoff, it is very important to update the model
without re-training or re-calibrating it. This will be very advantages in
where the changes in a catchment can be continuously included. This will
aid the hydrologists and engineers in planning, designing and managing
future water resources systems with more courageous.
Theory that
incorporates new information without re-calibration might be applied to
model rainfall-runoff processes.
(g)
Modelling techniques that have been used (HEC-HMS and XP-SWMM)
in this study are the lower ranking model compare to other popular models
such as Hydrologic Simulation Package Fortran IV (HSPF), Catchment
Model (CM), U.S. Geological Survey (USGS) Model, Utah State
University (USU) Model, and so on. So, the application of these popular
models should be carried out to see how accurate and reliable of the model
that have been developed.
(h)
Another aspect that has to consider is to try to constrain the uncertainty in
model predictions over both the short and long term, through data
assimilation.
If there is more remote sensing and other spatially
distributed data become available, then it will be possible to incorporate
other types of data into assimilation algorithms. For the long-term, where
predictions of future behaviour of a hydrological system are important but
highly uncertain, there may be some justification for implementing a
measurement program to monitor the impacts of change including future
climate change within the context of the natural variability of hydrological
systems.
223
REFERENCES
Abrahart, R.J. and Kneale, P.E. (1997). Exploring Neural Network Rainfall-Runoff
Modelling. Proceedings of the 6th British Hydrological Society Symposium.
Salford University. 9.35-9.44.
Abrahart, R.J. (1999). Neurohydrology: Implementation Options and Research Agenda.
Area: 31(2). 141-149.
Abrahart, R. J., See, L. and Kneale, P. E. (1999). Using Pruning Algorithms and Genetic
Algorithms to Optimise Network Architectures and Forecasting Inputs in a Neural
Network Rainfall-Runoff Model. Journal of Hydroinformatics. 1(2). 103-114.
Altrock, C. V. (1995). Fuzzy Logic & Neurofuzzy Applications Explained. Englewood
Cliffs, N.J.: Prentice Hall Inc.
Anderson, M. G. and Burt, T. P. (Eds.) (1985). Hydrological Forecasting. Chichester,
U.K.: John Wiley & Sons Ltd.
Anderson, M. L. et. al. (2002). Coupling HEC-HMS with Atmospheric Models for
Prediction of Watershed Runoff. Journal of Hydrology Engineering. 7(4). 312-318.
Angelo, M., Eddie, T. and Jamshidi, M. (1994). Fuzzy Logic Based Collision Avoidance
for a Mobile Robot. Robotica. 12(6). 521-527.
224
Angsorn, S. (1995). Fuzzy Logic in Polder Flood Control Operations in Bangkok.
University of British Columbia: Ph.D. Thesis.
Anjum, M. M. (2000). Rainfall-Runoff Modelling of Taman Mayang Catchment.
Universiti Teknologi Malaysia: Master Thesis.
ASCE (2000). Artificial Neural Networks In Hydrology, Part I: Preliminary
Concepts. Journal of Hydrology Engineering. 2. 115-123.
ASCE (2000). Artificial Neural Networks In Hydrology, Part II: Hydrologic
Applications. Journal of Hydrology Engineering. 2. 124-135.
Aziz, A. R. A. and Wong, K. F. V. (1992). Neural Network Approach To The
Determination of Aquifer Parameters. Ground Water. 30(2). 164-166.
Beven, K. J. (2001). Rainfall-Runoff Modeling: The Primer. Chichester: John Wiley
& Sons, Ltd.
Bishop, C. S. (1995). Neural Networks for Pattern Recognition. Great Clarendon
Street, Oxford: Oxford University Press.
Blum, A. (1992). Neural Networks in C++: An Object-Oriented Framework for
Building Connectionist Systems. USA: John Wiley and Sons Inc.
Bojadziev, G. and Bojadziev, M. (1995). Fuzzy Sets, Fuzzy Logic, Applications.
Singapore: World Scientific Publishing Co.
Box, G. E. P. and Jenkins, G. M. (1976). Time Series Analysis Forecasting and Control.
California: Holden-Day Inc.
Bronstert, A. (1999). Capabilities and Limitations of Distributed Hillslope Hydrological
Modelling. Hydrological Processes. Vol. 13. 21-48.
225
Brooks, K. N. et. al. (1991). Hydrology and the Management of Watersheds. IOWA:
State University Press.
Broomhead, D. S. and Lowe, D. (1988). Multivariate Functional Interpolation and
Adaptive Networks. Complex Systems. Vol. 2. 321-355.
Brownlie, W. R. (1983). Flow Depth in Sand-bed Channels. Journal of Hydrology
Engineering. Vol. 109(7). 950-990.
Buch, A. M., Mazumdar, H. S., and Pandey, P. C. (1993). A Case Study of Runoff
Simulation of a Himalayan Glacier Basin. Proceeding of International Joint
Conference on Neural Networks. Vol. 1. 971-974.
Campbell, P. F. (1993). Application of Fuzzy Sets Theory in Reservoir Operation.
University of British Columbia: Master Thesis.
Caudill, M. (1987). Neural Networks Primer, Part I. AI Expert. December. 46-52.
Carriere. P., Mohaghegh. S., and Gaskari. R. (1996). Performance of a Virtual Runoff
Hydrograph System. Journal of Water Resources Planning and Management. Vol.
122(6). 1-7.
Chen, S., Cowan, C. F. N. and Grant, P. M. (1991). Orthogonal least squares learning
for radial basis function networks. IEEE Transactions on Neural Networks.
Vol. 2(2). 302-309.
Chow, V. T. (1964). Handbook of Applied Hydrology. New York: McGraw-Hill.
Clarke, R. T. (1973). A Review of Some Mathematical Models Used in Hydrology With
Observation on Their Calibration and Use. Jurnal of Hydrology. Vol. 19. 1-20.
226
Colby, B. R. (1964). Practical Computations of Bed Material Discharge. Proceeding
ASCE. Vol. 90(2).
Cox, E. (1999). The Fuzzy Systems Handbook. San Diego: AP Professional.
Dandy, G. and Maier, H. (1996). Use of Artificial Neural Networks for Real Time
Forecasting of Water Quality. Proceeding of the International Conference on Water
Resources and Environmental Research. Japan. Vol. 2. 55-64.
Daniel, M. and Paul, F. (1993). Fuzzy Logic for Automatic Control. New York: Simon
& Schuster.
Da, R. (1998). Fuzzy Logic and Intelligent Technologies for Nuclear Science and
Industry. Singapore: World Scientific Publishing Co.
Dawson, C. and Wilby, R. (1998). An Artificial Neural Network Approach to RainfallRunoff Modeling. Journal of Hydrology Science. Vol. 43. 47-66.
Demuth, H. and Beale, M. (1994). Neural Network Toolbox User’s Guide. Prime Park
Way. Natick: The MathWorks Inc.
Dibike, Y. B. and Solomatine, D. P. (1999). River Flow Forecasting Using Artificial
Neural Networks. Netherlands: International Institute for Infrastructural,
Hydraulic, and Environmental Engineering.
Dibike, Y.B. and Abbott, M.B. (1999). Application of Artificial Neural Networks to the
Simulation of A Two Dimensional Flow. Journal of Hydraulic Research. Vol. 37(4).
435-446.
Dibike, Y.B. (2000). Machine Learning Paradigms for Rainfall-Runoff Modeling.
Proceeding of the Hydroinformatics. USA: IOWA Conference.
227
Dibike, Y.B. and Solomatine, D. (2001). River Flow Forecasting Using Artificial Neural
Networks. Journal of Physics and Chemistry of the Earth. Part B: Hydrology,
Oceans and Atmosphere. Vol. 26(1). 1-8.
Dooge, J. C. I. (1959). A General Theory of the Unit Hydrograph. Journal of
geophysical Research. Vol. 64. 241-256.
Dooge, J. C. I. (1981). Parameterization of Hydrologic Processes. Conference on land
surface processes in atmospheric general circulation models. 243-284.
El-kady, A. I. (1989). Watershed Models and Their Applicability to Conjunctive Use
Management. Journal of American Water Resources Association. Vol. 25(1). 25-137.
Elshorbagy, A., Simonovic, S. P., and Panu, U. S. (2000). Performance Evaluation of
Artificial Neural Networks for Runoff Prediction. Journal of Hydrology Engineering.
Vol. 5. 424-427.
Encyclopaedia (2000). The Columbia Electronic Encyclopaedia: 6th. ed. Columbia:
University Press.
Fausett, L. (1994). Fundamentals of Neural Networks. New Jersey: Prentice Hall,
Englewood Cliffs.
Fei, J. (1991). A Fuzzy Knowledge-based Learning Control System for a Mobile
Robot. Syracuse University: Ph.D. Thesis.
Fernando, D.A.K. and Jayawardena, A.W. (1998). Runoff Forecasting Using RBF
Networks With OLS Algorithm. Journal of Hydrology Engineering. Vol. 3(3).
203-209.
Forsyth, R. (ed.) (1984). Expert Systems: Principles and Case Studies. London:
Chapman and Hall Ltd.
228
Freeze, R. A. (1972). Role of Subsurface Flow in Generating Surface Runoff: Upstream
Source Areas. Water Resources Research. Vol. 8(5). 1272-1283.
French, M.N. Krajewski, W.F. and Cuykendall, R.R. (1992). Rainfall Forecasting in
Space and Time Using a Neural Network. Journal of Hydrology. Vol. 137. 1-31.
George, B. (1997). Fuzzy Logic for Business, Finance, and Management. Singapore:
World Scientific Publishing Co.
Grubbs, (1969). Procedures for Detecting Outlying Observations in Samples.
Technometrics. Vol. 11(1). 1-21.
Gupta, H. V. and Sorooshian, S. (1994). A New Optimization Strategy for Global Inverse
Solution of Hydrologic Models. Numerical methods in water resources. Boston:
Kluwer Academic.
Haan, C. T. (1977). Statistical Methods in Hydrology. Iowa: State University Press.
Hagen, M.T. and Menhaj, M.B. (1994). Training Feedforward Networks with the
Marquardt Algorithm. IEEE Transactions on Neural Networks. Vol. 5(6).
Hall, M.J. and Minns, A.W. (1993). Rainfall-Runoff Modelling as a Problem in
Artificial Intelligence: Experience with a Neural Network. Proceedings of the 4th
British Hydrological Society Symposium. Vol. 5. 51-5.57.
Hall, M.J. and Minns, A.W. (1999). The Classification of Hydrologically Homogeneous
Regions. Journal of Hydrology Sciences. Vol. 44(5). 693-704.
Haubold, V. B. (1993). Fuzzy Logic: A Clear Choice for Temperature Control.
I and SC. Vol. 66(6). 39-41.
229
Harun, S. (1999). Forecasting and Simulation of Net Inflows for Reservoir Operation
and Management. Universiti Teknologi Malaysia: Ph.D. Thesis.
Hawley, M.E. (1979). A Comparative Evaluation of Snowmelt Models. University of
Maryland, College Park: Master Thesis.
Haykin, S. (1994). Neural Networks: A Comprehensive Foundation. New York:
World Scientific Publishing.
Henk, B. V. (1999). Fuzzy Logic Control: Advances in Applications. New York:
Addison-Wesley Publication Co.
Hecht-Nielsen, R. (1991). Neurocomputing. New York: Addison-Wesley
Publication Co.
Heimes, F. and Heuveln, B. V. (1998). The Normalized Radial Basis Function Neural
Network. IEEE Transactions on Neural Networks. Vol. 1. 1609-1614.
Heshmaty, B. and Kandel, A. (1985). Fuzzy Linear Regressions and Its Applications to
Forecasting in Uncertain Environments. Fuzzy Sets and Systems. Vol. 2. 159-191.
Holder, R. L. (1985). Multiple Regression in Hydrology. Wallingford, England:
Institute of Hydrology.
Hopfield, J. J. (1982). Neural Networks and Physical Systems with Emergent Collective
Computational Abilities. Proceeding of National Academy of Scientists. Vol. 79.
2554-2558.
Hopfield, J. J and Tank, D. W. (1986). Computing with Neural Circuits: A Model.
Science. Vol. 233. 625-633.
230
Horton, R. E. (1933). The Role of Infiltration in the Hydrologic Cycle. Trans. Am.
Geophys. Union. Vol. 145. 446-460.
Hromadka, T. V., McCuen, R. H. and Yen, C. C. (1988). Effect of Watershed
Subdivision on Prediction Accuracy of Hydrologic Models. Hydrosoft. Vol. 1. 19-28.
Hsu, K. Gupta, H.V. and Sorooshian, S. (1995). Artificial Neural Network Modelling of
the Rainfall-Runoff Process. Water Resources Research. Vol. 31(10). 2517-2530.
Hsu, K. Gupta, H.V. and Sorooshian, S. (1998). Streamflow forecasting using artificial
neural networks. Water Resources Engineering. Proceeding of ASCE Conference.
Tennessee: Memphis.
Hydrologic Engineering Center (HEC) (2000). Hydrologic Modelling System HECHMS User’s Manual, version 2.0. Engineering. US Army Corps of Engineers,
California: Davis.
James, C. B. (1992). Fuzzy Logic and Neural Networks for Pattern Recognition: Visual
Materials. Discataway: IEEE Educational Activities Board.
James, C. B. (1999). Fuzzy Models and Algorithms for Pattern Recognition and Image
Processing. Boston: Kluwer Academic Publishers.
Jang, J-S, Sun C-T, and Mizutani (1997). Neuro-fuzzy and Soft Computing. New
Jersey: Prentice Hall.
Jindrich, L. (1994). Fuzzy Logic System Based Modelling and Control of Complex
Chemical Processes. Clemson University: Ph.D. Thesis.
Johnson, D. and King, M. (1988). Basic Forecasting Techniques. Great Britain:
Butterworth & Co. (Publishers) Ltd.
231
Julien, P. Y. and Moglen, G. E. (1990). Similarity and Length Scale for Spatially Varied
Overland Flow. Water Resources Research. Vol. 26(8). 1819-1832.
Kasmin, H. (2003). Kesan Pembalakan ke Atas Aliran Ribut. Universiti Teknologi
Malaysia: Master Thesis.
Kavvas, M. and Chen, Z. (1998). Meteorologic Model Interface for HEC-HMS NCEP
Eta Atmospheric Model and HEC Hydrologic Modelling System.
Kachroo, R.K. (1986). HOMS Workshop on River Flow Forecasting: Nanjing, China.
Unpublished internal report, Dept. of Engineering Hydrology. Ireland: University
College Galway.
Karim, M. F. and Kennedy, J. F. (1990). Menu of Coupled Velocity and Sediment
Discharge Relations for Rivers. Journal Hydrology Engineering. Vol. 116(8).
978- 996.
Karunanithi, et. al. (1994). Neural Networks for River Flow Prediction. Journal of
Computing in Civil Engineering. Vol. 8(12). 201-220.
Kitamura, Y. and Nakayama, H. (1985). Rainfall-runoff in the Catchment Area of Muda
and Pedu Dams. Quarterly report no. 18, TARC, Alor Setar.
Laursen, E. M. (1958). The Total Sediment Load of Streams. Journal of the hydraulics
Division. Vol. 84(1). 1-36.
Lipmann, R. P. (1987). An Introduction to Computing with Neural Nets. IEEE ASSP
Magazine. Vol. 4. 4-22.
Loague, K. M. and Freeze, A. (1985). A Comparison of Rainfall-Runoff Modelling
Techniques on Small Upland Catchments. Water Resources Research. Vol. 21(2).
229-248.
232
Lu, B. and Evans, B. L. (1999). Channel Equalization by Feedforward Neural Networks.
IEEE Trans. Neural Networks. Vol. 10. 587-590.
Lucks, M. B. and Oki, N. (1999). A Radial Basis Function for Function Approximation.
IEEE Transactions on Neural Networks. Vol. 5. 1099-1101.
Madan, M. G. and Yamakawa, T. (1988). Fuzzy Logic in Knowledge-based System,
Decision and Control. Amsterdam: Elsevier Science.
Maidment, D. R. (ed.) (1993). Handbook of Hydrology. New York: McGraw-Hill.
Maier, H.R. and Dandy, G.C. (1996). The Use of Artificial Neural Networks for the
Prediction of Water Quality Parameters. Water Resources Research. Vol. 32(4).
Mann, I. And McLaughlin, S. (2000). Dynamical system modelling using Radial Basis
Function. IEEE Transactions on Neural Networks. Vol. 7. 461-465.
Markus, M. (1997). Application of Neural Networks in Streamflow Forecasting.
Colorado State University: Ph.D. Dissertation.
Masters, T. (1993). Practical Neural Network Recipes in C++. San Diego, California:
Academic Press, Inc.
MATLAB (2000). Getting Started with MATLAB. 6 th. ed. Natick, M.A: The Math
Works Inc.
Mays, L. W. and Tung, Y. K. (1992). Hydrosystems Engineering and Management.
USA: McGraw-Hill, Inc.
Mazion, E. and Yen, B. C. (1994). Computational Discretization Effect on RainfallRunoff Simulation. Journal of Water Resources Planning & Management.
Vol. 120(5). 715-734.
233
McCuen, R. H. (1997). Hydrologic Analysis and Design. 2nd Edition. New Jersey:
Prentice Hall Englewood Cliffs.
Metcalf and Eddy (1971). Stormwater management model. Final report, EPA Rep.
No. 11024DOC07/71. University of Florida, Washington, D. C.
Minns, A. W. and Hall, M. J. (1996). Artificial Neural Networks as Rainfall-Runoff
Models. Journal of Hydrology Science. Vol. 41(3). 399-417.
Nagy, L., Kocbach, L., Pora, K. and Hansen, J. P. (1970). Inference Effects in the
Ionization of H2 by Fast Changed Projectiles. Journal of Physics. Vol. 35. 453-459.
Nash, J. E. and Sutcliffe, J. V. (1970). River Flow Forecasting Through Conceptual
Models: A Discussion of Principles. Journal of Hydrology. Vol. 10. 282-290.
Overton, D. E. and Meadows, M. E. (1976). Stormwater Modelling. New York:
Academic Press.
Pallavicini, I. (1999). Giving Simple Tools to Decision Makers-The Fuzzy Approach
To Decision Support Systems. Centre for Ecology & Hydrology. Vol. 11. 7-53.
Park, M. et. al. (1999). A New Approach to the Identification of a Fuzzy Model. Fuzzy
Sets and Systems. Vol. 104. 169-181.
Poff, L. N., Tokar, A. S. and Johnson, P. A. (1996). Steam Hydrological and Ecological
Responses to Climate Change Assessed with an Artificial Neural Network.
Limnology and Oceanography. Vol. 41(5). 857-863.
Poggio, T. and Girosi, F. (1990). Networks for Approximation and Learning. Proc. of the
IEEE. 78, 1481-1497.
234
Raman, H. and Chandramouli, V. (1996). Deriving a General Operating Policy for
Reservoirs Using Neural Network. Journal of Water Resources Planning and
Management. Vol. 122(5).
Ranjithan, S. and Eheart, J. W. (1993). Neural Network-based Screening for
Groundwater Reclamation Under Uncertainty. Water Resources Research.
Vol. 29(3). 563-574.
Rogers, L. L. and Dowla, F. U. (1994). Optimization of Groundwater Remediation Using
Artificial Neural Networks with Parallel Solute Transport Modelling. Water
Resources Researsh. Vol. 30(2). 457-481.
Rumelhart, D. E., McClelland, J. L. and the PDP Research Group (1986). Parallel
Distributed Processing. Massachusetts: The MIT Press. Vol. 1. 547.
Russell, S. O. and Campbell, P. F. (1996). Reservoir Operating Rules with Fuzzy
Programming. Journal of Water Resources Planning and Management. Vol. 122(3).
1-9.
Salas, J. D. et. al. (1980). Applied Modelling of Hydrologic Time Series. Littleton,
Colorado: Water Resources Publication.
Sargent, D. M. (1981). An Investigation Into the Effect of Storm Movement on the
Design of Urban Drainage System: Part I. Public Health Engineering. Vol. 9.
201-207.
Sargent, D. M. (1982). An Investigation Into the Effect of Storm Movement on the Design
of Urban Drainage System: Part II. Public Health Engineering. 111-117.
Shamseldin, A.Y. (1997). Application of a Neural Network Technique to Rainfall-Runoff
Modelling. Journal of Hydrology. Vol. 199. 272-294.
235
Shamseldin, A.Y. O’Connor, K.M. and Liang, G.C. (1997). Methods for Combining the
Outputs of Different Rainfall-Runoff Models. Journal of Hydrology. Vol. 197.
203-229.
Shamseldin, A.Y. and O’Connor, K.M. (1999). A Real-Time Combination Method for
the Outputs of Different Rainfall-Runoff Models. Journal of Hydrology Sciences.
Vol. 44(6). 895-912.
Sherman, L. K. (1932). Streamflow From Rainfall by the Unit Graph Method.
Engineering News-Rec.. Vol. 108. 501-505
Simpson, P. K. (1989). Artificial Neural Systems: Foundations, Paradigms, Applications
and Implementations. USA: Pergamon Press.
Singh, V. P. (1988). Hydrologic Systems-Rainfall-runoff Modelling. New Jersey:
Prentice Hall Englewood Cliffs. Vol. 1. 480.
Singh, V. P. (ed.) (1998). Effect of the Direction of Storm Movement on Planar Flow.
Hydrologic Processes, 12, 147-170
Singh, V. P. (ed.) (1982). Rainfall-Runoff Relationship. Proceeding of the
International Symposium on Rainfall-Runoff Modelling. Littleton, Colorado: Water
Resources Publications.
Singh, V. P. and Woolhiser, D. A. (2002). Mathematical Modelling of Watershed
Hydrology. Journal of Hydrology. Vol. 7(4). 270-292.
Skaggs, R. W., Tabrizi, A. N. and Foster, G. R. (1982). Subsurface Drainage Effects on
Erosion. Journal Soil Water Cons. Vol. 37. 167-172.
236
Smith, J. and Eli, R. N. (1995). Neural Network Models of Rainfall-Runoff Processes.
Journal of Water Resources Planning and Management. Vol. 121(6). 499-508.
Sorooshian, S. (1991). Parameter Estimation, Model Identification, and Model
Validation: Conceptual-Type Models. In Bowles, D.S. and O'Connell, P.E. (Eds.).
Proceedings of the NATO Advanced Study Institute on Recent Advances in the
Modelling of Hydrologic Systems. Portugal: Kluwer Academic Publishers. 10-23.
Specht, D. F. (1991). A General Regression Neural Network. IEEE Transactions on
Neural Networks. Vol. 2. 568-576.
SPSS Inc. (1995). SPSS Software User’s Guide: Release 6.0. Chicago: North
Michigan Avenue.
Starrett, S. K., Najjar, Y. M. and Hill, J. C. (1996). Neural Networks Predict Pesticide
Leaching. Proc. Am. Water and Envir. New York. 1693-1698.
Surkan, A. J. (1974). Simulation of Storm Velocity Effect of Flow From Distributed
Channel Networks. Water Resources Research. Vol. 10. 1149-1160.
Svanidze, G. G. (1980). Mathematical Modelling of Hydrologic Series. Littleton,
Colorado: Water Resources Publications.
Tawfik, M., Ibrahim, A., and Fahmy, H. (1997). Hysteresis Sensitive Neural Network for
Modelling Rating Curve. Journal of Computing in Civil Engineering. Vol. 11(3).
206-211.
The MathWorks Inc. (1992). The Student Edition of MATLAB: Student User Guide.
New Jersey: Prentice-Hall Inc.
237
Thirumalaiah, K. and Deo, M.C. (1998a). Real-Time Flood Forecasting Using Neural
Networks. Computer-Aided Civil and Infrastructure Engineering. Vol. 13(2).
101-111.
Thirumalaiah, K. and Deo, M.C. (1998b). River Stage Forecasting Using Artificial
Neural Networks. Journal of Hydrology Engineering. Vol. 3(1). 26-32.
Thiumalaiah, K. and Deo, M.C. (2000). Hydrological Forecasting Using Neural
Networks. Journal of Hydrology Engineering. Vol. 5(2). 180-189.
Tingsanchali, T. (2000). Forecasting Model of Chao Phraya River Flood Levels at
Bangkok. Thailand: Research Report, Asian Institute of Technology.
Tokar, A. S. and Markus, M. (2000). Precipitation-Runoff Modelling Using Artificial
Neural Networks and Conceptual Models. Journal of Hydrology Engineering. Vol. 2.
156-161.
Tokar, A. S. (1996). Rainfall-Runoff Modelling in an Uncertain Environment.
University of Maryland: Ph.D. Dissertation.
Tokar, A. S. and Johnson, P. A. (1999). Rainfall-Runoff Modelling Using Artificial
Neural Networks. Journal of Hydrology Engineering. Vol. 3. 232-239.
Todini, E. (1988). Rainfall-Runoff Modelling: Past, Present and Future. Journal of
Hydrology. Vol. 100. 341-352.
Torno, (1985). Computer Application in Water Resources. Proceedings of the specialty
conference. New York: Buffalo.
Tsoukalas, L. H. and Uhrig, R. E. (1997). Fuzzy and Neural Approaches in
Engineering. New York: John Wiley & Sons Inc..
238
Turksen, I. B. (1999). Type I and Type II Fuzzy System Modelling. Fuzzy Sets and
Systems. Vol. 106. 11-34.
U.S. Army Corps of Engineers Hydrologic Engineering Center (USACE-HEC) (2000).
HEC-HMS Hydrologic Modelling System Users Manual. California: USACE-HEC.
Wasserman, P. D. (1989). Neural Computing, Theory and Practice. New York: Van
Nostrand Reinhold.
Wasserman, A.I. (2000). Software Tools: Past, Present, and Future. IEEE Transactions
on Software Engineering. Vol. 9(3). 3-6.
Werbos, P. J. (1974). Beyond Regression: New Tools for Prediction and Analysis in the
Behavioural Science. , Havard University, Cambridge: Ph.D. Thesis
Wilby, R. L., Hassan, H. and Hanaki, K. (1998). Statistical Downscaling of
Hydrometeorological Variables Using General Circulation Model Output. Journal of
Hydrology. Vol. 205. 1-19.
Woolhiser, D. A. and Brakensiek, D. L. (1982). Hydrologic System Synthesis in
Hydrologic Modelling of Small Watersheds. St. Joseph: ASAE Monograph No. 5.
3-16.
Woolhiser, D. A. and Goodrich, D. C. (1988). Effect of Storm Rainfall Intensity Patterns
on Surface Runoff. Journal of Hydrology. Vol. 102. 29-47.
Wu, et. al. (1982). Effects of Spatial Variability of Hydraulic Roughness on Runoff
Hydrographs. Agriculture Forest Meteorologic. Vol. 59. 231-248.
Wurbs, R. A. (1998). Dissemination of Generalized Water Resources Models in the
United States. Water Int. Vol. 23. 190-198.
239
XP-SWMM (2000). Expert Stormwater and Wastewater Management Model.
Version 8.05 (42-805-0546), USA.
XP-SWMM (2000). Stormwater Management Users Manual-Version 4. Athens: US
Environmental Protection Agency.
Yager, R. R. (1977). Multiple Objective Decision-Making Using Fuzzy Sets. Intl.
Journal of Man-Machine Studies. Vol. 9. 375-382.
Yang, C. C., Prasher, S. O., Lacroix, R., Sreekanth S., Patni N. K. and Masse L. (1997).
Artifical Neural Networks Model for Subsurface Drained Farmland. Journal of Irr.
and Drain. Engrg. Vol. 123(4). 285-292.
Yang, S. and Tseng, C. (1988). An Orthogonal Neural Network for Function
Approximation. IEEE Transactions on Systems. Vol. 26(5). 779-785.
Yapo, P.O., Gupta, H.V., and Sorooshian, S. (1996). Automatic Calibration of
Conceptual Rainfall-Runoff Models: Sensitivity to Calibration Data. Journal of
Hydrology. Vol. 181. 23-48.
Yu, P-S and Yang, T-C (2000). Fuzzy Multi-Objective Function for Rainfall-Runoff
Model Calibration. Journal of Hydrology. Vol. 238. 1-14.
Yu, P-S, Chen C-J, and Chen, S-J (2000). Application of Gray and Fuzzy Methods for
Rainfall Forecasting. Journal of Hydrology Engineering. Vol. 4. 339-345.
Zakaria, et. al. (2003). Bio-Ecological Drainage System for Water Quantity and Quality
Control. Intl. River Basin Management. Vol. 1(3). 237-251.
Zadeh, L. A. (1973). Outline of A New Approach to the Analysis of Complex Systems
and Decision Processes. IEEE Trans. Systems, Man and Cybernetics. Vol. 3. 28-44.
240
Zadeh, L. A. and Kacprzyk, J. (eds.) (1992). Fuzzy Logic for the Management of
Uncertainty. New York: John Wiley & Sons.
Zhang, M., Fulcher, J. and Scofield, R. A. (1997). Rainfall Estimation Using Artificial
Neural Network Group. Neurocomputing. Vol. 16. 97-115.
Zimmermann, H.-J. (1994). Fuzzy Sets Theory and Its Applications. 2nd. ed. Boston:
Kluwer Academic Publishers.
Zou, et. al. (2002). Combining Time Series Model for Forecasting. Journal of
Forecasting. Vol. 24.
241
APPENDIX A
Daily and hourly results of MLP model
A.
Daily results
Table A4.3(a): Results of 3 Layer neural networks for Sg. Ketil catchment using 100% of data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
16-14-1*
268
0.8370
0.2924
0.0096
7.2512
16-14-1*
268
0.8590
0.2995
0.0099
7.3304
16-14-1*
268
0.8912
0.3543
0.0116
8.6508
16-14-1*
268
0.8448
0.3914
0.0131
9.8265
16-14-1*
268
0.8422
0.4765
0.0155
12.7560
16-14-1*
268
0.8930
0.3928
0.0131
9.8143
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
Table A4.3(b): Results of 3 Layer neural networks for Sg. Ketil catchment using 50% of data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
16-14-1*
268
0.7890
0.4068
0.0133
10.8220
16-14-1*
268
0.8226
0.4122
0.0136
11.4731
16-14-1*
268
0.8698
0.3807
0.0125
10.8917
16-14-1*
268
0.8295
0.4319
0.0143
12.1284
16-14-1*
268
0.8151
0.4460
0.0146
12.1461
16-14-1*
268
0.8631
0.5147
0.0170
15.3070
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
242
APPENDIX A
Daily and hourly results of MLP model
Table A4.3(c): Results of 3 Layer neural networks for Sg. Ketil catchment using 25% of data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
16-14-1*
268
0.8788
0.2871
0.0095
7.9621
16-14-1*
268
0.7350
0.6404
0.0212
13.5025
16-14-1*
268
0.7194
0.3385
0.0112
7.3006
16-14-1*
268
0.7536
0.7617
0.0252
17.1820
16-14-1*
268
0.6188
0.4652
0.0153
11.6471
16-14-1*
268
0.8131
0.8110
0.0266
16.8580
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
Table A4.4(a): Results of 4 Layer neural networks for Sg. Ketil catchment using 100% of data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
16-14-10-1*
414
0.8370
0.2871
0.0095
7.1466
16-14-10-1*
414
0.8430
0.3195
0.0106
7.6262
16-14-10-1*
414
0.8797
0.3391
0.0111
8.2829
16-14-10-1*
414
0.8279
0.4130
0.0138
9.9779
16-14-10-1*
414
0.8305
0.4536
0.0148
12.0876
16-14-10-1*
414
0.8830
0.4069
0.0135
9.4420
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
243
APPENDIX A
Daily and hourly results of MLP model
Table A4.4(b): Results of 4 Layer neural networks for Sg. Ketil catchment using 50% of data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
16-14-10-1*
414
0.8139
0.3424
0.0113
9.2320
16-14-10-1*
414
0.8479
0.3483
0.0116
8.5922
16-14-10-1*
414
0.8654
0.2870
0.0094
6.4820
16-14-10-1*
414
0.8547
0.4336
0.0145
11.3320
16-14-10-1*
414
0.8117
0.3903
0.0127
9.9875
16-14-10-1*
414
0.8802
0.4341
0.0144
9.6348
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
Table A4.4(c): Results of 4 Layer neural networks for Sg. Ketil catchment using 25% of data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
16-14-10-1*
414
0.9198
0.2231
0.0074
5.9640
16-14-10-1*
414
0.7545
0.5803
0.0191
13.0057
16-14-10-1*
414
0.6800
0.3675
0.0121
8.3800
16-14-10-1*
414
0.8241
0.5906
0.0195
14.5750
16-14-10-1*
414
0.5409
0.5000
0.0164
12.5944
16-14-10-1*
414
0.8699
0.6682
0.0218
15.7563
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
244
APPENDIX A
Daily and hourly results of MLP model
Table A4.5(a): Results of 3 Layer neural networks for Sg. Klang catchment using 100% of data sets in training phase
MODEL
Model
No. of
COC
RMSE
Data Set
Structure
Parameter
( R2 )
(cumecs)
17-13-1*
264
0.8203
6.3820
0.3693
25.7937
17-13-1*
264
0.8045
8.0723
0.3845
27.2468
17-13-1*
264
0.8562
7.4985
0.3011
23.3682
17-13-1*
264
0.7974
7.6160
0.3977
28.7316
17-13-1*
264
0.8465
8.9923
0.2582
21.3788
17-13-1*
264
0.7966
7.6309
0.5251
39.6970
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
RRMSE
MAPE
(%)
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
Table A4.5(b): Results of 3 Layer neural networks for Sg. Klang catchment using 50% of data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
17-13-1*
264
0.8817
5.0103
0.3553
25.6967
17-13-1*
264
0.7967
8.8230
0.3670
27.4527
17-13-1*
264
0.8594
7.3548
0.2601
19.7716
17-13-1*
264
0.7766
8.5181
0.3872
29.4335
17-13-1*
264
0.8493
9.0087
0.2343
19.3385
17-13-1*
264
0.7606
8.0567
0.4854
37.2270
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
245
APPENDIX A
Daily and hourly results of MLP model
Table A4.5(c): Results of 3 Layer neural networks for Sg. Klang catchment using 25% of data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
17-13-1*
264
0.9384
3.7962
0.2548
17.6230
17-13-1*
264
0.7568
9.3368
0.4664
31.8791
17-13-1*
264
0.8176
10.1187
0.3905
28.4961
17-13-1*
264
0.7201
8.9428
0.4771
32.3693
17-13-1*
264
0.7844
13.0202
0.4101
30.1309
17-13-1*
264
0.6352
10.6431
0.6465
46.5411
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
Table A4.6(a): Results of 4 Layer neural networks for Sg. Klang catchment using 100% of data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
17-13-9-1*
386
0.8077
6.5784
0.3739
26.1677
17-13-9-1*
386
0.8045
8.0770
0.3805
26.6066
17-13-9-1*
386
0.8704
7.2422
0.3007
23.1680
17-13-9-1*
386
0.7989
7.5862
0.3948
27.9979
17-13-9-1*
386
0.8677
8.3630
0.2342
18.2616
17-13-9-1*
386
0.8143
7.3068
0.5199
38.0870
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
246
APPENDIX A
Daily and hourly results of MLP model
Table A4.6(b): Results of 4 Layer neural networks for Sg. Klang catchment using 50% of data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
17-13-9-1*
386
0.8654
5.3202
0.3584
26.4449
17-13-9-1*
386
0.8020
8.7398
0.3542
26.3884
17-13-9-1*
386
0.8687
7.1217
0.2509
19.4906
17-13-9-1*
386
0.7921
8.3018
0.3712
27.9545
17-13-9-1*
386
0.8661
8.5737
0.2217
17.4775
17-13-9-1*
386
0.7977
7.4976
0.4623
35.0263
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
Table A4.6(c): Results of 4 Layer neural networks for Sg. Klang catchment using 25% of data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
17-13-9-1*
386
0.9213
4.3142
0.3159
21.1549
17-13-9-1*
386
0.7914
8.5172
0.4737
31.4957
17-13-9-1*
386
0.8119
11.0415
0.4383
32.1990
17-13-9-1*
386
0.7855
7.7986
0.4846
32.7049
17-13-9-1*
386
0.7767
14.2847
0.4515
30.7377
17-13-9-1*
386
0.7820
8.7751
0.6667
49.0089
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
247
APPENDIX A
Daily and hourly results of MLP model
Table A4.7(a): Results of 3 Layer neural networks for Sg. Slim catchment using 100% of data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
16-14-1*
268
0.8128
0.0564
0.0008
5.6743
16-14-1*
268
0.8838
0.0169
0.0002
1.6532
16-14-1*
268
0.7395
0.0249
0.0004
2.9419
16-14-1*
268
0.8714
0.0213
0.0003
2.1097
16-14-1*
268
0.8596
0.0118
0.0002
1.4608
16-14-1*
268
0.8205
0.0269
0.0004
8.8920
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
Table A4.7(b): Results of 3 Layer neural networks for Sg. Slim catchment using 50% of data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
16-14-1*
268
0.9114
0.0292
0.0004
3.0765
16-14-1*
268
0.8530
0.0157
0.0002
1.7289
16-14-1*
268
0.8258
0.0142
0.0002
1.6508
16-14-1*
268
0.8365
0.0190
0.0003
2.0100
16-14-1*
268
0.5837
0.0135
0.0002
1.6424
16-14-1*
268
0.8204
0.0220
0.0003
2.3775
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
248
APPENDIX A
Daily and hourly results of MLP model
Table A4.7(c): Results of 3 Layer neural networks for Sg. Slim catchment using 25% of data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
16-14-1*
268
0.8816
0.0403
0.0006
5.0678
16-14-1*
268
0.8055
0.0196
0.0003
2.3508
16-14-1*
268
0.6247
0.0160
0.0002
1.9067
16-14-1*
268
0.7954
0.0221
0.0003
2.5255
16-14-1*
268
0.6377
0.0131
0.0002
1.6342
16-14-1*
268
0.6855
0.0260
0.0004
2.9890
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
Table A4.8(a): Results of 4 Layer neural networks for Sg. Slim catchment using 100% of data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
16-14-11-1*
430
0.8026
0.0577
0.0009
5.9499
16-14-11-1*
430
0.9072
0.0157
0.0002
1.5960
16-14-11-1*
430
0.6880
0.0145
0.0002
1.5723
16-14-11-1*
430
0.9054
0.0195
0.0003
1.9809
16-14-11-1*
430
0.6037
0.0141
0.0002
1.6277
16-14-11-1*
430
0.8756
0.0251
0.0004
2.7608
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
249
APPENDIX A
Daily and hourly results of MLP model
Table A4.8(b): Results of 4 Layer neural networks for Sg. Slim catchment using 50% of data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
16-14-11-1*
430
0.8919
0.0315
0.0005
2.9798
16-14-11-1*
430
0.8983
0.0145
0.0002
1.5466
16-14-11-1*
430
0.7397
0.0141
0.0002
1.6155
16-14-11-1*
430
0.8941
0.0176
0.0003
1.7630
16-14-11-1*
430
0.7087
0.0117
0.0002
1.4508
16-14-11-1*
430
0.8730
0.0222
0.0003
2.4443
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
Table A4.8(c): Results of 4 Layer neural networks for Sg. Slim catchment using 25% of data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
16-14-11-1*
430
0.8747
0.0313
0.0005
2.9290
16-14-11-1*
430
0.8874
0.0159
0.0002
1.6534
16-14-11-1*
430
0.7297
0.0140
0.0002
1.5222
16-14-11-1*
430
0.8857
0.0195
0.0003
2.0200
16-14-11-1*
430
0.6976
0.0121
0.0002
1.4887
16-14-11-1*
430
0.8635
0.0238
0.0004
2.6554
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
250
APPENDIX A
Daily and hourly results of MLP model
B.
Hourly results
Table A4.11(a): Results of 3 Layer neural networks for Sg. Ketil catchment –
using 100% of available data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
6-4-1*
38
0.9904
0.0659
0.0022
9.4885
6-4-1*
38
0.9930
0.0260
0.0008
5.6790
6-4-1*
38
0.9932
0.0288
0.0009
4.1704
6-4-1*
38
0.9762
0.0554
0.0018
11.8142
6-4-1*
38
0.9852
0.0487
0.0016
9.9654
6-4-1*
38
0.9830
0.0895
0.0029
1.4510
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
Table A4.11(b): Results of 3 Layer neural networks for Sg. Ketil catchment –
using 65% of available data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
6-4-1*
38
0.9914
0.0600
0.0020
0.9080
6-4-1*
38
0.9932
0.0262
0.0008
0.5915
6-4-1*
38
0.9933
0.0292
0.0010
0.3853
6-4-1*
38
0.9765
0.0559
0.0018
1.0960
6-4-1*
38
0.9851
0.0487
0.0016
1.0435
6-4-1*
38
0.9831
0.0899
0.0029
1.4042
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
251
APPENDIX A
Daily and hourly results of MLP model
Table A4.11(c): Results of 3 Layer neural networks for Sg. Ketil catchment –
using 25% of available data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
6-4-1*
38
0.9812
0.0624
0.0021
1.1621
6-4-1*
38
0.9915
0.0420
0.0014
1.2127
6-4-1*
38
0.9932
0.0295
0.0010
0.4285
6-4-1*
38
0.9769
0.0550
0.0018
1.0609
6-4-1*
38
0.9849
0.0484
0.0016
1.0070
6-4-1*
38
0.9830
0.0895
0.0029
1.4190
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
Table A4.12(a): Results of 4 Layer neural networks for Sg. Ketil catchment –
using 100% of available data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
6-4-8-1*
82
0.9901
0.0671
0.0022
0.9828
6-4-8-1*
82
0.9926
0.0265
0.0008
0.6166
6-4-8-1*
82
0.9928
0.0294
0.0010
0.4564
6-4-8-1*
82
0.9764
0.0551
0.0018
1.1440
6-4-8-1*
82
0.9842
0.0499
0.0016
1.1090
6-4-8-1*
82
0.9806
0.0979
0.0032
1.9667
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
252
APPENDIX A
Daily and hourly results of MLP model
Table A4.12(b): Results of 4 Layer neural networks for Sg. Ketil catchment –
using 65% of available data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
6-4-8-1*
82
0.9915
0.0595
0.0020
0.9018
6-4-8-1*
82
0.9922
0.0272
0.0009
0.6474
6-4-8-1*
82
0.9914
0.0324
0.0011
0.5117
6-4-8-1*
82
0.9020
0.1294
0.0043
8.9520
6-4-8-1*
82
0.8337
0.1663
0.0560
2.6960
6-4-8-1*
82
0.9592
0.1346
0.0045
3.1456
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
Table A4.12(c): Results of 4 Layer neural networks for Sg. Ketil catchment –
using 25% of available data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
6-4-8-1*
82
0.9843
0.0564
0.0019
0.8250
6-4-8-1*
82
0.9919
0.0292
0.0009
0.6806
6-4-8-1*
82
0.9775
0.0576
0.0019
1.1043
6-4-8-1*
82
0.9443
0.0819
0.0027
1.4669
6-4-8-1*
82
0.9844
0.0504
0.0017
1.0686
6-4-8-1*
82
0.9828
0.0897
0.0029
1.5360
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
253
APPENDIX A
Daily and hourly results of MLP model
Table A4.13(a): Results of 3 Layer neural networks for Sg. Klang catchment –
using 100% of available data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
8-6-1*
68
0.8638
9.8268
0.2862
14.5678
8-6-1*
68
0.8588
9.8049
0.2134
11.2398
8-6-1*
68
0.9166
4.6828
0.0929
3.7779
8-6-1*
68
0.8743
9.0456
0.2424
12.5075
8-6-1*
68
0.8863
13.4652
0.2068
9.4064
8-6-1*
68
0.8235
6.9646
0.1164
7.0568
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
Table A4.13(b): Results of 3 Layer neural networks for Sg. Klang catchment –
using 70% of available data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
8-6-1*
68
0.8586
10.9196
0.3045
14.2968
8-6-1*
68
0.8619
9.5673
0.2125
13.3512
8-6-1*
68
0.9085
4.8491
0.0991
5.3344
8-6-1*
68
0.8550
9.6491
0.2966
14.2024
8-6-1*
68
0.8862
13.4952
0.2180
11.3032
8-6-1*
68
0.7922
8.8336
0.2154
18.0782
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
254
APPENDIX A
Daily and hourly results of MLP model
Table A4.13(c): Results of 3 Layer neural networks for Sg. Klang catchment –
using 40% of available data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
8-6-1*
68
0.8682
7.4827
0.2561
10.2560
8-6-1*
68
0.8638
9.2846
0.2055
11.5943
8-6-1*
68
0.9217
4.7954
0.0967
4.8928
8-6-1*
68
0.8909
8.2057
0.3064
10.5765
8-6-1*
68
0.8855
14.5339
0.1537
8.5149
8-6-1*
68
0.7672
8.1017
0.1388
8.6668
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
Table A4.14(a): Results of 4 Layer neural networks for Sg. Klang catchment –
using 100% of available data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
8-6-7-1*
118
0.8636
9.8162
0.2611
10.5408
8-6-7-1*
118
0.8587
9.6108
0.2144
11.9465
8-6-7-1*
118
0.9355
4.1073
0.0819
4.1601
8-6-7-1*
118
0.8987
7.9272
0.2554
11.5848
8-6-7-1*
118
0.8781
13.9002
0.2246
10.0978
8-6-7-1*
118
0.8098
7.6145
0.1237
5.9622
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
255
APPENDIX A
Daily and hourly results of MLP model
Table A4.14(b): Results of 4 Layer neural networks for Sg. Klang catchment –
using 65% of available data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
8-6-7-1*
118
0.8456
11.3374
0.2954
12.3124
8-6-7-1*
118
0.8546
9.5754
0.2258
13.1625
8-6-7-1*
118
0.7653
10.5593
0.2551
14.9596
8-6-7-1*
118
0.6741
21.3165
0.7847
53.7492
8-6-7-1*
118
0.7882
35.8792
0.4956
26.8303
8-6-7-1*
118
0.4959
12.2470
0.3496
24.8221
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
Table A4.14(c): Results of 4 Layer neural networks for Sg. Klang catchment –
using 30% of available data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
8-6-7-1*
118
0.8837
6.9320
0.2127
8.9768
8-6-7-1*
118
0.9166
7.5062
0.1908
9.7557
8-6-7-1*
118
0.9132
4.8488
0.1019
7.6912
8-6-7-1*
118
0.7226
11.8962
0.3431
14.7521
8-6-7-1*
118
0.8238
15.7230
0.2648
14.0124
8-6-7-1*
118
0.8410
6.7002
0.1132
5.3757
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
256
APPENDIX A
Daily and hourly results of MLP model
Table A4.15(a): Results of 3 Layer neural networks for Sg. Slim catchment –
using 100% of available data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
7-5-1*
52
0.9894
0.0498
0.0020
0.9618
7-5-1*
52
0.9542
0.0367
0.0015
0.7475
7-5-1*
52
0.9712
0.0563
0.0023
1.2782
7-5-1*
52
0.9893
0.0556
0.0022
1.3100
7-5-1*
52
0.9921
0.0426
0.0017
0.7290
7-5-1*
52
0.9877
0.0382
0.0015
0.9378
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
Table A4.15(b): Results of 3 Layer neural networks for Sg. Slim catchment –
using 65% of available data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
7-5-1*
52
0.9889
0.0502
0.0020
0.8868
7-5-1*
52
0.9547
0.0364
0.0015
0.7134
7-5-1*
52
0.9712
0.0563
0.0023
1.2612
7-5-1*
52
0.9902
0.0534
0.0021
1.2589
7-5-1*
52
0.9914
0.0441
0.0018
0.7479
7-5-1*
52
0.9299
0.0973
0.0039
2.5545
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
257
APPENDIX A
Daily and hourly results of MLP model
Table A4.15(c): Results of 3 Layer neural networks for Sg. Slim catchment –
using 30% of available data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
7-5-1*
52
0.9859
0.0466
0.0019
0.7567
7-5-1*
52
0.9537
0.0364
0.0015
0.6813
7-5-1*
52
0.9656
0.0657
0.0026
1.8349
7-5-1*
52
0.9906
0.0516
0.0020
1.2090
7-5-1*
52
0.9904
0.0460
0.0018
0.8350
7-5-1*
52
0.9880
0.0378
0.0015
0.9184
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
Table A4.16(a): Results of 4 Layer neural networks for Sg. Slim catchment –
using 100% of available data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
7-5-9-1*
110
0.9846
0.0597
0.0024
1.3368
7-5-9-1*
110
0.9408
0.0452
0.0018
1.1728
7-5-9-1*
110
0.9703
0.0597
0.0024
1.3872
7-5-9-1*
110
0.9884
0.0594
0.0024
1.4710
7-5-9-1*
110
0.9846
0.0663
0.0027
1.8754
7-5-9-1*
110
0.9844
0.0457
0.0018
1.2596
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
258
APPENDIX A
Daily and hourly results of MLP model
Table A4.16(b): Results of 4 Layer neural networks for Sg. Slim catchment –
using 65% of available data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
7-5-9-1*
110
0.9864
0.0554
0.0022
1.0898
7-5-9-1*
110
0.9456
0.0406
0.0016
0.9034
7-5-9-1*
110
0.7163
0.1790
0.0072
4.0341
7-5-9-1*
110
0.9882
0.0589
0.0023
1.4146
7-5-9-1*
110
0.6500
0.2898
0.0115
6.8653
7-5-9-1*
110
0.6837
0.2103
0.0083
4.2920
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
Table A4.16(c): Results of 4 Layer neural networks for Sg. Slim catchment –
using 30% of available data sets in training phase
MODEL
Data Set
MLP
TRAINING
MLP-TEST
Set 1
MLP-TEST
Set 2
MLP-TEST
Set 3
MLP-TEST
Set 4
MLP-TEST
Set 5
Model
Structure
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
7-5-9-1*
110
0.8881
0.1249
0.0051
1.7640
7-5-9-1*
110
0.7520
0.0906
0.0037
1.2845
7-5-9-1*
110
0.6588
0.1909
0.0077
4.6483
7-5-9-1*
110
0.4226
0.5048
0.0198
13.9979
7-5-9-1*
110
0.0128
0.4924
0.0195
12.4315
7-5-9-1*
110
0.7045
0.2010
0.0080
3.3840
(*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient
259
APPENDIX B
Daily and hourly results of RBF model
Part A: Daily results
Table B4.18(a): Results of RBF networks for Sg. Ketil catchment using 100% of data sets in training phase
MODEL
Data Set
Model
Structure
RBF
TRAINING
RBF-TEST
Set 1
RBF-TEST
Set 2
RBF-TEST
Set 3
RBF-TEST
Set 4
RBF-TEST
Set 5
16 input
nodes
16 input
nodes
16 input
nodes
16 input
nodes
16 input
nodes
16 input
nodes
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
52
0.9460
0.1717
0.0057
3.6483
52
0.8645
0.2987
0.0099
7.0205
52
0.8225
0.4076
0.0133
9.6987
52
0.8764
0.3430
0.0115
7.7610
52
0.7593
0.5432
0.0176
14.2859
52
0.9367
0.3015
0.0101
6.4010
cumecs-meter cubic second; COC-correlation of coefficient
Table B4.18(b): Results of RBF networks for Sg. Ketil catchment using 50% of data sets in training phase
MODEL
Data Set
RBF
TRAINING
RBF-TEST
Set 1
RBF-TEST
Set 2
RBF-TEST
Set 3
RBF-TEST
Set 4
RBF-TEST
Set 5
Model
Structure
16 input
nodes
16 input
nodes
16 input
nodes
16 input
nodes
16 input
nodes
16 input
nodes
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
52
0.9689
0.1383
0.0046
2.6770
52
0.8246
0.3445
0.0114
8.3736
52
0.8083
0.4171
0.0136
10.0984
52
0.8297
0.3971
0.0132
9.1910
52
0.7462
0.5499
0.0179
14.3318
52
0.9330
0.3177
0.0106
7.6435
cumecs-meter cubic second; COC-correlation of coefficient
260
APPENDIX B
Daily and hourly results of RBF model
Table B4.18(c): Results of RBF networks for Sg. Ketil catchment using 25% of data sets in training phase
MODEL
Data Set
Model
Structure
RBF
TRAINING
RBF-TEST
Set 1
RBF-TEST
Set 2
RBF-TEST
Set 1
RBF-TEST
Set 2
RBF-TEST
Set 3
16 input
nodes
16 input
nodes
16 input
nodes
16 input
nodes
16 input
nodes
16 input
nodes
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
52
0.9722
0.1315
0.0044
2.2279
52
0.7749
0.4742
0.0157
12.3495
52
0.7915
0.3816
0.0125
10.0193
52
0.7670
0.5593
0.0186
14.0947
52
0.7318
0.4632
0.0151
12.1967
52
0.8915
0.5012
0.0165
14.2769
cumecs-meter cubic second; COC-correlation of coefficient
Table B4.19(a): Results of RBF networks for Sg. Klang catchment using 100% of data sets in training phase
MODEL
Data Set
Model
Structure
RBF
TRAINING
RBF-TEST
Set 1
RBF-TEST
Set 2
RBF-TEST
Set 3
RBF-TEST
Set 4
RBF-TEST
Set 5
17 input
nodes
17 input
nodes
17 input
nodes
17 input
nodes
17 input
nodes
17 input
nodes
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
55
0.8462
6.2737
0.3323
24.0405
55
0.6623
10.6638
0.4359
30.5000
55
0.7948
10.1066
0.3342
27.9063
55
0.7501
8.7609
0.4416
29.5988
55
0.7565
12.8041
0.2990
25.2004
55
0.6938
8.9530
0.5847
39.4501
cumecs-meter cubic second; COC-correlation of coefficient
261
APPENDIX B
Daily and hourly results of RBF model
Table B4.19(b): Results of RBF networks for Sg. Klang catchment using 50% of data sets in training phase
MODEL
Data Set
RBF
TRAINING
RBF-TEST
Set 1
RBF-TEST
Set 2
RBF-TEST
Set 3
RBF-TEST
Set 4
RBF-TEST
Set 5
Model
Structure
17 input
nodes
17 input
nodes
17 input
nodes
17 input
nodes
17 input
nodes
17 input
nodes
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
55
0.9205
4.3089
0.2776
20.1004
55
0.6776
11.3620
0.3484
27.2397
55
0.7751
10.5567
0.2968
24.3221
55
0.7559
9.5698
0.3414
26.7941
55
0.7274
13.4357
0.2948
24.2662
55
0.7401
10.1590
0.4054
33.1348
cumecs-meter cubic second; COC-correlation of coefficient
Table B4.19(c): Results of RBF networks for Sg. Klang catchment using 25% of data sets in training phase
MODEL
Data Set
RBF
TRAINING
RBF-TEST
Set 1
RBF-TEST
Set 2
RBF-TEST
Set 3
RBF-TEST
Set 4
RBF-TEST
Set 5
Model
Structure
17 input
nodes
17 input
nodes
17 input
nodes
17 input
nodes
17 input
nodes
17 input
nodes
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
55
0.9392
3.9291
0.2449
15.8175
55
0.7112
9.7425
0.3685
26.7373
55
0.7400
9.6423
0.3417
27.7973
55
0.7772
8.0970
0.3640
25.0004
55
0.6616
12.4606
0.3506
28.0038
55
0.7205
8.9022
0.4771
34.8004
cumecs-meter cubic second; COC-correlation of coefficient
262
APPENDIX B
Daily and hourly results of RBF model
Table B4.20(a): Results of RBF networks for Sg. Slim catchment using 100% of data sets in training phase
MODEL
Data Set
Model
Structure
RBF
TRAINING
RBF-TEST
Set 1
RBF-TEST
Set 2
RBF-TEST
Set 3
RBF-TEST
Set 4
RBF-TEST
Set 5
16 input
nodes
16 input
nodes
16 input
nodes
16 input
nodes
16 input
nodes
16 input
nodes
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
52
0.8689
0.0509
0.0008
5.3167
52
0.7816
0.0217
0.0003
2.4099
52
0.6302
0.0186
0.0003
2.0171
52
0.7702
0.0259
0.0004
2.8813
52
0.5478
0.0196
0.0003
2.2808
52
0.7538
0.0312
0.0005
3.7190
cumecs-meter cubic second; COC-correlation of coefficient
Table B4.20(b): Results of RBF networks for Sg. Slim catchment using 50% of data sets in training phase
MODEL
Data Set
RBF
TRAINING
RBF-TEST
Set 1
RBF-TEST
Set 2
RBF-TEST
Set 3
RBF-TEST
Set 4
RBF-TEST
Set 5
Model
Structure
16 input
nodes
16 input
nodes
16 input
nodes
16 input
nodes
16 input
nodes
16 input
nodes
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
52
0.9479
0.0229
0.0003
2.1907
52
0.8073
0.0246
0.0004
2.6206
52
0.5945
0.0176
0.0003
1.9221
52
0.7911
0.0315
0.0005
3.4534
52
0.4092
0.0183
0.0003
2.2090
52
0.8046
0.0332
0.0005
3.9468
cumecs-meter cubic second; COC-correlation of coefficient
263
APPENDIX B
Daily and hourly results of RBF model
Table B4.20(c): Results of RBF networks for Sg. Slim catchment using 25% of data sets in training phase
MODEL
Data Set
Model
Structure
RBF
TRAINING
RBF-TEST
Set 1
RBF-TEST
Set 2
RBF-TEST
Set 3
RBF-TEST
Set 4
RBF-TEST
Set 5
16 input
nodes
16 input
nodes
16 input
nodes
16 input
nodes
16 input
nodes
16 input
nodes
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
52
0.9521
0.0204
0.0003
1.6890
52
0.7915
0.0264
0.0004
2.8865
52
0.5654
0.0190
0.0003
2.1240
52
0.8168
0.0332
0.0005
3.8255
52
0.3463
0.0192
0.0003
2.2851
52
0.8052
0.0334
0.0005
4.0113
cumecs-meter cubic second; COC-correlation of coefficient
Part B: Hourly results
Table B4.22(a): Results of RBF networks for Sg. Ketil catchment –
using 20% of available data sets in training phase
MODEL
Data Set
RBF
TRAINING
RBF-TEST
Set 1
RBF-TEST
Set 2
RBF-TEST
Set 3
RBF-TEST
Set 4
RBF-TEST
Set 5
Model
Structure
6 input
nodes
6 input
nodes
6 input
nodes
6 input
nodes
6 input
nodes
6 input
nodes
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
22
0.9970
0.0249
0.0008
0.1749
22
0.9987
0.0112
0.0004
0.1064
22
0.9982
0.0150
0.0005
0.1608
22
0.9965
0.0215
0.0007
0.3010
22
0.9972
0.0210
0.0007
0.2574
22
0.9968
0.0387
0.0013
0.3535
cumecs-meter cubic second; COC-correlation of coefficient
264
APPENDIX B
Daily and hourly results of RBF model
Table B4.22(b): Results of RBF networks for Sg. Ketil catchment –
using minimum data sets in training phase
MODEL
Data Set
RBF
TRAINING
RBF-TEST
Set 1
RBF-TEST
Set 2
RBF-TEST
Set 3
RBF-TEST
Set 4
RBF-TEST
Set 5
Model
Structure
6 input
nodes
6 input
nodes
6 input
nodes
6 input
nodes
6 input
nodes
6 input
nodes
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
22
0.9921
0.0181
0.0006
0.8890
22
0.9999
0.0032
0.0001
0.0226
22
0.9998
0.0051
0.0002
0.0425
22
0.9975
0.0182
0.0006
0.1678
22
0.9992
0.0114
0.0004
0.1159
22
0.9997
0.0117
0.0004
0.1094
cumecs-meter cubic second; COC-correlation of coefficient
Table B4.23(a): Results of RBF networks for Sg. Klang catchment –
using 40% of available data sets in training phase
MODEL
Data Set
RBF
TRAINING
RBF-TEST
Set 1
RBF-TEST
Set 2
RBF-TEST
Set 3
RBF-TEST
Set 4
RBF-TEST
Set 5
Model
Structure
8 input
nodes
8 input
nodes
8 input
nodes
8 input
nodes
8 input
nodes
8 input
nodes
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
28
0.9999
0.2077
0.0172
0.6618
28
1.000
0.0005
0.000
0.0004
28
1.000
0.0005
0.0006
0.0007
28
1.000
0.0221
0.0015
0.1698
28
1.000
0.0004
0.0004
0.0006
28
1.000
0.0002
0.0002
0.0008
cumecs-meter cubic second; COC-correlation of coefficient
265
APPENDIX B
Daily and hourly results of RBF model
Table B4.23(b): Results of RBF networks for Sg. Klang catchment –
using minimum data sets in training phase
MODEL
Data Set
RBF
TRAINING
RBF-TEST
Set 1
RBF-TEST
Set 2
RBF-TEST
Set 3
RBF-TEST
Set 4
RBF-TEST
Set 5
Model
Structure
8 input
nodes
8 input
nodes
8 input
nodes
8 input
nodes
8 input
nodes
8 input
nodes
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
28
1.000
0.0003
0.0002
0.0002
28
1.000
0.0004
0.0001
0.0004
28
1.000
0.0007
0.0002
0.0007
28
1.000
0.0006
0.0003
0.0002
28
1.000
0.0004
0.0001
0.0009
28
1.000
0.0005
0.0002
0.0004
cumecs-meter cubic second; COC-correlation of coefficient
Table B4.24(a): Results of RBF networks for Sg. Slim catchment –
using 30% of available data sets in training phase
MODEL
Data Set
RBF
TRAINING
RBF-TEST
Set 1
RBF-TEST
Set 2
RBF-TEST
Set 3
RBF-TEST
Set 4
RBF-TEST
Set 5
Model
Structure
7 input
nodes
7 input
nodes
7 input
nodes
7 input
nodes
7 input
nodes
7 input
nodes
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
25
0.9938
0.0309
0.0013
0.3309
25
0.9814
0.0231
0.0009
0.3906
25
0.9972
0.0176
0.0007
0.2884
25
0.9981
0.0234
0.0009
0.2536
25
0.9995
0.0110
0.0004
0.2091
25
0.9974
0.0177
0.0007
0.3210
cumecs-meter cubic second; COC-correlation of coefficient
266
APPENDIX B
Daily and hourly results of RBF model
Table B4.24(b): Results of RBF networks for Sg. Slim catchment –
using minimum data sets in training phase
MODEL
Data Set
RBF
TRAINING
RBF-TEST
Set 1
RBF-TEST
Set 2
RBF-TEST
Set 3
RBF-TEST
Set 4
RBF-TEST
Set 5
Model
Structure
7 input
nodes
7 input
nodes
7 input
nodes
7 input
nodes
7 input
nodes
7 input
nodes
No. of
Parameter
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
25
0.9976
0.0171
0.0007
0.2018
25
0.9889
0.0179
0.0007
0.2455
25
0.9992
0.0094
0.0004
0.1299
25
0.9989
0.0179
0.0007
0.1604
25
0.9998
0.0066
0.0003
0.0793
25
0.9994
0.0083
0.0003
0.1339
cumecs-meter cubic second; COC-correlation of coefficient
267
APPENDIX C
Results of application of MLR model
Table C4.26(a): Results of MLR Model for Sg. Ketil catchment –
using 100% of data sets in training phase
MODEL
Model
No. of
Data Set
Structure
Parameter
MLR
TRAINING
16 input
6
MLR-TEST
Set 1
16 input
6
MLR-TEST
Set 2
16 input
6
MLR-TEST
Set 3
16 input
6
MLR-TEST
Set 4
16 input
6
MLR-TEST
Set 5
16 input
6
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
0.6016
19.5476
0.6562
59.1316
0.6783
19.9696
0.6719
60.4182
0.7681
20.4595
0.6871
63.1152
0.6936
19.9418
0.6725
59.3665
0.6742
16.0532
0.5334
45.4869
0.7686
21.4246
0.7204
63.6091
cumecs-meter cubic second; COC-correlation of coefficient
Table C4.26(b): Results of MLR Model for Sg. Ketil catchment –
using 50% of data sets in training phase
MODEL
Model
No. of
Data Set
Structure
Parameter
MLR
TRAINING
16 input
6
MLR-TEST
Set 1
16 input
6
MLR-TEST
Set 2
16 input
6
MLR-TEST
Set 3
16 input
6
MLR-TEST
Set 4
16 input
6
MLR-TEST
Set 5
16 input
6
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
0.6207
18.8502
0.6305
55.5948
0.6682
19.6808
0.6622
59.3300
0.7376
20.2302
0.6794
62.6161
0.7012
19.6150
0.6616
57.7664
0.6411
15.7738
0.5242
45.1279
0.7783
21.2494
0.7144
62.7848
cumecs-meter cubic second; COC-correlation of coefficient
268
APPENDIX C
Results of application of MLR model
Table C4.26(c): Results of MLR Model for Sg. Ketil catchment –
using 25% of data sets in training phase
MODEL
Model
No. of
Data Set
Structure
Parameter
MLR
TRAINING
14 input
6
MLR-TEST
Set 1
14 input
6
MLR-TEST
Set 2
14 input
6
MLR-TEST
Set 3
14 input
6
MLR-TEST
Set 4
14 input
6
MLR-TEST
Set 5
14 input
6
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
0.6737
18.4978
0.6204
53.3926
0.6045
19.9485
0.6703
60.1075
0.6970
20.6432
0.6922
64.3279
0.6297
19.8242
0.6677
58.2319
0.5933
17.3075
0.5732
50.4693
0.6932
21.7474
0.7295
64.9861
cumecs-meter cubic second; COC-correlation of coefficient
Table C4.27(a): Results of MLR Model for Sg. Klang catchment –
using 100% of data sets in training phase
MODEL
Model
No. of
Data Set
Structure
Parameter
MLR
TRAINING
14 input
6
MLR-TEST
Set 1
14 input
6
MLR-TEST
Set 2
14 input
6
MLR-TEST
Set 3
14 input
6
MLR-TEST
Set 4
14 input
6
MLR-TEST
Set 5
14 input
6
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
0.6223
9.01273
0.5319
44.6148
0.6561
10.4614
0.5480
46.4188
0.7899
11.6684
0.5791
50.0855
0.6478
9.4858
0.5104
42.2342
0.8068
13.0131
0.4795
42.1463
0.6886
9.0047
0.4984
38.5141
cumecs-meter cubic second; COC-correlation of coefficient
269
APPENDIX C
Results of application of MLR model
Table C4.27(b): Results of MLR Model for Sg. Klang catchment –
using 50% of data sets in training phase
MODEL
Model
No. of
Data Set
Structure
Parameter
MLR
TRAINING
MLR-TEST
Set 1
MLR-TEST
Set 2
MLR-TEST
Set 3
MLR-TEST
Set 4
MLR-TEST
Set 5
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
14 input
6
0.7000
7.5302
0.5251
42.0505
14 input
6
0.6526
10.5988
0.5496
46.1864
14 input
6
0.7831
11.5517
0.5719
49.6184
14 input
6
0.6092
9.9586
0.5281
42.5334
14 input
6
0.8106
12.6366
0.4618
40.7592
14 input
6
0.6732
9.3088
0.5379
39.3836
cumecs-meter cubic second; COC-correlation of coefficient
Table C4.27(c): Results of MLR Model for Sg. Klang catchment –
using 25% of data sets in training phase
MODEL
Model
No. of
Data Set
Structure
Parameter
MLR
TRAINING
14 input
6
MLR-TEST
Set 1
14 input
6
MLR-TEST
Set 2
14 input
6
MLR-TEST
Set 3
14 input
6
MLR-TEST
Set 4
14 input
6
MLR-TEST
Set 5
14 input
6
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
0.7460
7.4099
0.5079
38.2439
0.6565
10.2078
0.5453
44.4387
0.7852
10.7336
0.5473
46.2175
0.6175
10.0458
0.5480
42.6807
0.8071
11.5182
0.4312
36.7242
0.6581
10.5492
0.6039
42.7946
cumecs-meter cubic second; COC-correlation of coefficient
270
APPENDIX C
Results of application of MLR model
Table C4.28(a): Results of MLR Model for Sg. Slim catchment –
using 100% of data sets in training phase
MODEL
Model
No. of
Data Set
Structure
Parameter
MLR
TRAINING
16 input
6
MLR-TEST
Set 1
16 input
6
MLR-TEST
Set 2
16 input
6
MLR-TEST
Set 3
16 input
6
MLR-TEST
Set 4
16 input
6
MLR-TEST
Set 5
16 input
6
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
0.5849
127.1532
1.9319
13.2305
0.8808
201.4210
3.0635
23.0220
0.5501
136.4133
2.0759
16.0203
0.8852
255.2275
3.8814
30.7050
0.4277
160.5863
2.4439
19.1346
0.8525
253.1943
3.8507
31.0617
cumecs-meter cubic second; COC-correlation of coefficient
Table C4.28(b): Results of MLR Model for Sg. Slim catchment –
using 50% of data sets in training phase
MODEL
Model
No. of
Data Set
Structure
Parameter
MLR
TRAINING
16 input
6
MLR-TEST
Set 1
16 input
6
MLR-TEST
Set 2
16 input
6
MLR-TEST
Set 3
16 input
6
MLR-TEST
Set 4
16 input
6
MLR-TEST
Set 5
16 input
6
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
0.7849
144.1439
2.1923
16.0407
0.8140
217.8754
3.3139
25.0923
0.5289
149.0955
2.2689
17.7358
0.8131
275.4002
4.1882
33.1587
0.4527
174.0586
2.6489
21.0313
0.7794
272.6427
4.1467
33.7076
cumecs-meter cubic second; COC-correlation of coefficient
271
APPENDIX C
Results of application of MLR model
Table C4.28(c): Results of MLR Model for Sg. Slim catchment –
using 25% of data sets in training phase
MODEL
Model
No. of
Data Set
Structure
Parameter
MLR
TRAINING
MLR-TEST
Set 1
MLR-TEST
Set 2
MLR-TEST
Set 3
MLR-TEST
Set 4
MLR-TEST
Set 5
COC
( R2 )
RMSE
(cumecs)
RRMSE
MAPE
(%)
15 input
6
0.7536
141.3654
2.1502
15.0923
15 input
6
0.7805
208.8311
3.1763
23.6218
15 input
6
0.5076
143.1880
2.1790
16.6663
15 input
6
0.7807
263.7675
4.0113
31.1748
15 input
6
0.4346
165.1244
2.5130
19.5123
15 input
6
0.7500
262.0962
3.9862
31.8572
cumecs-meter cubic second; COC-correlation of coefficient
272
APPENDIX D
Daily and hourly results of the HEC-HMS model calibration
Part A: Daily results
Table D4.30(a): Calibration Coefficients of Sungai Ketil catchment
(using 100% of data)
Model parameter
Calibrated value
Constant Rate (mm/hr)
10
Imperviousness (%)
54
SCS Lag (minutes)
9985.16
Recession Constant
1
Threshold Flow (cumecs)
0.99
Table D4.30(b): Calibration Coefficients of Sungai Ketil catchment
(using 50% data)
Model parameter
Calibrated value
Constant Rate (mm/hr)
2
Imperviousness (%)
54
SCS Lag (minutes)
7785.20
Recession Constant
1
Threshold Flow (cumecs)
0.99
Table D4.30(c): Calibration Coefficients of Sungai Ketil catchment
(using 25% data)
Model parameter
Calibrated value
Constant Rate (mm/hr)
2
Imperviousness (%)
54
SCS Lag (minutes)
5420.80
Recession Constant
1
Threshold Flow (cumecs)
0.99
*Note: Another 2 parameters (catchment size & baseflow) are fixed in the model
273
APPENDIX D
Daily and hourly results of the HEC-HMS model calibration
Table D4.31(a): Calibration Coefficients of Sungai Klang catchment
(using 100% of data)
Model parameter
Constant Rate (mm/hr)
Calibrated value
1.95
Imperviousness (%)
78
SCS Lag (minutes)
2842.12
Recession Constant
1
Threshold Flow (cumecs)
0.998
Table D4.31(b): Calibration Coefficients of Sungai Klang catchment
(using 50% data)
Model parameter
Constant Rate (mm/hr)
Calibrated value
2.12
Imperviousness (%)
78
SCS Lag (minutes)
1807.50
Recession Constant
1
Threshold Flow (cumecs)
0.998
Table D4.31(c): Calibration Coefficients of Sungai Klang catchment
(using 25% data)
Model parameter
Calibrated value
Constant Rate (mm/hr)
1.7
Imperviousness (%)
78
SCS Lag (minutes)
1500.00
Recession Constant
1
Threshold Flow (cumecs)
0.998
*Note: Another 2 parameters (catchment size & baseflow) are fixed in the model
274
APPENDIX D
Daily and hourly results of the HEC-HMS model calibration
Table D4.32(a): Calibration Coefficients of Sungai Slim catchment
(using 100% of data)
Model parameter
Calibrated value
Constant Rate (mm/hr)
30
Imperviousness (%)
35
SCS Lag (minutes)
16362.26
Recession Constant
1
Threshold Flow (cumecs)
0.98
Table D4.32(b): Calibration Coefficients of Sungai Slim catchment
(using 50% data)
Model parameter
Calibrated value
Constant Rate (mm/hr)
20
Imperviousness (%)
35
SCS Lag (minutes)
11445.30
Recession Constant
1
Threshold Flow (cumecs)
0.98
Table D4.32(c): Calibration Coefficients of Sungai Slim catchment
(using 25% data)
Model parameter
Calibrated value
Constant Rate (mm/hr)
20
Imperviousness (%)
35
SCS Lag (minutes)
7995.95
Recession Constant
1
Threshold Flow (cumecs)
0.98
*Note: Another 2 parameters (catchment size & baseflow) are fixed in the model
275
APPENDIX D
Daily and hourly results of the HEC-HMS model calibration
Part B: Hourly results
Table D4.33(a): Calibration Coefficients of Sungai Ketil catchment
(using 25% of data)
Model parameter
Calibrated value
Constant Rate (mm/hr)
34
Imperviousness (%)
52
SCS Lag (minutes)
2261.5
Recession Constant
1
Threshold Flow (cumecs)
0.992
Table D4.33(b): Calibration Coefficients of Sungai Ketil catchment
(using minimum data)
Model parameter
Calibrated value
Constant Rate (mm/hr)
34
Imperviousness (%)
56
SCS Lag (minutes)
2093.5515
Recession Constant
1
Threshold Flow (cumecs)
0.992
Table D4.34(a): Calibration Coefficients of Sungai Klang catchment
(using 25% of data)
Model parameter
Calibrated value
Constant Rate (mm/hr)
82
Imperviousness (%)
82
Time of Concentration (hr)
5
Storage Coefficient (hr)
17
Recession Constant
1
Threshold Flow (cumecs)
0.995
*Note: Another 2 parameters (catchment size & baseflow) are fixed in the model
276
APPENDIX D
Daily and hourly results of the HEC-HMS model calibration
Table D4.34(b): Calibration Coefficients of Sungai Klang catchment
(using minimum data)
Model parameter
Calibrated value
Constant Rate (mm/hr)
81
Imperviousness (%)
82
Time of Concentration (hr)
7
Storage Coefficient (hr)
10
Recession Constant
1
Threshold Flow (cumecs)
0.995
Table D4.35(a): Calibration Coefficients of Sungai Slim catchment
(using 25% of data)
Model parameter
Calibrated value
Constant Rate (mm/hr)
55
Imperviousness (%)
38
Time of Concentration (hr)
35
Storage Coefficient (hr)
70
Recession Constant
1
Threshold Flow (cumecs)
0.985
Table D4.35(b): Calibration Coefficients of Sungai Slim catchment
(using minimum data)
Model parameter
Calibrated value
Constant Rate (mm/hr)
40
Imperviousness (%)
38
Time of Concentration (hr)
22.6
Storage Coefficient (hr)
55
Recession Constant
1
Threshold Flow (cumecs)
0.985
*Note: Another 2 parameters (catchment size & baseflow) are fixed in the model
277
APPENDIX E
Daily and hourly results of application of HEC-HMS model
Part A: Daily Results
Table E4.38(a): Results of HEC-HMS Model for Sg. Ketil catchment using 100% of data sets in training phase
MODEL
Data Set
HEC
TRAINING
HEC-TEST
Set 1
HEC-TEST
Set 2
HEC-TEST
Set 3
HEC-TEST
Set 4
HEC-TEST
Set 5
COC
( R2 )
RMSE
(cumecs)
RRMSE
No. of Parameter
MAPE
(%)
7
0.4390
0.3963
0.0131
69.6432
7
0.5513
0.4298
0.0142
29.2748
7
0.5776
0.4479
0.0148
24.7868
7
0.5277
0.4222
0.0140
31.7811
7
0.4448
0.6328
0.0209
28.3676
7
0.6474
0.4462
0.0146
36.8108
cumecs-meter cubic second; COC-correlation of coefficient
Table E4.38(b): Results of HEC-HMS Model for Sg. Ketil catchment using 50% of data sets in training phase
MODEL
Data Set
HEC
TRAINING
HEC-TEST
Set 1
HEC-TEST
Set 2
HEC-TEST
Set 3
HEC-TEST
Set 4
HEC-TEST
Set 5
COC
( R2 )
RMSE
(cumecs)
RRMSE
No. of Parameter
MAPE
(%)
7
0.3455
0.4099
0.0134
96.9587
7
0.3141
0.4791
0.0158
67.3545
7
0.2948
0.5070
0.0169
58.7408
7
0.3306
0.4645
0.0152
40.2240
7
0.2268
0.7075
0.0235
55.9711
7
0.3425
0.5208
0.0166
89.7145
cumecs-meter cubic second; COC-correlation of coefficient
278
APPENDIX E
Daily and hourly results of application of HEC-HMS model
Table E4.38(c): Results of HEC-HMS Model for Sg. Ketil catchment using 25% of data sets in training phase
MODEL
Data Set
HEC
TRAINING
HEC-TEST
Set 1
HEC-TEST
Set 2
HEC-TEST
Set 3
HEC-TEST
Set 4
HEC-TEST
Set 5
RMSE
(cumecs)
RRMSE
No. of Parameter
COC
( R2 )
MAPE
(%)
7
0.5696
0.4309
0.0142
30.2807
7
0.5509
0.5112
0.0169
21.6564
7
0.6148
0.5168
0.0169
20.8150
7
0.4700
0.5174
0.0172
62.6365
7
0.5304
0.7440
0.0243
37.9182
7
0.4938
0.6069
0.0202
44.7081
cumecs-meter cubic second; COC-correlation of coefficient
Table E4.39(a): Results of HEC-HMS Model for Sg. Klang catchment using 100% of data sets in training phase
MODEL
Data Set
HEC
TRAINING
HEC-TEST
Set 1
HEC-TEST
Set 2
HEC-TEST
Set 3
HEC-TEST
Set 4
HEC-TEST
Set 5
COC
( R2 )
RMSE
(cumecs)
RRMSE
No. of Parameter
MAPE
(%)
7
0.1095
13.6513
0.5740
44.2057
7
0.2095
16.9564
0.5803
47.8109
7
0.2628
17.3662
0.5397
50.1142
7
0.2723
17.3328
0.6410
48.4294
7
0.2483
21.7856
0.6089
57.1594
7
0.3109
21.2287
0.7834
53.2633
cumecs-meter cubic second; COC-correlation of coefficient
279
APPENDIX E
Daily and hourly results of application of HEC-HMS model
Table E4.39(b): Results of HEC-HMS Model for Sg. Klang catchment using 50% of data sets in training phase
MODEL
Data Set
HEC
TRAINING
HEC-TEST
Set 1
HEC-TEST
Set 2
HEC-TEST
Set 3
HEC-TEST
Set 4
HEC-TEST
Set 5
RMSE
(cumecs)
RRMSE
No. of Parameter
COC
( R2 )
MAPE
(%)
7
0.2559
12.7938
0.5346
44.0716
7
0.2479
17.8388
0.5997
53.1483
7
0.2903
18.5725
0.6022
57.6016
7
0.3090
17.9154
0.6194
51.2026
7
0.2888
23.0966
0.6649
63.7140
7
0.4383
20.4854
0.6630
49.0098
cumecs-meter cubic second; COC-correlation of coefficient
Table E4.39(c): Results of HEC-HMS Model for Sg. Klang catchment using 25% of data sets in training phase
MODEL
Data Set
HEC
TRAINING
HEC-TEST
Set 1
HEC-TEST
Set 2
HEC-TEST
Set 3
HEC-TEST
Set 4
HEC-TEST
Set 5
COC
( R2 )
RMSE
(cumecs)
RRMSE
No. of Parameter
MAPE
(%)
7
0.3418
16.5412
0.6718
50.7731
7
0.2855
20.6104
0.6797
56.0819
7
0.3238
18.4099
0.6095
56.4563
7
0.3383
23.4747
0.7681
58.5394
7
0.3253
23.2594
0.6818
63.1373
7
0.4727
28.8749
0.8762
61.2457
cumecs-meter cubic second; COC-correlation of coefficient
280
APPENDIX E
Daily and hourly results of application of HEC-HMS model
Table E4.40(a): Results of HEC-HMS Model for Sg. Slim catchment using 100% of data sets in training phase
MODEL
Data Set
HEC
TRAINING
HEC-TEST
Set 1
HEC-TEST
Set 2
HEC-TEST
Set 3
HEC-TEST
Set 4
HEC-TEST
Set 5
COC
( R2 )
RMSE
(cumecs)
RRMSE
No. of Parameter
MAPE
(%)
7
0.1560
0.0962
0.0015
103.6129
7
0.2736
0.0928
0.0014
110.6630
7
0.1002
0.0689
0.0010
87.0855
7
0.2636
0.1125
0.0017
137.4980
7
0.1321
0.0696
0.0011
87.4498
7
0.1812
0.1087
0.0016
139.3122
cumecs-meter cubic second; COC-correlation of coefficient
Table E4.40(b): Results of HEC-HMS Model for Sg. Slim catchment using 50% of data sets in training phase
MODEL
Data Set
HEC
TRAINING
HEC-TEST
Set 1
HEC-TEST
Set 2
HEC-TEST
Set 3
HEC-TEST
Set 4
HEC-TEST
Set 5
RMSE
(cumecs)
RRMSE
No. of Parameter
COC
( R2 )
MAPE
(%)
7
0.4214
0.0709
0.0011
80.4609
7
0.3030
0.1085
0.0016
126.0659
7
0.1775
0.0839
0.0013
102.1349
7
0.3001
0.1336
0.0020
160.5670
7
0.1876
0.0834
0.0013
97.0609
7
0.2768
0.1287
0.0020
159.2668
cumecs-meter cubic second; COC-correlation of coefficient
281
APPENDIX E
Daily and hourly results of application of HEC-HMS model
Table E4.40(c): Results of HEC-HMS Model for Sg. Slim catchment using 25% of data sets in training phase
MODEL
Data Set
HEC
TRAINING
HEC-TEST
Set 1
HEC-TEST
Set 2
HEC-TEST
Set 3
HEC-TEST
Set 4
HEC-TEST
Set 5
COC
( R2 )
RMSE
(cumecs)
RRMSE
No. of Parameter
MAPE
(%)
7
0.3832
0.0816
0.0012
89.0290
7
0.2962
0.1172
0.0018
135.4532
7
0.1818
0.0933
0.0014
114.7980
7
0.2955
0.1440
0.0022
169.9900
7
0.1753
0.0939
0.0014
109.3430
7
0.2785
0.1410
0.0021
170.1134
cumecs-meter cubic second; COC-correlation of coefficient
282
APPENDIX E
Daily and hourly results of application of HEC-HMS model
Part B: Hourly Results
Table E4.42(a): Results of HEC-HMS Model for Sg. Ketil catchment using 25% of data sets in training phase
MODEL
Data Set
HEC
TRAINING
HEC-TEST
Set 1
HEC-TEST
Set 2
HEC-TEST
Set 3
HEC-TEST
Set 4
HEC-TEST
Set 5
COC
( R2 )
RMSE
(cumecs)
RRMSE
No. of Parameter
MAPE
(%)
7
0.1001
0.9106
0.0300
26.6170
7
0.4194
0.2831
0.0090
60.5143
7
0.2104
0.2147
0.0122
62.5150
7
0.6183
0.1742
0.0135
49.8880
7
0.3647
0.3924
0.0129
113.1124
7
0.5767
0.2959
0.0242
77.1620
cumecs-meter cubic second; COC-correlation of coefficient
Table E4.42(b): Results of HEC-HMS Model for Sg. Ketil catchment using minimum data sets in training phase
MODEL
Data Set
HEC
TRAINING
HEC-TEST
Set 1
HEC-TEST
Set 2
HEC-TEST
Set 3
HEC-TEST
Set 4
HEC-TEST
Set 5
COC
( R2 )
RMSE
(cumecs)
RRMSE
No. of Parameter
MAPE
(%)
7
0.6105
0.1276
0.0042
30.1550
7
0.3376
0.2914
0.0093
63.1680
7
0.1519
0.2121
0.0124
61.8832
7
0.5820
0.1767
0.0135
30.3545
7
0.2874
0.3940
0.0132
112.4270
7
0.5086
0.2997
0.0242
79.4885
cumecs-meter cubic second; COC-correlation of coefficient
283
APPENDIX E
Daily and hourly results of application of HEC-HMS model
Table E4.43(a): Results of HEC-HMS Model for Sg. Klang catchment using 40% of data sets in training phase
MODEL
Data Set
HEC
TRAINING
HEC-TEST
Set 1
HEC-TEST
Set 2
HEC-TEST
Set 3
HEC-TEST
Set 4
HEC-TEST
Set 5
RMSE
(cumecs)
RRMSE
No. of Parameter
COC
( R2 )
MAPE
(%)
8
0.0914
35.2076
2.1584
91.8729
8
0.4036
33.8449
1.3703
50.8431
8
0.8496
6.8700
0.1902
13.9516
8
0.1021
23.9458
0.8421
62.6154
8
0.2208
35.4623
0.6088
32.6375
8
0.4024
14.5522
0.2324
40.3196
cumecs-meter cubic second; COC-correlation of coefficient
Table E4.43(b): Results of HEC-HMS Model for Sg. Klang catchment using minimum data sets in training phase
MODEL
Data Set
HEC
TRAINING
HEC-TEST
Set 1
HEC-TEST
Set 2
HEC-TEST
Set 3
HEC-TEST
Set 4
HEC-TEST
Set 5
RMSE
(cumecs)
RRMSE
No. of Parameter
COC
( R2 )
MAPE
(%)
8
0.1830
21.5490
0.8719
88.7204
8
0.5340
20.0460
0.3592
29.6470
8
0.8865
12.4918
0.2156
13.9121
8
0.2895
18.2621
0.9586
56.5022
8
0.3169
34.0235
0.6801
42.1594
8
0.2542
16.0454
0.2305
97.0262
cumecs-meter cubic second; COC-correlation of coefficient
284
APPENDIX E
Daily and hourly results of application of HEC-HMS model
Table E4.44(a): Results of HEC-HMS Model for Sg. Slim catchment using 30% of data sets in training phase
MODEL
Data Set
HEC
TRAINING
HEC-TEST
Set 1
HEC-TEST
Set 2
HEC-TEST
Set 3
HEC-TEST
Set 4
HEC-TEST
Set 5
COC
( R2 )
RMSE
(cumecs)
RRMSE
No. of Parameter
MAPE
(%)
8
0.2128
0.4141
0.0166
12.8902
8
0.3872
0.3399
0.0088
10.7243
8
0.1097
0.2178
0.0173
70.3790
8
0.0070
0.4703
0.0262
157.2870
8
0.1677
0.2545
0.0236
84.0278
8
0.1443
0.3269
0.0131
98.5790
cumecs-meter cubic second; COC-correlation of coefficient
Table E4.44(b): Results of HEC-HMS Model for Sg. Slim catchment using minimum data sets in training phase
MODEL
Data Set
HEC
TRAINING
HEC-TEST
Set 1
HEC-TEST
Set 2
HEC-TEST
Set 3
HEC-TEST
Set 4
HEC-TEST
Set 5
RMSE
(cumecs)
RRMSE
No. of Parameter
COC
( R2 )
MAPE
(%)
8
0.5154
0.3566
0.0145
23.1082
8
0.3450
0.4218
0.0110
114.3003
8
0.1347
0.3156
0.0227
95.2197
8
0.0285
0.5998
0.0285
120.5626
8
0.4791
0.4081
0.0216
30.2660
8
0.4922
0.3948
0.0100
32.3102
cumecs-meter cubic second; COC-correlation of coefficient
285
APPENDIX F
Daily and hourly results of the SWMM model calibration
Part A: Daily results
Table F4.46(a): Calibration Coefficients of Sungai Ketil catchment
(using 100% of data)
Model parameter
Imperviousness (%)
Calibrated value
45
Time of Concentration (hr)
66.42
Decay rate of infiltration
0.0025
Table F4.46(b): Calibration Coefficients of Sungai Ketil catchment
(using 50% data)
Model parameter
Imperviousness (%)
Calibrated value
45
Time of Concentration (hr)
29.75
Decay rate of infiltration
0.002
Table F4.46(c): Calibration Coefficients of Sungai Ketil catchment
(using 25% data)
Model parameter
Imperviousness (%)
Calibrated value
45
Time of Concentration (hr)
30.35
Decay rate of infiltration
0.002
286
APPENDIX F
Daily and hourly results of the SWMM model calibration
Table F4.47(a): Calibration Coefficients of Sungai Klang catchment
(using 100% of data)
Model parameter
Calibrated value
Imperviousness (%)
78
Pervious Area CN
30
Time of Concentration (hr)
7.37
Initial Abstraction
0.14
Decay rate of infiltration
0.001
Table F4.47(b): Calibration Coefficients of Sungai Klang catchment
(using 50% data)
Model parameter
Calibrated value
Imperviousness (%)
75
Pervious Area CN
30
Time of Concentration (hr)
3.13
Initial Abstraction
0.14
Decay rate of infiltration
0.001
Table F4.47(c): Calibration Coefficients of Sungai Klang catchment
(using 25% data)
Model parameter
Calibrated value
Imperviousness (%)
75
Pervious Area CN
30
Time of Concentration (hr)
2.5
Initial Abstraction
0.12
Decay rate of infiltration
0.001
287
APPENDIX F
Daily and hourly results of the SWMM model calibration
Table F4.48(a): Calibration Coefficients of Sungai Slim catchment
(using 100% of data)
Model parameter
Calibrated value
Imperviousness (%)
35
Pervious Area CN
95
Time of Concentration (hr)
72.7
Initial Abstraction
0.16
Decay rate of infiltration
0.0013
Table F4.48(b): Calibration Coefficients of Sungai Slim catchment
(using 50% data)
Model parameter
Calibrated value
Imperviousness (%)
35
Pervious Area CN
95
Time of Concentration (hr)
90.76
Initial Abstraction
0.15
Decay rate of infiltration
0.0013
Table F4.48(c): Calibration Coefficients of Sungai Slim catchment
(using 25% data)
Model parameter
Calibrated value
Imperviousness (%)
35
Pervious Area CN
95
Time of Concentration (hr)
33.26
Initial Abstraction
0.15
Decay rate of infiltration
0.0012
288
APPENDIX F
Daily and hourly results of the SWMM model calibration
Part B: Hourly results
Table F4.50(a): Calibration Coefficients of Sungai Ketil catchment
(using 25% of data)
Model parameter
Calibrated value
Imperviousness (%)
46
Pervious Area CN
84
Time of Concentration (hr)
Decay rate of infiltration
37.69
0.00125
Table F4.50(b): Calibration Coefficients of Sungai Ketil catchment
(using minimum data)
Model parameter
Calibrated value
Imperviousness (%)
46
Pervious Area CN
84
Time of Concentration (hr)
Decay rate of infiltration
34.89
0.00125
Table F4.51(a): Calibration Coefficients of Sungai Klang catchment
(using 25% of data)
Model parameter
Calibrated value
Imperviousness (%)
82
Pervious Area CN
98
Time of Concentration (hr)
5
Decay rate of infiltration
0.001
289
APPENDIX F
Daily and hourly results of the SWMM model calibration
Table F4.51(b): Calibration Coefficients of Sungai Klang catchment
(using minimum data)
Model parameter
Calibrated value
Imperviousness (%)
82
Pervious Area CN
98
Time of Concentration (hr)
6
Decay rate of infiltration
0.001
Table F4.52(a): Calibration Coefficients of Sungai Slim catchment
(using 25% of data)
Model parameter
Calibrated value
Imperviousness (%)
38
Pervious Area CN
92
Time of Concentration (hr)
50
Initial Abstraction
Decay rate of infiltration
0.15
0.0012
Table F4.52(b): Calibration Coefficients of Sungai Slim catchment
(using minimum data)
Model parameter
Calibrated value
Imperviousness (%)
38
Pervious Area CN
92
Time of Concentration (hr)
51.5
Initial Abstraction
0.15
Decay rate of infiltration
0.0012
290
APPENDIX G
Daily and hourly results of application of SWMM model
Part A: Daily Results
Table G4.54(a): Results of SWMM Model for Sg. Ketil catchment using 100% of data sets in training phase
MODEL
Data Set
SWMM
TRAINING
SWMM
TEST Set 1
SWMM
TEST Set 2
SWMM
TEST Set 3
SWMM
TEST Set 4
SWMM
TEST Set 5
RMSE
(cumecs)
RRMSE
No. of Parameter
COC
( R2 )
MAPE
(%)
5
0.5848
0.2378
0.0595
28.78643
5
0.6421
0.3412
0.0210
29.07223
5
0.4532
0.5060
0.0105
37.01045
5
0.6210
0.5239
0.0102
29.79555
5
0.4543
0.3456
0.0247
54.5238
5
0.7432
0.3499
0.0122
28.9544
cumecs-meter cubic second; COC-correlation of coefficient
Table G4.54(b): Results of SWMM Model for Sg. Ketil catchment using 50% of data sets in training phase
MODEL
Data Set
SWMM
TRAINING
SWMM
TEST Set 1
SWMM
TEST Set 2
SWMM
TEST Set 3
SWMM
TEST Set 4
SWMM
TEST Set 5
RMSE
(cumecs)
RRMSE
No. of Parameter
COC
( R2 )
MAPE
(%)
5
0.4023
0.4087
0.0110
98.7632
5
0.3027
0.5010
0.0201
62.1008
5
0.3075
0.5021
0.0154
117.651
5
0.4033
0.4354
0.0144
91.0868
5
0.3202
0.6678
0.0231
83.5644
5
0.3356
0.6020
0.0187
100.5443
cumecs-meter cubic second; COC-correlation of coefficient
291
APPENDIX G
Daily and hourly results of application of SWMM model
Table G4.54(c): Results of SWMM Model for Sg. Ketil catchment using 25% of data sets in training phase
MODEL
Data Set
SWMM
TRAINING
SWMM
TEST Set 1
SWMM
TEST Set 2
SWMM
TEST Set 3
SWMM
TEST Set 4
SWMM
TEST Set 5
COC
( R2 )
RMSE
(cumecs)
RRMSE
No. of Parameter
MAPE
(%)
5
0.5645
0.4452
0.0188
30.3644
5
0.5643
0.5336
0.0232
32.0901
5
0.5874
0.6100
0.0166
30.7375
5
0.4986
0.5121
0.0194
32.6852
5
0.5412
0.6786
0.0343
28.8043
5
0.5547
0.5611
0.0233
27.9174
cumecs-meter cubic second; COC-correlation of coefficient
Table G4.55(a): Results of SWMM Model for Sg. Klang catchment using 100% of data sets in training phase
MODEL
Data Set
SWMM
TRAINING
SWMM
TEST Set 1
SWMM
TEST Set 2
SWMM
TEST Set 3
SWMM
TEST Set 4
SWMM
TEST Set 5
COC
( R2 )
RMSE
(cumecs)
RRMSE
No. of Parameter
MAPE
(%)
7
0.2102
12.7491
0.4675
42.2178
7
0.2033
15.9874
0.5732
45.9753
7
0.3209
17.4323
0.5258
50.6578
7
0.3754
17.0427
0.6361
47.9233
7
0.4211
19.6473
0.7006
55.8722
7
0.4039
20.0037
0.6984
51.8643
cumecs-meter cubic second; COC-correlation of coefficient
292
APPENDIX G
Daily and hourly results of application of SWMM model
Table G4.55(b): Results of SWMM Model for Sg. Klang catchment using 50% of data sets in training phase
MODEL
Data Set
SWMM
TRAINING
SWMM
TEST Set 1
SWMM
TEST Set 2
SWMM
TEST Set 3
SWMM
TEST Set 4
SWMM
TEST Set 5
COC
( R2 )
RMSE
(cumecs)
RRMSE
No. of Parameter
MAPE
(%)
7
0.2627
11.6734
0.5022
42.7446
7
0.2986
15.7943
0.5031
52.9911
7
0.3002
18.2109
0.6012
57.2412
7
0.3077
16.7822
0.5988
49.0133
7
0.3106
20.0109
0.6808
60.5186
7
0.4354
19.0574
0.6701
49.0076
cumecs-meter cubic second; COC-correlation of coefficient
Table G4.55(c): Results of SWMM Model for Sg. Klang catchment using 25% of data sets in training phase
MODEL
Data Set
SWMM
TRAINING
SWMM
TEST Set 1
SWMM
TEST Set 2
SWMM
TEST Set 3
SWMM
TEST Set 4
SWMM
TEST Set 5
RMSE
(cumecs)
RRMSE
No. of Parameter
COC
( R2 )
MAPE
(%)
7
0.4020
15.4677
0.6521
49.0921
7
0.3928
18.5466
0.6753
56.2100
7
0.3205
18.0912
0.6974
55.4325
7
0.3487
21.0231
0.7987
57.9862
7
0.3243
23.7544
0.6328
64.0881
7
0.4733
27.8754
0.9001
60.6781
cumecs-meter cubic second; COC-correlation of coefficient
293
APPENDIX G
Daily and hourly results of application of SWMM model
Table G4.56(a): Results of SWMM Model for Sg. Slim catchment using 100% of data sets in training phase
MODEL
Data Set
SWMM
TRAINING
SWMM
TEST Set 1
SWMM
TEST Set 2
SWMM
TEST Set 3
SWMM
TEST Set 4
SWMM
TEST Set 5
COC
( R2 )
RMSE
(cumecs)
RRMSE
No. of Parameter
MAPE
(%)
7
0.2160
0.0875
0.0015
110.3453
7
0.2768
0.0732
0.0018
91.20311
7
0.2069
0.0659
0.0020
80.6877
7
0.3103
0.1204
0.0029
148.3300
7
0.2059
0.0593
0.0010
107.5987
7
0.1805
0.0968
0.0013
113.0632
cumecs-meter cubic second; COC-correlation of coefficient
Table G4.56(b): Results of SWMM Model for Sg. Slim catchment using 50% of data sets in training phase
MODEL
Data Set
SWMM
TRAINING
SWMM
TEST Set 1
SWMM
TEST Set 2
SWMM
TEST Set 3
SWMM
TEST Set 4
SWMM
TEST Set 5
RMSE
(cumecs)
RRMSE
No. of Parameter
COC
( R2 )
MAPE
(%)
7
0.5402
0.0695
0.0018
30.8320
7
0.3402
0.1203
0.0017
61.3043
7
0.2103
0.0783
0.0019
51.0115
7
0.2960
0.1269
0.0012
51.5927
7
0.1975
0.0782
0.0011
80.8323
7
0.3029
0.1302
0.0021
101.4110
cumecs-meter cubic second; COC-correlation of coefficient
294
APPENDIX G
Daily and hourly results of application of SWMM model
Table G4.56(c): Results of SWMM Model for Sg. Slim catchment using 25% of data sets in training phase
MODEL
Data Set
COC
( R2 )
RMSE
(cumecs)
RRMSE
No. of Parameter
MAPE
(%)
7
0.3938
0.0683
0.0010
80.7394
7
0.3018
0.1110
0.0016
61.4202
7
0.2019
0.0843
0.0014
91.0500
7
0.2899
0.1200
0.0022
111.7020
7
0.1900
0.0684
0.1106
91.0903
7
0.3905
0.1520
0.0018
51.0997
SWMM
TRAINING
SWMM
TEST Set 1
SWMM
TEST Set 2
SWMM
TEST Set 3
SWMM
TEST Set 4
SWMM
TEST Set 5
cumecs-meter cubic second; COC-correlation of coefficient
Part B: Hourly Results
Table G4.58(a): Results of SWMM Model for Sg. Ketil catchment using 25% of data sets in training phase
MODEL
Data Set
SWMM
TRAINING
SWMM
TEST Set 1
SWMM
TEST Set 2
SWMM
TEST Set 3
SWMM
TEST Set 4
SWMM
TEST Set 5
RMSE
(cumecs)
RRMSE
No. of Parameter
COC
( R2 )
MAPE
(%)
6
0.1867
0.8102
0.0293
82.0927
6
0.4540
0.1403
0.102
50.4392
6
0.3828
0.2010
0.0192
50.6442
6
0.6685
0.1948
0.0110
30.5390
6
0.4102
0.3868
0.0200
51.0173
6
0.5983
0.2344
0.0194
30.6322
cumecs-meter cubic second; COC-correlation of coefficient
295
APPENDIX G
Daily and hourly results of application of SWMM model
Table G4.58(b): Results of SWMM Model for Sg. Ketil catchment using minimum data sets in training phase
MODEL
Data Set
SWMM
TRAINING
SWMM
TEST Set 1
SWMM
TEST Set 2
SWMM
TEST Set 3
SWMM
TEST Set 4
SWMM
TEST Set 5
COC
( R2 )
RMSE
(cumecs)
RRMSE
No. of Parameter
MAPE
(%)
6
0.6039
0.1109
0.0184
34.0837
6
0.4292
0.3029
0.0103
36.2025
6
0.2203
0.3039
0.0108
55.0239
6
0.6054
0.1463
0.0192
35.0252
6
0.2982
0.2929
0.0206
71.0491
6
0.5022
0.1948
0.0215
37.8588
cumecs-meter cubic second; COC-correlation of coefficient
Table G4.59(a): Results of SWMM Model for Sg. Klang catchment using 40% of data sets in training phase
MODEL
Data Set
SWMM
TRAINING
SWMM
TEST Set 1
SWMM
TEST Set 2
SWMM
TEST Set 3
SWMM
TEST Set 4
SWMM
TEST Set 5
RMSE
(cumecs)
RRMSE
No. of Parameter
COC
( R2 )
MAPE
(%)
6
0.1932
35.8969
2.0594
88.8810
6
0.4029
32.7570
1.3029
50.0010
6
0.8506
6.2733
0.0948
11.6454
6
0.1896
21.7890
0.7574
55.1276
6
0.2985
33.2093
0.6932
30.0553
6
0.4838
14.0026
0.3292
17.4222
cumecs-meter cubic second; COC-correlation of coefficient
296
APPENDIX G
Daily and hourly results of application of SWMM model
Table G4.59(b): Results of SWMM Model for Sg. Klang catchment using minimum data sets in training phase
MODEL
Data Set
SWMM
TRAINING
SWMM
TEST Set 1
SWMM
TEST Set 2
SWMM
TEST Set 3
SWMM
TEST Set 4
SWMM
TEST Set 5
COC
( R2 )
RMSE
(cumecs)
RRMSE
No. of Parameter
MAPE
(%)
6
0.1803
20.2293
0.9302
112.0912
6
0.6828
17.9695
0.2002
27.9887
6
0.7904
12.0092
0.1095
13.0123
6
0.3069
15.8282
0.8604
54.6571
6
0.4082
32.9920
0.5532
38.1953
6
0.3029
15.0290
0.2010
34.1208
cumecs-meter cubic second; COC-correlation of coefficient
Table G4.60(a): Results of SWMM Model for Sg. Slim catchment using 30% of data sets in training phase
MODEL
Data Set
SWMM
TRAINING
SWMM
TEST Set 1
SWMM
TEST Set 2
SWMM
TEST Set 3
SWMM
TEST Set 4
SWMM
TEST Set 5
COC
( R2 )
RMSE
(cumecs)
RRMSE
No. of Parameter
MAPE
(%)
7
0.3059
0.3312
0.0249
51.0921
7
0.4502
0.2090
0.0083
51.0322
7
0.2097
0.1110
0.0129
80.5021
7
0.0193
0.6022
0.0194
102.019
7
0.3020
0.3002
0.0294
30.2933
7
0.1099
0.2001
0.0122
60.4059
cumecs-meter cubic second; COC-correlation of coefficient
297
APPENDIX G
Daily and hourly results of application of SWMM model
Table G4.60(b): Results of SWMM Model for Sg. Slim catchment using minimum data sets in training phase
MODEL
Data Set
SWMM
TRAINING
SWMM
TEST Set 1
SWMM
TEST Set 2
SWMM
TEST Set 3
SWMM
TEST Set 4
SWMM
TEST Set 5
COC
( R2 )
RMSE
(cumecs)
RRMSE
No. of Parameter
MAPE
(%)
7
0.7039
0.2010
0.0105
21.0192
7
0.3594
0.3302
0.0193
51.1030
7
0.2019
0.2793
0.0110
70.5066
7
0.0295
0.4029
0.0392
111.7868
7
0.5094
0.4029
0.0102
11.0022
7
0.5731
0.2192
0.0090
10.9403
cumecs-meter cubic second; COC-correlation of coefficient
APPENDIX H
Daily and hourly results of calibration/training process
Part A: Daily Results
Table H4.61: The percentage bias (PBIAS) of model calibration/training
of Sungai Bekok catchment
Model
PBIAS (%)
Calibration
(a) 100%
(b) 50%
(c) 25%
3-MLP
+0.072
+0.602
+0.289
4-MLP
+0.072
+0.078
+0.099
RBF
-1.596
-0.913
-0.682
MLR
+280.61
+300.04
+290.68
HEC-HMS
+8.555
+0.947
+0.629
XP-SWMM
+6.286
+1.203
+0.678
Table H4.62: The percentage bias (PBIAS) of model calibration/training
of Sungai Ketil catchment
Model
PBIAS (%)
Calibration
(a) 100%
(b) 50%
(c) 25%
3-MLP
-0.206
-0.734
+0.325
4-MLP
-0.116
+0.410
+0.153
RBF
-0.098
-0.101
-0.089
MLR
-53.39
-47.62
-45.18
HEC-HMS
+0.064
+0.008
+0.008
XP-SWMM
+0.082
-0.011
-0.070
Table H4.63: The percentage bias (PBIAS) of model calibration/training
of Sungai Klang catchment
PBIAS (%)
Model
Calibration
(a) 100%
(b) 50%
(c) 25%
3-MLP
+0.278
-0.099
-1.842
4-MLP
-0.075
+0.111
+3.903
RBF
-5.439
-5.515
-5.824
MLR
-27.41
-11.50
-8.55
HEC-HMS
-35.01
-38.40
-26.43
XP-SWMM
-28.93
-25.03
-17.62
298
APPENDIX H
Daily and hourly results of calibration/training process
Table H4.64: The percentage bias (PBIAS) of model calibration/training
of Sungai Slim catchment
Model
PBIAS (%)
Calibration
(a) 100%
(b) 50%
(c) 25%
3-MLP
-0.003
+0.009
+0.036
4-MLP
+0.004
+0.001
+0.000
RBF
-0.023
-0.010
-0.007
MLR
+111.24
+147.14
+138.56
HEC-HMS
-0.019
-0.007
+0.006
XP-SWMM
+0.016
-0.008
-0.008
Part B: Hourly Results
Table H4.65: The percentage bias (PBIAS) of model calibration/training
of Sungai Bekok catchment
PBIAS (%)
Model
Calibration
(a) 100%
(b) 50%
(c) 25%
3-MLP
+0.002
+0.002
+0.010
4-MLP
+0.002
+0.002
+0.167
RBF
+0.003 (opt)
-0.054 (min)
HEC-HMS
+0.532
-6.490
XP-SWMM
+0.455
-5.332
Table H4.66: The percentage bias (PBIAS) of model calibration/training
of Sungai Ketil catchment
Model
PBIAS (%)
Calibration
(a) 100%
(b) 50%
(c) 25%
3-MLP
+0.003
-0.000
+0.041
4-MLP
+0.002
+0.000
+0.001
RBF
+0.000 (opt)
+0.007 (min)
HEC-HMS
-2.679
-0.010
XP-SWMM
-2.875
-0.011
299
APPENDIX H
Daily and hourly results of calibration/training process
Table H4.67: The percentage bias (PBIAS) of model calibration/training
of Sungai Klang catchment
Model
PBIAS (%)
Calibration
(a) 100%
(b) 50%
(c) 25%
3-MLP
-0.004
+0.101
+0.476
4-MLP
-0.018
-0.306
+0.242
RBF
-0.000 (opt)
+0.000 (min)
HEC-HMS
+57.19
+62.29
XP-SWMM
+55.01
+48.22
Table H4.68: The percentage bias (PBIAS) of model calibration/training
of Sungai Slim catchment
Model
PBIAS (%)
Calibration
(a) 100%
(b) 50%
(c) 25%
3-MLP
+0.000
+0.001
+0.001
4-MLP
+0.003
-0.000
-0.004
RBF
+0.000 (opt)
+0.000 (min)
HEC-HMS
-0.849
+1.055
XP-SWMM
-0.445
+0.992
300
APPENDIX I
301
Figures illustrate the daily and hourly result of ANN models
Figure I4.3(a)
Daily results of 3-Layer neural networks for Sg. Ketil catchment
using 100% of data sets in training phase
APPENDIX I
302
Figures illustrate the daily and hourly result of ANN models
Figure I4.3(b)
Daily results of 3-Layer neural networks for Sg. Ketil catchment
using 50% of data sets in training phase
APPENDIX I
303
Figures illustrate the daily and hourly result of ANN models
Figure I4.3(c)
Daily results of 3-Layer neural networks for Sg. Ketil catchment
using 25% of data sets in training phase
APPENDIX I
304
Figures illustrate the daily and hourly result of ANN models
Figure I4.4(a)
Daily results of 4-Layer neural networks for Sg. Ketil catchment
using 100% of data sets in training phase
APPENDIX I
305
Figures illustrate the daily and hourly result of ANN models
Figure I4.4(b)
Daily results of 4-Layer neural networks for Sg. Ketil catchment
using 50% of data sets in training phase
APPENDIX I
306
Figures illustrate the daily and hourly result of ANN models
Figure I4.4(c)
Daily results of 4-Layer neural networks for Sg. Ketil catchment
using 25% of data sets in training phase
APPENDIX I
Figures illustrate the daily and hourly result of ANN models
Figure I4.18(a)
Daily results of RBF networks for Sg. Ketil catchment using
100% of data sets in training phase
307
APPENDIX I
308
Figures illustrate the daily and hourly result of ANN models
Figure I4.18(b)
Daily results of RBF networks for Sg. Ketil catchment using
50% of data sets in training phase
APPENDIX I
309
Figures illustrate the daily and hourly result of ANN models
Figure I4.18(c)
Daily results of RBF networks for Sg. Ketil catchment using
25% of data sets in training phase
APPENDIX I
310
Figures illustrate the daily and hourly result of ANN models
Figure I4.5(a)
Daily results of 3-Layer neural networks for Sg. Klang catchment
using 100% of data sets in training phase
APPENDIX I
311
Figures illustrate the daily and hourly result of ANN models
Figure I4.5(b)
Daily results of 3-Layer neural networks for Sg. Klang catchment
using 50% of data sets in training phase
APPENDIX I
312
Figures illustrate the daily and hourly result of ANN models
Figure I4.5(c)
Daily results of 3-Layer neural networks for Sg. Klang catchment
using 25% of data sets in training phase
APPENDIX I
313
Figures illustrate the daily and hourly result of ANN models
Figure I4.6(a)
Daily results of 4-Layer neural networks for Sg. Klang catchment
using 100% of data sets in training phase
APPENDIX I
314
Figures illustrate the daily and hourly result of ANN models
Figure I4.6(b)
Daily results of 4-Layer neural networks for Sg. Klang catchment
using 50% of data sets in training phase
APPENDIX I
315
Figures illustrate the daily and hourly result of ANN models
Figure I4.6(c)
Daily results of 4-Layer neural networks for Sg. Klang catchment
using 25% of data sets in training phase
APPENDIX I
Figures illustrate the daily and hourly result of ANN models
Figure I4.19(a)
Daily results of RBF networks for Sg. Klang catchment using
100% of data sets in training phase
316
APPENDIX I
317
Figures illustrate the daily and hourly result of ANN models
Figure I4.19(b)
Daily results of RBF networks for Sg. Klang catchment using
50% of data sets in training phase
APPENDIX I
318
Figures illustrate the daily and hourly result of ANN models
Figure I4.19(c)
Daily results of RBF networks for Sg. Klang catchment using
25% of data sets in training phase
APPENDIX I
319
Figures illustrate the daily and hourly result of ANN models
Figure I4.7(a)
Daily results of 3-Layer neural networks for Sg. Slim catchment
using 100% of data sets in training phase
APPENDIX I
320
Figures illustrate the daily and hourly result of ANN models
Figure I4.7(b)
Daily results of 3-Layer neural networks for Sg. Slim catchment
using 50% of data sets in training phase
APPENDIX I
321
Figures illustrate the daily and hourly result of ANN models
Figure I4.7(c)
Daily results of 3-Layer neural networks for Sg. Slim catchment
using 25% of data sets in training phase
APPENDIX I
322
Figures illustrate the daily and hourly result of ANN models
Figure I4.8(a)
Daily results of 4-Layer neural networks for Sg. Slim catchment
using 100% of data sets in training phase
APPENDIX I
323
Figures illustrate the daily and hourly result of ANN models
Figure I4.8(b)
Daily results of 4-Layer neural networks for Sg. Slim catchment
using 50% of data sets in training phase
APPENDIX I
324
Figures illustrate the daily and hourly result of ANN models
Figure I4.8(c)
Daily results of 4-Layer neural networks for Sg. Slim catchment
using 25% of data sets in training phase
APPENDIX I
Figures illustrate the daily and hourly result of ANN models
Figure I4.20(a)
Daily results of RBF networks for Sg. Slim catchment using
100% of data sets in training phase
325
APPENDIX I
326
Figures illustrate the daily and hourly result of ANN models
Figure I4.20(b)
Daily results of RBF networks for Sg. Slim catchment using
50% of data sets in training phase
APPENDIX I
327
Figures illustrate the daily and hourly result of ANN models
Figure I4.20(c)
Daily results of RBF networks for Sg. Slim catchment using
25% of data sets in training phase
APPENDIX I
328
Figures illustrate the daily and hourly result of ANN models
Figure I4.11(a)
Hourly results of 3-Layer neural networks for Sg. Ketil catchment
using 100% of available data sets in training phase
APPENDIX I
329
Figures illustrate the daily and hourly result of ANN models
Figure I4.11(b)
Hourly results of 3-Layer neural networks for Sg. Ketil catchment
using 65% of available data sets in training phase
APPENDIX I
330
Figures illustrate the daily and hourly result of ANN models
Figure I4.11(c)
Hourly results of 3-Layer neural networks for Sg. Ketil catchment
using 25% of available data sets in training phase
APPENDIX I
331
Figures illustrate the daily and hourly result of ANN models
Figure I4.12(a)
Hourly results of 4-Layer neural networks for Sg. Ketil catchment
using 100% of available data sets in training phase
APPENDIX I
332
Figures illustrate the daily and hourly result of ANN models
Figure I4.12(b)
Hourly results of 4-Layer neural networks for Sg. Ketil catchment
using 65% of available data sets in training phase
APPENDIX I
333
Figures illustrate the daily and hourly result of ANN models
Figure I4.12(c)
Hourly results of 4-Layer neural networks for Sg. Ketil catchment
using 25% of available data sets in training phase
APPENDIX I
334
Figures illustrate the daily and hourly result of ANN models
Figure I4.22(a)
Hourly results of RBF networks for Sg. Ketil catchment using
25% of available data sets in training phase
APPENDIX I
335
Figures illustrate the daily and hourly result of ANN models
Figure I4.22(b)
Hourly results of RBF networks for Sg. Ketil catchment using
min of available data sets in training phase
APPENDIX I
336
Figures illustrate the daily and hourly result of ANN models
Figure I4.13(a)
Hourly results of 3-Layer neural networks for Sg. Klang catchment
using 100% of available data sets in training phase
APPENDIX I
337
Figures illustrate the daily and hourly result of ANN models
Figure I4.13(b)
Hourly results of 3-Layer neural networks for Sg. Klang catchment
using 65% of available data sets in training phase
APPENDIX I
338
Figures illustrate the daily and hourly result of ANN models
Figure I4.13(c)
Hourly results of 3-Layer neural networks for Sg. Klang catchment
using 25% of available data sets in training phase
APPENDIX I
339
Figures illustrate the daily and hourly result of ANN models
Figure I4.14(a)
Hourly results of 4-Layer neural networks for Sg. Klang catchment
using 100% of available data sets in training phase
APPENDIX I
340
Figures illustrate the daily and hourly result of ANN models
Figure I4.14(b)
Hourly results of 4-Layer neural networks for Sg. Klang catchment
using 65% of available data sets in training phase
APPENDIX I
341
Figures illustrate the daily and hourly result of ANN models
Figure I4.14(c)
Hourly results of 4-Layer neural networks for Sg. Klang catchment
using 25% of available data sets in training phase
APPENDIX I
342
Figures illustrate the daily and hourly result of ANN models
Figure I4.23(a)
Hourly results of RBF networks for Sg. Klang catchment using
25% of available data sets in training phase
APPENDIX I
343
Figures illustrate the daily and hourly result of ANN models
Figure I4.23(b)
Hourly results of RBF networks for Sg. Klang catchment using
min of available data sets in training phase
APPENDIX I
344
Figures illustrate the daily and hourly result of ANN models
Figure I4.15(a)
Hourly results of 3-Layer neural networks for Sg. Slim catchment
using 100% of available data sets in training phase
APPENDIX I
345
Figures illustrate the daily and hourly result of ANN models
Figure I4.15(b)
Hourly results of 3-Layer neural networks for Sg. Slim catchment
using 65% of available data sets in training phase
APPENDIX I
346
Figures illustrate the daily and hourly result of ANN models
Figure I4.15(c)
Hourly results of 3-Layer neural networks for Sg. Slim catchment
using 25% of available data sets in training phase
APPENDIX I
347
Figures illustrate the daily and hourly result of ANN models
Figure I4.16(a)
Hourly results of 4-Layer neural networks for Sg. Slim catchment
using 100% of available data sets in training phase
APPENDIX I
348
Figures illustrate the daily and hourly result of ANN models
Figure I4.16(b)
Hourly results of 4-Layer neural networks for Sg. Slim catchment
using 65% of available data sets in training phase
APPENDIX I
349
Figures illustrate the daily and hourly result of ANN models
Figure I4.16(c)
Hourly results of 4-Layer neural networks for Sg. Slim catchment
using 25% of available data sets in training phase
APPENDIX I
350
Figures illustrate the daily and hourly result of ANN models
Figure I4.24(a)
Hourly results of RBF networks for Sg. Slim catchment using
25% of available data sets in training phase
APPENDIX I
Figures illustrate the daily and hourly result of ANN models
Figure I4.24(b)
Hourly results of RBF networks for Sg. Slim catchment using
min of available data sets in training phase
351
APPENDIX J
Architecture of daily and hourly MLP network structures
Figure J4.26(a): The 3-layer MLP network structures of the daily model for
Sg. Ketil catchment.
Figure J4.26(b): The 4-layer MLP network structures of the daily model for
Sg. Ketil catchment.
352
APPENDIX J
Architecture of daily and hourly MLP network structures
Figure J4.27(a): The 3-layer MLP network structures of the daily model for
Sg. Klang catchment.
Figure J4.27(b): The 4-layer MLP network structures of the daily model for
Sg. Klang catchment.
353
APPENDIX J
Architecture of daily and hourly MLP network structures
Figure J4.28(a): The 3-layer MLP network structures of the daily model for
Sg. Slim catchment.
Figure J4.28(b): The 4-layer MLP network structures of the daily model for
Sg. Slim catchment.
354
APPENDIX J
Architecture of daily and hourly MLP network structures
Figure J4.30(a): The 3-layer MLP network structures of the hourly model for
Sg. Ketil catchment.
Figure J4.30(b): The 4-layer MLP network structures of the hourly model for
Sg. Ketil catchment.
355
APPENDIX J
Architecture of daily and hourly MLP network structures
Figure J4.31(a): The 3-layer MLP network structures of the hourly model for
Sg. Klang catchment.
Figure J4.31(b): The 4-layer MLP network structures of the hourly model for
Sg. Klang catchment.
356
APPENDIX J
Architecture of daily and hourly MLP network structures
Figure J4.32(a): The 3-layer MLP network structures of the hourly model for
Sg. Slim catchment.
Figure J4.32(b): The 4-layer MLP network structures of the hourly model for
Sg. Slim catchment.
357
Download