RAINFALL-RUNOFF MODELLING USING ARTIFICIAL NEURAL NETWORK METHOD NOR IRWAN BIN AHMAT NOR A thesis submitted in fulfilment of the requirements for the award of the degree of Doctor of Philosophy Faculty of Civil Engineering Universiti Teknologi Malaysia AUGUST 2005 PSZ 19:16 (Pind. 1/97) UNIVERSITI TEKNOLOGI MALAYSIA BORANG PENGESAHAN STATUS TESISυ JUDUL: RAINFALL-RUNOFF MODELLING USING ARTIFICIAL NEURAL NETWORK METHOD SESI PENGAJIAN: 2001/02 Saya NOR IRWAN BIN AHMAT NOR mengaku membenarkan tesis (PSM/Sarjana/Doktor Falsafah)* ini disimpan di Perpustakaan Universiti Teknologi Malaysia dengan syarat-syarat kegunaan seperti berikut: 1. 2. 3. 4. Tesis adalah hakmilik Universiti Teknologi Malaysia. Perpustakaan Universiti Teknologi Malaysia dibenarkan membuat salinan untuk tujuan pengajian sahaja. Perpustakaan dibenarkan membuat salinan tesis ini sebagai bahan pertukaran antara institusi pengajian tinggi. **Sila tandakan (4) √ SULIT (Mengandungi maklumat yang berdarjah keselamatan atau kepentingan Malaysia seperti yang termaktub di dalam AKTA RAHSIA RASMI 1972) TERHAD (Mengandungi maklumat TERHAD yang telah ditentukan oleh organisasi/badan di mana penyelidikan dijalankan) TIDAK TERHAD Disahkan oleh [ Signed ] [ Signed ] (TANDATANGAN PENULIS) (TANDATANGAN PENYELIA) Alamat Tetap: NO. 46, LOT 11545, TAMAN SRI PERKASA 93050 JALAN MATANG KUCHING, SARAWAK PROF. MADYA DR. SOBRI BIN HARUN Nama Penyelia Tarikh: 19 OGOS 2005 Tarikh: 19 OGOS 2005 CATATAN: * Potong yang tidak berkenaan. ** Jika tesis ini SULIT atau TERHAD, sila lampirkan surat daripada pihak berkuasa/organisasi berkenaan dengan menyatakan sekali sebab dan tempoh tesis ini perlu dikelaskan sebagai SULIT atau TERHAD. υ Tesis dimaksudkan sebagai tesis bagi Ijazah Doktor Falsafah dan Sarjana secara penyelidikan, atau disertasi bagi pengajian secara kerja kursus dan penyelidikan, atau Laporan Projek Sarjana Muda (PSM). SUPERVISOR’S DECLARATION “We hereby declare that we have read this thesis and in our opinion this thesis is sufficient in terms of scope and quality for the award of the degree of Doctor of Philosophy” Signature : [ Signed ] Name of Supervisor I : ASSOC. PROF. DR. SOBRI BIN HARUN Date : AUGUST, 2005 Signature : [ Signed ] Name of Supervisor II : PROF. IR. DR. AMIR HASHIM BIN MOHD. KASSIM Date : AUGUST, 2005 DECLARATION “I declare that this thesis entitled “Rainfall-runoff modelling using artificial neural network method” is the result of my research except as cited in references. The thesis has not been accepted for any degree and is not concurrently submitted in candidature of any degree” Signature : [ Signed ] Name of Candidate : NOR IRWAN BIN AHMAT NOR Date : AUGUST, 2005 DEDICATION “Dan sesungguhnya tiadalah seseorang itu memperolehi melainkan apa yang telah diusahakannya” (Al-Najm: 39) I pay my most humble gratitude to Allah Subhanahuwataala for blessing me with good health and spirit to undertake and complete this study. To my beloved mother and father i ACKNOWLEDGEMENT I would like to express my sincerest and deepest appreciation and thanks to my supervisors, Assoc. Prof. Dr. Sobri Bin Harun (UTM) and Prof. Ir. Dr. Amir Hashim Bin Mohd. Kassim (KUiTTHO) for their guidance and kind encouragement throughout the length of this research. High gratitude I intend to the authorities of the Universiti Teknologi Malaysia, Skudai, Johor Darul Takzim. I would like also to express my gratitude and sincere thanks to the Ministry of Science, Technology, and Environmental that provided financial support during my study in the Universiti Teknologi Malaysia. I would like to thanks to the office staff of Sekolah Pengajian Siswazah (SPS) and Graduate Studies Committee, Faculty of Civil Engineering for their support and their good management for the students. My thanks also to the office staff of Hydrology Division, Department of Irrigation and Drainage (DID) Malaysia for providing me the data for my study and a good advice, and also to colleagues and friends who have given me invaluable assistance throughout my research work. Most important of all, I am deeply indebted to my parent for providing me the peace of mind to pursue knowledge and at the same time being close at hand to render love, comfort, and support. My family has been the source of my perseverance with the research at times all seemed lost. ii ABSTRACT Rainfall and surface runoff are the driving forces behind all stormwater studies and designs. The relationship is known to be highly non-linear and complex that is dependent on numerous factors. In order to overcome the problems on the non-linearity and lack of information in rainfall-runoff modelling, this study introduced the Artificial Neural Network (ANN) approach to model the dynamic of rainfall-runoff processes. The ANN method behaved as the black-box model and proven could handle the non-linearity processes in complex system. Numerous structures of ANN models were designed to determine the relationship between the daily and hourly rainfall against corresponding runoff. Therefore, the desired runoff could be predicted using the rainfall data, based on the relationship established by the ANN training computation. The ANN architecture is simple and it considers only the rainfall and runoff data as variables. The internal processes that control the rainfall to runoff transformation will be translated into ANN weights. Once the architecture of the network is defined, weights are calculated so as to represent the desired output through a learning process where the ANN is trained to obtain the expected results. Two types of ANN architectures are recommended and they are namely the multilayer perceptron (MLP) and radial basis function (RBF) networks. Several catchments such as Sungai Bekok, Sungai Ketil, Sungai Klang and Sungai Slim were selected to test the methodology. The model performance was evaluated by comparing to the actual observed flow series. Further, the ANN results were compared against the results produced from the application of HEC-HMS, XP-SWMM and multiple linear regression (MLR). It had been found that the ANN could predict runoff accurately, with good correlation between the observed and predicted values compared to the MLR, XP-SWMM and HEC-HMS models. Obviously, the ANN application to model the daily and hourly streamflow hydrograph was successful. iii ABSTRAK Hujan dan airlarian permukaan merupakan daya penggerak kepada semua kajian dan rekabentuk berkaitan ributhujan. Diketahui umum bahawa perhubungan antara keduanya adalah taklinear dan komplek yang mana bergantung kepada banyak faktor. Bagi menyelesaikan masalah akibat kekurangan maklumat dan ketaklinearan hubungan antara hujan dan airlarian, maka kajian ini memperkenalkan kaedah atau pendekatan rangkaian neural buatan (ANN) untuk memodelkan proses dinamik hubungan tersebut. Kaedah ANN bercirikan model ‘kotak hitam’ dan telah dibuktikan bahawa ianya boleh menghadapi proses taklinear dalam sistem yang komplek ini. Pelbagai struktur bagi model ANN telah direkabentuk untuk mendapatkan perhubungan harian dan jam yang selaras dengan hubungan hujan dengan airlarian. Dengan itu, data airlarian sebenar boleh diramal menggunakan data hujan berdasarkan kepada hubungan yang telah dikenalpasti perkiraannya melalui proses latihan dalam ANN. Senibina ANN adalah mudah kerana ia mengambilkira data hujan dan airlarian sebagai pembolehubah. Proses dalaman yang mengawal transformasi hujan kepada airlarian dapat diterjemahkan melalui pemberatpemberat pada ANN. Setelah senibina rangkaian ANN dikenalpasti dan pemberatpemberat ditentukan, ia akan dapat menterjemahkan keluaran sebenar melalui proses pembelajaran yang mana ANN telah dilatih untuk mendapatkan keputusan seperti yang dijangkakan. Dua jenis senibina ANN telah dicadangkan iaitu kaedah rangkaian perseptron pelbagai lapisan (MLP) dan fungsi asas jejarian (RBF). Beberapa kawasan tadahan iaitu kawasan tadahan Sungai Bekok, Sungai Ketil, Sungai Klang dan Sungai Slim telah dipilih untuk menguji metodologi ini. Keupayaan model dinilai dengan membandingkannya dengan siri-siri aliran cerapan sebenar. Seterusnya, keputusan ANN ini dibandingkan dengan keputusan yang diperolehi dari aplikasi HEC-HMS, SWMM dan regresi linear berbilang (MLR). Didapati bahawa, ANN boleh meramalkan airlarian setepatnya dengan korelasi yang baik antara nilai cerapan sebenar dengan nilai ramalan berbanding model-model MLR, XP-SWMM dan HEC-HMS. Jelasnya, aplikasi ANN untuk permodelan hidrograf aliran sungai bagi sela masa harian dan jam dapat dilaksanakan dengan jayanya. iv TABLE OF CONTENTS CHAPTER TITLE PAGE DECLARATION 1 ACKNOWLEDGEMENTS i ABSTRACT ii ABSTRAK iii TABLE OF CONTENTS iv LIST OF TABLES ix LIST OF FIGURES xiv LIST OF SYMBOLS xvii LIST OF APPENDICES xxii INTRODUCTION 1.1 Background of Study 1 1.2 Statement of the Problem 5 1.3 Study Objectives 8 1.4 Research Approach and Scope of Work 9 1.5 10 Significance of the Study 1.6 Structure of the Thesis 11 v 2 LITERATURE REVIEW 2.1 General 13 2.2 Rainfall-Runoff Process and Relationship 14 2.3 Review of Hydrologic Modelling 18 2.4 Rainfall-Runoff Models 22 2.5 Artificial Neural Network 27 2.5.1 Basic Structure 30 2.5.2 Transfer Function 32 2.5.3 Back-propagation Algorithm 34 2.5.4 Learning or Training 35 2.6 Neural Network Application 37 2.7 Neural Network Modelling in Hydrology 2.8 3 and Water Resources 38 2.7.1 Versatility of Neural Network Method 44 Bivariate Linear Regression and Correlation in Hydrology 45 2.8.1 Fitting Regression Equations 48 2.9 Review on HEC-HMS Model 50 2.10 Review on XP-SWMM Model 55 2.11 Summary of Literature Review 57 RESEARCH METHODOLOGY 3.1 Introduction 59 3.2 Multilayer Perceptron (MLP) Model 60 3.2.1 Training of ANN 67 3.2.2 69 3.3 3.4 Selection of Network Structures Radial Basis Function (RBF) Model 70 3.3.1 71 Training RBF Networks Multiple Linear Regression (MLR) Model 74 vi 3.5 76 3.5.1 Evaporation and Transpiration 77 3.5.2 Computing of Runoff Volumes 77 3.5.3 Modelling of Direct Runoff 80 3.6 XP-SWMM Model 85 3.7 Calibration of Distributed Models 89 3.8 Evaluation of the Model 90 3.8.1 Goodness of Fit Tests 90 3.8.2 Missing Data and the Outliers 93 The Study Area 94 3.9.1 Selection of Training and Testing Data 95 3.9.2 The Sungai Bekok Catchment 97 3.9.3 The Sungai Ketil Catchment 99 3.9 3.10 4 HEC-HMS Model 3.9.4 The Sungai Klang Catchment 101 3.9.5 103 The Sungai Slim Catchment Computer Packages 106 RESULTS AND DISCUSSIONS 4.1 General 107 4.2 Results of the Multilayer Perceptron (MLP) Model 108 4.2.1 Results of Daily MLP Model 108 4.2.2 Results of Hourly MLP Model 117 4.2.3 Training and Validation 125 4.2.4 Testing 126 4.2.5 Robustness Test 128 Results of the Radial Basis Function (RBF) Model 128 4.3.1 Results of Daily RBF Model 129 4.3.2 Results of Hourly RBF Model 132 4.3 vii 4.4 4.5 4.6 4.7 4.3.3 Training and Validation 135 4.3.4 Testing 136 4.3.5 Robustness Test 137 Results of the Multiple Linear Regression (MLR) Model 138 4.4.1 Calibration 138 4.4.2 Results of Daily MLR Model 143 4.4.3 Verification 146 4.4.4 Robustness Test 146 Results of the HEC-HMS Model 147 4.5.1 Calibration 148 4.5.2 Results of Daily HEC-HMS Model 152 4.5.3 156 Results of Hourly HEC-HMS Model 4.5.4 Verification 158 4.5.5 Robustness Test 159 Results of the SWMM Model 160 4.6.1 Calibration 161 4.6.2 Results of Daily SWMM Model 165 4.6.3 169 Results of Hourly SWMM Model 4.6.4 Verification 171 4.6.5 Robustness Test 172 Discussions on the Rainfall-Runoff Modelling 173 4.7.1 Basic Model Structure 176 4.7.2 Model Performance 184 4.7.3 Transfer Function and Algorithm 188 4.7.4 Robustness and Model Limitation 190 4.7.5 River Basin Characteristics 193 4.7.6 Time Interval 195 viii 5 CONCLUSIONS AND RECOMMENDATIONS 5.1 General 216 5.2 Conclusions 217 5.3 Recommendations for future work 220 REFERENCES Appendices A-J 223 241-357 ix LIST OF TABLES TABLE NO. TITLE 3.1 Infiltration rates by the soil groups 3.2 Rain Gauges used in calibration and verification of the models for Sg. Bekok catchment 3.3 103 104 Results of 3 Layer neural networks for Sg. Bekok catchment-using 100% of data sets in training phase 4.1(b) 101 Rain Gauges used in calibration and verification of the models for Sg. Slim catchment 4.1(a) 98 Rain Gauges used in calibration and verification of the models for Sg. Klang catchment 3.5 79 Rain Gauges used in calibration and verification of the models for Sg. Ketil catchment 3.4 PAGE 109 Results of 3 Layer neural networks for Sg. Bekok catchment-using 50% of data sets in training phase 110 x 4.1(c) Results of 3 Layer neural networks for Sg. Bekok catchment-using 25% of data sets in training phase 4.2(a) Results of 4 Layer neural networks for Sg. Bekok catchment-using 100% of data sets in training phase 4.2(b) 120 Results of 4 Layer neural networks for Sg. Bekok catchment -using 65% of available data sets in training phase 4.10(c) 119 Results of 4 Layer neural networks for Sg. Bekok catchment -using 100% of available data sets in training phase 4.10(b) 119 Results of 3 Layer neural networks for Sg. Bekok catchment -using 25% of available data sets in training phase 4.10(a) 118 Results of 3 Layer neural networks for Sg. Bekok catchment -using 65% of available data sets in training phase 4.9(c) 112 Results of 3 Layer neural networks for Sg. Bekok catchment -using 100% of available data sets in training phase 4.9(b) 112 Results of 4 Layer neural networks for Sg. Bekok catchment-using 25% of data sets in training phase 4.9(a) 111 Results of 4 Layer neural networks for Sg. Bekok catchment-using 50% of data sets in training phase 4.2(c) 110 121 Results of 4 Layer neural networks for Sg. Bekok catchment -using 25% of available data sets in training phase 121 xi 4.17(a) Results of RBF networks for Sg. Bekok catchment -using 100% of data sets in training phase 4.17(b) Results of RBF networks for Sg. Bekok catchment -using 50% of data sets in training phase 4.17(c) 144 Calibration Coefficients of Sg. Bekok catchment -using 100% of data 4.29(b) 144 Results of MLR Model for Sg. Bekok catchment -using 25% of data sets in training phase 4.29(a) 143 Results of MLR Model for Sg. Bekok catchment -using 50% of data sets in training phase 4.25(c) 133 Results of MLR Model for Sg. Bekok catchment -using 100% of data sets in training phase 4.25(b) 133 Results of RBF networks for Sg. Bekok catchment -using minimum data sets in training phase 4.25(a) 130 Results of RBF networks for Sg. Bekok catchment -using 25% of available data sets in training phase 4.21(b) 130 Results of RBF networks for Sg. Bekok catchment -using 25% of data sets in training phase 4.21(a) 129 150 Calibration Coefficients of Sg. Bekok catchment -using 50% of data 151 xii 4.29(c) Calibration Coefficients of Sg. Bekok catchment -using 25% of data 4.33(a) Calibration Coefficients of Sg. Bekok catchment -using 25% of data 4.33(b) 157 Calibration Coefficients of Sg. Bekok catchment -using 100% of data 4.45(b) 157 Results of HEC-HMS Model for Sg. Bekok catchment -using minimum data sets in training phase 4.45(a) 154 Results of HEC-HMS Model for Sg. Bekok catchment -using 25% of data sets in training phase 4.41(b) 154 Results of HEC-HMS Model for Sg. Bekok catchment -using 25% of data sets in training phase 4.41(a) 153 Results of HEC-HMS Model for Sg. Bekok catchment -using 50% of data sets in training phase 4.37(c) 152 Results of HEC-HMS Model for Sg. Bekok catchment -using 100% of data sets in training phase 4.37(b) 152 Calibration Coefficients of Sg. Bekok catchment -using minimum data 4.37(a) 151 163 Calibration Coefficients of Sg. Bekok catchment -using 50% of data 163 xiii 4.45(c) Calibration Coefficients of Sg. Bekok catchment -using 25% of data 4.49(a) Calibration Coefficients of Sg. Bekok catchment -using 25% of data 4.49(b) 167 Results of SWMM Model for Sg. Bekok catchment -using 25% of data sets in training phase 4.57(b) 166 Results of SWMM Model for Sg. Bekok catchment -using 25% of data sets in training phase 4.57(a) 166 Results of SWMM Model for Sg. Bekok catchment -using 50% of data sets in training phase 4.53(c) 165 Results of SWMM Model for Sg. Bekok catchment -using 100% of data sets in training phase 4.53(b) 165 Calibration Coefficients of Sg. Bekok catchment -using minimum data 4.53(a) 164 169 Results of SWMM Model for Sg. Bekok catchment -using minimum data sets in training phase 170 xiv LIST OF FIGURES FIGURE NO. 2.1 TITLE PAGE A schematic outline of the different steps in the modelling process 25 2.2 Simple mathematical model of a neuron 29 2.3 A three-layer neural network with i inputs and k outputs 31 2.4 A threshold-logic transfer function 33 2.5 A hard-limit transfer function 33 2.6 Continuous transfer function: (a) the sigmoid, (b) the hyperbolic tangent 33 2.7 The gaussian function 33 2.8 Steps in training and testing 37 2.9 Typical HEC-HMS representation of watershed runoff 53 3.1 Structure of a MLP rainfall-runoff model with one hidden layer 61 xv 3.2 Hyperbolic-tangent (tansig) activation function 64 3.3 The structure of RBF Model 71 3.4 The Sungai Bekok catchment area 99 3.5 The Sungai Ketil catchment area 100 3.6 The Sungai Klang catchment area 102 3.7 The Sungai Slim catchment area 105 4.1(a) Daily results of 3-layer neural networks for Sg. Bekok catchment using 100% of data sets in training phase 4.1(b) Daily results of 3-layer neural networks for Sg. Bekok catchment using 50% of data sets in training phase 4.1(c) 203 Daily results of 4-layer neural networks for Sg. Bekok catchment using 25% of data sets in training phase 4.9(a) 202 Daily results of 4-layer neural networks for Sg. Bekok catchment using 50% of data sets in training phase 4.2(c) 201 Daily results of 4-layer neural networks for Sg. Bekok catchment using 100% of data sets in training phase 4.2(b) 200 Daily results of 3-layer neural networks for Sg. Bekok catchment using 25% of data sets in training phase 4.2(a) 199 204 Hourly results of 3-layer neural networks for Sg. Bekok catchment using 100% of data sets in training phase 205 xvi 4.9(b) Hourly results of 3-layer neural networks for Sg. Bekok catchment using 65% of data sets in training phase 4.9(c) Hourly results of 3-layer neural networks for Sg. Bekok catchment using 25% of data sets in training phase 4.10(a) 213 Hourly results of RBF networks for Sg. Bekok catchment using 25% of data sets in training phase 4.21(b) 212 Daily results of RBF networks for Sg. Bekok catchment using 25% of data sets in training phase 4.21(a) 211 Daily results of RBF networks for Sg. Bekok catchment using 50% of data sets in training phase 4.17(c) 210 Daily results of RBF networks for Sg. Bekok catchment using 100% of data sets in training phase 4.17(b) 209 Hourly results of 4-layer neural networks for Sg. Bekok catchment using 25% of data sets in training phase 4.17(a) 208 Hourly results of 4-layer neural networks for Sg. Bekok catchment using 65% of data sets in training phase 4.10(c) 207 Hourly results of 4-layer neural networks for Sg. Bekok catchment using 100% of data sets in training phase 4.10(b) 206 214 Hourly results of RBF networks for Sg. Bekok catchment using min of available data sets in training phase 215 xvii LIST OF SYMBOLS net j - a summation of weighted input for the j th neurons Wij - a weight from the i th neuron in the previous layer to the j th neuron in the current layer Xi - the input form the i th to the j th neuron x, y - the variables for their population linear regressions b1 , b2 - the tangents of slope angles of the two regression lines a1 , a 2 - the intercepts α - learning rate parameter μ - momentum parameter xi - input rainfall variables yj - output signal from rainfall y _ in j - sum of weighted input signals w0 j - weight for the bias wij - weight between input layer and hidden layer f (t ) - hyperbolic-tangent function x _ in k - weighted input signals c 0( k ) - weight for the bias c (kj ) - weight between second layer and third layer f ( z _ in j ) - output signal from rainfall xviii zj - input signal or rainfall δk - error information term Δc (kj ) - weight correction term Δc 0( k ) - bias correction term tk - target neural network output r y (k ) - neural network output δ _ in j - delta inputs Δ w0 j - bias correction term wij (new) - updates bias and weights Δc (jk ) (t + 1) - update weight for bias with momentum Δwij (t + 1) - update weight for backpropagation with momentum η - learning rate E min - minimum error H - Hessian matrix J - Jacobian matrix E - sum of squares function g - gradient JT - transposition of J e - vector of network errors wk - vector of current weights and biases gk - current gradient y (t ) - runoff at the present time x(t ) - rainfall at present time x(t − i ) - rainfall at previous time y (x) - output with input vector x c - centre ℜ - metric xix rj - Euclidean length φ - transfer function T - transposition I - interposes y - datum vector Y v y (k ) - radial centre - output layer with linear combination of φ (r j ) y' - prediction of the actual output x - input vector yi - actual output n - length of input vector p - set of input pattern stored y ij - desired output y 'j - predicted output component xi - stored pattern W ( x, x i ) - the weight D - distance function σk - sigma value Nj - the summation units computes y - dependent variable xi - independent variables a, b - constants e - random variable x ki - value of independent variable x k n - number of observations α, β - coefficients S - summation of square function ( j) xx PMAP - total storm mean areal precipitation pi (t ) - precipitation depth at time t at gage i fc - rate of precipitation loss pet - the excess precipitation at time t Ia - initial loss Pe - accumulated precipitation P - accumulated rainfall depth S - potential maximum retention Ai - the drainage area of subdivision i Qn - storm hydrograph ordinate Pm - depth U n −m +1 - dimensions of flow rate per unit depth Up - UH peak discharge Tp - the time to UH peak C - the conversion constant tc - time of concentration It - average inflow to storage Ot - outflow from storage St - storage at time t R - constant linear C A , CB - routing coefficients Ot - average outflow A - the drainage area L - the distance from the upper end of the plane to the point of interest n - the Manning resistance coefficient S - dimensionless slope of the surface N - basin roughness xxi Qp - the peak discharge tp - the time to peak C - constant R2 - correlation of coefficient Q0 - actual observed streamflow Qs - model simulated streamflow n - is the number observed streamflow xxii LIST OF APPENDICES APPENDIX TITLE PAGE A Daily and hourly results of MLP model 241 B Daily and hourly results of RBF model 259 C Results of application of MLR model 267 D Daily and hourly results of the HEC-HMS model calibration 272 E Daily and hourly results of application of HEC-HMS model 277 F Daily and hourly results of the SWMM model calibration 285 G Daily and hourly results of application of SWMM model 290 H Daily and hourly results of PBIAS 298 I Figures illustrate the daily and hourly result of ANN models 301 J The architecture of MLP network structures 352 CHAPTER 1 INTRODUCTION 1.1 Background of Study Hydrologists are often confronted with problems of prediction and estimation of runoff, precipitation, contaminant concentrations, water stages, and so on (ASCE, 2000). Moreover, engineers are often faced with real situations where little or no information is available. The processes and relationship between rainfall and surface runoff for a catchment area require good understanding, as a necessary pre-requisite for preparing satisfactory drainage and stormwater management projects. In the hydrological cycle, the rainfall occurs and reaching the ground may collect to form surface runoff or it may infiltrate into the ground. The surface runoff and groundwater flow join together in surface streams and rivers which finally flow into the ocean. Most of hydrologic processes has a high degree of temporal and spatial variability, and are further plagued by issues of non-linearity of physical processes, conflicting spatial and temporal scales, and uncertainty in parameter estimates. That the reason why our understanding in many areas especially in hydrologic processes is far from perfect. So that empiricism plays an important role in modelling studies. Hydrologists strive to provide rational answers to problems that arise in design and management of water resources projects. As modern computers become ever more powerful, researchers continue testing and evaluating a new approach of solving problems efficiently. 2 A problem commonly encountered in the stormwater design project is the determination of the design flood. Design flood estimation using established methodology is relatively simple when records of streamflow or runoff and rainfall are available for the catchment concerned. The quantity of runoff resulting from a given rainfall event depends on a number of factors such as initial moisture, land use, and slope of the catchments, as well as intensity, distribution, and duration of the rainfall. Knowledge on the characteristics of rainfall-runoff relationship is essential for risk and reliability analysis of water resources projects. Since the 1930s, numerous rainfall-runoff models have been developed to forecast streamflow. For example, conceptual models provide daily, monthly, or seasonal estimates of streamflow for long term forecasting on a continuous basis. Sherman (1932) defined the unit graph, linear systems analysis has played an important role in relating input-output components in rainfall-runoff modelling and in the development of stochastic models of single hydrological sequences (Singh, 1982). The performance of a rainfall-runoff model heavily depends on choosing suitable model parameters, which are normally calibrated by using an objective function (Yu and Yang, 2000). The entire physical process in the hydrologic cycle is mathematically formulated in conceptual models that are composed of a large number of parameters (Tokar and Johnson , 1999). The modelling technique approach used in the present study is based on artificial neural network methods in modelling of hydrologic input-output relationships. The rainfall-runoff models are developed to provide predicts or forecast rainfalls as input to the rainfall-runoff models. The observed streamflow was treated as equivalent to runoff. The previous data were used in the test set to illustrate the capability of model in predicting future occurrences of runoff, without directly including the catchment characteristics. Tokar and Markus (2000) believed that the accuracy of the model predictions is very subjective and highly dependent on the user’s ability, knowledge, and understanding of the model and the watershed characteristic. Artificial intelligence (AI) techniques have given rise to a set of ‘knowledge engineering’ methods constituting a new approach to the design of high-performance software systems. This new approach represents an evolutionary change with revolutionary consequences (Forsyth, 1984). The 3 systems are based on an extensive body of knowledge about a specific problem area. Characteristically this knowledge is organized as a collection of rules, which allow the system to draw conclusions from given data or premises. Application of neural networks is an extremely interdisciplinary field such as science, engineering, automotive, aerospace, banking, medical, business, transportation, defense, industrial, telecommunications, insurance, and economic. In the last few years, the subject of artificial neural networks or neural computing has generated a lot of interest and receives a lot of coverage in articles and magazine. Nowadays, artificial neural networks (ANN) methods are gaining popularity, as is evidenced by the increasing number of papers on this topic appearing in engineering and hydrology journals, conferences, seminars, and so on. This modelling tool is still in its nascent stage in terms of hydrologic applications (ASCE, 2000). Recently there are increasing number of works attempt to apply the neural network method for solving various problems in different branches of science and engineering. This highly interconnected multiprocessor architecture in ANN is described as parallel distributed processing and has solved many difficult computer science problems (Blum, 1992). Electrical Engineers find numerous applications in signal processing and control theory. Computer engineers and computer scientists find that the potential to implement neural networks efficiently and by applications of neural networks to robotics and it show promise for difficult problems in areas such as pattern recognition, feature detector, handwritten digit recognition, image recognition, etc. Manufacturers use neural networks to provide a sophisticated machine or instrument enabling the consumers to gain some benefit in a modern society and our life become comfortable and productive. In medical, the neural networks used to diagnose and prescribe the treatment corresponding to the symptoms it has been before. It is a tool to provide hydraulic and environmental engineers with sufficient details for design purposes and management practices (Nagy et. al., 2002). In other word, apparently neural network models are able to treat problems of different disciplines. The main function of all artificial neural network paradigms is to map a set of inputs to a set of output. However, there are a wide variety of ANN algorithms. An 4 attractive feature of ANN is their ability to extract the relation between the inputs and outputs of a process, without the physics being explicitly provided to them. They are able to provide a mapping from one multivariate space to another, given a set of data representing that mapping. Even if the data is noisy and contaminated with errors, ANN has been known to identify the underlying rule (ASCE, 2000). Neural network can learn from experience, generalize from previous examples to new ones, and abstract essential characteristics from inputs containing irrelevant data (Fausett, 1994; Wasserman, 2000). Therefore, the natural behaviour of hydrological processes is appropriate for the application of ANN methods. In this study, artificial neural network (ANN) methods were applied to model the hourly and daily rainfall-runoff relationship. The available rainfalls and runoffs data are from four catchments known as Sungai Bekok, Sungai Ketil, Sungai Klang, and Sungai Slim. An attractive feature of ANN methods is their ability to extract the relation between the inputs and outputs of process, without the physics being explicitly provided to them. The networks were trained and tested using data that represent different characteristics of the catchments area and rainfall patterns. The sensitivity of the network performance to the content and length of the calibration data were examined using various training data sets. Existing commercially available models used in modelling study were HEC-HMS and XP-SWMM. The performances of the ANN model for the selected catchments were investigated and comparison was made against the XPSWMM, HEC-HMS and linear regression models. The performance of the proposed models and the existing models are evaluated by using correlation of coefficient, root mean square error, relative root mean square error, mean absolute percentage error and percentage bias. 5 1.2 Statement of the Problem In many parts of the world, rapid population growth, urbanization, and industrialization have increased the demand for water. These same pressures have resulted in altered watersheds and river systems, which have contributed to a greater loss of life and property damages due to flooding. It is becoming increasingly critical to plan, design, and manage water resources systems carefully and intelligently. Understanding the dynamics of rainfall-runoff process constitutes one of the most important problems in hydrology, in order to predict or forecast streamflow for purposes such as water supply, power generation, flood control, water quality, irrigation, drainage, recreation, and fish and wildlife propagation. During the past decades, a wide variety of approaches, such as conceptual, has been developed to model rainfall-runoff process. However, an important limitation of such approaches is that treatment of the rainfall-runoff process as a realization of stochastic and statistical process means that only some statistical features of the parameters are involved. Therefore, what is required is an approach that seeks to understand the complete dynamics of the hydrologic process, capturing not only the overall appearance but also the intricate details. The rainfall-runoff relationships are among the most complex hydrologic phenomena to comprehend due to the tremendous spatial and temporal variability of watershed characteristics, snow pack, and precipitation patterns, as well as a number of variables involved in modelling the physical processes (Tokar and Johnson, 1999). The modelling of rainfall-runoff relationship is very important in the hydraulics and hydrology study for new development area. The transformation of rainfall to runoff involves many highly complex components, such as interception, infiltration, overland flow, interflow, evaporation, and transpiration, and also non-linear and cannot easily calculate by using simple equation. The runoff is critical to many activities such as designing flood protection works for urban areas and agricultural land and assessing how much water may be extracted from a river for water supply or irrigation. Despite the complex nature of the rainfall-runoff process, the practice of estimating runoff as fixed percentage of rainfall is the most commonly used method in design of urban storm 6 drainage facilities, highway culverts, and many small hydraulic structures. The quantity of runoff resulting from a given rainfall event depends on a number of factors such as initial moisture, land use, and slope of the catchments, as well as intensity, distribution, and duration of the rainfall. Various well known currently available rainfall-runoff models have been successfully applied in many problems and catchments. Numerous papers on the subject have been published and many computer simulation models have been developed. All these models, however, require detailed knowledge of a number of factors and initial boundary conditions in a catchments area which in most cases are not readily available. However, the existing popular rainfall-runoff models can be detected as not flexible and they require many parameters for calibration. Beven (2001) reported that the ungauged catchment problem is one of the real challenges for hydrological modellers in the twenty-first century. Furthermore, the traditional method of investigation and the collection of data in the field involving the installation and maintenance of a network of instruments tend to be costly. Furthermore, some of these models are expensive, and of limited applicability. The availability of rainfall-runoff data is important for the model calibration process. Rainfall-runoff modelling for sites where there are no discharge data is a very much more difficult problem. However, it is considered that the main limitation in the development of a design flood hydrograph estimation procedure lies in the availability of rainfall and streamflow data, rather than any inherent limitations in the techniques used to develop the procedure. However, discharge data are available at only a small number of sites in any region. In this respect the problem is that there are very few major floods for which reliable rainfall and streamflow data are available, particularly on small catchments. Any relationships developed are therefore based on data from relatively small storms, and hence the flood estimates are made from extrapolated relationships. Even more often, physical measurements of the pertinent quantities are very difficult and expensive especially in a virgin rural area. That is reasons why many catchments in many countries in the world are not installed the measurement instruments. These difficulties lead us to explore the use of neural networks as a way of obtaining models based on experimental measurements. In terms of hydrologic applications, this modelling tool is still in its 7 nascent stages. An attractive feature of this model is their ability to extract the relationship between the inputs and outputs of a process, without the physics being explicitly provided to them. The goal is to create a model for predicting runoff from a gauged or ungauged catchment. For long term runoff modelling, use a continuous model rather than a single-event model. Rainfall-runoff modelling software’s and guideline from USA, Australia and United Kingdom are required as reference for understanding and development of hydrologic model in Malaysia. Those models and guidelines to study the modelling technique, hydrologic problems, management and design of urban or rural watershed system. Since the present software and guidelines are based on the compilation of the practice of urban stormwater management of USA, United Kingdom and Australia, hence it is important for us to develop our own. Furthermore, various well-known currently available rainfall-runoff models such as HEC-HMS, MIKE-11, SWMM, etc. have been successfully applied in many problems and watersheds. However, the existing popular rainfall-runoff models can be detected as not flexible and they require too many parameters for calibration. Obviously, the models have their own weaknesses, especially in the calibration processes and the ability to adopt the non-linearity of processes. However, there are also many areas where today’s tools are lacking the features and functions needed to build these applications effectively (Wasserman, 2000). Furthermore, the software’s are not robust and performed by selective calibration. The rapid development of modern Malaysia, the demand of water resources utility has also increased, and therefore, time has already come to develop new techniques to overcome the problems regarding the hydrology and water resources design and management. In this context, one of the main potential areas of application of rainfall-runoff models is the prediction and forecasting of streamflow. An alternative approach to predicting suggested in recent years is the neural network method, inspired by the functioning of the human brain and nervous systems. Artificial neural networks are able to determine the relationship between input data and corresponding output data. When presented with simultaneous input-output observations, artificial neural network adjust their connection 8 weights (model parameters), and discover the rules governing the association between input and output variables. 1.3 Study Objectives The research is focused on the application of the neural networks method on the rainfall-runoff modelling. Comparison between neural networks and other methods is made. The overall objective of the present study is developing mathematical models that are able to provide accurate and reliable runoff estimates from the historical data of rainfall-runoff of selected catchments area. To address the performance of various rainfall-runoff models applied in Malaysian environment, the following specific objectives are made: (i) To develop rainfall-runoff model using artificial neural network (ANN) methods, based on the Multilayer Perceptron (MLP) model and Radial Basis Function (RBF) computation techniques. (ii) To examine and quantify the predicting accuracy of neural networks models using multiple inputs and output series. (iii) To evaluate and compare the neural networks and multiple linear regression (MLR) models for daily flow prediction only. (iv) To compare and evaluate the performance of the neural networks models against XP-SWMM and HEC-HMS models for daily and hourly predictions. 9 1.4 Research Approach and Scope of Work The present study was undertaken to develop daily and hourly rainfall-runoff models using the ANNs method that can possible be used to provide reliable and accurate estimates of runoff based on rainfall as input variable. The ANN models used are the MLP and RBF. It is believed that the ANN is able to overcome the non-linear relationship between rainfalls against runoff. The ANN methods of computation are MLP and RBF. Calibration methods (algorithm) apply for MLP is back-propagation and the transfer function used is tangent sigmoid (tansig). Meanwhile, calibration methods apply for RBF is Generalized Regression Neural Network (GRNN) and the transfer function used is Gaussian for hidden units. The modelling work was carried out using five years period of daily data and ten years period of hourly data consisting the rainfall and runoff records from selected catchments in Peninsular of Malaysia. There are four catchments being selected for analysis and modelling. Those stations have sufficient length of records and fairly good quality of data. Those are Sungai Bekok (Johor, Malaysia), Sungai Ketil (Kedah, Malaysia), Sungai Klang (Kuala Lumpur, Malaysia), and Sungai Slim (Perak, Malaysia) catchments. Those sites were selected to demonstrate the development and application of ANN, multiple linear regression (MLR), XP-SWMM and HEC-HMS models. It is emphasized that the MLR model is only applied to model the daily rainfall-runoff for those catchments. The data required to carry out this study are catchment physical data, rainfall and river (at catchments outlet). The data of all these gauges is recorded and maintained by Department of Drainage and Irrigation (DID) Malaysia. This study is subjected to the following limitations: (i) Analyses treat the catchment as one single catchment. No sub-division of catchment is carried out. (ii) It is assumed that the HEC-HMS and XP-SWMM can be applied to a big catchment without sub-division. 10 (iii) The available observed data for analysis are rainfall, runoff or streamflow, evapotranspiration, and size of the catchment area. Other data or parameters such as time of concentration, runoff coefficient and infiltration loss coefficient in the HEC-HMS and XP-SWMM will be estimated. 1.5 Significance of the Study The relationship, or the operation of transforming the input (rainfall) into the output (runoff), is implied uniquely by any corresponding input-output pair. This relationship can be abstracted and used to find the output for any arbitrary input or, the input corresponding to any given output, though, in practice, in analysing systems which are not exactly linear time variant, or where the data are subject to errors. Problems may arise both in identifying the operation or in computing an input corresponding to a given output function of time (Singh, 1982). Overton and Meadows (1976) defined mathematical model as, “a quantitative expression of a process or phenomenon one is observing, analyzing, or predicting”. Meanwhile, Woolhiser and Brakensiek (1982) defined mathematical model as, “a symbolic, usually mathematical representation of an idealized situation that has the important structural properties of the real system. Mathematical models that require precise knowledge of all the contributing variables, a trained artificial intelligence such as neural networks can estimate process behaviour even with incomplete information. It is a proven fact that neural networks have a strong generalization ability, which means that once they have been properly trained, they are able to provide accurate results even for cases they have never seen before (HechtNielsen, 1991; Haykin, 1994). This generalization capability provides an understanding of how the runoff hydrograph system can respond under different rainfall and catchments characteristics. 11 Most synthetic procedures for estimating design flood hydrographs are deterministic in that the design flood is derived from a hypothetical design storm. A review of some of the more widely used procedures for estimating design flood hydrographs has been made by Cordery et. al. (1970). Three basic steps are common to this methodology of flood estimation: (1) the specification of the design storm of which the important characteristics are usually the recurrence interval, the total rainfall volume, the areal distribution of rainfall over the catchment, the temporal distribution of rainfall, and the duration of rainfall; (2) the estimation of the runoff volume resulting from the design storm; and (3) the estimation of the time distribution of runoff from the catchment. Over recent years there have been numerous and diverse techniques developed for estimating all of the above components. Today, most urban drainage systems in the tropical regions are relying upon the ‘old concept’ of rapid stormwater disposal determined from tradition rainfall-runoff modelling approach. The obvious negative impacts of urbanization towards water balance are increased stormwater runoff, degradation of water quality, recession of the water table and reduction of roughness and thus time of concentration. Therefore, in view of the importance of the relationship between rainfall-runoff, the present study was undertaken in order to develop predicting models that can be used to provide reliable and accurate estimates of runoff. 1.6 Structure of the Thesis This thesis consists of five chapters. The first chapter presents the introduction of this study, and outlined the objectives and scopes of this research. A review of the relevant literature is presented in Chapter 2. The proposed models for rainfall-runoff modelling are described in Chapter 3. The fundamentals and concepts of rainfall-runoff relationship, and also the concepts of hydrology modelling are discussed in detail in Chapter 3. The description of selected catchments area, as well as the current catchment management practice and related problems also discussed in this chapter. Meanwhile, results and discussions are presented in Chapter 4. Results of the Multilayer Perceptron 12 (MLP) model were discussed in sub-topic 4.2 and results of the Radial Basis Function (RBF) model were discussed in sub-topic 4.3. Meanwhile, results of the Multiple Linear Regression (MLR), HEC-HMS and XP-SWMM were discussed in sub-topic 4.4, 4.5 and 4.6 respectively. The results and discussions involving the application and performance of the proposed models, the robustness and limitation of the model, river basin characteristics, etc. were discussed in detail in sub-topic 4.7. Finally, in the last chapter, conclusions from the present study are summarized and recommendations for future studies are outlined. 13 CHAPTER 2 LITERATURE REVIEW 2.1 General Determining the relationship between rainfall and runoff for a catchments area is one of the most important problems faced by hydrologists and engineers. It is because; the information about rainfall and runoff is needed for hydrologic and hydraulics engineering design and management purposes. This relationship is known to be highly non-linear and complex. In addition to rainfall, runoff is dependent on numerous factors such as initial soil moisture, land use, catchment geomorphology, evaporation, infiltration, distribution, duration of the rainfall, and so on. Although many catchments have been gauged to provide continuous records of stream flow, engineers are often faced with situations where little or no information is available. Todini (1988) reviewed the historical development of mathematical methods used in rainfall-runoff modelling and classified the models based on a priori knowledge and problem requirements. It found that the increasing role of distributed models, satellite, and radar technology in watershed hydrology and noted that techniques for model calibration and verification remained less than robust. However, efforts have been made to compare models of some component processes. Also, developers of some models have compared their models with one or a few other models. During the past until recent years, there are various rainfall-runoff studies have been carried out by researchers. The ability to transform the rainfall to the 14 flow or runoff accurately is required for further investigation. The neural network approach possibly can lead to new insights in problem solving and to give better solutions of predicting or forecasting via mathematical modelling. Watershed models are fundamental to water resources assessment, development, and management. They are, for example, used to analyze the quantity and quality of streamflow, reservoir system operations, groundwater development and protection, surface water and groundwater conjunctive use management, water distribution systems, water used, and a range of water resources management activities (Wurbs, 1998). The development of the computer and neural network techniques provides hydrologists and researchers with enhanced computational power to solve complicated problems from time to time, as far as engineering is concern. This power increased the possibilities of applying search algorithms, and the ability to simulate models of cognitive processes. This developments stimulated neural networks research. In particular, several models based on time series analysis methods or based on regression techniques have been commonly used to describe the random behaviour of various physical phenomena in hydrology. With the advent of computational and simulation capability using computers, new technique such as neural networks have been developed to represent more accurately complex non-linear stochastic processes (Harun, 1999). The accuracy of model predictions is very subjective and highly dependent on the user’s ability, knowledge, and understanding of the model and of the watershed characteristics (Tokar and Johnson, 1999). 2.2 Rainfall-Runoff Process and Relationship Hydrology is the scientific of water and its properties, distribution, and effects on the earth’s surface, soil, and atmosphere (McCuen, 1997). Most of the hydrologic processes are non-linear processes, such as the relationship between rainfall and runoff. 15 The most important processes in hydrology cycle are the section where rainfall occurs and result in runoff. The rainfall-runoff relationship describes the time distribution of direct runoff as a function of excess rainfall. The varying portion of the precipitation or rainfall becomes runoff, moving via overland flow into stream channels. Runoff is generated by rainstorms and its occurrence and quantity are dependent on the characteristics of the rainfall event such as the intensity, duration, and distribution. Rainfall intensity is defined as the ratio of the total amount of rain (rainfall depth) falling during a given period to the duration of the period. It is expressed in depth units per unit time, usually as millimeter per hour (mm/hr). When rain falls, the first drops of water are intercepted by the leaves and stems of the vegetation. This is usually referred to as interception storage. As the rain continuous, water reaching the ground surface infiltrates into the soil until it reaches a stage where the rate of rainfall (intensity) exceeds the infiltration capacity of the soil. Thereafter, surface puddles, ditches, and other depressions are filled (depression storage), after which runoff is generated. The infiltration capacity of the soil depends on its texture and structure, as well as on the antecedent soil moisture content (previous rainfall or dry season). The initial capacity of a dry soil is high but, as the storm continuous, it decreases until it reaches a steady value termed as final infiltration rate. Apart from rainfall characteristics such as intensity, duration and distribution, there are a number of site or catchment specific factors which have a direct bearing on the occurrence and volume of runoff. The major factors which influence the rainfall-runoff process are as follows: (a) Soil type The infiltration capacity is among others dependent on the porosity of a soil which determines the water storage capacity and affects the resistance of water to flow into deeper layers. Porosity differs from one soil type to the other. The highest infiltration capacities are observed in loose, sandy soils, while heavy clay or loamy soils have considerable smaller infiltration capacities. The infiltration capacity depends furthermore on the moisture content prevailing in a soil at the onset of a rainstorm. This however, is only valid when the soil surface remains undisturbed. 16 (b) Vegetation The amount of rain lost to interception storage on the foliage depends on the kind of vegetation and its growth stage. A cereal crop has a smaller storage capacity than a dense grass cover. More significant is the effect the vegetation has on the infiltration capacity of the soil. A dense vegetation cover shields the soil from the raindrop impact and reduces the crusting effect. The root systems as well as organic matter in the soil increase the soil porosity thus allowing more water to infiltrate. Vegetation also retards the surface flow particularly on gentle slopes, giving the water more time to infiltrate and to evaporate. In conclusion, an area densely covered with vegetation, yields less runoff than bare ground. (c) Slope and catchment size It was observed that the quantity of runoff decreased with increasing slope length. This is mainly due to lower flow velocities and subsequently a longer time of concentration. This means that the water is exposed for a longer duration to infiltration and evaporation before it reaches the measuring point. The same applies when catchment areas of different sizes are compared. The runoff efficiency (volume of runoff per unit of area) increases with the decreasing size of the catchment. For example, the larger the size of the catchment, the larger the time of concentration and the smaller the runoff efficiency. Apart from the above-mentioned site specific factors which strongly influence the rainfall-runoff process, it should also be considered that the physical conditions of a catchment area are not homogenous. Even at the micro level, there are a variety of different slopes, soil types, vegetation covers, etc. Each catchment has therefore its own runoff response and will respond differently to different rainstorm events. The design of water harvesting schemes requires the knowledge of the quantity of runoff to be produced by rainstorms in a given catchment area. It is commonly assumed that the quantity or volume of runoff is a proportion (percentage) of the rainfall depth, where, runoff (mm) is 17 equal to C × rainfall depth. In rural catchments where no or only small parts of the area are impervious, the coefficient, C which describes the percentage of runoff resulting from a rainstorm, is however not a constant factor. Instead its value is highly variable and depends on the above described catchment specific factors and on the rainstorm characteristics. An analysis of the rainfall-runoff relationship and subsequently an assessment of relevant runoff coefficients should best be based on actual, simultaneous measurements of both rainfall and runoff in the study area. The runoff from a watershed is governed by local combinations of several factors such as hydraulic conductivity and porosity, topography, landuse and etc. Discontinuous encompass the boundaries separating soil types, geologic formations, or land covers. Physical properties control interception, surface retention, infiltration, overland flow, and evapotranspiration at difference scales, and these processes control runoff. It has been observed empirically that the form of hydrologic response changes with the spatial scale of heterogeneities, usually considered to be simpler and more linear with increasing watershed size (Dooge, 1981). The discharge hydrograph is a graph of instantaneous discharge at the catchment outlet versus time. The hydrograph are separated into two components, direct runoff and base flow. In flood studies the portion of the discharge hydrograph of major interest is the direct runoff hydrograph, and the portion of the rainfall hyetograph of interest is the storm rainfall. The rainfall hyetograph is a plot in discrete form, of the rainfall intensity over the catchment versus time. The point rainfall intensity was extracted at hourly intervals from the recording raingauge data and then adjusted linearly so that the area under the hyetograph equaled the catchment mean rainfall over the same period. The storm rainfall is that portion of the rainfall hyetograph for which the intensity exceeds the phi-index. The phi-index or loss rate is defined as the rainfall intensity above which the volume of rainfall or rainfall excess equals the volume of direct runoff. The volume of rainfall below this intensity goes to catchment recharge. It is assumed that rainfall of intensity less than the phi-index does not contribute to direct runoff. These concepts are 18 largely empirical and obviously over simplify the very complex relationship between rainfall and runoff. In practice, there are two methods of deriving the volume of runoff from the volume of rainfall. In other word is rainfall-runoff relationship. The first method is the loss rate approach in which an initial loss prior to the onset of direct runoff and a continuing loss during the storm are abstracted from the design storm. The selection of a continuing loss or phi-index approach is usually the most important factor for the design storm. The second approach a rainfall-runoff relationship is developed from which the volume of runoff can be estimated for any design storm volume. This approach is usually adopted for design storms which are usually outside the range of the observed data. For practical purposes it is useful to express the rainfall-runoff relationship as a mathematical equation rather than graphically. Several forms of equation have been reported by Chow (1964) who describes an empirical equation developed by the U.S. Soil Conservation Service and the form Q = Pe2 (Pe + I ) , where, Q is the direct runoff, Pe is the total rainfall during the storm minus the initial loss, and I is the potential infiltration. The value of I is dependent on the soil type, ground cover and the antecedent moisture conditions. However, the above equation is attractive in that the proportion of estimated runoff relative to rainfall increases as the storm rainfall increases. This is logical since the catchment recharge is high in the early stages of a storm and decreases to a more or less constant rate as the storm duration increases. 2.3 Review on Hydrologic Modelling Beven (2001) defined the rainfall-runoff model as a model to predict the hydrograph peaks correctly at least to within the magnitude of the errors associated with the observations, to predict the timing of the hydrograph peak correctly, and to give a good representation of the form of the recession curve to set up the initial conditions prior 19 to the next event. Meanwhile, Loague and Freeze (1985) defined hydrologic modelling as, ‘In many ways, hydrologic modelling is more an art than a science, and it is likely to remain so. Predictive hydrologic modelling is normally carried out on a given catchment using a specific model under the supervision of an individual hydrologist. The usefulness of the results depends in large measure on the talents and experience of the hydrologist and... understanding of the mathematical nuances of the particular model and the hydrologic nuances of the particular catchment. It is unlikely that the results of an objective analysis of modelling methods…can ever be substituted for the subjective talents of an experienced modeler’. Hydrologic modelling is employed to address a wide spectrum of environmental and water resources problems, and to understand dynamic interactions between climate and land surface hydrology. Hydrological models are fundamental to water resources assessment, development and management. They are, for example used to analyzed the quantity and quality of streamflow, reservoir system operations, surface water and groundwater conjunctive use management, water distribution systems, water use, and a range of water resources management activities (Wurbs, 1998). Words such as deterministic, stochastic, conceptual, black box, empirical, lumped and distributed input lumped and distributed parameter, linear and nonlinear systems abound, and this is very subjective, complex and hard to understand. In this study, the history of catchment modelling over the last ninety years is described. The field of catchment modelling today encounters a bewildering array of models and model terminology. One is likely to receive the impression that this is a very complex and hard to understand subject. To understand the present, it usually helps to understand how it came about and to this end; the history of catchment modelling must be considered. This history can be divided into five phases, or eras (Torno, 1985). 20 2.3.1 First era (1910-1930) Known as “crude” era. In the decades, or perhaps centuries, prior to this time, people had some interest in hydrologic processes and had methods by which they could infer conclusions about the response of rivers to meteorological phenomena. In this period that the first attempts were made to actually analyse and quantify those processes. One such approach was to observed, for a number of events, the quantity of rainfall at some point in the catchment, usually the outlet, and the amount by which the stage of the river increased and to then plot precipitation versus rise in stage. Another very early and very simple approach is called the Rational Method. In it, the maximum rate of outflow from a catchment is equated to the product of three quantities, the rainfall intensity, the catchment area, and a coefficient. It is assumed that the rain continuous at the stated intensity until the maximum flow is reached, and equilibrium condition. Obviously, if a rain storm maintains a constant intensity until equilibrium is reached and if the correct value of the coefficient is known, the method will yield a correct value of peak discharge. 2.3.2 Second era (1930-1945) Known as the “theoretical” era. After the crude methods had been thoroughly exploited and after it was realized that the physical process is really much more complicated than this, investigators began to turn to what they felt were more scientific approaches. Work done during the theoretical era was characterized by attempts to understand the processes and to represent them mathematically. Typical of the period was the work of Horton (1933) and others in analyzing the infiltration process. The solution to the problem was obvious that to determine how much water soaked into the ground, subtract this from the total rainfall and the reminder was what went to the river. The key to the whole thing was the infiltration process. 21 2.3.3 Third era (1945-1965) Known as the “empirical” era. The word empirical is defined as ‘relying upon or derived from observation or experiment”. The investigators of the empirical era decided that the reason the theoretical approach did not work well was that no one really did understand the rainfall-runoff process and that the mathematical relationships based in this false understanding were wrong. Consequently, any attempt to fit observed data to such formulations was doomed to failure. The right way to handle the problem then was to observe, for many events, the magnitude of rainfall, of river response, and of any other factors which were logically thought to be involved in the process. And then experiment with different types of mathematical, or graphical, relationships until one is found which causes everything to “fit together”. That is, let the data, rather than theory, dictate the form of relationship to be used. The most notable example of this philosophy is the work of Kohler et. al. in the late 1940’s with a coaxial graphical rainfall-runoff relationship employing the now well known “Antecedent Precipitation Index” or “API”. This method soon proved to be quite suitable for various types of hydrologic studies and has been very popular for the last thirty years. 2.3.4 Fourth era (1965-1975) Known as the “conceptual” era. Computers became available to hydrologists in the late 1950’s and one of the first uses made of them was the automation of existing hydrologic techniques. The conceptual era of catchment modelling began with the publication by Linsley and Crawford of the Stanford Watershed Model in 1966. Their approach was new at the time and, to the best of the author’s knowledge, original with them. It consists of writing mathematical formulations which represent the modeller’s concept of all of the processes. Linsley and Crawford saw the opportunity to developed new modelling techniques, to do things which cannot be done without a computer simply because those things involve a tremendous volume of computation. Since 1965, other 22 hydrologists independently have produced a number of other conceptual catchment models. 2.3.5 Fifth era (1975-present) Some observers feel that “black box” modelling is a new era in hydrology that using modern and powerful computing equipment. The black box approach to catchment modelling was introduced by Wallis and Todini in 1973 in the form of the “Constrained Linear system”, or “CLS” Model. It involves the use of a computer to correlate great masses of input and output data using mathematical formulations having little or nothing to do with hydrological processes but which result in outputs which resemble a catchments output, a hydrograph. A claim advantage of black box models is the ability, when being calibrated on poor quality historical data, to ‘filter out’ random errors in the data thereby producing a set of parameters which accurately represents the hydrological characteristics of the catchment. 2.4 Rainfall-Runoff Models Clarke (1973) described the rainfall-runoff models by one or more of the characteristic. Firstly are the deterministic or probabilistic models. It is based on the characteristics of model parameters and variables. If the model parameters or variables are considered random variables with probability distributions, the model is called probabilistic. If the model parameters or variables are free from any random variation, the model is deterministic. Secondly are the lumped or distributed models. It is based on the geometric or probabilistic. If the model parameters or variables vary spatially within the watershed, the model is a distributed model; otherwise, it is a lumped model. Thirdly are linear or nonlinear models. It is system-theory sense or statistical regression sense. If 23 the principle of superposition is not violated, the model is linear in system-theory sense. If the model parameters are linear, the model is linear in statistical regression sense. Otherwise, the model is nonlinear. Fourthly are continuous or discrete models. If the model uses continuous functions in the formulation of the physical phenomena, it is continuous. Otherwise, it is discrete. Fifthly are the black-box or process models. It is based on analyses of rainfall input, runoff output, and a transfer function, which simulates the relationship between rainfall and runoff in a watershed, such as unit hydrograph and time-area methods (McCuen, 1997). Conceptual models are combinations of process and black-box models. In these models, the physical processes are defined using process models and model parameters are optimized employing a black-box approach (Singh, 1988). Examples where the conceptual models are used include the Clark, Nash and Stanford Watershed models. Lastly are the event-driven or continuous process models. It is based on the simulation period. If the model is designed to simulate a single event, it is called an event-driven model. The focus of event-driven models is on the evaluation of surface runoff and direct infiltration. The most commonly used event-driven models are HEC-HMS, developed by the US Army Corps Engineers; Storm Water Management Model (SWMM), developed by US Environmental Agency, and others. Beven (2001) revealed very basic classification of hydrologic model as describe follows. Lumped models treat the catchment as a single unit, with state variables that represent averages over the catchment area, such as average storage in the saturated zone. Distributed models make predictions that are distributed in space, with state variables that represent local averages of storage, flow depths or hydraulic potential, by discretizing the catchment into a large number of elements or grid squares and solving the equations for the state variables associated with every element grid square. Parameter values must also be specified for every element in a distributed model. Meanwhile, the deterministic or stochastic models permit only one outcome from a simulation with one set of inputs and parameter values. Stochastic models allow for some randomness or uncertainty in the possible outcomes due to uncertainty in input variables, boundary conditions or model parameters. The vast majority of models used in rainfall-runoff modelling are used in a deterministic way, although again the distinction is not clear-cut since there are examples 24 of models which add a stochastic error model to the deterministic predictions of the hydrological model and there are models that use a probability distribution function of state variables but make predictions in a deterministic way. A rainfall-runoff model is essence of much engineering hydrology. Because flow data are rarely available, design event are usually determined by a combination of rainfall information and rainfall-runoff relationships. As mentioned before, determining the relationship between rainfall-runoff for a catchment area is one of the most important problems faced by hydrologists and engineers. Although many catchments have been gauged to provide continuous records of runoff, hydrologists and engineers are often faced with situations where little or no information is available. In such instances, simulation models are often used to generate synthetic flows. In hydrologic context, another approach of modelling is emerging which uses the data to evolve the model themselves. The data are seen the primary source of information and the control area or volume is considered as a system. This process is essentially a system identification process. The traditional methods of system identification based on statistical analysis of input and output variables such as correlation analysis, spectral analysis, dynamic linear model or state-space models seem to be inadequate to explain the non-linear system behaviour. Therefore, a new type of modelling paradigm based on artificial intelligence such as artificial neural network, fuzzy logic, genetic algorithm, etc. are increasingly offering a promising alternative. In recent years, there is a growing interest to use the artificial intelligence models especially in the field of hydrologic. The main advantage of these soft modelling techniques is that these can be set up with a considerably less time and the model response also can be obtained fast thus reducing cost. Moreover, these techniques can be used for modelling systems on a real-time basis and even can be updated continuously. Figure 2.1 shows a schematic outline of the different steps in the modelling process that has been designed based on the experience and observation. It shows the perceptual model of the rainfall-runoff processes in a catchment. The perceptual model is the summary of our perceptions of how the catchment responds to rainfall under different 25 conditions. From this figure, it also can be described how the catchment responds to rainfall under different conditions. It was from our views according to previous study, the data sets that have analyzed and particularly the field sites that from experience of in different environments. The perceptual model is not constrained by mathematical theory (Beven, 2001). However, a mathematical description is traditionally, the first stage in the formulation of a model that will make quantitative predictions. Revise quality and quantity of the data Revise perceptions/ model structures Revise equations Revise parameter values Selection of the data (quality and quantity) Deciding on the processes or model structures The conceptual model -Deciding on the equations Model calibration - Getting values of parameters Model validation/testing No Declare Declare Success? Success? Yes Ok! Figure 2.1 A schematic outline of the different steps in the modelling process 26 The steps involved in the identification of the perceptual model of a system are selection of input-output data suitable for calibration and validation, and then selection of a model structure and estimation of its parameters. Before we can apply the model, it is generally necessary to go through a stage of parameter calibration. It is because all the models used in hydrology have equations that involve a variety of different input and state variables. There are variables that define the time variable boundary conditions during simulation, such as the rainfalls and other meteorological variables at a given time step, water table depth, initial values of the state variables, etc. Finally, there are the model parameters which define the characteristics of the catchment area or flow domain. The most commonly used method of parameter calibration is to use a technique of adjusting the values of the parameters to achieve the best match between the model predictions and any observations of the actual catchment response that may be available. The next stage is validation or evaluation of those predictions. Most model structures have a sufficient number of parameters that can be varied to allow reasonable fits to the data. It is usually much more difficult to find a model that is totally acceptable. The differences may lead to a revision of the parameter values being used, to a reassessment of the conceptual model being used, or to a revision of the perceptual model of the catchment as understanding is gained from the attempt to model the hydrological processes. Bronstert (1999) have revealed that from the experience in using distributed models as follows: (a) There is still a lack of knowledge about how to represent some processes at or near the soil surface, including surface crusting and preferential flows. (b) In many practical applications the data available on soil characteristics and boundary conditions, with all their spatial and temporal variability, may not be adequate to support the use of a fully distributed model. (c) In some cases experience suggests that the predicted responses are highly sensitive to small changes in parameters, initial and boundary conditions. It may be necessary to allow for this sensitivity by considering ranges of 27 parameters values and the propagation of the resulting uncertainty through the model. When applying a neural network to the rainfall-runoff problem, the stimulus is obviously the rainfall, and the response is the runoff at the catchment outlet or downstream. Since the flow at any time point is effectively composed of contributions from different sub-areas whose time of travel to the outlet covers a range of values, both the concurrent and antecedent rainfalls should be considered as stimuli (see Dawson and Wilby, 1998). The number of antecedent rainfall ordinates required is broadly related to the lag time of the drainage area. In addition, the use of previous output variables (previous runoff value) in the input pattern is also encountered in some cases and is referred to as recurrent network. Dawson and Wilby (1998) point out the importance of careful data preparation and the choice of the input variables set. In this study, the writer has placed emphasis on artificial neural network (ANN) methods. The objective of the applications approach is to create ANN models that can be used to tackle real life problems. 2.5 Artificial Neural Network The development of ANN began approximately 50 years ago by McCulloch and Pitts in 1943, inspired by a desire to understand the human brain and emulate its functioning. Although the idea of ANN was proposed by McCulloch and Pitts over fifty years ago, the development of ANN technique has experienced a renaissance only in the last decade due to Hopfield’s (1982) effort, because the current algorithms overcome the limitations of early networks. A tremendous growth in the interest of this computational mechanism has occurred since Rumelhart et. al. (1986) rediscovered a mathematically rigorous theoretical framework for neural networks, i.e., backpropagation algorithm (ASCE, 2000). 28 Artificial Intelligence (AI) can be broadly defined as computer processes that attempt to emulate the human thought processes that are associated with activities that require the use of intelligence (Tsoukalas and Uhrig, 1997). The term AI in its broadest sense, encompasses a number of technologies that includes, but is not limited to, expert systems, neural networks, genetic algorithms, fuzzy logic systems, cellular automata, chaotic systems, and anticipatory systems. Interestingly of artificial intelligence is that, the use of computers to model the behaviour aspects of human reasoning and learning. In problem solving, one must proceed from a beginning (the initial state) to the end (the goal state) via a limited number of steps (Encyclopaedia, 2000). The method used to construct such systems, knowledge engineering, extracts a set of rules and data from an expert or experts through extensive questioning. This material is then organized in a format suitable for representation in a computer and a set of tools for inquiry, manipulation, and response is applied. While such systems do not often replace the human experts, they can serve as useful adjuncts or assistants. Meanwhile, an artificial neural network (ANN) can be defined as ‘a data processing system consisting of a large number of simple, highly interconnected processing elements (artificial neurons) in an architecture inspired by the structure of the cerebral cortex of the brain’ (Tsoukalas and Uhrig, 1997). Work on artificial neural network models has a long history. The development of artificial neural networks began approximately more than 40 years ago (Lippman, 1987), motivated by a desire to try both to understand the brain and to emulate some of its strengths (Fausett, 1994). The brain could be considered as a highly complex, nonlinear, and parallel computer. It has the capability of organizing neurons so as to perform certain computations many times faster than the fastest digital computer in existence today (Haykin, 1994). Neural nets use a number of simple computational units called “neuron”, of which each tries to imitate the behaviour of a single human brain cell. The brain as a “biological neural net” and to implementations on computers as “neural nets” (Altrock, 1995). ANN were simulated based on the functions of the mammalian nervous system. Network structure is formed by two main units, neurons 29 and interconnections between them. Various mathematical models are based on neuron concept. Figure 2.2 shows a simple mathematical model of a neuron. This three layer neural network architecture is a universal function approximator. The main control parameters of neural network model are interneuron connection strengths also known as weights and the biases. The optimum value of these connection weights and biases are determined by minimising an objective function, usually mean square error. Weights Wij are “learned” to fit any function. First, the so-called propagation function combines all inputs X i that stem from the sending neurons. The means of combination is a weighted sum, where the weights Wi represent the synaptic strength. Exciting synapses have positive weights, inhibiting synapses have negative weights. To express a background activation level of the neuron, an offset (bias) Θ is added to the weighted sum. The socalled activation function computes the output signal Y of the neuron from the activation level f is of the sigmoid type as plotted in the lower right box of Figure 2.2. Biological Neuron Artificial Neuron X1 Inputs X2 W1 W2 X3 W3 Output, Y W4 Wn … X4 Xn Propagation Function n f = ∑Wi ⋅ Xi + Θ Activation Function Y i =0 f Figure 2.2 Simple mathematical model of a neuron 30 During learning, a neuron receives inputs from the inputs or previous layer, weights each input with a pre-assigned value, and combines these weighted inputs. All inputs are combined by weighted sum (propagation function). The combination of the weighted inputs is represented as, net j = ∑Wij X i (2.1) where, net j is a summation of weighted input for the j th neurons; Wij is a weight from the i th neuron in the previous layer to the j th neuron in the current layer; and X i is the input form the i th to the j th neuron. The net j is either compared to a threshold or passed through a transfer function to determine the level of activation. If the activation of a neuron is strong enough, it produces an output that is sent as an input to other neurons in the successive layer. 2.5.1 Basic Structure A network can have one or more layers. As shown in Figure 2.3, the basic structure of a network usually consists of three layers: the input layer, where the data are introduced to the network; the hidden layer or layers, where the data are processed; and the output layer, where the results for given inputs are produced. Layers in between the input and output layers are called hidden layers that can be one or more. Determination of structure of hidden layers and number of neurons is important in the multilayer perceptron modelling. There is no hard and fast rule for defining networks parameters. Masters (1993) gave three simple guidelines to follow. The first is to use one hidden layer; second is to use very few hidden neurons; and the third is to train until we can’t stand anymore. He suggested that using one hidden layer for multilayer perceptron was sufficient because in most problems the second hidden layer will not produce a large improvement in performance. The use of more than one hidden layers substantially increases the number of parameters to be estimated. Such an increase in the number of 31 the parameters may slow the calibration process without substantially improving the efficiency of network (Masters, 1993). It has been reported that for vast majority of practical problems, there is no reason to use more than one hidden layer. Additional hidden layers should be added only when a single hidden layer has found to be inadequate (Tsoukalas and Uhrig, 1997). Input layer Hidden layer 1 Outputlayer Output layer 1 1 y (k ) c(( kj)) m xi (i = 1,..., m) Figure 2.3 wij n A three-layer neural network with i inputs and k outputs The architecture of neural networks is designed by weights between neurons, a transfer function that controls the generation of output in a neuron, and learning laws that define the relative importance of weights for input to a neuron (Caudill, 1987). In practice, however, neural networks cannot provide the solution working by themselves alone. Rather they need to be integrated into consistent system engineering approach. Neural nets operate analogously. The most distinctive characteristic of an ANN is its ability to learn from examples. A net must be trained by being repeatedly fed input data together with corresponding target outcomes. Learning (or training) is defined as selfadjustment of the network weights as a response to changes in the information environment. When a set of inputs is presented, a network adjusts its weights in order to approximate the target output (observed or measured output) based on a certain algorithm. Learning in ANN consists of three elements; weights between neurons that define the relative importance of the inputs, a transfer function that controls the 32 generation of the output from a neuron, and learning laws that describe how the adjustments of the weights are made during training (Caudill, 1987). The net of a neuron is passed through an activation or transfer function to produce the output of a neuron. In the backpropagation networks, the modification of the network weights is accomplished with the derivative of the activation function. Therefore, continuous-transfer functions are desirable. The sigmoid and hyperbolic-tangent functions are the most commonly used continuous transfer function in the backpropagation networks (see Lipmann, 1987; ASCE, 2000; Tokar and Johnson, 1999; Tokar and Markus, 2000). 2.5.2 Transfer Function The transfer function determines the level of activation of a neuron by scaling the net. If the activation of a neuron is strong enough, it will produce an output and sends the result to other neurons. The most commonly used transfer functions are hard-limit, threshold logic, continuous functions such as sigmoid and hyperbolic tangent, and Gaussian functions. Some examples of typical transfer functions are illustrated in Figure 2.4 to 2.7. The simplest transfer function is the threshold logic function and a hard-limit transfer function. The difference between hard-limit function and threshold logic function is that the activation of a neuron in between upper and lower limits of threshold is zero for the threshold-logic function whereas this value varies linearly depending on the value of net for the hard-limit function. In early neural network architecture, these two types of transfer functions were used in the training process, but proven to be inefficient. In advanced structures of neural networks, a learning process using the derivative of the transfer function has been usually employed. A transfer function that is differentiable or continuous everywhere is required in these types of network. Continuous functions have been adapted for the advanced form of networks. The accuracy of the network trained using hyperbolic tangent transfer function was slightly better than the one trained using sigmoid transfer functions (Tokar, 1996). The hyperbolic tangent function shown in Figure 2.6 is one of the most commonly used 33 continuous functions. As shown in Figure 2.6, the difference between the sigmoid and the hyperbolic tangent functions is that the latter is bipolar. Meanwhile, Figure 2.7 show the Gaussian function. Compared to the other types of transfer functions the Gaussian function is the most commonly used for radial basis function (RBF) networks. out out 1 1 or net 0 net 0 -1 Figure 2.4 A threshold-logic transfer function out out 1 1 0 net or 0 net -1 Figure 2.5 A hard-limit transfer function out 1 out 1 or 0 net 0 net -1 (a) Figure 2.6 (b) Continuous transfer function: (a) the sigmoid, (b) the hyperbolic tangent out 1 0 Figure 2.7 net The gaussian function 34 2.5.3 Back-propagation Algorithm Several algorithms of neural network model exist and widely used by the researchers. In this study, the training of ANN was accomplished by a backpropagation algorithm. Backpropagation is the most commonly used supervised training algorithm in the multilayer feed-forward networks (Tokar and Johnson, 1999). Backpropagation which belongs to supervised learning algorithm that performs a gradient descent search in weights space using generalized delta rule is often reported in applications (Minns and Halls, 1996). Backpropagation is a systematic method for training multiple (three or more) layers ANN (Tsoukalas and Uhrig, 1997). The least mean square error method, along with the generalized delta rule, is used to optimize the network weights in backpropagation network. The objective of a backpropagation network is to find the weight that approximate target values of output with a selected accuracy. The backpropagation algorithm involves two steps. The first step is a forward pass, in which the effect of the input is passed forward through the network to reach the output layer. The network output is compared with the desired target output, and an error is computed. After the error is computed, a second step starts backward through the network. The errors at the output layer are propagated back toward the input layer with the weights being modified. With the development of a backpropagation algorithm, the network weights are modified by minimizing the error between a target and computed outputs. The momentum factor can speed up training in very flat regions of the error surface and help prevent oscillations in the weights. A learning rate is used to increase the chance of avoiding the training process being trapped in local minima instead of global minima (ASCE, 2000). Such a network, with learning governed by a generalized delta-rule, is typically called a backpropagation neural network. This algorithm was developed earlier by Werbos (1974) as part of his Ph.D. dissertation at Harvard University (Tsoukalas and Uhrig, 1997). Nevertheless, its powerfulness was not recognized and appreciated for many years. The elucidation of this training algorithm in 1986 by Rumelhart, Hinton, and Williams was the key step in making neural networks practical in many real-world situations. Today, it is estimated that 80% of all 35 applications utilize this backpropagation algorithm in one form or another (Tsoukalas and Uhrig, 1997). 2.5.4 Learning or Training The objective of a neural network is to process the information in a way that is previously trained. Neural network can learn from experience, generalize from previous examples to new ones, and abstract essential characteristics from inputs containing irrelevant data (Fausett, 1994). Calibration or training uses sample data sets of inputs and corresponding outputs to perform of the neural network. The increased knowledge of the biological background of the brain and the rapidly developing knowledge of electronics and automatic computation initiated development of artificial neurons and neural networks. According to Tokar (1996), learning can be categorized into three fundamental groups based on whether outputs are provided or not, namely supervised learning, graded or reinforcement learning, and unsupervised learning or self-organization. The most popular approach and easiest way to training the network are by using the supervised learning. In supervised learning, a pair of input–output vector called training pair or set is introduced to a network. A network computes the outputs for a given set of inputs and compares these outputs with target outputs. The difference between the computed outputs and target outputs is used to modify the weights in the network using a method minimizing the difference or error. This procedure is repeated until an acceptable level of accuracy is reached (Hecht-Nielsen, 1991). In unsupervised learning, the training set is composed of only inputs, no outputs are provided. The network does not have any knowledge of what the correct answer is. Therefore, the network modifies the weights in a way that similar inputs yield similar outputs. During the training process, statistical characteristics of given data are extracted and similar inputs are arranged into similar classes (Wassermann, 2000). Meanwhile, reinforcement learning is somewhat in 36 between the supervised and unsupervised learning. In the reinforcement learning, a score or grade is given to a network based on the performance of network over a series of multiple training trials instead of output for each trial (Hecht-Nielsen, 1991). By considering the historical data, the calibration (training) and verification (testing) of the model is carried out for the rainfall-runoff data series at selected catchments. Figure 2.8 shows the steps in training and testing of the neural network model. The models were trained to calibrate parameters. Then, the model is trained starting with some small randomly generated weights. After the predefined target error for all patterns is reached, the training process is stopped and the weights are saved. These weights and the same architecture of the model are utilized in the verification (testing) phase. In the development of rainfall-runoff relationship models, the networks were trained using a backpropagation algorithm and one or more hidden layer. The weights are update or modified iteratively using the generalized delta rule or the steepest-gradient descent principle. Training of the networks was accomplished using the MATLAB software developed by MathWorks Inc. (2000). After a sufficient number of training iterations, the network learns to recognize patterns in the data and creates an internal model of the process governing the data. The training process is stopped when no appreciable change is observed in the values associated with the connection links or some termination criterion is satisfied. The network can then use this internal model to make predictions for new input conditions. Evaluation of the model performance will be based on the generated errors and the model robustness. The MLP model of multilayer feedforward neural network is the most commonly used in hydrological modelling so far with regard to input-output function approximation. MLP has been applied successfully to solve some difficult and diverse problems by training them in a supervised manner. The MLP is a supervised neural network that learns the mapping between input data and target data. Meanwhile, RBF networks may require more neurons than standard feed-forward backpropagation network, but often 37 they can be designed in a fraction of the time it takes to train standard feed-forward networks. They work best when many training vectors are available (SPSS Inc., 1995). Given model inputs Determination of model architecture and parameters Parameter estimation Total sum of square error between computed outputs and target values ≤ Allowable? No Yes Verification Figure 2.8 2.6 Steps in training and testing Neural Network Application Several studies indicate that neural networks have proven to be potentially useful tools in hydrological modelling such as for modelling of rainfall-runoff processes (Buch et. al., 1993; Anamala et. al., 1995; Smith and Eli, 1995; Shamseldin, 1997; Abrahart et. al., 1999); flow prediction (Ichiyanagi, 1993; Karunithi et. al., 1994; Dibike and Solomatine, 1999; Tingsanchali, 2000; Elshorbagy et. al., 2000); water quality predictions (Maier and Dandy, 1996); operation of reservoir system (Sakakima et. al., 1992; Raman and Chandramouli, 1996; Harun, 1999); water demand forecasting (Grino, 1992; Zhang et. al., 1994); groundwater reclamation problems (Ranjithan and Eheart, 1993) and modelling rating curve (Tawfik et. al., 1997). A case study of runoff 38 simulation of a Himalayan glacier basin using artificial neural network was presented (Buch et. al., 1993). It was found that the neural network performance was superior compared to the energy balance and the multiple regression models. In addition, it was observed that the neural network was faster in learning and exhibited excellent system generalization characteristics. Smith (1995) used back-propagation neural network model to predict only the peak discharge and the time to peak resulting from a single rainfall pattern. Tokar and Johnson (1999) was employed artificial neural network (ANN) to forecast daily runoff as a function of daily precipitation, temperature, and snowmelt for the Little Patuxent River watershed in Maryland. It was found that the ANN model provides a more systematic approach, reduces the length of calibration data, and shortens the time spent in calibration of the models. Further, Tokar and Markus (2000) was applied ANN technique to model watershed runoff in three basins with different climatic and physiographic characteristics. The ANN technique was applied to model monthly streamflow. It was found that ANN models could be powerful tools in modelling the precipitation-runoff process for various time scales, topography, and climate patterns. At the same time, it represents an improvement upon the prediction accuracy and flexibility of current methods. 2.7 Neural Network Modelling in Hydrology and Water Resources In hydrology, the problems are not clearly understood or are too ill defined for a meaningful analysis using physically-based methods. Even when such models are available, they have to rely on assumptions that make neural networks seem more attractive. Moreover, neural networks routinely model the non-linearity of the underlying process without having to solve complex partial differential equations. The presence of noise in the inputs and outputs is handled by neural networks without severe loss of accuracy because of distributed processing within the network. Many articles report that neural networks method can produce good models of the problem that accurately represent nonlinearities in the data (Bishop, 1995). Therefore, the natural behaviour of 39 hydrological processes is appropriate for the application of neural networks method. The characteristics of non-linearity and existence of noise component in hydrological processes demand a solution that can promise a reliable result. The neural networks technique is fairly new to water resources research. However, there is rapidly growing interest among water scientists to apply neural networks to various water resources problems. Hydrologists are often faced the problems of prediction and estimation of runoff, precipitation, especially in ungauged catchment, contaminant concentrations, water stages, and so on. Most hydrologic processes exhibit a high degree of temporal and spatial variability and are further plagued by issues of nonlinearity of physical processes, conflicting spatial and temporal scales, and uncertainty in parameter estimates. Hydrologists attempt to provide rational answers to problems that arise in design and management of water resources. An attractive feature of ANN is their ability to extract the relation between the inputs and outputs of a process, without the physics being explicitly provided to them. They are able to provide a mapping from one multivariate space to another, given a set of data representing that mapping. Even if the data is noisy and contaminated with errors, ANN has been known to identify the underlying rule. These properties suggest that ANN may be the well-suited to the problems of estimation and prediction in hydrology. ANN has been applied in rainfall-runoff modelling. The relationship between rainfall-runoff is known highly non-linear and complex. The hydrologists and engineers always faced the problem with situations where little or no information is available. The information about rainfall and runoff is important in design and manage watersheds, and other project related to water resources. Runoff is dependent on numerous factors such as initial soil moisture, land use, evaporation, transpiration, infiltration, distribution and so on. It is important to make sure that a watershed should be gauged to provide continuous records of runoff, rainfall and so on. It is because this data is very useful for hydrologists and engineers to carried out a research and development, manages water resources projects, etc. In such cases of no available data in certain catchment, simulation models are often used to generate synthetic flows. Several researchers have 40 investigated the potential of neural networks in modelling watershed runoff based on rainfall inputs. Hsu et. al. (1995) used a three layer feedforward ANN to model daily rainfall-runoff relationship. They concluded that the feedforward ANN needed a trial and error procedure to find the appropriate number of time delayed input variables to the model. ANN was able to provide a representation of the dynamic internal feedback loops in the system, eliminating the need for lagged inputs and resulting in a compact weight space. They found that ANN performed well at runoff prediction. Tokar and Johnson (1999) employed ANN to forecast daily runoff as a function of daily precipitation, temperature and snowmelt for the Little Patuxent River watershed in Maryland. ANN model provides a more systematic approach, reduces the length of calibration data, and shortens the time spent in calibration of the models. They used a three layer feedforward ANN and apply trial and error procedure to find the appropriate number of input nodes to the model. They reported that ANN provides reasonable good solutions for circumstances where there are complex systems that may be poorly defined or understood using mathematical equations, problems that deal with noise or involve pattern recognition, and situations where input data are incomplete and ambiguous by nature. ANN models provided higher training and testing accuracy when compared with regression and simple conceptual models. Smith and Eli (1995) applied a back propagation ANN model to predict peak discharge and time to peak over a hypothetical watershed. The output was either the watershed runoff alone or the runoff and the time to peak. The number of nodes in the hidden layer was determined by trial and error for each case. They found that, for single storm events, the peak discharge and the time to peak were predicted well by neural network, both during training and testing. But, the ANN was less successful for multiple storm events. Shamseldin (1997) have compared ANN with a simple linear model, a season-based linear perturbation model, and a nearest neighbor linear perturbation model. They used a three layer neural network and adopted the conjugate gradient method for training the model. The network output consisted of the runoff time series. The found that neural networks generally performed better than the other models during training and testing. Minns and Hall (1996) adopted a three layer neural network with back-propagation algorithms. Data for network training consisted of 41 model results from one storm sequence, and two such sequences were generated for testing. Each storm sequence was generated using Monte Carlo procedure that preserved predetermined storm characteristics. Network inputs consisted of concurrent and 14 antecedent rainfall depths and 3 antecedent runoff values, and the network output was current runoff. They found that ANN performance was hardly influenced by level of nonlinearity, with performance deteriorating only slightly for high levels of nonlinearity. Haykin (1994) showed that design of a supervised neural network might be pursued in a number of different ways. While the back-propagation algorithm for the design of a multilayer perceptron (under supervision) may be viewed as an application of stochastic approximation, radial basis function (RBF) networks can be viewed as a curved curvefitting problem in a high-dimensional space. Therefore, the learning for such networks is equivalent to finding a surface in a multidimensional space that provides a best fit to the training data, with the criterion for ‘best-fit’ being expressed in a statistical sense. ANN has been applied in streamflow estimation. Streamflows are often treated as estimates of runoff from watershed. Karunanithi et. al. (1994) found lag time to be important in predicting streamflows or runoffs. This reflects the longer memory associated with streamflows. They claimed that ANN are likely to be more robust when noisy data is present in the inputs. Markus et. al. (1997) used ANN with the backpropagation algorithm to predict monthly streamflows at the Del Norte gauging station in the Rio Grande Basin in Southern Colorado. The inputs used were snow water equivalent and temperature. They looked at forecast bias and root mean square error for assessing model performance. They found that ANN did a god job of predicting streamflows. The studies of Karunanithi et. al. (1994) and Thirumalaiah and Deo (1998) directed network training to better replicate low streamflow events, meanwhile Poff et. al. (1996) concentrated on high flow events to generate improved statistics for floods. They used ANN to evaluate the changes in stream hydrograph from hypothetical climate change scenarios based on precipitation and temperature changes. The synthetic daily hydrograph was generated based on historic precipitation and temperature as inputs. The presented a procedure developed using current technical literature, heuristics and 42 experience of experts in artificial intelligence. They applied the back-propagation algorithm for cases of scarce information were used. ANN has been applied in water quality modelling. There are several studies have applied ANN to address water quality related issues. Water quality is influenced by many factors such as flow rate, contaminant load, water levels and other site-specific parameters. Maier and Dandy (1996) applied the utility of ANN for estimating salinity at the Murray Bridge on the River Murray in South Australia. The inputs to the ANN model were daily salinity values, water levels, and flows at upstream stations and antecedent times. Network output was the 14-day advance forecast of river salinity. They used two hidden layers and back-propagation algorithms for training. They concluded that the ANN model was able to replicate salinity levels fairly accurately based on 14-day forecast. Meanwhile, Starrett et. al. (1996) employed an ANN to predict pesticide leaching through turfgrass-covered soil. The variables of input nodes were pesticide solubility, rate of pesticide application, time since pesticide application, and type of irrigation practice. The ANN output was the percentage of pesticide that leached through 50cm of turfgrass-covered soil. They found that the application of ANN model was successful. Nagy et al. (2002) proposed feed-forward multilayer neural network approach on sediment transport using the back-propagation algorithm. Their objective of their study is to estimate total sediment discharge concentration in streams and natural rivers. Meanwhile, Zou et. al. (2002) proposed a neural network embedded Monte Carlo (NNMC) approach to account for uncertainty in water quality modelling. The framework of their proposed method has three major parts: a numerical water quality model, a neural network technique and Monte Carlo simulation. ANN has been applied in groundwater modelling. For example, Yang et. al. (1997) applied an ANN to predict water table elevations in subsurface drained farmlands. The input nodes are daily rainfall, potential evapotranspiration, and previous water table locations. The output was the current location of the water table. They found that a three layer feedforward ANN could predict water table elevations satisfactorily. Aziz and Wong (1992) used ANN to determining aquifer parameter values from normalized 43 drawdown data obtained from pumping tests. The input layer consisted of confined aquifer data and the leaky confined aquifer data. The values of aquifer parameters predicted by the ANN compared well with results using traditional methods. Other ANN applications to problem of ground water were investigated by Ranjithan et. al. (1993). They concluded that the pattern recognition strengths of ANN are particularly useful for identifying the more critical realizations. The problem of identifying optimal pumping strategies to control hydraulic gradient becomes simplified. Rogers and Dowla (1994) employed an ANN to perform optimization studies in ground water remediation. They investigated hypothetical scenarios of one or several contaminant plumes moving through a ground water region with a number of pumping wells. The optimization arises in trying to minimize the total volume of pumping. A multilayer feedforward ANN was applied together with back-propagation algorithm. The input nodes are possible pumping cases, with wells being assigned a value of 1, indicating that the well was pumping at the maximum capacity, or zero, indicating that the well was off. The ANN output represent whether or not the realization of pumping met the successful with the values being either 1 if successful, or 0 if not. They found that ANN model are robust and flexible tools that can be used for planning effective strategies in ground water remediation. ANN has been applied for estimating precipitation. The hydrologists always faced the problem with the ungauged catchment. It is because the information of precipitation are very important in water resources design and management. Precipitation serves as the driving force for most hydrologic processes. It is difficult to predict because it exhibits a large degree of spatial and temporal variability. French et. al. (1992) used a three layer feedforward ANN with back-propagation algorithm to forecast rainfall intensity fields. They also study studied the impact of different number of hidden nodes, using 15, 30, 45, 60, and 100 hidden nodes. They compared ANN with the forecasting models. They found that the ANN model performed slightly better than these models during the training and testing stage after a suitable architecture had been identified. Zhang et. al. (1997) has proposed that ANN need to be employed in groups when the 44 transformation from the input to the output space in complex. This group theory treats the input-output relationship mapping as being piecewise continuous. They found that ANN were successful in making half-hourly rainfall estimates. Meanwhile, Hsu et. al. (1998) developed a modified counter-propagation ANN for transforming satellite infrared images to rainfall rates over a specified area. They trained separately the connection weight between the input and hidden layer, and between the hidden and output layer. The training algorithms used is a back-propagation. They used an unsupervised self- organizing clustering procedure based on the principle of competition. They showing comparisons between observed and predicted rainfall rates. They found that ANN provided a good estimation of rainfall and yielded some insights into the functional relationships between the input variables and the rainfall rate. 2.7.1 Versatility of Neural Network Method Very often in hydrology, the problems are not clearly understood or are too illdefined for a meaningful analysis using physical-based methods. Even when such models are available, they have to rely on assumptions that make ANN seem more attractive. Moreover, ANN routinely models the nonlinearity of the underlying process without having to solve complex partial differential equations. Unlike regression-based techniques, there is no need to make assumptions about the mathematical form of the relationship between input and output. The presence of noise in the inputs and outputs is handled by an ANN without severe loss of accuracy because of distributed processing within the network. This, along with the nonlinear nature of the activation function, truly enhances the generalizing capabilities of ANN and makes them desirable for a large class of problems in hydrology. Markus (1997) addressed that their versatility of neural networks application will allow them to be applied to an increasing range of hydrologic problems in the future. 45 The neural networks model, amongst the black box models, has gained wider applicability, as the functional form between the input variables and the output is not required to be defined a priori. ASCE (2000) outlined the following reasons the ANN become an attractive computational tool: (i) They are able to recognize the relation between the input and output variables without explicit physical consideration. (ii) They work well even when the training sets contain noise and measurement errors. (iii) They are able to adapt to solutions over time to compensate for changing circumstances. (iv) They posses other inherent information-processing characteristics and once trained are easy to use. (v) It can evolve on its own a nonlinear relationship between input and output. (vi) The model is highly flexible in structure. Neural networks have a built-in capability to adapt their synaptic weights to changes in the surrounding environment. In particular, a neural network trained to operate in a specific environment can be easily retrained to deal with minor changes in the operating environmental conditions. Moreover, when it is operating in a non-stationary environment, for example one whose statistics change with time, a neural network can be designed to change its synaptic weights in real time. 2.8 Bivariate Linear Regression and Correlation in Hydrology Correlation and regression procedures are widely used in hydrology, water resources engineering and other sciences. Their use has been greatly enhanced in recent decades by the availability of computers and an abundant number of computer programs for various aspects of the analyses. The premise of the methods is that one variable is often conditioned by the value of another, or several others or the distribution of one may 46 be conditioned by the value of another. Regression analyses are among the oldest statistical techniques used in hydrology. They were used primarily in the past to transfer information between points at which the same variable was observed, or among several variables observed simultaneously. This included the estimation of missing data in hydrologic series, and the prediction of a variable from an observed variable or several other variables. The explicit determination of the regression equation is in the sense of the final product of the analysis. The followings briefly outlined the analysis by some researchers using this technique for prediction works. Laursen (1958) proposed a relationship that give both quantity and quality of total, suspended and bed loads as functions of stream and sediment characteristics. Colby (1964) developed graphical solutions for total load based from laboratory and field data. Meanwhile, Brownlie (1983) succeeded in obtaining an improved solution of the onedimensional equation that is a fairly simple regression equation. Kitamura and Nakayama (1985) proposed a multiple linear regression model to describe the relationships between monthly rainfalls and monthly net inflows for Pedu-Muda reservoir system. Karim and Kennedy (1990) derived a relation between flow velocities, sediment discharge, bed-form geometry and friction factor of alluvial rivers using non-linear form of the multiple regression analysis. Harun (1999) proposed the multiple linear regressions for modelling of stochastic reservoir net inflow processes. Regression represents a mathematical equation expressing one random variable as being correlatively related to another random variable, or to several random variables. The regression equation may be any function that can be fitted to a set of points of observed variables. The selection of the function to be fitted to points determines the type and the degree of correlative association. Determining mathematical models of correlative association of two or more variables, so that the best prediction of one variable can be obtained from the other variable or variables, is referred to as regression analysis, and the models are called regression functions. 47 The correlation and regression techniques provide a powerful means to identify the mathematical dependence between observed values of physically related variables and can account for the additional information contained in correlated sequences of events. Sampling errors are reduced and the reliability of estimates is improved. In addition, to predicting the mean or expected value of a hydrologic variable such as rainfall, runoff, or peak flows, the technique can be used to predict the expected value of other statistical parameters, for example, standard deviation, skewness, or autocorrelation. The correlation is determined between the desired statistical parameter as dependent variable, and the appropriate physical and climatic variables within the basin or region as the independent variables. The procedures are significantly better than using relatively short historical sequences and point-frequency analysis. Not only does the method reduce the inherently large sampling errors but it furnishes a means to estimate parameters at ungauged locations. There are limitations to the techniques of fitting regression. First, the analyst assumes the form of the model that can express only linear, or logarithmically linear, dependence. Second, the independent variables to be included in the regression analysis are selected. And, third, the theory assumes that the independent variables are indeed independent and are observed or determined without error. Advanced statistical methods that are beyond the scope of this text offer means to overcome some of these limitations but in practice it may be impossible to satisfy them. Therefore, care must be exercised in selecting the model and in interpreting results. Accidental or casual correlation may exist between variables that are not functionally correlated. For this reason, correlation should be determined between hydrologic variables only when a physical relation can be presumed. Because of the natural dependence between many factors treated as independent variables in hydrologic studies, the correlation between the dependent variable and each of the independent variables is different from the relative effect of the same independent variables when analyzed together in a multivariate model. One way to guard against 48 this effect is by screening the variables initially by graphical methods. Another is to examine the results of the final regression equation to determine physical relevance. Alternatively, regression techniques themselves aid in screening significant variables. When electronic computation is available, a procedure can be followed in which successive independent variables are added to the multiple regression model, and the relative effect of each is judged by the increase in the multiple correlation coefficient. Although statistical tests can be employed to judge significance, it is useful otherwise to specify that any variable remain in the regression equation if it contributes or explains, say, 1 or 5 percent of the total variance, or of R 2 . A frequently used method is to compute the partial correlation coefficients for each variable. This statistic represents the relative decrease in the variance remaining (1- R 2 ) by the addition of the variable in question. Most PC spreadsheet software packages have statistical routines for all the analyses described here and many more. Most are extremely flexible, requiring minimal instructions and input data other than raw data. Special manipulations can affect an interchange of dependent and independent variables, bring one variable at a time into the regression equation, rearrange the independent variables in order of significance, and perform various statistical tests. 2.8.1 Fitting Regression Equations Regression analysis has been used in various scientific and engineering disciplines. There are many problems in hydrology that may be solved by multiple regression procedures. These types of analysis have been used in flood studies, flow projection, catchment modelling, etc. It was first applied to filling in missing data and extending short records at one hydrologic station by relating the available data at this 49 station with those at adjacent stations. For example, the two variables x and y for their population linear regressions give two equations, y = a1 + b1 x , and x = a2 + b2 y (2.2) where b1 and b2 are tangents of slope angles of the two regression lines, and a1 and a 2 are their intercepts. The regression line y versus x is different from the regression line x versus y . The variable on the right side of equation 2.2 is called the independent variable, and on the left side it is called the dependent variable. However, the explanatory variables and the predictive variables as used by some hydrologists are better terms than the independent variables and the dependent variable, as generally used by statisticians. Equation 2.2 is used for predictive purposes, namely when one needs the best value of y for a given value of x , or the converse. One is free to choose which variable shall be taken as explanatory (independent) and predictive (dependent) in filling in the missing data. The fitting technique is the method of least squares, which minimizes the sum of the residuals squared. Multiple regression is used is when one dependent variable and several independent variables are available and it is desired to find a linear model for predicting unobserved values for the dependent variable. The model that is developed does not necessarily have contained all of the independent variables. In selecting a multiple regression model, several regressions on a given set of data are performing using different combinations of the independent variables. The regression that ‘best fit’ the data is then selected. A commonly used criterion for the ‘best fit’ is to select the equation yielding the largest value of correlation coefficient ( R 2 ). One of the most commonly used procedures for selecting the best regression equation is stepwise regression. The stepwise regression technique will be employed for determining the number of significant independent variables to be included in the model. 50 2.8.1.1 Stepwise regression This procedure consists of building the regression equation one variable at a time by adding at each step the variable that explains the largest amount of the remaining unexplained variation. After each step all the variables in the equation are examined for significance and discarded if they are no longer explaining a significant amount of the variation. Thus the first variable added is the one with the highest simple correlation with the dependent variable. The second variable added is the one explaining the largest variation in the dependent variable that remains unexplained by the first variable added. At this point the first variable is tested for significance and retained or discarded depending on the results of this test. The third variable added is the one that explains the largest portion of the variation that is not explained by the variables already in the equation. The variables in the equation are then tested for significance. This procedure is continued until all of the variables not in the equation are found to be insignificant and all of the variables in the equation are significant. The real test of how good the resulting regression model is depends on the ability of the model to predict the dependent variable for observations on the independent variables that were not used in estimating the regression coefficients. To make a comparison of this nature, it is necessary to divide the data into two parts. One part of the data is then used to develop the model and the other part to test the model. 2.9 Review on HEC-HMS Model The HEC-HMS program was developed at the Hydrologic Engineering Centre (HEC) of the US Army Corps of Engineers. It utilizes a graphical user interface to build a watershed model and to set up the precipitation and control variables for simulation. HEC-HMS is considered the standard model in the private sector in the United States for the design of drainage systems, quantifying the effect of land-use change on flooding, etc. (Singh and Woolhiser, 2002). The HEC-HMS program simulates rainfall-runoff and 51 routing processes, both natural and controlled. By referring to the HEC-HMS Technical Reference Manual (2000), for the rainfall-runoff simulation, HEC-HMS provides the following components: (i) Rainfall specification options which can describe an observed (historical) rainfall event, a frequency–based hypothetical rainfall event, or an event that represents the upper limit of rainfall possible at a given location. (ii) Loss models which can estimate the volume of runoff, given the rainfall and properties of the watershed. (iii) Direct runoff models that can account for overland flow, storage and energy losses as water runs off a watershed and into the stream channels. (iv) Hydrologic routing models that account for storage and energy flux as water moves through stream channels. (v) Models of naturally occurring confluences and bifurcations. (vi) Models of water-control measures, including diversions and storage facilities. (vii) A distributed runoff model for use with distributed precipitation data, such as the data available from weather radar. (viii) A continuous soil moisture accounting model used to simulate the longterm response of a watershed to wetting and drying. HEC-HMS is a deterministic model, where all the input, parameters, and processes in a model are considered free of random variation and known with certainty. HEC-HMS models can be used to models an event or continuous rainfall-runoff processes. An event model simulates a single storm. The duration of the storm may range from a few hours to a few days. HEC-HMS is a measured-parameter and fittedparameter model. A measured-parameter model is one in which model parameters can be determined from system properties, either by direct measurement or by indirect methods that are based upon the measurements. A fitted-parameter model, on the other hand, includes parameters that cannot be measured. Instead, the parameters must be found by fitting the model with observed values of the input and the output. 52 A continuous model simulates a longer period, predicting watershed response both during and between rainfall events. Most of the models included in HEC-HMS are event models. HEC-HMS is a lumped model. These spatial (geographic) variations of characteristics and processes are averaged or ignored. HEC-HMS includes both empirical and conceptual models. A conceptual model is built upon a base of knowledge of the pertinent physical, chemical, and biological processes that act on the input to produce the output. An empirical model, on the other hand, is built upon observation of input and output, without seeking to represent explicitly the process of conversion. The model is fitted with observed rainfall and runoff, and it is based upon fundamental principles of surface flow. HEC-HMS uses a separate model to represent each component of the runoff process, including models that compute runoff volume; models of direct runoff; models of base flow; and models of channel flow. Figure 2.9 is a systems diagram of the watershed runoff process, at a scale that is consistent with the scaled modelled well with HEC-HMS. The processes illustrated begin with rainfall. Current HEC-HMS is limited to analysis of runoff from rainfall. In the simple conceptualisation shown, the rainfall can fall on the watershed’s vegetation, land surface, and water bodies (stream and lakes). In the natural hydrologic system, much of the water that falls as rainfall returns to the atmosphere through evaporation from vegetation, land surfaces, and water bodies and through transpiration from vegetation. During a storm event, this evaporation and transpiration is limited. Water that does not pond or infiltrate moves by overland flow to a stream channel. The stream channel is the combination point for the overland flow, the rainfall that falls directly on water bodies in the watershed, and the interflow and base flow. The resultant streamflow is the total watershed outflow. The appropriate representation of the system depends upon the information needs of a hydrologic engineering study. For some analysis, a detailed accounting of the movement and storage of water through all components of the system is required. For example, to estimate changes due to watershed land use changes, it may be appropriate to use a long record of rainfall to construct a corresponding long record of runoff, which can 53 be statistically analyzed. Instead, the model needs only compute and report the peak, or the volume, or the hydrograph of watershed runoff. The HEC-HMS view of the hydrologic process can be somewhat simpler. Then, only those components necessary to predict runoff are represented in detail, and the other components are omitted or lumped. HEC-HMS includes models of infiltration from the land surface, but it does not model storage and movement of water vertically within soil layer. It implicitly combines the near surface flow and overland flow and models this as direct runoff. It does not include a detailed model of interflow or flow in the groundwater aquifer, instead representing only the combined outflow as base flow. Precipitation Evapotranspiration Land surface Water body Soil Stream channel Groundwater aquifer Watershed discharge Figure 2.9 Typical HEC-HMS representation of watershed runoff HEC-HMS considers that all land and water in a watershed can be categorized as either directly-connected impervious surface or pervious surface. Directly-connected impervious surface in a watershed is that portion of the watershed for which all contributing precipitation runoff, with no infiltration, evaporation, or other volume losses. HEC-HMS computes runoff volume by computing the volume of water that is intercepted, infiltrated, stored, evaporated, or transpired and subtracting it from the precipitation. Singh and Woolhiser (2002) concluded that most models perform little to 54 no error analysis. Thus, it is not clear what the model errors are and how different errors propagate through different model components and parameters. This is one of the major limitations of most current watershed hydrology models. Thus, from the standpoint of a user, it is not clear how reliable a particular model is. Anderson et. al. (2002) employed the HEC-HMS model for runoff prediction for Calaveras river basin. The calibration period for the HEC-HMS model of the Calavaras basin was a 48 hours period from February 8 to 9, 1999. Meanwhile, a 48 hours forecast period form January 19 to 21, 1999 was selected. The writer found that that the HECHMS model with distributed precipitation is necessary for forecasts of reservoir inflows. HEC-HMS can transform spatial representations of rainfall data into runoff at the subwatershed outlet. Furthermore, future work is required in order to provide quantitative measures for this type of forecast accuracy, in terms of matching the timing and magnitude of the peak inflow and total volume of runoff. Kavvas and Chen (1998) employed HEC-HMS model as Meteorologic model interface. The writer revealed that there are several options exist for the parameterization of moist convection and boundary layer processes for the simulation of atmospheric phenomena at different scales and different characteristics. Woolhiser and Goodrich (1988) investigated the importance of time varying rainfall in a model of a small watershed and found that disaggregating total rainfall amounts into simple, constant, and triangular distributions caused significant distortion in the peak rate distributions for Hortonian runoff. Mazion and Yen (1994) investigated the effect of computational spatial size on watershed runoff simulated by HEC-HMS, RORB and a linear system. They found that the computational grid size had a significant effect on the model results if the physical scale was not finer, although the effects decreased with increasing rainfall duration. Wu et. al. (1982) examined the effects of spatial variability of roughness on runoff hydrographs from an experimental watershed facility and found that under certain conditions an equivalent uniform roughness could be used for a watershed with nonuniform roughness. Sargent (1981, 1982) determined the effects of storm direction and speed on runoff peak, flood volume, and hydrograph shape. Surkan (1974) observed that 55 peak flow rates and average flow rates were most sensitive to changes in the direction and speed of the rainstorms. Meanwhile, Singh (1998) evaluated the effect of the direction of storm movement on planar flow and showed that the direction of storm movement exercised a significant influence on the peak flow, time to the peak flow, and the shape of the overland flow hydrograph. Kasmin (2003) applied the HEC-HMS model to examine the effects of selective logging on stormflow parameters at Berembun catchment, Negeri Sembilan. HEC-HMS was used to calibrate and validate the observed data in order to predict stormflow parameters. The writer revealed that the stormflow volume cannot be satisfactorily predicted especially after logging activities. It may due to larger storms or peak flows that increased with storm duration. 2.10 Reviews on XP-SWMM Model Development of new models or improvement of previously developed models continuous today. The power of computers increased exponentially and, as a result, advances in hydrology have occurred at an unprecedented pace during the past 35 years. During the decades of the 1970s and 1980s, a number of mathematical models were developed not only for simulation of watershed hydrology but also for their applications in other areas, such as environmental and ecological system management (Singh and Woolhiser, 2002). The XP-SWMM is an interactive runoff and streamflow routing program developed by the US Environmental Agency (XP-SWMM users manual, 2000). It provides a comprehensive environment to design urban drainage systems utilizing sophisticated graphical tools together with associated Geographical Information Systems (GIS) and Computed Aided Design (CAD). It includes Australian, US and South African storm patterns. Indeed there has been a proliferation of watershed hydrology models with emphasize on physically based models. Examples of such watershed hydrology models are SWMM (Metcalf and Eddy, 1971). This model has since been significantly 56 improved. SWMM model has been extended to contain increased catchment information, more physically based processes and improved parameter estimation. It calculates the catchment losses and streamflow hydrographs resulting from rainfall events and/or other forms of inflow to channel networks. It is mainly used for flood estimation, flood routing and hydraulic structures design. In flood estimation applications, the program may be used on urban or partly rural or partly urban catchments. It mostly used for design flood forecasting and prediction. In flood routing applications, single and multiple reaches, network of streams and lateral inflow and outflow can be modelled. The model really distributed, nonlinear, and based upon a storage routing procedure. The program provides continuous and an event-type modelling procedure. The rainfall is operated on by a loss model to produce rainfallexcess, and the rainfall-excess is operated on by a catchment storage model representing the effects of overland flow storage and channel storage to produce the surface runoff hydrograph. There are three functions of calibration, testing and design in this model. The sequence of operations used to model a particular catchment or stream is defined by a series of numerical codes. The data relevant to each code is stored with that code in a data file. If rainfall and runoff data are available for an event, the program may be used to calibrate the model through an interactive, trial and error fitting procedure and also to provide some testing of the model. At the conclusion of each run there is an option to change the parameters and run again with new values without re-reading and checking the data and with no unnecessary re-computation or output. Also re-run may be used with a new data file or same data file with or without changing the parameters or output. It is necessary firstly to prepare a data file to run the program. Zakaria et. al. (2003) employed the XP-SWMM model to study the behaviour of the Bio-Ecological Drainage system (BIOECODS). The simulation is emphasized on the impact of minor flood events on the drainage system. They found that the modelling and simulation indicate that the feature and characteristic of BIOECODS has been 57 satisfactorily represented in the XP-SWMM model. The results generated from the XPSWMM modelling have confirmed that the BIOECODS consists of storage, flow retarding and infiltration engineering, capable of attenuating flood discharge and managing stormwater at source. Using a length scale based on surface characteristics and excess rainfall duration, Julien and Moglen (1990) found that the influence of spatial variability of slope, roughness, width, and excess rainfall intensity on watershed runoff varied with the length scale. By using the unit hydrograph model, Hromadka et. al. (1988) found on 12 watersheds that the variance of model-simulated discharge decreased significantly with the level of discretization, but this decrease reflected a departure of the model results from the true watershed behavior. El-Kady (1989) reviewed numerous watershed models and concluded that the surface water-groundwater linkage needed improvement, while ensuring an integrated treatment of the complexity and scale of individual component processes. 2.11 Summary of Literature Review Numerous applications of commercially available hydrologic model (HEC-HMS, XP-SWMM, RORB, etc.) can be found in several reports and literature. Most of them are successful. However, the application of those models for example HEC-HMS, XPSWMM possess has some constraints. Some of the main problems of optimization methods are the difficulty of finding a unique ‘best’ parameter set. Another difficulty is the inadequacy of these methods for multi input-output hydrology models (Gupta and Sorooshian, 1994). The calibration process required more data in order to yield a good fit calibration model. The rainfall and flow records are normally available. Other data such as evapotranspiration, land use, soil characteristic, etc. sometimes not available and this possibly reduces the degree of accuracy and reliability of the calibration processes in commercially available rainfall-runoff models. 58 In the literature review, the neural networks methodology has been reported to provide reasonably good solutions for circumstances having complex systems that may be poorly defined or understood using mathematical equations, problems that deal with noise or involve pattern recognition, and input data that are incomplete and ambiguous by nature. Neural networks can identify and learn correlative patterns between sets of input data and corresponding target values. Once trained, such nets can be used predictively to forecast outcomes from new input data. Because of these characteristics, it was believed that neural networks could be applied to model rainfall-runoff relationships. It is apparent that an ANN method derives its computing power through the massively parallel distributed structure, and its ability to learn and therefore generalize that producing reasonable outputs for inputs not encountered during training (learning). These two information processing capabilities make it possible for neural networks to solve complex or large-scale problems that are currently intractable. In practice, however, neural networks cannot provide the solution working by themselves alone. Rather, they need to be integrated into a consistent system engineering approach. Generally, a neural network system must be capable of doing three things: (1) store knowledge; (2) applies the knowledge stored to solve problems; and (3) acquire new knowledge through experience. Mathematically, an ANN may be treated as a universal approximator. The ability to learn and generalize knowledge form sufficient data pairs makes it possible for ANN to solve large-scale complex problems such as pattern recognition, non-linear modelling, prediction, classification, prediction, forecasting, control, and others which is find application in hydrology today. Nowadays, ANN has found increasing use in diverse disciplines ranging over perhaps all branches of engineering and science. In this study, the writer has placed emphasis on neural networks methods. 59 CHAPTER 3 RESEARCH METHODOLOGY 3.1 Introduction This research evaluates the applicability of neural networks to runoffs predicting and data generation through comparison of neural networks to a variety of other methods. To accomplish such a task the models are tested on basins of various sizes and at various time periods. The ultimate aim of this research is to determine the performance of rainfall-runoff model, due to the application of neural network (ANN) method using multilayer perceptron (MLP) and radial basis function (RBF) models. The study was accomplished through the quantitative analysis of the relationship between rainfall and river discharge for hourly and daily interval. Further, comparison was made between the performance of ANN models against multiple linear regression (MLR), HEC-HMS and XP-SWMM models. To address the above mentioned objectives the following study approach was planned. The historical records of rainfall and flow data of several catchments taken from Department of Irrigation and Drainage (DID) and used in the study. The rainfall-runoff models were developed based on the relationship between rainfall and flow. For the HEC-HMS and XP-SWMM models, additional parameters were added to calibrate the rainfall-runoff model. Those parameters are the imperviousness, loss rate, time of concentration and lag time. Both models considered as the most popular and well-known software’s in the market compared to others. 60 This chapter will provide a description of the research methodology and the selected study sites. The proposed methodology consists of the MLP and RBF modelling approaches, MLR method, HEC-HMS and XP-SWMM commercial rainfall-runoff packages. To develop the ANN model, the MATLAB computer package (The Math Works Inc., 2000) was utilized. It is required to write various computer programming in MATLAB environment in order to model the rainfall-runoff relationship. 3.2 Multilayer Perceptron (MLP) Model The MLP is supervised and feedforward neural network with one or more layers of nodes between input and output nodes. It is a most commonly used neural computing technique. Each node is the basic element of a neural network called neuron. The proposed multilayer perceptron model consists of an input layer linked to the input rainfall variables xi , a hidden layer, and an output layer that connects to the output (k ) variables y . Figure 3.1 illustrates the architecture of the proposed rainfall-runoff prediction model. The decisions that affect the performance of the multilayer perceptron models during training include the number of input nodes, the number of hidden nodes, learning rate, momentum constant and the transfer function. Input layer constitutes the input nodes or neurons. The number of input nodes in the input layer should be selected carefully in order to construct a good neural network model. The accuracy of model depends on the selection of input nodes derived from the characteristics of data series. As mentioned before, the rainfall-runoff process is very complex, highly nonlinear, time varying, and possesses a stochastic behaviour. Hence, the application of neural network methods may be able to describe accurately this process. In the rainfallrunoff modelling, the input nodes constitute the rainfall series and the output node consists of the runoff data. The target value of neural network computation is the runoff. The data is divided into two subsets. First is training data sets. Training data sets used to 61 train the model, and also to validation data. Validation process is carried out to monitor neural network model performance during training. Meanwhile, the test data sets used to measure the model performance. The training data set is the data which the neural network uses to learn the solution to the problem. The validation process is used to establish when to stop the algorithm that is to choose the best solution. The training and validation data set is considered as the model construction stage. Hidden layer Input layer Output layer 1 1 1 . . . . . . n wij m y (k ) c(k ) ( j) y j ( j = 1,..., n) xi (i = 1,..., m) Figure 3.1 Structure of a MLP Rainfall-Runoff model with one hidden layer The training data set is the data which the neural network uses to learn the solution to the problem. The validation data set is used to establish when to stop the algorithm that is to choose the best solution. The training and validation data set is considered as the model construction stage. After every training iteration the validation data set is passed through the network, and the error over the data set is calculated. The best set of weights is defined as those that produce the lowest error over the validation data set. Meanwhile the test data set is the stage of using model for forecasting and performance test. 62 The present study employs the supervised training neural network models. Once the neural network models have been created, their suitability for the application needs to be investigated. This task involves training the models and then testing the performance of the neural system. Before begin the training process the data is first normalized. Data normalization ensures that each input contributes equally to the decision or prediction made by the network. The normalization technique used is known as zero mean unit standard deviation normalization. It is to determine the mean and the standard deviation for each field. Each field is then normalized such that the mean values for the field becomes zero and the values at plus and minus one standard deviation are mapped onto plus and minus one. The training or learning phase is critical to the success of the neural networks. In this study, back-propagation algorithm is used to correct errors. Back-propagation is a numerically intensive technique, and there are many different ways to perform backpropagation to teach the neural nets how to respond. Any back-propagation network is based on a supervised learning technique that compares the actual output from output units to the target or specified output and then readjusts the weights backward in the network (Tingsanchali, 2000). In order to improve the usefulness of the steepest descent method, two parameters can be altered, the momentum coefficient and the learning coefficient (Harun, 1999). Learning rate is used to control the amount of weights adjustment at each step of training. The smaller the learning rate parameter α , the smaller will the changes to the synaptic weight in the network from one iteration to the next and the smoother will be the trajectory in weight space. However, the smaller learning rate will take too long to reach the minimum. If on the other hand, we make the learning rate parameter α too large so as to speed up the rate of learning, the resulting large changes in the synaptic weights assume such a form that the network may become unstable. In the standard backpropagation, the learning rate may be constant (Fausett, 1994). There is no hard and fast rule about what value the learning coefficient should have. A simple method of increasing the rate of learning and avoid the danger of instability is to include the 63 momentum term, μ . The momentum term is usually a positive number. The incorporation of momentum in the backpropagation algorithm represents a minor modification to the weight update. The momentum term have the benefit of preventing the learning rate process from terminating in a shallow local minimum on the error surface. Basically, the proposed neural network model consists of an input layer linked to the input rainfall variables xi , a hidden layer, and an output layer that connects to the output runoff variables, y . Input layer is the rainfall data and the output layer constitute the runoff data. The information of the data in the input layer transfer to the next consecutive layers in the system of feed forward networks. The selected activation function will process the signal send by input data that passes from each node. Associated with each incoming input signal is a weight. Each input node unit ( i = 1,..., m ) in input layer broadcasts the input signal to the hidden layer. Each hidden node ( j = 1,..., n ) sums its weighted input signals, m y _ in j = w0 j + ∑ xi wij (3.1) i =1 applies its activation function to compute its output signal from rainfall, y j = f ( z _ in j ) (3.2) and sends this signal to all units in the hidden layer, which wij is the weight between input layer and hidden layer, w0 j is the weight for the bias; and xi is the input rainfall signal. In this study, the activation function used is hyperbolic-tangent sigmoid (tansig) function as shown in Figure 3.2. For hyperbolic-tangent function, f (t ) = 2 −1 1 + e −2t (3.3) 64 Transfer function f (t ) +1 x 0 -1 Figure 3.2 Hyperbolic-tangent (tansig) activation function The hyperbolic-tangent activation function will process the signal that passes from each node: f ( y _ in j ) = 1 1+ e − y _ in j (3.4) Then from second layer the signal is transmitted to third layer. The output unit ( k = 1 ) sums its weighted input signals, n x _ ink = c0( k ) + ∑ z j c (jk ) (3.5) j =1 and applies its activation function to compute its output signal, y (k ) = f ( x _ ink ) (3.6) where c (kj ) is the weight between second layer and third layer; and c 0( k ) is the weight for the bias. The output node ( k = 1 ) receives a target pattern corresponding to the input training pattern, computes its error information term, (k ) δ k = (t k − y ) f ' ( x _ ink ) (3.7) calculates its weight correction term (used to update c (kj ) later), and Δc (jk ) = αδ k y j (3.8) calculates its bias correction term (used to update c 0( k ) later), Δc 0( k ) = αδ k (3.9) 65 where α is learning rate; t k is the target neural network output; y (k ) is the neural network output as net inflow variable; and f ' ( x ) = f ( x )[1 − f ( x )] . The error information is transfer from the output layer back to early layers. This is known as the backpropagation of the output error to the input nodes to correct the weights. This method uses partial derivatives of error with respect to weights to update the weights of the connections. Each hidden unit ( j = 1,..., m ) sums its delta inputs (from units in the layer above), p δ _ in j = ∑ δ k c (jk ) (3.10) k =1 multiplies by the derivative of its activation function to calculate its error information term, δ j = δ _ in j f ' ( y _ in j ) (3.11) calculates its weight correction term (used to update wij later), Δwij = αδ j xi (3.12) and calculates its bias correction term (used to update w0 j later), Δw0 j = αδ j (3.13) The output unit ( k = 1 ) updates its bias and weight s ( j = 0,..., n ) c kj (new) = c kj (old ) + Δc (jk ) (3.14) and each hidden unit ( j = 1,..., n) updates its bias and weights ( i = 0,..., m ) wij ( new) = wij (old ) + Δwij (3.15) The weight update formulas for backpropagation with momentum are, Δc (jk ) (t + 1) = αδ k y j + μΔc (jk ) (t ) (3.16) and, Δwij (t + 1) = αδ j xi + μΔwij (t ) The process is terminated when this difference is achieved a specified value. (3.17) The training phase needs to produce an ANN that is both stable and convergence, to produce 66 accurate input-output relations. After this, the network can be tested using data have not been assigned during learning. In general, the processes or procedures of backpropagation algorithm can be summarized as follows: (i) Obtain a set of training patterns (ii) Setup neural network model (no. of input neurons, hidden neurons, and output neurons) (iii) Set a model parameters (learning rate η , momentum rate α , etc) (iv) Initialize all connection, weights wij , and biases b , to random values. (v) Set minimum error, Emin (vi) Start training by applying input and desired outputs and propagate through the layers then calculate total error. (vii) Backpropagate error through output and hidden layer and adapt weights. (viii) Backpropagate error through hidden and input layer and adapt weights. (ix) Check it error < Emin . If not repeat steps 6-9. If yes stop training. Tokar and Johnson (1999) reported that the most commonly used activation function in the backpropagation network is the hyperbolic-tangent (tansig) functions. It is a continuous transfer functions that accomplished the modification of the network weights. This function is nonlinearity, differentiable, which is antisymmetric with respect to the origin and for which the amplitude of the output lies between –1 and +1. The transfer function also introduces a nonlinearity that further enhances the network’s ability to model complex functions. Its functional form determines the nonlinear response of a node to the total input signal it receives to produce a continuous value. Normally, the output layer is chosen a linear activation function. The hyperbolic tangent and sigmoid are often employed as transfer functions in the training of network (Tokar and Johnson, 1999). One of the basic requirements of the backpropagation training is that the transfer function be continuous and differentiable. 67 3.2.1 Training of ANN There are two types of training approaches to training ANN namely supervised learning and unsupervised learning. Supervised learning is the most common type of learning used in ANN. Adapting the values of the weight and thresholds by presenting the input and output data is known as learning or training. Training is the actual process of adjusting weight factors based on trial-and-error. A supervised training requires target patterns or signals to guide the training process. The objective is to minimize the error between the target and actual output and to find weights. To train ANN, the weight factors were adjusted until the calculated output pattern based on the given input matches the desired output. These weights are modified until the difference between the network output and the actual outputs are equal or close to targets. This training procedure involves the iterative adjustment and optimization of connection weights and threshold values for each node in the network. Tsoukalas and Uhrig (1997) reported that, today it is estimated that 80% of all applications utilize this backpropagation algorithm in one form or another. It is a gradient (derivative) technique that are simple to compute locally, and it performs stochastic gradient descent in weight space (for pattern by pattern updating of synaptic weights). In the current study, the Levenberg-Marquardt (LM) algorithm is used. The LM algorithm is an approximation to Newton’s Method (Hagen and Menhaj, 1994). Newton’s method is an alternative to the conjugate gradient methods for fast optimization and often converges faster than conjugate gradient methods (see Demuth and Beale, 1994). The LM algorithm was designed to approach second-order training speed without having to compute the Hessian matrix. When the performance function has the form of a sum of squares (as is typical in training feedforward networks), then the Hessian matrix can be approximated as, H = JTJ (3.18) If E is assumed as a sum of squares function, N E = ∑ ei2 i =1 (3.19) 68 Differentiating Eq. 3.19 with respect to e , the gradient can be computed as, g = JTe (3.20) where J is the Jacobian matrix that contain first derivatives of the network errors with respect to the weights and biases, J T is the transposition of J , and e is a vector of network errors. Let W denote the total number of free parameters such as weights and biases of a multilayer perceptron, which are ordered in the manner described to form the weight vector w . Let N denote the total number of training patterns used to train the network. Using backpropagation to compute a set of W partial derivatives of the approximating function F [w : x(n)] with respect to the elements of the weight vector w for a specific training pattern x(n) in the training set. Repeating these computations for n = 1,2,..., N , then it is end up with a N − by − W matrix of partial derivatives. This matrix is called the Jacobian J of the multilayer perceptron. The Jacobian matrix can be computed through a standard backpropagation technique that is much less complex than computing the Hessian matrix. The LM algorithm uses this approximation to the Hessian matrix in the following Newton’s method, wk +1 = wk − H k−1 g k [ = wk − J T J + μI ] wk +1 = wk + Δw −1 k JTe (3.21) (3.22) (3.23) where wk is a vector of current weights and biases, g k is the current gradient and Δw = − H k−1 g k . This equation is applied iteratively, with the computed value of wk +1 being used repeatedly as the ‘new’ wk . When the scalar μ is zero, this is just Newton’s method, using the approximation Hessian matrix. When μ is large, this become gradient descent with a small step size. Newton’s method is faster and more accurate near an error minimum, so the aim is to shift towards Newton’s method as quickly as possible. Thus, μ is decreased after each successful step (reduction in performance function) and is increased only when a tentative step would increase the performance function. In this way, the performance function will always be reduced at each iteration of the algorithm. 69 3.2.2 Selection of Network Structures Networks defined by various combinations of rainfall and runoff at present and previous time periods were trained and tested to predict runoff using different ANN configurations. For all different combinations of the input variables, networks were first trained using a one hidden layer. As the MLP, then the best fit combination of these input variables for one hidden layer networks was used to train the two hidden layer networks. The comparison of the one hidden layer with the two hidden layer networks was carried out. In this particular study, the structure of ANN model is designed based on trial and error procedure to find the appropriate number of time-delayed input variables to the model. Hsu et. al. (1998), Tokar and Johnson (1999) and Dibike and Solomatine (2001) treat the rainfall as directly related to runoff at the present time t by using the following equation, y (t ) = f {x(t ), x(t − 1), x(t − 2), K , x(t − n), y (t − 1), K , y (t − n)} (3.24) This model treat the rainfall as directly related to runoff at the present time t . The goodness-of-fit statistics are computed for both training and testing for each ANN architecture. At the first step, the rainfall at time t was added to the model. The goodness-of-fit statistics for the present model were computed for training and testing procedures. Then rainfall at time ( t − 1 ) was added as an additional input variable to the model, and the goodness-of-fit statistics were computed. This procedure is repeated by adding rainfall at previous time periods as input variable until there is no significant change in model training and testing accuracy. After the first step was completed, another input variable; the runoff at previous time periods, (t − 2) is added to the best-fit model obtained from the first step. Then, the goodness-of-fit statistics for the present model were computed for training and testing procedures. This procedure is repeated by adding runoff at previous time periods as input variable until there is no significant change in model training and testing accuracy. 70 3.3 Radial Basis Function (RBF) Model A well known non-linear modelling approach is the RBF network. RBF have only three layers (one input, one hidden, and one output). Figure 3.3 illustrates the designed architecture of the RBF. In RBF, the number of RBF ‘centres’ (also called weight vectors) is as many as data points and the performance of an RBF network depends upon the chosen centres. However, the problem arise is how to select the RBF centres especially for a large number of parameters. The most general formula for any RBF is, ( ) y ( x) = φ ( x − c ) ℜ −1 ( x − c ) T (3.25) where φ is the function used, c is the centre and ℜ is the metric. ((x − c ) ℜ T −1 (x − c )) is the distance between the input The term x and the centre c in the metric defined by ℜ . Often the metric is Euclidean. In this case, ℜ = r 2 Ι for some scalar radius r and equation 3.25 simplifies to ⎛ ( x − c )T ( x − c ) ⎞ ⎟ y ( x) = φ ⎜⎜ 3 ⎟ r ⎠ ⎝ (3.26) According to Fausett (1994), the Euclidean length is represented by r j that measures the radial distance between the datum vector y = ( y1 , y 2 ,..., y m ) ; and the radial centre Y ( j) = ( w1 j , w2 j ,..., wmj ) ; can be written as; rj = y − Y ( j) ⎡m ⎤ = ⎢∑ ( y i − wij ) 2 ⎥ ⎦ ⎣ i =1 1/ 2 (3.27) A suitable transfer function is then applied to r j to give, ( φ (r j ) = φ y − Y ( k ) ) (3.28) Finally the output layer ( k = 1 ) receives a weighted linear combination of φ (r j ) , ( n n v ( j) y ( k ) = ∑ c (jk )φ (r j ) = ∑ c (jk )φ y − Y j =1 j =1 ) (3.29) 71 x1 x2 w1 j w2 j . wmj . xm 1 2 ⎡m 2⎤ r j = ⎢∑ (x i − wij ) ⎥ ⎦ ⎣ i =1 Hidden Layer Figure 3.3 3.3.1 φ (r j ) c (kj ) n v y ( k ) = ∑ c (jk ) φ (r j ) j =1 Output Layer The structure of RBF Model Training RBF Networks Adapting the values of the weights and centres of networks by presenting the input and output data is known as learning or training. Training in an RBF network may be done in two stages. First, calculating the parameters of the RBF, including centres and the scaling parameters. Second, calculation of the weights between the hidden and output layer. Training is the actual process of adjusting weight factors based on trial-and-error. A supervised training requires target patterns or signals to guide the training process. The objective is to minimize the error between the target and actual output and to find weights. To train RBF network, the weight factors were adjusted until the calculated output pattern based on the given input matches the desired output. There are several types of learning algorithms can be used in RBF network such as Orthogonal Least Squares (OLS), Generalized Regression Neural Network (GRNN), K-means Clustering, and Probability Density Function (PDF). The emphasis of the paper is on adopting Generalized Regression Neural Network (GRNN) routine in selecting parsimonious RBF model. Specht (1991) has popularized ‘kernel regression’ which he calls a Generalized Regression Neural Network (GRNN). The GRNN algorithm is a kind of radial basis network that is often used for function approximation. The GRNN was introduced as a memory based neural network that would store all the independent and dependent training data available for a particular mapping (Heimes and Heuveln, 1998). 72 GRNN is particularly advantageous with sparse data in a real-time environmental, because the regression surface is instantly defined everywhere, even with just one sample (Specht, 1991). GRNN can be designed very quickly, fast learning, and effectively uses historical data to estimates values for continuous dependent variables. The learning process is equivalent to finding a surface in a multidimensional space that provides a best fit to the training data, with the criterion for the ‘best fit’ being measured in some statistical sense. Using the features of learning and training processes which it learned from past experience, or generalization of previous examples, RBF is capable of performing a basis for system modelling and forecasting. The GRNN predicts the value of one or more dependent variables, given the value of one or more independent variables. According to Heimes and Heuveln (1998) and Specht (1991), the GRNN thus takes as an input vector x of length n and generates an output vector (or scalar) y ' of length m , where y ' is the prediction of the actual output y . The GRNN does this by comparing a new input pattern x with a set of p stored patterns x i (pattern nodes) for which the actual output yi is known. The predicted output y ' is the weighted average of all these associated stored output y ij . Equation 3.30 expresses how each predicted output component y 'j is a function of the corresponding output components y j associated with each stored pattern x i . The weight W ( x, x i ) reflects the contribution of each known output yi to the predicted output. It is a measure of the similarly of each pattern node with the input pattern, p ∑ y W ( x, x ) i y 'j = Nj D = i =1 p ij ∑ W ( x, x ) j = 1,2,..., m (3.30) i i =1 It is clear from equation 3.30 that the predicted output magnitude will always lie between the minimum and maximum magnitude of the desired output, y ij associated with the stored patterns (since 0 ≤ W ≤ 1) . In the GRNN algorithm, the output weights are set to the desired outputs. The GRNN is best seen as an interpolator, which interpolates 73 between the desired outputs of pattern layer nodes that are located near the input vector (or scalar) in the input space. A standard way to define the similarly function, W is to base it on a distance function, D( x1 , x 2 ) , that gives a measure of the distance or dissimilarity between two patterns x1 and x 2 . The desired property of the weight function W ( x, x i ) is that its magnitude for a stored pattern x i be inversely proportional to its distance from the input pattern x (if the distance is zero the weight is a maximum of unity). The standard distance and weight functions are given by the following equations, respectively: W ( x, x i ) = e − D ( x , x ) i ⎛ x − x2k D ( x1 , x 2 ) = ∑ ⎜⎜ 1k σk k =1 ⎝ n (3.31) ⎞ ⎟⎟ ⎠ 2 (3.32) In equation 3.32, each input variable has its own sigma value, σ k , where σ k is the normalization constant that controls the width of the basis function. The procedures of GRNN algorithm can be summarized as follows: (i) Input unit stores an input vector x . (ii) The pattern units which computes the distances D( x, x i ) between the incoming patterns x and stored patterns x i . The pattern nodes output the quantities W ( x, x i ) . (iii) The summation units computes N j , the sums of the products of W ( x, x i ) and the associated known output component yi . This unit also has a node to compute D , the sum of all W ( x, x i ) . (iv) Finally, the output unit divides N j by D to produce the estimated output component y 'j that is a localized average of the stored output patterns. 74 3.4 Multiple Linear Regression (MLR) Model For MLR model, the input nodes are selected using the MLR method that proposed by Harun (1999). For both training and testing processes, the application of input node selection by Harun (1999) considerably can reduce the time taken to find out the optimum number of inputs to the models. Multiple linear regression applies to problems in which records have been kept of one variable, y , the dependent variable, and several other variables x1 , x2 , x3 , ..., xk , the independent variables, and in which the objective requires the relationship between the variables y and the variables x1 , x2 , x3 , ..., xk to be investigated. The basic multiple regression model is given by (Holder, 1985), y = a + b1 x1 + b2 x2 + ... + bk xk + e (3.33) where a , b1 , b2 , …, bk are constants and e is a random variable. Thus, it is assumed that y is linearly related to each of the independent variables and that each independent variable has an additive effect on y . For any i th set of observations, the model can be written more conveniently as, yi = α + β 1 ( x1i − x1 ) + β 2 ( x2i − x 2 ) + ... + β k ( x ki − x k ) + ei (3.34) where, xki is the value of independent variable x k at i th set of observations totalling n ; and x1 = 1 n 1 n ,…, x x = ∑ 1i ∑ xki , etc. k n i =1 n i =1 (3.35) By comparing model 3.33 and 3.34, we can see that b j = β j (for j = 1,2,..., k ) and a = α − β1 x1 − β 2 x2 − ... − β k xk . In general case of n observations and k variables, the method of least squares is used to choose values of a, b1 , b2 ,...bk (or α , β1 , β 2 ,..., β k ) to minimise, n n i =1 i =1 S 2 = ∑ ei2 = ∑ ( y i − a − b1 x1i − b2 x 2i − ... − bk x ki ) n 2 2 = ∑ ( yi − α − β 1 ( x1i − x1 ) − β 2 ( x2i − x2 ) − ... − β k ( xki − xk )) i =1 (3.36) (3.37) 75 Solving ∂S 2 ∂α = 0 , ∂S 2 ∂β 1 = 0 ,…, ∂S 2 ∂β k = 0 gives the following equations for the values of α , β1 , β 2 ,..., β k which minimise S 2 (denoted by αˆ , βˆ1 , βˆ 2 ,..., βˆ k ): α̂ = y = n n n i =1 i =1 1 n ∑ yi n i =1 (3.38) n ∑ ( yi − y )(x1i − x1 ) = βˆ1 ∑ (x1i − x1 )2 + βˆ 2 ∑ (x1i − x1 )(x2i − x2 ) + ... + βˆ k ∑ (x1i − x1 )(xki − xk ) i =1 n ∑ (y i =1 i i =1 n n i =1 i =1 n − y )( x 2 i − x 2 ) = βˆ1 ∑ ( x1i − x1 )( x 2 i − x 2 ) + βˆ 2 ∑ ( x 2i − x 2 ) + ... + βˆ k ∑ (x 2i − x 2 )( x ki − x k ) 2 i =1 M n ∑ (y i =1 i (3.39) n n n i =1 i =1 i =1 2 − y )( x ki − x k ) = βˆ1 ∑ ( x1i − x1 )( x ki − x k ) + βˆ 2 ∑ (x 2i − x 2 )( x ki − x k ) + ... + βˆ k ∑ (x ki − x k ) Thus, the estimates βˆ1 , βˆ 2 ,..., βˆ k are given by, βˆ = S xx−1 S xy (3.40) ⎡ βˆ1 ⎤ ⎢ˆ ⎥ β ˆ β = ⎢ 2⎥ ⎢M ⎥ ⎢ ⎥ ⎢⎣ βˆ k ⎥⎦ (3.41) where, The equations may now be condensed into, ⎡ S x1 y ⎤ ⎡ S x1x1 ⎢S ⎥ ⎢S ⎢ x 2 y ⎥ = ⎢ x1x 2 ⎢ M ⎥ ⎢ M ⎢ ⎥ ⎢ ⎣⎢ S xky ⎦⎥ ⎣⎢ S x1xk S x1x 2 K S x1xk ⎤ ⎡ βˆ1 ⎤ ⎢ ⎥ S x 2 x 2 K S x 2 xk ⎥⎥ ⎢ βˆ 2 ⎥ M M ⎥⎢ M ⎥ ⎥⎢ ⎥ K K S xkxk ⎦⎥ ⎣⎢ βˆ k ⎦⎥ (3.42) or, S xy = S xx β̂ (3.43) where, S xjy = ∑ (x ji − x j )( yi − y ) n (for j = 1,2,..., k ) (3.44) i =1 S xjxl = ∑ (x ji − x j )( xli − xl ) n i =1 (for j = 1,2,..., k and l = 1,2,..., k ) (3.45) 76 In this study, rainfall-runoff equations derived by using multiple regression procedures have been developed and used for estimation flows or runoff from rainfalls data. The MLR method is suitable for daily rainfall-runoff modelling. Harun (1999) reported that the MLR model had the ability to yield best results in modelling of monthly and daily input-output relationship. 3.5 HEC-HMS Model A model relates something unknown (the output) to something known (the input). In the case of the models that are included in HEC-HMS, the known input is rainfall and the unknown output is runoff. The rainfall may be observed from a historical event, it may be a frequency-based hypothetical rainfall event, or it may be an event that represents the upper limit of rainfall possible at a given location. Historical rainfall data are useful for calibration and verification of model parameters, and for evaluating the performance of proposed designs or regulations. Similarly, the evapotranspiration data used may be observed values from a historical record, or they may be hypothetical values. The required watershed precipitation depth can be inferred from the depths at gages using an averaging scheme. Thus, ⎛ ⎞ ∑ ⎜⎝ w ∑ p (t ) ⎟⎠ = ∑w i PMAP i i t (3.46) i i where PMAP is a total storm mean areal precipitation (MAP) depth over the watershed; pi (t ) is precipitation depth measured at time t at gage i ; and wi is weighting factor assigned to gage/observation i . If gage i is not a recording device, only the quantity ∑ pi (t ) , the total storm precipitation at gage i , will be available and used in the computation. 77 3.5.1 Evaporation and Transpiration In common application, HEC-HMS omits any detailed accounting of evaporation and transpiration, as there are insignificant during a flood. In the case of shorter storms, it may be appropriate to omit this accounting. Evaporation, as modelled in HEC-HMS, includes vaporization of water directly from the soil and vegetative surface, and transpiration through plant leaves. This volume of evaporation and transpiration combined is estimated as an average volume. The evaporation and transpiration are combined and collectively referred to as evapotranspiration (ET) in the meteorological input to the program. In this input, monthly varying ET values are specified, along with an ET coefficient. The potential ET rate for all time periods within the month is computed as the product of the monthly value and the coefficient. 3.5.2 Computing of Runoff Volumes HEC-HMS includes several alternative models to account for the cumulative losses such as the initial and constant-rate loss model, the deficit and constant-rate model, the SCS curve number (CN) loss model, and the Green and Ampt loss model. In this study, the best models used are the initial and constant-rate loss model, the deficit and constant-rate model, and the SCS curve number (CN) loss models. For each model, precipitation loss is found for each computation time interval, and is subtracted from the MAP depth for that interval. The remaining depth is referred to as precipitation excess. This depth is considered uniformly distributed over a watershed area, so it represents a volume of runoff. 78 3.5.2.1 The initial and constant-rate and Deficit and Constant-rate loss model The underlying concept of the initial and constant-rate loss model is that the maximum potential rate of precipitation loss, f c is constant throughout an event. Thus, if p t is the MAP depth during a time interval t to t + Δt , the excess, pet during the interval is given by, ⎧ p − fc pet = ⎨ t ⎩0 if pt > f c ⎫ ⎬ otherwise ⎭ (3.47) An initial loss, I a , is added to the model to represent interception and depression storage. Interception storage is a consequence of absorption of precipitation by surface cover, including plants in the watershed. Depression storage is a consequence of depressions in the watershed topography; water is stored in these and eventually infiltrates or evaporates. This loss occurs prior to the onset of runoff. Until the accumulated precipitation on the pervious area exceeds the initial loss volume, no runoff occurs. Thus, the excess is given by: ⎧0 if ⎪⎪ pet = ⎨ pt − f c if ⎪ if ⎩⎪0 ∑p <I ∑p >I ∑p >I i i i ⎫ ⎪⎪ a and p t > f c ⎬ ⎪ ⎪ a and p t < f c ⎭ a (3.48) The initial and constant-rate model, in fact, includes one parameter (the constant rate) and one initial condition (the initial loss). Respectively, these represent physical properties of the watershed soils and land use and the antecedent condition. If the watershed is in a saturated condition, I a will approach zero. If the watershed is dry, then I a will increase to represent the maximum precipitation depth that can fall on the watershed with no runoff; this will depend on the watershed terrain, land use, soil types, and soil treatment. The constant loss rate can be viewed as the ultimate infiltration capacity of the soils. Skaggs and Khaleel (1982) have published estimates of infiltration rates for those soils, as shown in Table 3.1. Because the model parameter is not a measured parameter, it and the initial condition are best determined by calibration. 79 Table 3.1: Infiltration rates by the soil groups Soil group Description Range of loss rates (in/hr) A Deep sand, deep loess, aggregated silts 0.30 – 0.45 B Shallow loess, sandy loam 0.15 – 0.30 C Clay loams, shallow sandy loam, soils low in organic content, and soils usually high in clay 0.05 – 0.15 Soils that swell significantly when wet, heavy plastic clays, and certain saline soils 0.00 – 0.05 D 3.5.2.2 SCS Curve Number Loss Model The Soil Conservation Service (SCS) Curve Number (CN) model estimates precipitation excess as a function of cumulative precipitation, soil cover, land use, and antecedent moisture, using the following equation: Pe = (P − I a ) 2 P − Ia + S (3.49) where Pe is accumulated precipitation excess at time t ; P is accumulated rainfall depth at time t ; I a is the initial loss; and S is potential maximum retention, a measure of the ability of a watershed to abstract and retain storm precipitation. Until the accumulated rainfall exceeds the initial abstraction, the precipitation excess, and hence the runoff, will be zero. From analysis of results from many small experimental watersheds, the SCS developed an empirical relationship of I a and S : I a = 0.2 S (3.50) Therefore, the cumulative excess at time t is: ( P − 0.2 S ) 2 Pe = P + 0.8S (3.51) Incremental excess for a time interval is computed as the difference between the accumulated excess at the end of and beginning of the period. The maximum retention, 80 S , and watershed characteristics are related through an intermediate parameter, the curve number (commonly abbreviated CN ) as: ⎧1000 − 10 CN ⎫ ( foot − pound system)⎪ ⎪⎪ ⎪ CN S =⎨ ⎬ ⎪ 25400 − 254 CN ⎪ ( SI ) ⎪⎩ ⎪⎭ CN (3.52) The CN for a watershed can be estimated as a function of land use, soil type, and antecedent watershed moisture, using tables published by the SCS. This CN is entered directly in the appropriate HEC-HMS input form. For a watershed that consists of several soil types and land uses, a composite CN is calculated as: CN composite = ∑ A CN ∑A i i (3.53) i in which CN composite is the composite CN used for runoff volume computations with HEC-HMS; i is an index of watersheds subdivisions of uniform land use and soil type; CN i is the CN for subdivision i ; and Ai is the drainage area of subdivision i . Tables in Appendix A include composite CN for urban districts, residential districts, and newly graded areas. That is, the CN shown are composite values for directly- connected impervious area and open space. If CN for these land uses are selected, no further accounting of direct-connected impervious area is required in HEC-HMS. 3.5.3 Modelling of Direct Runoff This section describes the models that simulate the process of direct runoff of excess precipitation on a watershed. It is refers to the ‘transformation’ process of precipitation excess into point runoff. There are two options for these transform methods namely, empirical models and a conceptual model. traditional unit hydrograph (UH) models. The empirical models are the The system theoretic models attempt to establish a causal linkage between runoff and excess precipitation without detailed consideration of the internal processes. Meanwhile, the conceptual model included is a 81 kinematic-wave model of overland flow. It represents, to the extent possible, all physical mechanisms that govern the movement of the excess precipitation over the watershed land surface. There are several types of empirical models such as user-specified UH, parametric and synthetic UH, Snyder’s UH, SCS UH, Clark’s UH, and ModClark UH. In this study, the SCS UH and Clark UH models are applied. These models are simple and required only small number of parameters compared to others. In other word, these models are suitable in the case of less of information available in the study area. The unit hydrograph is a well-known, commonly used empirical model of the relationship of direct runoff to excess precipitation. Sherman (1932) defined UH as “…the basin outflow resulting from one unit of direct runoff generated uniformly over the drainage area at a uniform rainfall rate during a specified period of rainfall duration”. The underlying concept of the UH is that the runoff process is linear, so the runoff from greater or less than one unit is simply a multiple of the unit runoff hydrograph. To compute the direct runoff hydrograph with a UH, HEC-HMS uses a discrete representation of excess precipitation, in which a ‘pulse’ of excess precipitation is known for each time interval. It then solves the discrete convolution equation for a linear system: Qn = n≤ M ∑P U m =1 m n − m +1 (3.54) where Qn is storm hydrograph ordinate at time nΔt ; Pm is rainfall excess depth in time interval mΔt to ( m + 1) Δt ; M is total number of discrete rainfall pulses; and U n − m +1 is UH ordinate at time ( n − m + 1) Δt . Qn and Pm are expressed as flow rate and depth respectively, and U n − m +1 has dimensions of flow rate per unit depth. Use of this equation requires the implicit assumptions: (i) The excess precipitation is distributed uniformly spatially and is of constant intensity throughout a time interval, Δt . (ii) The ordinates of a direct-runoff hydrograph corresponding to excess precipitation of a given duration are directly proportional to the volume of excess. Thus, twice the excess produces a doubling of runoff 82 hydrograph ordinates and half the excess produces a halving. This is the so-called assumption of linearity. (iii) The direct runoff hydrograph resulting from a given increment of excess is independent of the time of occurrence of the excess and of the antecedent precipitation. This is the assumption of time-invariance. (iv) Precipitation excesses of equal duration are assumed to produce hydrographs with equivalent time bases regardless of the intensity of the precipitation. 3.5.3.1 SCS UH Model The Soil Conservation Service (SCS) proposed a parametric UH model; this model is included in HEC-HMS. The model is based upon averages of UH derived from gauged rainfall and runoff for a large number of small agricultural watersheds throughout the US. At the heart of the SCS UH model is a dimensionless. This dimensionless UH, expresses the UH discharge, U t , as a ratio to the UH peak discharge, U p , for any time t , a fraction of T p , the time to UH peak. Research by the SCS suggests that the UH peak and time of UH peak are related by: Up = C A Tp (3.55) in which A is watershed area; and C is conversion constant (2.08 in SI and 484 in foot-pound system). The time of peak (also known as the time of rise) is related to the duration of the unit of excess precipitation as: Tp = Δt + tlag 2 (3.56) in which Δt is the excess precipitation duration (which is also the computational interval in HEC-HMS); and tlag is the basin lag, defined as the time difference between the centre of mass of rainfall excess and the peak of the UH. When the lag time is 83 specified, HEC-HMS solves equation 3.79 to find the time of UH peak, and equation 3.55 to find the UH peak. With U p and T p known, the UH can be found from the dimensionless form, which is included in HEC-HMS, by multiplication. The SCS UH lag can be estimated via calibration. For ungauged watersheds, the SCS suggests that the UH lag time may be related to time of concentration, t c , where t lag is 0.6t c . 3.5.3.2 Clark’s UH Model Clark’s model derives a watershed UH by explicitly representing two critical processes in the transformation of excess precipitation to runoff. First, is the translation or movement of the excess from its origin throughout the drainage to the watershed outlet; and second is attenuation or reduction of the magnitude of the discharge as the excess is stored throughout the watershed. Short-term storage of water throughout a watershed in the soil, on the surface, and in the channels plays an important role in the transformation of precipitation excess to runoff. The linear reservoir model is a common representation of the effects of this storage. That model begins with the continuity equation: dS = I t − Ot dt (3.57) in which dS dt is time rate of change of water in storage at time t; I t is average inflow to storage at time t ; and Ot is outflow from storage at time t . With the linear reservoir model, storage at time t is related to outflow ( S t = ROt ), where R is a constant linear reservoir parameter. Combining and solving the equations using a simple finite difference approximation yields: Ot = C A I t + C B Ot −1 (3.58) where C A , C B are the routing coefficients. The coefficients are calculated from: CA = Δt R + 0.5Δt (3.59) 84 CB = 1 − C A (3.60) The average outflow during period t is: Ot = Ot −1 + Ot 2 (3.61) With Clark's model, the linear reservoir represents the aggregated impacts of all watershed storage. Thus, conceptually, the reservoir may be considered to be located at the watershed outlet. In addition to this lumped model of storage, the Clark model accounts for the time required for water to move to the watershed outlet. It does that with a linear channel model (Dooge, 1959), in which water is routed from remote points to the linear reservoir at the outlet with delay (translation), but without attenuation. This delay is represented implicitly with a so-called time-area histogram. That specifies the watershed area contributing to flow at the outlet as a function of time. If the area is multiplied by unit depth and divided by Δt , the computation time step, the result is inflow, I t , to the linear reservoir. Solving equation 3.58 and equation 3.61 recursively, with the inflow thus defined, yields values of O . However, if the inflow ordinates in equation 3.58 are runoff from a unit of excess, these reservoir outflow ordinates are, in fact, U t , the UH. Application of the Clark model requires properties of the time-area histogram; and the storage coefficient, R . As noted, the linear routing model properties are defined implicitly by a time area histogram. Studies at HEC have shown that, even though a watershed specific relationship can be developed, a smooth function fitted to a typical time area relationship represents the temporal distribution adequately for UH derivation for most watersheds. That typical time area relationship, which is included in HEC-HMS is: 85 1.5 ⎫ ⎧ ⎛t ⎞ t ⎪ ⎪1.414⎜⎜ ⎟⎟ for t ≤ c 2 ⎪ At ⎪ ⎝ tc ⎠ =⎨ ⎬ 1.5 A ⎪ ⎛ tc ⎪ t ⎞ for t ≥ ⎪ ⎪1 − 1.414⎜⎜1 − ⎟⎟ 2⎭ t c ⎝ ⎠ ⎩ (3.62) where At is cumulative watershed area contributing at time t ; A is total watershed area; and t c is time of concentration of watershed. For application in HEC-HMS, only the parameter t c , the time of concentration, is necessary. This can be estimated via calibration. The basin storage coefficient, R , is a index of the temporary storage of precipitation excess in the watershed as it drains to the outlet point. It can be estimated via calibration if gauged precipitation and streamflow data are available. Though R has units of time, there is only a qualitative meaning for it in the physical sense. Clark (1945) indicated that R can be computed as the flow at the inflection point on the falling limb of the hydrograph divided by the time derivative of flow. 3.6 XP-SWMM Model By referring to the XP-SWMM Technical Reference Manual (2000), there are five major types of Hydrograph generation techniques available in runoff. There are, (a) SWMM runoff nonlinear reservoir method Subcatchments are modeled as idealized rectangular areas with the slope of the catchment perpendicular to the width. (b) Kinematic wave method The kinematic wave method for overland flow applies only the kinematic wave component of the St. Venant shallow flow equations for momentum and continuity. (c) Laurenson nonlinear method/Rafts When using Laurenson hydrology the catchment width is by default not used. The catchment roughness utilized when calculating the storage 86 delay parameter, B is taken from the pervious manning, n value for the sub-catchment included with the infiltration information. (d) SCS unit hydrograph method SCS is as an alternative to the nonlinear runoff routing method employed by Runoff, hydrographs may optionally be generated by the SCS method. Typical values of the previous area curve number vary from 20 for regions with high infiltration and interception capacities to 98 for impervious areas. The SCS has been determined the hydrograph shape factor to be 484 for most watersheds. This was the result of analyzing many watersheds of various size and geographic location. It is used in the formulation of peak discharge and the peak of the unit hydrograph, Q p = 484 A t p , in which Q p is the peak discharge, A is the drainage area, and t p is the time to peak. The time of concentration, Tc can be estimated from several formula such as Kinematic waves. For a constant excess rainfall can be described as, ⎛ ⎞ L0.6 Tc = C ⎜⎜ n 0.6 0.4 S 0.3 ⎟⎟ i ⎝ ⎠ (3.63) where L is the distance from the upper end of the plane to the point of interest, n is the Manning resistance coefficient, i is the excess rainfall rate, S is the dimensionless slope of the surface, and C is constant that depends on units of the other variables. T p = 2 3Tc for shape factor 484. The initial abstraction from the precipitation may be represented as an absolute number, that is the total depth of precipitation that is less or as a fraction of the amount of precipitation (between 0 to 1). (e) Rational Formula The rational method as applied in this application provides a unit hydrograph approach applying a deterministic form of the Rational Formula, Q = CIA . The three items of data needed for this procedure are the runoff coefficient, C , the rainfall intensity, I and the size of the catchment area, A . The method utilizes the rainfall concurrent to the time 87 step being computed. The unit hydrograph incorporates a base length equal to two times the time of concentration. (f) Time area method Time area methods utilize a convolution of the rainfall excess hyetograph with a time area diagram representing the progressive area contributions within a catchment in set time increments. The time area procedure assumes a linear time area relationship for the sub-area and is based on an input time of concentration. The only input necessary for this procedure is the time of concentration for the sub-area. Determine the runoff generated from individual sub-catchments using the following equation; qi = I i A1 + I i −1 A2 + ... + I 1 Ai , where qi is hydrograph ordinate, I i is effective rainfall intensity, Ai is contributing sub-catchment area at a particular time and i is number of isochrones area contributing to the outlet. (g) Other unit hydrograph methods are Nash, Snyder, Synder (Alameda) and Santa Barbara urban hydrograph (i) Nash unit hydrograph Nash in 1957 proposed a conceptual catchment model by considering a drainage basin with a series of identical linear reservoir in series. Two items of data are required to apply this method. These include an exponent and a time of concentration. (ii) Santa Barbara urban hydrograph (SBUH) The SBUH method was developed by Santa Barbara County Flood and Water Conservation District, California. The SBUH method directly computes a runoff hydrograph without going through an intermediate process as the SCS method does. The SBUH method uses two steps to synthesize the runoff hydrograph; first step is computing the instantaneous hydrograph, and the second step is computing the runoff hydrograph. 88 (iii) Snyder unit hydrograph Synder (1938) was the first to develop a synthetic unit hydrograph based on a study of watersheds in the Appalachian Highlands. Synder’s relationships are, T p = Ct ( LLc ) 0.3 . Where T p is the basin lag, L is the length of the main stream from the outlet, Lc is the length along the main stream to a point nearest the watershed centroid, and C t is the coefficient usually ranging from 1.8 to 2.2. Peak discharge of the unit hydrograph, Q p = 640C p A / T p , where A is the drainage area and C p is the storage coefficient ranging from 0.4 to 0.8. The time base of the hydrograph, Tb = 3 + T p / 8 . (iv) Synder (Alameda) unit hydrograph The Synder (Alameda) method is the Snyder procedure as applied by Alameda County in California where the individual parameters more suited to Alameda region are computed from catchment characteristics. This method requires four items of data; stream length ( L ), centroid length ( Lc ), stream slope ( S ) and basin roughness ( N ). In this study, the model parameters used for modelling rainfall-runoff relationship are a Horton infiltration/loss and SCS, Time Area unit hydrograph and Rational Formula unit hydrograph. These models have been selected because it is simple and required a small numbers of parameters. Therefore, it was easily design and manages with less of information available in the study area. 89 3.7 Calibration of Distributed Models The proper choice of calibration data may mitigate difficulties encountered in model calibration. Critical issues pertaining to calibration data are the amount of data necessary and sufficient for calibration and the quality of data resulting in the best parameter estimates. However, our understanding to address such issues is less than complete (Singh, 2002). Model calibration for the case of distributed models is one of important process predicted discharge. Suppose that the parameters of a distributed model have initially been calibrated only on the basis of prior information about soil and vegetation type, with some adjustment of values being made to improve the simulation of measured discharges. Furthermore, after initial calibration a decision is made to collect more spatially distributed information about the catchment response. In fact, because of the lack of information available about the internal responses of the catchment, it would probably use effective values of model parameters over wide regions of the flow domain. But if the new data are being used to improve the local calibration, more data will be needed to evaluate a model, without necessarily having any impact on the variables of greatest interest in prediction discharge in rainfall-runoff modelling. For each different catchment, especially differences in soil, geological condition, etc, a set of parameters needs to be established so that the HEC-HMS and SWMM models can simulate hydrological processes. The procedure for determining parameter values for a particular catchment is called parameter calibration or parameter optimization. However, before undertaking parameter calibration, a system performance check was implemented to ensure continuity of mass during simulations. In general, as with many other complex models, parameter calibration involved multiple trial and error run, professional judgement being used to decide which parameters to adjusted and to what extend. The strategy for determining the model parameters can be summarized as follows: (a) Pick initial values for parameters as rational as possible by doing some simply math calculations 90 (b) Pick a low-flow period that is followed by a peak-flow at the beginning of a simulation and determine the initial type of parameter first (c) Tests run the model and evaluate results. Other parameters can also be adjusted, such as the Horton’s parameters, channel geometric parameters, evapotranspiration and etc. Some of these parameters may have little effect on short-term hydrological processes and thus may not need to be adjusted. 3.8 Evaluation of the Model Most models perform little to no error analysis. Thus, it is not clear what the model errors are and how different errors propagate through different model components and parameters (Singh and Woolhiser, 2002). This is one of the major limitations of most current catchments hydrology models. The issue of forecasting accuracy arises at both the model development stages and when the model is used to handle the forecasting task (Harun, 1999). During development stage the model performance is monitored based on the agreement of the proposed model with the actual data. Following the model development stage, the suitable model will be selected for testing its capability to perform the predicting task. Basically, the performance of each model will be compared based on the estimated errors between the computes and the actual data. It is often useful to calculate more than one criterion for comparison because occasionally different criteria may give different indications. 3.8.1 Goodness of Fit Tests There are many measures that can be used to evaluate the results of a model simulation. These will depend on what observational data are available to evaluate each model. Most measures of goodness of fit used in hydrograph simulation in the past have been based on the sum of square errors, or error variance (Beven, 2001). For this study, 91 the quantitative tests will be applied to assess the capability of the models in describing the process investigated. The quantitative tests will be performed by using several indicators that described below. It is often more effectual to calculate more than one criteria for comparison as different measures may give different indications. The performance of each candidate model will be compared based on the estimated errors between the computes or predictions and the actual data. The present study employs therefore five different measures of accuracy to assess the performance of each model. Formulas for the five measures are: 3.8.1.1 Correlation of coefficient ( R 2 ) The criterion used for assessing the performance of the different models in the established R 2 criterion of Nash and Sutcliffe (1970) that was expressed as: ∑ [(Q n R = 2 t =1 o (t ) ] − Qo (t ) )(Qs ( t ) − Qs ( t ) ) 2 ⎡n ⎢∑ (Qo ( t ) − Qo ( t ) ) ⎣ t =1 ∑ (Q n t =1 s (t ) − Qs ( t ) ) 2 1 ⎤2 ⎥ ⎦ (3.64) In which, Qo is the actual observed streamflow value; Qs is the model simulated streamflow value; and n is the number observed streamflows of time periods over which the errors are computed. The value of R 2 of 90% indicates a very satisfactory model performance while a value in the range 80-90% indicates a fairly good model. Values of R 2 in the range 60-80% would indicate unsatisfactory model fit (Kachroo, 1986). For a perfect match, the value of R 2 should be 1.0 (Thirumalaiah and Deo, 2000; Yu and Yang, 2000) 3.8.1.2 The root mean square error (RMSE) 1 2 ⎡1 n 2⎤ RMSE = ⎢ ∑ (Qo ( t ) − Qs ( t ) ) ⎥ ⎣ n t =1 ⎦ (3.65) 92 Generally, these formulas evaluate the models based on a comparison of the estimated errors between the actual observations and the fitted model. A model with the minimum error is considered the best choice. 3.8.1.3 The relative root mean square error (RRMSE) ⎡ 1 n ⎡ (Qo (t ) − Qs ( t ) )⎤ ⎤ 2 RRMSE = ⎢ ∑ ⎢ ⎥⎥ Qo ( t ) ⎥⎦ ⎦⎥ ⎣⎢ n t =1 ⎢⎣ 1 (3.66) To compare the accuracy of the model used in estimating the runoff, the at-site estimates (using the actual data from the site being estimated) were compared with those found using the model. 3.8.1.4 The mean absolute percentage error (MAPE) MAPE = 1 n Qo ( t )−Qs ( t ) × 100% ∑ n t =1 Qo ( t ) (3.67) Johnson and King (1988) assigned that the MAPE around 30% is considered a reasonable prediction. Further, the analysis will be considered very accurate when the MAPE is in the range of 5% to 10%. 3.8.1.5 The percentage bias (PBIAS) ∑ (Q n PBIAS = t =1 n ∑ (Q t =1 s (t ) o(t ) − Qo (t ) ) − Qo ( t ) ) × 100% (3.68) 2 The model performances calibrated by various objective functions are evaluated by using the percentage bias (PBIAS) as proposed by Yapo et. al (1996). PBIAS measures the bias of model performance. The optimal value is 0.0, which means that the model has an unbiased flow simulation. Positives value indicates a tendency of overestimation and negative values indicate a tendency of underestimation. 93 All these measures are aimed at providing a relative measure of model performance. That measure should reflect the aims of a particular application in an appropriate way. Beven (2001) revealed that, there is no universal performance measure and whatever choice is made, there will be an effect on the relative goodness of fit estimates for different models and parameter sets, particularly if an optimum parameter set is sought. It is often useful to calculate more than one criterion for comparison because occasionally different criteria may give different indications. In general, a model with minimum error is chosen as the best model for predicting. A mean squared error (RMSE) is one of the most commonly used performance measures in hydrological modelling. Many researchers used RMSE as an accuracy measure (Hsu et. al., 1995; Shamseldin, 1997; Harun, 1999; Tokar and Markus, 2000; Elshorbagy et. al., 2000; Yu and Yang, 2000). Harun (1999) reported that the interpretation of the MAPE is quite straightforward, because most people can readily appreciate the significance of forecasts being within a certain percentage of the true value. Therefore, comparisons of trained networks were accomplished by comparing models with each other using goodness of fit statistics discussed above and selecting the best fit network for each training and testing set. 3.8.2 Missing Data and the Outliers Missing data is very common in hydrology data collection and needs to be estimated or treated. There are quite a number of models to be used for the above purpose, such as arithmetic mean method, normal ratio method, modified normal ratio method, inverse distance method, quadrant method, modified inverse distance method, isohyetal method, Beard’s method, rank matching method, linear programming method, and etc. The rainfall data of rain gauges were used in calibration of the models. There was found that some missing data for different stations for different events. The linear 94 regression method was used to fill in the missing rainfall data. Goodness of fit between the data sets of various stations was judged by the resulting values of Index of Fit ( R 2 ). Outliers are other aspects that should be tackled in preparing the data series. There are two types of outliers namely, high outliers and low outliers. Maidment (1993) reported that the outliers are to be data points which depart significantly from the trend of the remaining data. In experimental statistics, an outliers is often a rogue observation which may result from unusual conditions or observational or recording error, such observations are often discarded. High outliers are retained unless historical information is identified showing that such floods are the largest in an extended period. If no cause can be determined, Grubbs (1969) proposed that one of the following actions for treating the outliers: (i) The outliers could be eliminated from the sample, with only the remaining observations used in further analysis. (ii) The outliers could be replaced with the next closest values in the sample. (iii) The outliers could be retained, and the median used in the test statistic rather than the mean. (iv) 3.9 The outliers could be removed, and truncated sample theory applied. The Study Area Automatic gauging stations were chosen from selected catchments over peninsular Malaysia. These stations were chosen based on several criteria. First and foremost, the quality and quantity of data for each station was individually screened, the initial screening was done on the entire available automatic gauging stations in the country. Some stations had numerous days of missing data in a year that make the annual maximum values questionable. These stations are excluded from the study. The length of record or data available is of primary importance. The amount of data available for 95 each station varies between 23 years to 29 years as the earliest date of operation for automatic rain gauges in Peninsular Malaysia was in 1971. In this study, the mathematical programming based on neural network methods was applied to model the rainfall-runoff relationship in the selected catchments area. A total of four regions have been defined in Peninsular Malaysia, within which reasonably consistent regional rainfall-runoff relationship have been established. The precision with which the areal extent of these regions can be specified is largely dependent upon the areal density of the gauging stations within region. Each region consists of the catchments area of all the station records judged to be homogeneous and used in the study. These regions are Sungai Bekok, Sungai Ketil, Sungai Klang and Sungai Slim catchments area as shown in Figure 3.4, Figure 3.5, Figure 3.6 and Figure 3.7 respectively, together with the location of the gauging stations used in the analysis. A rainfall station register is to be completed for every rainfall station within each state. This includes both manual and automatic recording stations in the primary and secondary networks. Table 3.2 to 3.5 shows the characteristics of the rainfall and runoff stations that are available in the selected catchments. 3.9.1 Selection of training and testing data The selection of training data that represents the characteristics of a watershed and meteorological patterns is extremely important in modelling (Yapo et. al., 1996). Input variable (rainfall) is selected to describe the physical phenomena of the rainfall-runoff process, in order to forecast runoff. The steps involved in the identification of a dynamic model of a system are selection of input-output data suitable for calibration and verification; selection of a model structure and estimation of its parameters; and validation of the identified models (Hsu et. al., 1995). The selection of training data that represents the characteristics of a watershed and meteorological patterns is extremely 96 important in modelling (Yapo et. al., 1996). Input variable is selected to describe the physical phenomena of the rainfall-runoff process, in order to forecast runoff. For the recorded data to be generally useful, the exposure to instruments and the recording practices must remain as constant as possible for a lengthy period of at least a decade (Hecht-Nielsen, 1991). The most current data were used in testing in order to examine the capability of model in predicting future runoff, without directly including the change in land-use characteristics, such as urbanization. It is very difficult to assign absolute error values to historic rainfall and runoff data. Some estimate of the likely errors can be made by examining the station record, the physical features of the site, actual chart record of water level where available, and the precision with which the station rating curve is defined at flood stages. In many cases, for historic reasons, very little of this type of information is available. A subjective classification of the value of the flood peak information used in this study has been made. In considering the station records within a proposed flood frequency region, the effects of possible data errors on the results can be considered to be, (1) errors in the record likely to make the inclusion of the station within the flood frequency region suspect, (2) errors in the record likely to influence the mean annual flood-catchment area relationship, and (3) errors in the record likely to influence the regional flood frequency relationship. The data from each station record used in this study have been examined and estimates made in respect of these three considerations. The errors in the data finally used to define the areal extent and the flood frequency relationships for each region are considered to have negligible effect on the results presented. For the daily rainfall-runoff modelling, record of five years of daily rainfall-runoff series of Sungai Bekok catchment (1984-1988), Sungai Ketil catchment (1986-1990), Sungai Klang catchment (1996-2000), and Sungai Slim catchment (1996-2000) are selected to evaluate the performance of the neural network model. The data used consist of two sets: the first four years of data are used for model calibration (training) and validation in the case of ANN, and the remaining one year of data are used for model verification (testing). The data use for calibration process can be more than four years in 97 order to represent the characteristics of the catchment area. Two data sets representing wet and dry season were selected for testing in order to identify the rainfall-runoff behaviour in the selected catchments. Increasing the number of training data in the training phase, with no change in neural network structure, will improve performance on the training and testing phase. Thus, it is depends on providing an adequate number and good of training data. Meanwhile, for the hourly rainfall-runoff modelling, record of ten years of hourly rainfall-runoff series of Sungai Bekok catchment (1991-2000), Sungai Ketil catchment (1983-1992), Sungai Klang catchment (1991-2000), and Sungai Slim catchment (19881997) are used. A good quality input-output pairs of data sets was selected to develop and evaluated of the neural network models. The neural network was trained under two sets of conditions. In this study, 55 sets of data have been selected from the records. The first 50 sets of data are used for model calibration (training), and the remaining 5 sets of data are used for model verification (testing). The data use for calibration process can be more than 50 sets in order to represent the characteristics of the catchment area. It is anticipated that by increasing the number of training data in the training phase, with no change in neural network structure, will improve the performance on the training and testing phase. Thus, it is depends on providing an adequate number and quality of training data. In addition, capability of a network on predicting peak flow was examined with a ratio of the predicted to the observed peak flow values. 3.9.2 The Sungai Bekok Catchment Figure 3.4 shows the schematics of Sungai Bekok catchment area. It is located at state of Johor, Malaysia. The Sg. Bekok is a natural catchment with size of 350 km2. In term of utilization of the Sg. Bekok catchment, it is consists of 85% of open space (agricultures, fields, roads, utility reserve, etc), and 15% of domestic area. In future, there is high possibility of any further changes in land use pattern. It is located on 98 southwestern part of Johor, latitude 020 07’ 15” and longitude 1030 02’ 30. Table 3.2 shows the latitude and longitude of water level station and the raingauges station for the Sungai Bekok catchment. There are two raingauges station (No. 2130068 and 2031069) located within the catchment boundary have been used for analysis purposes. Rainfalls up to 1 minute’s interval are recorded by electronic data loggers and retrieved once fortnightly. An automatic water level recorder is provided at downstream at Sg. Bekok At Bt. 77 Jalan Yong Peng/Labis (2130422). The water level recorder, can record the readings up to 1 minutes interval, by an electronic logger. The data are retrieved once fortnightly. Referring to Soil Map of Malaya (1962) Ministry of Agriculture, Malaysia, this catchment area consists of two types of soil. There are consists of 35% of red and yellow latosols and red and yellow podzolic soils on gently to strongly sloping land of variable fertility derived from a variety of sedimentary rocks; and 65% of organic soils, principally peat’s, with some mucks, developed over mineral alluvial soils in poorly drained situations, of limited suitability for agricultural development. Table 3.2: Raingauges used in calibration and verification of the models for Sg. Bekok catchment (a) Water Level Station Latitude Longitude 02° 07’ 15” 103° 02’ 30” Latitude Longitude 02° 07’ 50” 103° 03’ 00” 02° 04’ 15” 103° 09’ 10” At Sg. Bekok At Bt. 77 Jalan Yong Peng/Labis (2130422) (b) Raingauge At Ladang Union, Yong Peng (2130068) At Ladang Yong Peng, Batu Pahat (2031069) 99 Figure 3.4 The Sungai Bekok catchment area 3.9.3 The Sungai Ketil Catchment Figure 3.5 shows the schematics of Sungai Ketil catchment area. It is located at state of Kedah, Malaysia. The Sg. Ketil catchment is a semi-developed area and the size is 704 km 2 . It is consists of 53% of open space (agricultures, fields, roads, utility reserve, etc), 22% of domestic area, and 25% of commercial area. It is located on south part of Kedah, the latitude 050 38’ 20” and the longitude 1000 48’ 45”. Table 3.3 shows the latitude and longitude of water level station and the raingauges station for the Sg. Ketil catchment. There are three raingauges station (No. 5608074, 5609072 and 5708071) located within the catchment boundary have been used for analysis purposes. 100 Rainfalls up to 1 minute’s interval are recorded by electronic data loggers and retrieved once fortnightly. An automatic water level recorder is provided at downstream at Sg. Ketil at Kuala Pegang (5608418). The water level recorder, can record the readings up to 1 minutes interval, by an electronic logger. The data are retrieved once fortnightly. Referring to Soil Map of Malaya (1962) Ministry of Agriculture, Malaysia, this catchment area consists of three types of soil. There are consists of 45% of red and yellow latosols and red and yellow podzolic soils on gently to strongly sloping land, mostly of average fertility, derived from acid igneous rocks; and 45% of laterite soils on gently to strongly sloping land, mostly of average to below average fertility; and 10% of freely drained coarse textured grey brown podzols of below average fertility, developed over recently accumulated coast deposits with associated swamps. Figure 3.5 The Sungai Ketil catchment area 101 Table 3.3: Raingauges used in calibration and verification of the models for Sg. Ketil catchment (a) Water Level Station At Sg. Ketil At Kuala Pegang Latitude Longitude 05° 38’ 20” 100° 48’ 45” Latitude Longitude 05° 39’ 25” 100° 53’ 55” 05° 40’ 50” 100° 55’ 00” 05° 45’ 05” 100° 53’ 35” (5608418) (b) Raingauge At Pulai (5608074) At Hospital Baling (5609072) At Kg. Terabak (5708071) 3.9.4 The Sungai Klang Catchment Figure 3.6 shows the schematics of Sungai Klang catchment area. It is located at Kuala Lumpur, Malaysia. The Sg. Klang catchment is a fully-developed or urbanization area and the size is 468 km 2 . Urbanization is known to have a significant effect on rainfall-runoff relationships. It is consists of 65% of commercial area, 25% of domestic area, and 10% of open space (agricultures, fields, roads, utility reserve, etc). It is located on northwestern part of Kuala Lumpur, the latitude 030 08’ 20” and the longitude 1010 41’ 50”. Table 3.4 shows the latitude and longitude of water level station and the raingauges station for the Sg. Klang catchment. There are eight raingauges station (No. 3116003, 3116005, 3116006, 3117070, 3118069, 3216001, 3217001 and 3116006) located within the catchment boundary have been used for analysis purposes. Rainfalls up to 1 minute’s interval are recorded by electronic data loggers and retrieved once fortnightly. An automatic water level recorder is provided at downstream at Sg. Klang at Jambatan Sulaiman (3116430). The water level recorder, can record the readings up to 1 minutes interval, by an electronic logger. The data are retrieved once fortnightly. 102 Referring to Soil Map of Malaya (1962) Ministry of Agriculture, Malaysia, this catchment area consists of three types of soil. There are consists of 45% of lithosols and shallow latosols on steep mountainous and hilly land considered unsuitable for extensive agriculture development; 40% of red and yellow latosols and red and yellow podzolic soils on gently to strongly sloping land, mostly of average fertility, derived from acid igneous rocks; and 15% disturbed land, chiefly tin tailings, of limited suitability for agriculture. Figure 3.6 The Sungai Klang catchment area 103 Table 3.4: Raingauges used in calibration and verification of the models for Sg. Klang catchment (a) Water Level Station At Sg. Klang At Jambatan Sulaiman Latitude Longitude 03° 08’ 20” 101° 41’ 50” Latitude Longitude 03° 09’ 05” 101° 41’ 05” 03° 11’ 50” 101° 38’ 10” 03° 11’ 00” 101° 38’ 00” 03° 09’ 20” 101° 45’ 00” 03° 09’ 30” 101° 48’ 05” 03° 16’ 20” 101° 41’ 10” 03° 16’ 05” 101° 43’ 45” 03° 14’ 10” 101° 45’ 10” (3116430) (b) Raingauge At Ibu Pejabat JPS, Malaysia (3116003) At Sekolah Rendah Taman Maluri (3116005) At Ladang Edinburgh Site 2 (3116006) At Pusat Penyelidikan JPS, Ampang (3117070) At Pemasokan Ampang (3118069) At Kg. Sg. Tua (3216001) At Ibu Bekalan KM. 16, Gombak (3217001) At Ibu Bekalan KM. 11, Gombak (3116006) 3.9.5 The Sungai Slim Catchment Figure 3.7 shows the schematics of Sungai Slim catchment area. It is located at state of Perak, Malaysia. The Sg. Slim is a semi-developed area with size 455 km2. It is 104 consists of 65% of open space (agricultures, fields, roads, utility reserve, etc), 25% of domestic area, and 10% of commercial area. It is located on middle part of Perak, the latitude 030 49’ 35” and the longitude 1010 24’ 40”. Table 4.4 shows the latitude and longitude of water level station and the raingauges station for the Sg. Slim catchment. There are two raingauges station (No. 3814156, and 3814157) located within the catchment boundary have been used for analysis purposes. Rainfalls up to 1 minute’s interval are recorded by electronic data loggers and retrieved once fortnightly. An automatic water level recorder is provided at downstream at Sg. Slim At Slim River (3814416). The water level recorder, can record the readings up to 1 minutes interval, by an electronic logger. The data are retrieved once fortnightly. Referring to Soil Map of Malaya (1962) Ministry of Agriculture, Malaysia, this catchment area consists of three types of soil. There are consists of 10% of lithosols and shallow latosols on steep mountainous and hilly land considered unsuitable for extensive agriculture development; 55% of red and yellow latosols and red and yellow podzolic soils on gently to strongly sloping land of variable fertility derived from a variety of sedimentary rocks; and 35% of low humic gley soils, being moderately and poorly drained soils developed over coastal plains and in the valleys and flood plains of the larger rivers, of very variable fertility. Table 3.5: Raingauges used in calibration and verification of the models for Sg. Slim catchment (a) Water Level Station At Sg. Slim At Slim River Latitude Longitude 03° 49’ 35” 101° 24’ 40” Latitude Longitude 03° 51’ 40” 101° 26’ 40” 03° 49’ 55” 101° 24’ 30” (3814416) (b) Raingauge At Ladang Bedford, Slim River (3814156) At Ladang Baba Bakala (3814157) 105 Figure 3.7 The Sungai Slim catchment area The most current data were used in testing in order to examine the capability of model in predicting future runoff, without directly including the change in land-use characteristics, such as urbanization. Based on daily and hourly rainfall-runoff relationships, five data sets representing set 1, set 2, set 3, set 4 and set 5 were selected for training and testing in order to identify the rainfall-runoff behaviour in the four selected catchments in peninsular Malaysia. 106 3.10 Computer Packages In this study, the computer programmings were using MATLAB copyrighted by MathWorks Inc. (2000) as tool to develop model structures for prediction of the rainfallrunoff relationship. The modelling technique approach used in the present study is based on artificial neural network method in modelling of hydrologic input-output relationship. MATLAB can be incorporated effectively to enhance understanding and enabling the researcher actively to put theory into practice. This software is known are friendly user and flexible with high capability for analysis and design the hydrologic processes. The performances of these models were compared with that of existing models available in the market. Results of this study will permit the identification of the best models for the rainfall-runoff modelling. 107 CHAPTER 4 RESULTS AND DISCUSSION 4.1 General This study focuses on the application of Multilayer Perceptron (MLP) and Radial Basis Function (RBF) methods in modelling of rainfall-runoff relationship. Results of MLP and RBF models are illustrated in Section 4.2 and 4.3 respectively. Results of multiple linear regression (MLR), HEC-HMS and XP-SWMM models are described in section 4.4, 4.5 and 4.6 respectively. The term ‘training’ used in MLP and RBF models is similar to the term ‘calibration’ which is used in others comparison models. Meanwhile, the term ‘testing’ used in MLP and RBF models are similarly with the term ‘verification’ which is used in other models. This kind of process was carried out to achieve the best parameter or weighted coefficient before the model is ready to be used for testing, prediction and evaluation. The evaluation is carried out based on correlation of coefficient ( R 2 ), root mean square error (RMSE), relative root mean square error (RRMSE), mean absolute percentage error (MAPE) and percentage bias (PBIAS). Appendix J shows the programming of daily (part A) and hourly (part B) rainfall-runoff relationships that has been developed by using MATLAB. 108 4.2 Results of the Multilayer Perceptron (MLP) Model To evaluate the performance of the model, previous records of daily and hourly rainfall-runoff data of Sungai Bekok, Sungai Ketil, Sungai Klang and Sungai Slim catchments are used. There are five selected sets of data used as the prediction set. Testing data set number 1, 2 and 3 are representing normal condition of 1 year, 6 months and other 6 months data sets respectively in the case of daily rainfall-runoff modelling. Meanwhile, testing data set number 4 and set number 5 are representing dry and wet season. For the neural network training process, the best hidden nodes are chosen based on the minimum root mean square error (RMSE) computed for the training data. For the training or calibration phase, the data used are 100%, 50% and 25% or minimum training data sets. For example; in the case of daily rainfall-runoff modelling, 100% data consists of 4 years or 1460 data sets. Meanwhile, 50% and 25% data means that the calibration data was randomly selected from the total of 1460 data sets. Even the data sets for the training process is randomly selected, it must covered the maximum and minimum values of testing data sets. This approach was carried out as well as in the case of hourly rainfall-runoff modelling. The objectives are, first is to evaluate the effect of amount or quantity of the data to the model performance and the results accuracy and second is to evaluate the robustness of the model. 4.2.1 Results of Daily MLP Model There are two types of MLP model structure have been developed for modelling daily rainfall-runoff relationship. First model is the 3 layers MLP structure with one hidden layer. Second model is the 4 layers MLP structure with two hidden layer. Results of 3 layers and 4 layers of MLP model are described below. Tables 4.1(a) to 4.1(c) present the correlation of coefficient ( R 2 ), RMSE, RRMSE, and MAPE resulting from 3-layer MLP model of daily rainfall-runoff 109 relationship for the Sungai Bekok catchment. Meanwhile, Figures 4.1(a) to 4.1(c) illustrates the graphical results of 3-layer MLP during training and testing. For Sungai Bekok catchment, the numbers of input nodes considered for 3 layers MLP are 16 nodes with 13 numbers hidden nodes. The total numbers of parameters are 250. The example of calculation is, (16x13)+(13x1)+16+13=250, which are representing the total numbers of weights in the network structures. By using 100% training data sets (Table 4.1(a)), it shows the consistency and reliable results in the prediction phase compared to the results by using 50% and 25% of training data sets. It can be concluded that a large number of training data sets is required to perform successful training. Most values of R 2 approach in the range of 80% to 97%. This outcome indicates that the MLP model consistently show a good performance in rainfall-runoff modelling. Kachroo (1986) reported that when R 2 more than 90%, the model is very satisfactory. It is fairly good with R 2 in the range of 80% to 90%. Johnson and King (1988) decided on model accuracy based on MAPE value. The prediction model considered reasonable with MAPE below 30% and very accurate with MAPE less than 10%. Results of modelling for Sungai Bekok with MAPE less than 10% can be considered as very accurate. Table 4.1(a): Results of 3-Layer neural networks for Sg. Bekok catchment using 100% of data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 16-13-1* 250 0.9350 0.1660 0.0328 2.010 16-13-1* 250 0.9310 0.1220 0.0242 1.8143 16-13-1* 250 0.9580 0.0893 0.0183 1.6232 16-13-1* 250 0.9102 0.1264 0.0253 1.9461 16-13-1* 250 0.9706 0.1254 0.0262 2.3390 16-13-1* 250 0.8441 0.1345 0.0276 2.1922 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient 110 Table 4.1(b): Results of 3-Layer neural networks for Sg. Bekok catchment using 50% of data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 16-13-1* 250 0.9660 0.1258 0.0254 1.8522 16-13-1* 250 0.9144 0.1709 0.0337 2.7322 16-13-1* 250 0.9571 0.1050 0.0206 1.6073 16-13-1* 250 0.8860 0.1696 0.0338 2.7705 16-13-1* 250 0.9718 0.0936 0.0193 1.7228 16-13-1* 250 0.8155 0.1612 0.0328 2.6859 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient Table 4.1(c): Results of 3-Layer neural networks for Sg. Bekok catchment using 25% of data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 16-13-1* 250 0.9380 0.1327 0.0270 1.8472 16-13-1* 250 0.8860 0.1248 0.0249 1.7972 16-13-1* 250 0.9451 0.0981 0.0203 1.7598 16-13-1* 250 0.8507 0.1307 0.0266 1.9147 16-13-1* 250 0.9655 0.1357 0.0282 2.6268 16-13-1* 250 0.7680 0.1358 0.0282 2.1433 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient During the training phase, the RMSE for Sungai Bekok is consistently less than 0.17 cumecs for the 3 layer (16-13-1) model structures. The RRMSE also maintains below 0.033 for this model structures. During testing, the RMSE gives values less than 111 0.17 cumecs and RRMSE come close to zero. Obviously, the application of MLP method to model rainfall-runoff relationship of Sungai Bekok is successful. The Sungai Bekok has the observed flow between 4.13 cumecs to 7.10 cumecs and it is 350 km2 catchment sizes. Meanwhile, the results of R 2 , RMSE, RRMSE, and MAPE show that the networks calibrated using dry-season of data set (set 4) approximate rainfall-runoff process in the catchment more closely than the networks calibrated using wet-season of data set (set 5). Tables 4.2(a) to 4.2(c) present the R 2 , RMSE, RRMSE, and MAPE resulting from 4-layer MLP model of daily rainfall-runoff relationship for the Sungai Bekok catchment. Meanwhile, Figures 4.2(a) to 4.2(c) illustrates the graphical results of 4-layer MLP during training and testing. Table 4.2(a): Results of 4-Layer neural networks for Sg. Bekok catchment using 100% of data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 16-13-11-1* 402 0.9310 0.1710 0.0339 2.1600 16-13-11-1* 402 0.9340 0.1184 0.0234 1.7441 16-13-11-1* 402 0.9604 0.0920 0.0189 1.7046 16-13-11-1* 402 0.9148 0.1226 0.0246 1.8785 16-13-11-1* 402 0.9732 0.1309 0.0274 2.4430 16-13-11-1* 402 0.8568 0.1296 0.0266 2.1094 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient 112 Table 4.2(b): Results of 4-Layer neural networks for Sg. Bekok catchment using 50% of data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 16-13-11-1* 402 0.9624 0.1268 0.0250 1.6902 16-13-11-1* 402 0.9245 0.1802 0.0356 2.9553 16-13-11-1* 402 0.9580 0.1102 0.0214 1.6143 16-13-11-1* 402 0.8987 0.1810 0.0360 3.0342 16-13-11-1* 402 0.9679 0.0946 0.0192 1.7231 16-13-11-1* 402 0.8378 0.1701 0.0346 2.9212 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient Table 4.2(c): Results of 4-Layer neural networks for Sg. Bekok catchment using 25% of data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 16-13-11-1* 402 0.9155 0.1530 0.0304 2.0354 16-13-11-1* 402 0.9233 0.1170 0.0231 1.6629 16-13-11-1* 402 0.9614 0.0983 0.0204 1.8458 16-13-11-1* 402 0.8966 0.1234 0.0247 1.8017 16-13-11-1* 402 0.9662 0.1388 0.0292 2.6270 16-13-11-1* 402 0.8449 0.1278 0.0263 2.0734 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient For Sungai Bekok catchment, the numbers of input nodes considered for 4-layers MLP are 16 nodes. The numbers of hidden nodes in the hidden layer are 13 and 11 for the first and second layer respectively. Most values of R 2 in Table 4.2(a) to 4.2(c) are in 113 the range of 80% to 97%. According to Kachroo (1986), it is fairly good and satisfactory model. Johnson and King (1988) decided on model accuracy based on MAPE value. According to Johnson and King (1988), this prediction model considered very accurate with MAPE below 10%. By increasing the number of layer in the hidden layer (to 2hidden layer), the results of rainfall-runoff modelling in Sungai Bekok catchment is not significantly different. The R 2 is maintained in the range of 80% to 97%. During the training phase, the RMSE for Sungai Bekok is consistently less than 0.18 cumecs for the 4 layer (16-13-11-1) model structures. The RRMSE also maintains at 0.034 for this model structures. During testing, the RMSE is consistently less than 1.82 cumecs and RRMSE is less than 0.04, and this come close to zero. Obviously, the application of 4 layer MLP structures to model rainfall-runoff relationship of Sungai Bekok is very successful. It is observed that, the 4 layer networks shows a good performances during calibrated and testing using dry-season (data set 4) that approximate rainfall-runoff process in the catchment more closely than the networks calibrated using wet-season data set (set 5). Tables A4.3(a) to A4.3(c) in Appendix A (Part A) present the R 2 , RMSE, RRMSE, and MAPE resulting from 3-layer MLP model for the Sungai Ketil catchment. Meanwhile, Figures I4.3(a) to I4.3(c) in Appendix I (Part A) illustrate the graphical results of 3-layer MLP during training and testing. For Sungai Ketil catchment, the numbers of input nodes considered for 3 layers MLP are also 16 nodes with 14 numbers of hidden nodes. By using 100% of training data sets, it shows that model performance is consistent and reliable for prediction compared to the results by using 50% and 25% of training data sets. It is apparent that a large number of training data sets is required to perform successful training. Most values of R 2 in Table 4.3(a) are in the range of 80% to 90%. This outcome indicates that the MLP model consistently show a good performance in rainfall-runoff modelling. According to Kachroo (1986), it is fairly good model. According to Johnson and King (1988), this prediction model considered very accurate with MAPE less than 10%. During the training phase, the RMSE for Sungai Ketil is consistently less than 0.41 cumecs for the 3-layer (16-14-1) model structure. The 114 RRMSE also maintains below 0.01 for this model structures. During testing, the RMSE is less than 0.85 cumecs and the RRMSE is less than 0.027 that come close to zero. Obviously, the application of MLP method to model rainfall-runoff relationship of Sungai Ketil also very successful. The Sungai Ketil has the observed flow between 29.02 cumecs to 33.15 cumecs with the maximum rainfall of 110.3mm and it is 704 km2 catchment sizes. Tables A4.4(a) to A4.4(c) in Appendix A (Part A) present the R 2 , RMSE, RRMSE, and MAPE resulting from 4-layer MLP model for the Sungai Ketil catchment. Meanwhile, Figures I4.4(a) to I4.4(c) in Appendix I (Part A) illustrate the graphical results of 4-layer MLP during training and testing. For Sungai Ketil catchment, the numbers of input nodes considered for 4 layers MLP are also 16 nodes with 2 hidden layers. The numbers of hidden nodes in the hidden layer are 14 and 10 for the first and second layer respectively. Most values of R 2 in Table A4.4(a) to A4.4(c) are around 80% to 88% for 100% and 50% training data sets. Obviously, the model yields good results when the model calibrated using 100% and 50% training data sets. According to Kachroo (1986), it is fairly good. According to Johnson and King (1988), this prediction model considered very accurate with MAPE below 10%. By increasing the number of layer in the hidden layer (to 2 hidden layers), the results of rainfall-runoff modelling in Sungai Ketil catchment is not significantly different. For the 100% and 50% training data sets, the R 2 are maintained in the range of 80% to 90%. For the testing phase, the RMSE for Sungai Bekok is consistently less than 0.35 cumecs for the 4 layer (16-14-101) model structures. The RRMSE also maintains at 0.0095 for this model structures. During testing, the RMSE is consistently less than 0.67 cumecs and RRMSE is less than 0.022, and this come close to zero. Obviously, the application of 4 layer MLP structures to model rainfall-runoff relationship of Sungai Ketil is very successful. Tables A4.5(a) to A4.5(c) in Appendix A (Part A) present the R 2 , RMSE, RRMSE, and MAPE resulting from 3-layer MLP model for the Sungai Klang catchment. Figures I4.5(a) to I4.5(c) in Appendix I (Part A) illustrate the graphical results during training and testing. For Sungai Klang catchment, the numbers of input nodes considered 115 for 3 layers MLP are 17 nodes with 13 numbers of hidden nodes. By using 100% of training data sets, the model performance is moderate in term of consistency and reliable for prediction. Most values of R 2 in Table 4.5(a) are in the range of 70% to 85% for 100% and 50% training data set. According to Kachroo (1986), it is fairly good model. According to Johnson and King (1988), this prediction model considered reasonable with MAPE below 30%. All results using 100%, 50% and 25% training data sets exhibit MAPE less than 30%. However, data set 5 that represent wet-season give MAPE more than 30%. This clarifies that a large number of training data sets is required to perform successful training and can produce good results. This outcome indicates that the MLP model consistently show a fairly good performance in rainfall-runoff modelling for Sungai Klang catchment. Some of poor performance might be due to the existence of missing records and this problem would affect the parameters estimation in the calibration phase. During the training phase, the RMSE for Sungai Klang is consistently less than 6.4 cumecs for the 3 layer (17-13-1) model structure. The RRMSE also maintains below 0.4 for this model structure. During the testing phase, the RMSE is consistently less than 13.03 cumecs and the RRMSE is maintains below 0.65 for this 3 layer model structures. Obviously, the application of MLP method to model rainfallrunoff relationship of Sungai Klang is successful. The Sungai Klang has the observed flow between 6.0 cumecs to 89.0 cumecs with the maximum rainfall of 114 mm and it is 468 km2 catchment sizes. Tables A4.6(a) to A4.6(c) in Appendix A (Part A) present the R 2 , RMSE, RRMSE, and MAPE resulting from 4-layer MLP model for the Sungai Klang catchment. Meanwhile, Figures I4.6(a) to I4.6(c) in Appendix I (Part A) illustrate the graphical results of 4-layer MLP during training and testing. The Sungai Klang receives runoff from quite large catchment (468 km2) and the observed flow magnitude between 6 cumecs to 89 cumecs. The RMSE during training around 6 cumecs for 4 layer (17-13-91) model structure. The numbers of hidden nodes in the hidden layer are 13 and 9 for the first and second layer respectively. The RRMSE is approximately 0.37. Testing phase yields the RMSE between 4.5 cumecs to 14 cumecs and the RRMSE is approximately 0.3-0.4 for most of the data sets. Most values of R 2 in Table 4.6(a) to 4.6(c) are in the 116 range of 70% to 90%. It shows the consistency and reliable results in the prediction phase. According to Kachroo (1986), it is fairly good. According to Johnson and King (1988), this prediction model considered reasonable with MAPE below 30%. Most of results display MAPE less than 30%. However, data set 5 that represent wet-season gives MAPE more than 30%. By increasing the number of layer in the hidden layer (to 2 hidden layers), the results of rainfall-runoff modelling in Sungai Klang catchment is not significantly different. The application of 4-layer MLP structures to model rainfallrunoff relationship of Sungai Klang is successful. Tables A4.7(a) to A4.7(c) in Appendix A (Part A) present the R 2 , RMSE, RRMSE, and MAPE resulting from 3-layer MLP model for the Sungai Slim catchment. Figures I4.7(a) to I4.7(c) in Appendix I (Part A) illustrate the graphical results of 3-layer MLP during training and testing. For Sungai Slim catchment, the numbers of input nodes considered for 3 layers MLP are 16 nodes with 14 numbers of hidden nodes. By using 100% of training data sets, it shows that this model is moderate in term of consistency and reliable for prediction. Most values of R 2 in Table A4.7(a) are in the range of 73% to 90%. According to Kachroo (1986), it is fairly good model. According to Johnson and King (1988), this prediction model considered very accurate with MAPE less than 10%. This clarifies that a large number of training data sets is required to perform successful training and can produce good results. This outcome indicates that the MLP model consistently show a good performance in rainfall-runoff modelling for Sungai Slim catchment. During the training phase, the RMSE for Sungai Slim is consistently less than 0.06 cumecs for the 3 layer (16-14-1) model structures. The RRMSE also maintains below 0.001 for this model structures. During the testing phase, the RMSE is consistently less than 0.04 cumecs and the RRMSE is maintains below 0.0004 for this 3 layer model structures. Obviously, the application of MLP method to model rainfallrunoff relationship of Sungai Slim is very successful. The Sungai Slim has the observed flow between 65.5 cumecs to 66.27 cumecs with the maximum rainfall of 1310 mm and it is 455 km2 catchment sizes. 117 Tables A4.8(a) to A4.8(c) in Appendix A (Part A) present the R 2 , RMSE, RRMSE, and MAPE resulting from 4-layer MLP model for the Sungai Slim catchment. Meanwhile, Figures I4.8 in Appendix I (Part A) illustrate the graphical results of 4-layer MLP during training and testing. For Sungai Slim catchment, the numbers of input nodes considered for 4-layers MLP are 16 nodes. The numbers of hidden nodes in the hidden layer are 14 and 11 for the first and second layer respectively. Most values of R 2 in Table 4.8(a) are in the range of 70% to 80%. According to Kachroo (1986), it is fairly good model. Johnson and King (1988) decided on model accuracy based on MAPE values and this prediction model considered very accurate with MAPE below 10%. By increasing the number of layer in the hidden layer (to 2 hidden layers), the results of rainfall-runoff modelling in Sungai Slim catchment is not significantly different. During the training phase, the RMSE for Sungai Slim is consistently less than 0.06 cumecs for the 4 layer (16-14-11-1) model structures. The RRMSE also maintains at less than 0.001 for this model structure. During testing, the RMSE is consistently less than 0.026 cumecs and RRMSE is less than 0.0005, and this come close to zero. Obviously, the application of 4 layer MLP structures to model rainfall-runoff relationship of Sungai Slim is successful. 4.2.2 Results of Hourly MLP Model There are two types of MLP model structure have been developed for modelling hourly streamflow hydrograph. First model is the 3 layers MLP structure with one hidden layer. Second model is the 4 layers MLP structure with two hidden layer. Results of three layers (3-layer) and four layers (4-layer) MLP model are described as follows. Tables 4.9(a) to 4.9(c) present the R 2 , RMSE, RRMSE, and MAPE resulting from 3-layer MLP model of hourly rainfall-runoff relationship for the Sungai Bekok catchment. Figures 4.9(a) to 4.9(c) illustrates the graphical results of 3-layer MLP model during training and testing. 118 The measures of performance of each model are indicated by R 2 , RMSE, RRMSE, and MAPE. The numbers of input nodes considered for MLP are 7 nodes for Sungai Bekok. For the neural network training process, the best hidden nodes are chosen based on the minimum root mean square error (RMSE) computed for the training data. The numbers of hidden nodes considered are 6 nodes. There are, 50 selected data sets used for training task and 5 selected sets of data used as the prediction set. Most values of R 2 approach 1.0. This outcome indicates that the MLP model consistently show a good performance in rainfall-runoff modelling. Kachroo (1986) reported that when R 2 more than 90%, the model is is very satisfactory. It is fairly good with R 2 in the range of 80% to 90%. Johnson and King (1988) decided on model accuracy based on MAPE value. The prediction model considered reasonable with MAPE below 30% and very accurate with MAPE less than 10%. Results of modelling for Sungai Bekok with MAPE less than 10% can be considered as very accurate. Table 4.9(a): Results of 3-Layer neural networks for Sg. Bekok catchment using 100% of available data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 7-6-1* 61 0.9927 0.0477 0.0105 1.8910 7-6-1* 61 0.9976 0.0082 0.0016 1.0771 7-6-1* 61 0.9968 0.0091 0.0018 1.1717 7-6-1* 61 0.9896 0.0066 0.0013 0.8792 7-6-1* 61 0.9875 0.0082 0.0019 1.4223 7-6-1* 61 0.9861 0.0033 0.0008 0.6940 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient 119 Table 4.9(b): Results of 3-Layer neural networks for Sg. Bekok catchment using 65% of available data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 7-6-1* 61 0.9907 0.0559 0.0123 1.9101 7-6-1* 61 0.9968 0.0094 0.0018 1.2512 7-6-1* 61 0.9967 0.0102 0.0020 1.4880 7-6-1* 61 0.9869 0.0101 0.0020 1.6788 7-6-1* 61 0.9777 0.0104 0.0024 1.9654 7-6-1* 61 0.9528 0.0108 0.0027 2.5093 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient Table 4.9(c): Results of 3-Layer neural networks for Sg. Bekok catchment using 25% of available data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 7-6-1* 61 0.9902 0.0484 0.0100 1.9422 7-6-1* 61 0.9962 0.0104 0.0020 1.3314 7-6-1* 61 0.9954 0.0107 0.0021 1.3631 7-6-1* 61 0.9928 0.0058 0.0012 0.7490 7-6-1* 61 0.9890 0.0076 0.0017 1.2240 7-6-1* 61 0.8978 0.0204 0.0052 4.8408 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient Tables 4.10(a) to 4.10(c) present the R 2 , RMSE, RRMSE, and MAPE resulting from 4-layer MLP model of hourly rainfall-runoff relationship for the Sungai Bekok catchment. Figures 4.10(a) to 4.10(c) illustrate the graphical results of 4-layer MLP 120 model during training and testing. For the 4 layer networks, the measures of performance of each model are indicated by R 2 , RMSE, RMSE and MAPE. The numbers of input nodes considered for MLP are 7 nodes for Sungai Bekok. The numbers of hidden nodes considered in the first and the second layer of hidden layer are 6 nodes and 8 nodes respectively. There are, 50 selected data sets used for training task and 5 selected sets of data used as the prediction set. For Sungai Bekok catchment, most values of R 2 approach 1.0. This outcome indicates that the MLP model consistently show a good performance in rainfall-runoff modelling. Kachroo (1986) reported that when R 2 more than 90%, the model is is very satisfactory. Johnson and King (1988) decided on model accuracy based on MAPE value. Results of modelling for Sungai Bekok with MAPE less than 10% can be considered as very accurate. Table 4.10(a): Results of 4-Layer neural networks for Sg. Bekok catchment using 100% of available data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 7-6-8-1* 119 0.9927 0.0477 0.0105 1.9012 7-6-8-1* 119 0.9972 0.0088 0.0017 1.1251 7-6-8-1* 119 0.9976 0.0076 0.0015 0.9720 7-6-8-1* 119 0.9923 0.0056 0.0011 0.7509 7-6-8-1* 119 0.9908 0.0077 0.0018 1.3467 7-6-8-1* 119 0.9745 0.0036 0.0009 0.7852 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient 121 Table 4.10(b): Results of 4-Layer neural networks for Sg. Bekok catchment using 65% of available data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 7-6-8-1* 119 0.9907 0.0558 0.0123 2.1302 7-6-8-1* 119 0.9973 0.0085 0.0017 1.1198 7-6-8-1* 119 0.9959 0.0099 0.0019 1.1864 7-6-8-1* 119 0.9675 0.0120 0.0024 1.3771 7-6-8-1* 119 0.9829 0.0100 0.0023 1.4903 7-6-8-1* 119 0.9643 0.0048 0.0012 0.7130 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient Table 4.10(c): Results of 4-Layer neural networks for Sg. Bekok catchment using 25% of available data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 7-6-8-1* 119 0.9352 0.1216 0.0247 1.2130 7-6-8-1* 119 0.8650 0.0673 0.0131 7.5251 7-6-8-1* 119 0.9965 0.0092 0.0018 1.2206 7-6-8-1* 119 0.9849 0.0079 0.0016 1.0253 7-6-8-1* 119 0.8503 0.0276 0.0065 3.0417 7-6-8-1* 119 0.7833 0.0124 0.0031 1.2463 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient Tables A4.11(a) to A4.11(c) in Appendix A (Part B) present the R 2 , RMSE, RRMSE, and MAPE resulting from 3-layer MLP model for the Sungai Ketil catchment. Figures I4.11(a) to I4.11(c) in Appendix I (Part B) illustrate the graphical results of 3- 122 layer MLP model during training and testing. The numbers of input nodes considered for MLP are 6 nodes for Sungai Ketil. Meanwhile, the best numbers of hidden nodes considered are 4 nodes for the network structure. For Sungai Ketil, most values of R 2 also approach 1.0. This outcome indicates that the MLP model consistently show a good performance in rainfall-runoff modelling. According to Kachroo (1986), the model is very satisfactory. According to Johnson and King (1988), the prediction model considered very accurate with MAPE less than 10%. Tables A4.12(a) to A4.12(c) in Appendix A (Part B) present the R 2 , RMSE, RRMSE, and MAPE resulting from 4-layer MLP model for the Sungai Ketil catchment. Figures I4.12(a) to I4.12(c) in Appendix I (Part B) illustrate the graphical results of 4layer MLP model during training and testing. For the Sungai Ketil catchment, the numbers of input nodes considered for MLP are 6 nodes. The numbers of hidden nodes considered in the first and the second layer of hidden layer are 4 nodes and 8 nodes respectively. For Sungai Ketil catchment, most values of R 2 approach 1.0. This outcome indicates that the MLP model consistently show a good performance in rainfallrunoff modelling. Kachroo (1986) reported that when R 2 more than 90%, the model is is very satisfactory. Johnson and King (1988) decided on model accuracy based on MAPE value. Results of modelling for Sungai Ketil with MAPE less than 10% can be considered as very accurate. Tables A4.13(a) to A4.13(c) in Appendix A (Part B) present the R 2 , RMSE, RRMSE, and MAPE resulting from 3-layer MLP model for the Sungai Klang catchment. Figures I4.13(a) to I4.13(c) in Appendix I (Part B) illustrate the graphical results of 3layer MLP model during training and testing. For Sungai Klang catchment, the numbers of input nodes considered for MLP are 8 nodes. Meanwhile, the best numbers of hidden nodes considered are 6 nodes for the network structure. For Sungai Klang, most values of R 2 are in the range of 80% to 90%. According to Kachroo (1986), the model is fairly good model. According to Johnson and King (1988), results of modelling for Sungai Klang with MAPE less than 30% is considered as reasonable prediction. This outcome indicates that the MLP model consistently show a fairly good rainfall-runoff model. 123 Tables A4.14(a) to A4.14(c) in Appendix A (Part B) present the R 2 , RMSE, RRMSE, and MAPE resulting from 4-layer MLP model for the Sungai Klang catchment. Figures I4.14(a) to I4.14(c) in Appendix I (Part B) illustrate the graphical results of 4layer MLP model during training and testing. For the Sungai Klang catchment, the numbers of input nodes considered for MLP are 8 nodes. The numbers of hidden nodes considered in the first and the second layer of hidden layer are 6 nodes and 7 nodes respectively. For Sungai Klang catchment, most values of R 2 are in the range of 70%90%. This outcome indicates that the MLP model consistently show a fairly good model. Meanwhile, Johnson and King (1988) decided on model accuracy based on MAPE value. Results of modelling for Sungai Klang with MAPE less than 30% can be considered as reasonable. Tables A4.15(a) to A4.15(c) in Appendix A (Part B) present the R 2 , RMSE, RRMSE, and MAPE resulting from 3-layer MLP model for the Sungai Slim catchment. Figures I4.15(a) to I4.15(c) in Appendix I (Part B) illustrate the graphical results of 3layer MLP model during training and testing. The numbers of input nodes considered for MLP are 7 nodes for Sungai Slim catchment. Meanwhile, the best numbers of hidden nodes considered are 5 nodes. For Sungai Slim, most values of R 2 also approach 1.0. This outcome indicates that the MLP model consistently show a good performance in rainfall-runoff modelling. According to Kachroo (1986), the model is very satisfactory. According to Johnson and King (1988), the prediction model considered very accurate with MAPE less than 10%. Tables A4.16(a) to A4.16(c) in Appendix A (Part B) present the R 2 , RMSE, RRMSE, and MAPE resulting from 4-layer MLP model for the Sungai Slim catchment. Figures I4.16(a) to I4.16(c) in Appendix I (Part B) illustrate the graphical results of 4layer MLP model during training and testing. For the Sungai Slim catchment, the numbers of input nodes considered for MLP are 7 nodes. The numbers of hidden nodes considered in the first and the second layer of hidden layer are 5 nodes and 9 nodes respectively. For Sungai Slim catchment, most values of R 2 approach 1.0. This outcome indicates that the MLP model consistently show a good performance in rainfall- 124 runoff modelling. Kachroo (1986) reported that when R 2 more than 90%, the model is very satisfactory. Johnson and King (1988) decided on model accuracy based on MAPE value. Results of modelling for Sungai Slim with MAPE less than 10% can be considered as very accurate. During the training phase, the RMSE for Sungai Bekok is consistently less than 0.1 cumecs for the 3 layer (7-6-1) and 4 layer (7-6-8-1) model structures. The RRMSE also maintains at less than 0.0105 for both model structures. During testing, the RMSE is less than 0.01 cumecs and RRMSE is less than 0.002 that come close to zero. Obviously, the application of MLP method to model hourly rainfall-runoff relationship of Sungai Bekok is very successful. The Sungai Bekok has the observed flow between 3.61 cumecs to 6.95 cumecs and it is 350 km2 catchment size. For Sungai Ketil, during the training phase, the RMSE is consistently less than 0.1 cumecs for the 3 layer (7-5-1) and 4 layer (6-4-8-1) model structures. The RRMSE is maintains below 0.03 for both model structures. During testing, the RMSE is less than 0.01 cumecs and RRMSE is less than 0.02 that come close to zero. Obviously, the application of MLP method to model hourly rainfall-runoff relationship of Sungai Ketil is moderate and satisfactory. The Sungai Ketil has the observed flow between 28.87 cumecs to 34.22 cumecs and it is 704 km2 catchment size. The Sungai Klang receives runoff from quite large catchment (468 km2) and the observed flow magnitude between 3.6 cumecs to 424.7 cumecs. The RMSE during training around 10 cumecs for 3 layer (8-6-1) and 4 layer (8-6-7-1) model structures. The RRMSE is approximately 0.25. Testing phase yields the RMSE between 4.0 cumecs to 13.5 cumecs and the RRMSE is approximately 0.2 cumecs for most of the data sets. Meanwhile, the Sungai Slim is semi-develop area with size 455 km2. It receives runoff from the observed flow magnitude between 23.77 cumecs to 26.38 cumecs. The RMSE during training is less than 0.06 cumecs for 3 layer (7-5-1) and 4 layer (7-5-9-1) 125 model structures. Meanwhile the RRMSE is less than 0.005. Testing phase yields the RMSE below 0.5 cumecs and the RRMSE is below 0.02 cumecs for most of the data sets. 4.2.3 Training and Validation The training process is time consuming. If the architecture of the training algorithms is not suitable, it will affect the accuracy of predictions and a network’s learning ability. The number of hidden nodes significantly influences the performance of a network and the time taken to train the model. The number of nodes in the hidden layer can be as small or large as required. There are no fixed rules about the number of nodes in the hidden layer. It is related to the complexity of the system being modelled and to the resolution of the data fit. The number of nodes in the hidden layer was determined by trial and error for each case. If this number of hidden nodes is small, the network can suffer from under fit of the data and may not achieve the desired level of accuracy, while with too many nodes it will take a long time to be adequately trained and may some times over fit the data. French et. al. (1992) proposed that normally neural networks were developed using 15, 30, 45, 60 and 100 hidden nodes. This procedure is also considered to examine the performance of neural network model with different number of hidden nodes and hidden layers. In this study, these procedures are not relevance. It is because every one additional number of node yields different results and may contribute to the consistency and accuracy of the model. So, we proposed to increase the number of nodes in the hidden layer one by one. In the case of MLP model, it is observed that a large number of training data sets is required to perform successful training. As a result, the model will get a proper trained and it is capable to yield a better results. So, the accuracy of MLP increases as more and more input data are made available to them. But, it must be kept in mine to not using an excessive number of neurons or layers, without considering the limitations due to the number of observations in training set. The validation process is important to give a 126 signal to the system due to overtraining. Overtraining leads to poor future predictions because it forces networks to fit the noise in training data rather that generalizing the patterns in the training set. The main objective of validation process is to stop the simulation or calibration process when the error was increased. So that, it will shorten the time taken for calibration process and give the best results with high accuracy. Table H4.61 to H4.68 in Appendix H shows the daily and hourly results of percentage bias (PBIAS) of calibration or training for the 3-layer and 4-layer of MLP model. According to Yapo et. al (1996), the optimal value is zero, which means that the model has an unbiased flow simulation. Positives value indicates a tendency of overestimation and negative values indicate a tendency of underestimation. It shows that, by using 100% amount of data, the PBIAS are in the range of -0.1 to +0.1. It means that the models were robust and unbiased and it was ready to be used for prediction. Meanwhile, model that has been calibrated using 50% and 25% amount of data were also satisfactory. But, the PBIAS of the models was in the range of +4.0 to -2.0. It may cause by little or not enough data used in calibration or training process. Therefore, the models have a tendency of overestimate or underestimate. Normally, it can be seen that the result of training is lower than the results of testing in term of correlation of coefficient ( R 2 ). But, some of the results show the improvement in term of RMSE, RRMSE and MAPE and it is still in the acceptable range. This scenario shows that with the proper and enough trained, the model would be consistent and robust and it will have the ability to yield a good results. 4.2.4 Testing Testing of the model was carried out by using five different data sets with the different period of time. The objective is to evaluate performance and the robustness of 127 the model by using several sets of data. This new sets of data set will introduce to the model that has been calibrated. One of the problems that occur during neural network testing is over-fitting. The error on the training set is driven to a very small value, but when new or testing data is presented to the network the error is large. The network has memorized the training examples, but it has not learned to generalize to new situations. So, in this study the network that is just large enough was used to provide an adequate fit. The larger a network has been used, the more complex the functions the network can create. By using a small enough network, it will not have enough power to overfit the data. So, one method that has been implemented in this study to estimate how large a network should be for a specific application is early stopping method. Early stopping can be used with any of the training functions. The number of nodes in the hidden layer can be as small or large as required. The number of nodes in the hidden layer was determined by trial and error for each case. By using trial and error method, the best optimum number of hidden nodes in the hidden layer that produced the best fit results can be decided. It is related to the complexity of the system being modelled and quality of the data sets. By using 100% of training data sets, it would produced more accurate, consistence and reliable results compared to the models that using 50% or 25% of training data sets. It is observed that the accuracy of MLP increases as more and more input data are made available to them. The number of hidden layer neurons significantly influences the performance of a network. If this number is small, the network may not achieve a desired level of accuracy, while with too many nodes it will take a long time to get trained and may sometimes over fit the data. The application of two hidden layer appear to be an advantage for a bigger and large catchment such as Sungai Ketil. It can be seen that the smaller catchment as Sungai Bekok is sufficient for a single hidden layer of neural model structure. In general, 3layers and 4-layers of MLP networks show slightly better performance both in the training and testing periods. 128 In overall, by increase the number of hidden layer and number of hidden nodes in the model, it will increase the complexity of the system, and it may slow the training and testing process without substantially improving the efficiency of the network. 4.2.5 Robustness Test The robustness tests were carried out for each model by using difference sets of data with difference period of time. For daily and hourly rainfall-runoff modelling, 5 different sets of data namely set 1, set 2, set 3, set 4 and set 5 were used to study the capability of the models in transforming a rainfall into runoff in any condition. The measures of performance of each model are indicated by R 2 , RMSE, RRMSE, and MAPE. The MLP model is capable and robust in modelling continuous daily rainfallrunoff relationship. This model is robust and demonstrates remarkable performance in modelling hourly streamflow hydrograph. Even the MLP model considered a big number of parameter to be estimated, it shows a good and reasonably robust calibration. It shows that MLP model is consistent, reliable and robust to coupe with any condition or problem regarding to the input data that have to introduce to the model. 4.3 Results of the Radial Basis Function (RBF) Model Result of daily and hourly rainfall-runoff modelling were discussed in section 4.3.1 and 4.3.2 respectively. The different of RBF model compared to the MLP model is the number of layer for the network structure, which are only three layers. 129 4.3.1 Results of Daily RBF Model Tables 4.17(a) to 4.17(c) present the R 2 , RMSE, RRMSE, and MAPE resulting from RBF model of daily rainfall-runoff relationship for the Sungai Bekok catchment. Meanwhile, Figures 4.17(a) to 4.17(c) illustrates the graphical results of RBF model during training and testing. For the RBF training process, the best-input nodes are chosen based on the RMSE computed for the training data. The numbers of input nodes considered for RBF are 16 nodes for Sungai Bekok. Results of modelling for Sungai Bekok with MAPE less than 10% can be considered as very accurate. Meanwhile, the RBF model give R 2 between more than 90% and this condition shows that the model performance is very satisfactory. During the training phase, the RMSE for Sungai Bekok is consistently less than 0.18 cumecs. The RRMSE also maintains at 0.35 for this model structures. During testing, the RMSE is consistently less than 0.29 cumecs and RRMSE is less than 0.06, and this come close to zero. Table 4.17(a): Results of RBF networks for Sg. Bekok catchment using 100% of data sets in training phase MODEL Data Set RBF TRAINING RBF-TEST Set 1 RBF-TEST Set 2 RBF-TEST Set 3 RBF-TEST Set 4 RBF-TEST Set 5 Model Structure 16 input nodes 16 input nodes 16 input nodes 16 input nodes 16 input nodes 16 input nodes No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 52 0.9410 0.1776 0.0344 2.5298 52 0.9060 0.2048 0.0401 3.2268 52 0.9299 0.1612 0.0313 2.4287 52 0.8946 0.2091 0.0411 3.3006 52 0.9708 0.1424 0.0275 2.2149 52 0.8196 0.1837 0.0369 2.9040 cumecs-meter cubic second; COC-correlation of coefficient 130 Table 4.17(b): Results of RBF networks for Sg. Bekok catchment using 50% of data sets in training phase MODEL Data Set Model Structure RBF TRAINING RBF-TEST Set 1 RBF-TEST Set 2 RBF-TEST Set 3 RBF-TEST Set 4 RBF-TEST Set 5 16 input nodes 16 input nodes 16 input nodes 16 input nodes 16 input nodes 16 input nodes No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 52 0.9770 0.1097 0.0220 1.5198 52 0.8490 0.2508 0.0493 4.0637 52 0.8706 0.1899 0.0367 2.6731 52 0.8625 0.2518 0.0496 4.1023 52 0.9578 0.1469 0.0282 2.1788 52 0.7342 0.2244 0.0449 3.5356 cumecs-meter cubic second; COC-correlation of coefficient Table 4.17(c): Results of RBF networks for Sg. Bekok catchment using 25% of data sets in training phase MODEL Data Set Model Structure RBF TRAINING RBF-TEST Set 1 RBF-TEST Set 2 RBF-TEST Set 3 RBF-TEST Set 4 RBF-TEST Set 5 16 input nodes 16 input nodes 16 input nodes 16 input nodes 16 input nodes 16 input nodes No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 52 0.9920 0.0747 0.0152 1.1121 52 0.8336 0.2861 0.0559 4.6072 52 0.8337 0.2078 0.0400 2.8824 52 0.8279 0.2788 0.0548 4.5339 52 0.9550 0.0896 0.0178 1.4919 52 0.7414 0.2303 0.0462 3.6924 cumecs-meter cubic second; COC-correlation of coefficient The RBF networks show a successful calibration and testing process using dryseason of data set (set 4). It approximates rainfall-runoff process in the catchment more closely than the networks calibrated using wet-season of data set (set 5). 131 Tables A4.18(a) to A4.18(c) in Appendix B (Part A) present the R 2 , RMSE, RRMSE, and MAPE resulting from RBF model for the Sungai Ketil catchment. Meanwhile, Figures I4.18(a) to I4.18(c) in Appendix I (Part A) illustrate the graphical results of RBF model during training and testing. For the Sungai Ketil catchment, the numbers of input nodes considered for RBF are also 16 nodes. According to Johnson and King (1988) results of modelling for Sungai Ketil with MAPE less than 10% can be considered as very accurate. Meanwhile, the RBF model give R 2 in the range of 80% to 90% and this condition shows that the model performance is fairly good model. During the training phase, the RMSE for Sungai Ketil is consistently less than 0.172 cumecs. The RRMSE also consistently less than 0.02 for this model. During testing, the RMSE is consistently less than 0.56 cumecs and RRMSE is less than 0.019, and this come close to zero. Tables A4.19(a) to A4.19(c) in Appendix B (Part A) present the R 2 , RMSE, RRMSE, and MAPE resulting from RBF model for the Sungai Klang catchment. Meanwhile, Figures I4.19(a) to I4.19(c) in Appendix I (Part A) illustrate the graphical results of RBF model during training and testing. For the Sungai Klang catchment, the numbers of input nodes considered for RBF are 17 nodes. According to Johnson and King (1988) results of modelling for Sungai Ketil with MAPE less than 30% can be considered as reasonable model. Meanwhile, the RBF model gives R 2 in the range of 70% to 80% and this condition shows that the model performance can be classified as a moderate model. During the training phase, the RMSE for Sungai Klang is consistently less than 6.3 cumecs. The RRMSE also consistently less than 0.34 for this model. During testing, the RMSE is consistently less than 12.8 cumecs and RRMSE is less than 0.59, and this come close to zero. Tables A4.20(a) to A4.20(c) in Appendix B (Part A) present the R 2 , RMSE, RRMSE, and MAPE resulting from RBF model for the Sungai Slim catchment. Meanwhile, Figures I4.20(a) to I4.20(c) in Appendix I (Part A) illustrate the graphical results of RBF model during training and testing. For the Sungai Slim catchment, the best-input nodes chosen are 16 nodes. Results of modelling for Sungai Slim with MAPE 132 less than 10% can be considered as very accurate. Meanwhile, the RBF model give R 2 in the range of 60% to 80% and this condition can be classified as moderate model. During the training phase, the RMSE for Sungai Slim is consistently less than 0.05 cumecs. The RRMSE also maintains at 0.0008 for this model. Meanwhile, during testing, the RMSE is consistently less than 0.034 cumecs and RRMSE is less than 0.0005, and this come close to zero. 4.3.2 Results of Hourly RBF Model Tables 4.21(a) to 4.21(b) present the R 2 , RMSE, RRMSE, and MAPE resulting from RBF model of hourly rainfall-runoff relationship for the Sungai Bekok catchment. Figures 4.21(a) to 4.21(c) illustrate the graphical results of RBF model during training and testing. For the RBF training process, the best-input nodes are chosen based on the minimum RMSE computed for the training data. The numbers of input nodes considered for RBF are 7 input nodes for Sungai Bekok catchment. The RBF model learns faster than MLP model, and the model can produce results rapidly in the testing phase. Obviously, the limitation of the RBF model is, it unable to carry large number of data sets. The RBF model is stable and yield a consistent and reliable results by using the optimum data sets. During the training phase, the RMSE for Sungai Bekok is consistently less than 0.1 cumecs and the RRMSE also maintains below 0.02. During testing, the RMSE is less than 0.004 cumecs and RRMSE is less than 0.001 and this come close to zero. The application of RBF method to model rainfall-runoff relationship of Sungai Bekok is very successful. Results of modelling for Sungai Bekok with MAPE less than 10% can be considered as very accurate. 133 Table 4.21(a): Results of RBF networks for Sg. Bekok catchment – using 25% of available data sets in training phase MODEL Data Set RBF TRAINING RBF-TEST Set 1 RBF-TEST Set 2 RBF-TEST Set 3 RBF-TEST Set 4 RBF-TEST Set 5 Model Structure 7 input nodes 7 input nodes 7 input nodes 7 input nodes 7 input nodes 7 input nodes No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 25 0.9942 0.0372 0.0074 0.7893 25 0.9995 0.0039 0.0008 0.3151 25 0.9994 0.0038 0.0008 0.3430 25 0.9976 0.0031 0.0006 0.2307 25 0.9998 0.0011 0.0002 0.1108 25 0.9997 0.0004 0.0001 0.0369 cumecs-meter cubic second; COC-correlation of coefficient Table 4.21(b): Results of RBF networks for Sg. Bekok catchment – using minimum data sets in training phase MODEL Data Set RBF TRAINING RBF-TEST Set 1 RBF-TEST Set 2 RBF-TEST Set 3 RBF-TEST Set 4 RBF-TEST Set 5 Model Structure 7 input nodes 7 input nodes 7 input nodes 7 input nodes 7 input nodes 7 input nodes No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 25 0.9709 0.0975 0.0181 2.8852 25 1.000 0.0004 0.0000 0.0197 25 0.9998 0.0020 0.0004 0.0838 25 0.9991 0.0019 0.0004 0.0862 25 1.000 0.0003 0.0000 0.0108 25 0.9997 0.0002 0.0000 0.0154 cumecs-meter cubic second; COC-correlation of coefficient Tables B4.22(a) to B4.22(b) in Appendix B (Part B) present the R 2 , RMSE, RRMSE, and MAPE resulting from RBF model for the Sungai Ketil catchment. Figures I4.22(a) to I4.22(c) in Appendix I (Part B) illustrate the graphical results of RBF model 134 during training and testing. The numbers of input nodes considered for RBF are 6 input nodes for Sungai Ketil catchment. Obviously, the RBF model learns faster than MLP model, and the model can produce results rapidly in the testing phase. During the training phase, the RMSE for Sungai Ketil is consistently less than 0.03 cumecs and the RRMSE also maintains below 0.001. During testing, the RMSE is less than 0.04 cumecs and RRMSE is less than 0.015 and this come close to zero. Obviously, the application of RBF method to model rainfall-runoff relationship of Sungai Ketil is also very successful. Results of modelling for Sungai Ketil with MAPE less than 10% can be considered as very accurate. Tables B4.23(a) to B4.23(b) in Appendix B (Part B) present the R 2 , RMSE, RRMSE, and MAPE resulting from RBF model for the Sungai Klang catchment. Figures I4.23(a) to I4.23(c) in Appendix I (Part B) illustrate the graphical results of RBF model during training and testing. The numbers of input nodes considered for RBF are 8 input nodes for Sungai Klang catchment. Obviously, the RBF model learns faster than MLP model, and the model can produce results rapidly in the testing phase. During the training phase, the RMSE for Sungai Klang is consistently less than 0.21 cumecs and the RRMSE also maintains below 0.02 and this come close to zero. During testing, the RMSE is less than 0.025 cumecs and RRMSE is less than 0.002 and this also come close to zero. Obviously, the application of RBF method to model rainfall-runoff relationship of Sungai Klang is very successful and good performance. According to Johnson and King (1988), results of modelling for Sungai Klang with MAPE less than 10% can be considered as very accurate. Tables B4.24(a) to B4.24(b) in Appendix B (Part B) present the R 2 , RMSE, RRMSE, and MAPE resulting from RBF model for the Sungai Slim catchment. Figures I4.24(a) to I4.24(c) in Appendix I (Part B) illustrate the graphical results of RBF model during training and testing. The numbers of input nodes considered for RBF are 7 input nodes for Sungai Slim catchment. Obviously, the RBF model learns faster than MLP model, and the model can produce results rapidly in the testing phase. During the training phase, the RMSE for Sungai Slim is consistently less than 0.031 cumecs and the 135 RRMSE also maintains below 0.0015 and this come close to zero. During testing, the RMSE is less than 0.025 cumecs and RRMSE is less than 0.001 and this also come close to zero. The application of RBF method to model rainfall-runoff relationship of Sungai Slim is very successful and yields a good performance. According to Johnson and King (1988), results of modelling for Sungai Klang with MAPE less than 10% can be considered as very accurate. 4.3.3 Training and Validation The training and validation of RBF model was also works parallels. It was important to control the time consuming in the calibration process. As mentioned before, it would be stop if the errors become higher. In general, the RBF network can be described as universal approximated function using combinations of basis functions centred around weights vectors to provide spatial estimates. The best results were achieved for the network with Gaussian activation function, GRNN algorithms, and appropriated number of input nodes. If the architecture of the training algorithms is not suitable, it will affect the accuracy of predictions and a network’s learning ability. The number of input nodes significantly influences the performance of a network and the time taken to train the model. It is related to the complexity of the system being modelled and to the resolution of the data fit. The number of input nodes in the input layer was determined by trial and error for each case. If this number of input nodes is small, the network can suffer from under fit of the data and may not achieve the desired level of accuracy, while with too many nodes it will take a long time to be adequately trained and may some times over fit the data. The trial and error method, will determine the best optimum number of input nodes in the input layer that produced the best fit results. It is related to the complexity of the system being modelled and quality of the data sets. The daily rainfall-runoff modelling, used 100%, 50% and 25% of training data sets to evaluate the accuracy, 136 consistency and reliability of the results. Meanwhile, for hourly rainfall-runoff modelling only 25% and minimum of training data sets are used because of so many numbers of data sets involved within the period of time in this can increase complexity. The study confirms that the accuracy of RBF is not affected by the number of input data that available to them. Table H4.61 to H4.68 in Appendix H show the daily and hourly results of percentage bias (PBIAS) of calibration or training for the RBF model. In general, RBF model shows consistent values of PBIAS for model the hourly rainfall-runoff relationship and the PBIAS reaches zero. Meanwhile, the PBIAS for model the daily rainfall-runoff relationship was in the range of -0.01 to -6.0. It shows that the RBF model has a small tendency for underestimate. 4.3.4 Testing Testing of the RBF model was also carried out by using five different data sets with the different period of time. The objective is to evaluate performance and the robustness of the model by using several sets of data. This new sets of data set will introduce to the model that has been calibrated. The different of RBF model compared to the MLP model is the number of layer for the network structure are only three layers. As the MLP model, the problems that occur during RBF network testing are overfitting. The error on the training set is driven to a very small value, but when new or testing data is presented to the network the error is large. So, in this study the network that is just large enough was used to provide an adequate fit. The larger a network has been used, the more complex the functions the network can create. If a small enough network is used, it will not have enough power to overfit the data. 137 The number of nodes in the hidden layer can be as small or large as required. The number of nodes in the input layer and the hidden layer was determined by trial and error for each case. By using trial and error method, the best optimum number of hidden nodes in the hidden layer that produced the best fit results can be found. It is related to the complexity of the system being modeled and quality of the data sets. The accuracy of RBF is not affected by the number of input data that available to them. The number of hidden layer neurons significantly influences the performance of a network. If this number is small, the network may not achieve a desired level of accuracy, while with too many nodes it will take a long time to get trained and may sometimes over fit the data. 4.3.5 Robustness Test As the MLP model, the robustness tests for RBF model were carried out for each model by using difference sets of data with difference period of time. For daily and hourly rainfall-runoff modelling, five different sets of data namely set 1, set 2, set 3, set 4 and set 5 were used to study the capability of the models in transforming a rainfall into runoff in any condition. The measures of performance of each model are indicated by R 2 , RMSE, RRMSE, MAPE and PBIAS. The study confirmed that the RBF model was capable and robust in modelling daily and hourly rainfall-runoff relationship. The results of R 2 and the error analysis, indicate that the RBF model is very consistent and robust and have a good agreement for modelling of hourly rainfall-runoff relationship. The RBF model is more reliable and gives a higher accuracy for model short duration event or storm hydrograph modelling. 138 4.4 Results of the Multiple Linear Regression (MLR) Model Result of daily rainfall-runoff modelling using MLR model were discussed below. The MLR model is an alternative tool that applied the regression and statistical concepts to develop the relationship between the input-output data pairs. This model is only applied for model of continuous daily rainfall-runoff modelling. 4.4.1 Calibration The calibration of the model parameters is usually accomplished by trial and error process. Thus, the calibration accuracy of the models is very subjective and highly dependent on the knowledge, experience, and understanding of the components of the model and catchment characteristics. Most calibration studies in the past have involved some form of optimization of the parameter values by comparing the results of repeated simulations with whatever observations of the catchment response are available. The parameter values are adjusted between each run of the model; either manually or by some computerized optimization algorithm until some ‘best fit’ parameter set has been found. Methods of model calibration that assume an optimum parameter set and that ignore the estimation of predictive uncertainty are simple trial and error method with parameter values adjusted and the variety of automatic optimization methods. Once one model have been chosen for consideration in a project, it is necessary to address the problem of parameter calibration. In general, it is possible to estimate the parameters of model by either measurement or prior estimation. It is generally necessary to go through a stage of parameter calibration before apply the model to make quantitative predictions for a particular catchment. All the models used in hydrology have equations that involve a variety of different input and state variable. Finally, there 139 are the model parameters which define the characteristics of the catchment area or flow domain. Once the model parameter values have been specified, a simulation may be made and quantitative predictions about the response obtained. The application of MLR model is suitable for daily rainfall-runoff modelling, but the theory somehow couldn’t link to the daily situation. It is because this model was very sensitive to the inconsistency of data especially the continuous data with some extremely maximum data. In the calibration, the data would be sorted by stages that functional of time t , t − 1 , t − 2 , and so on until the optimum number of input nodes that highly correlated and can contribute to the output or results are determined. By using the best structure and parameters that has been selected, then the model will be test by using test sets of data to evaluate the robustness of the model. The following multiple linear regression (MLR) model are proposed for the prediction of runoff using available rainfall measurements from rain gauges locate at the site. The models are proposed to describe the relationships between daily rainfalls and daily runoffs using available data series for the five years period. The proposed models for each catchment are described in section 4.7.1.3, 4.7.1.4 and 4.7.1.5 respectively. 4.4.1.1 Model developed using 100% of calibration data This section shows the model structures that have been developed by using 100 percents of calibration data. The MLR model structures for each catchment are described as follows: (i) Sungai Bekok catchment area y (t ) = 0.056 xt + 0.080 xt −1 + 0.168 xt − 2 + 0.235 xt −3 + 0.228 xt − 4 + 0.218 xt −5 + 0.210 xt −6 + 0.187 xt −7 + 0.169 xt −8 + 0.143xt −9 + 0.134 xt −10 + 0.138 xt −11 + 0.146 xt −12 + 0.137 xt −13 + 0.144 xt −14 + 0.145 xt −15 (4.1) 140 (ii) Sungai Ketil catchment area y (t ) = 0.042 xt + 0.048 xt −1 + 0.222 xt − 2 + 0.334 xt −3 + 0.170 xt − 4 + 0.145 xt −5 + 0.123xt −6 + 0.135 xt −7 + 0.080 xt −8 + 0.085 xt −9 + 0.109 xt −10 + 0.101t −11 + 0.101xt −12 + 0.086 xt −13 + 0.111xt −14 + 0.089 xt −15 (4.2) (iii) Sungai Klang catchment area y (t ) = 0.561xt + 0.287 xt −1 + 0.136 xt − 2 + 0.109 xt −3 + 0.065 xt − 4 + 0.065 xt −5 + 0.052 xt −6 + 0.040 xt −9 + 0.040 xt −10 + 0.072 xt −11 + 0.049 xt −12 + 0.047 xt −13 + 0.061xt −15 + 0.054 xt −16 (4.3) (iv) Sungai Slim catchment area y (t ) = 0.252 xt + 0.306 xt −1 + 0.184 xt − 2 + 0.156 xt −3 + 0.160 xt − 4 + 0.172 xt −5 + 0.107 xt −6 + 0.106 xt −7 + 0.107 xt −8 + 0.082 xt −9 + 0.084 xt −10 + 0.083xt −11 + 0.049 xt −12 + 0.079 xt −13 + 0.101xt −14 + 0.107 xt −15 (4.4) where y(t ) is predicted runoff; xt , xt −1 ,..., xt −n are the rainfall input for the corresponding time. It was observed that 16 inputs data gives significant contributions to the model structures for Sungai Bekok, Sungai Ketil and Sungai Slim catchments respectively. Meanwhile, for Sungai Klang catchment, there are only 14 inputs data gives a significant contribution to the model structure, where the rainfall at time t − 7 and t − 8 were not take into consideration. 4.4.1.2 Model developed using 50% of calibration data This section shows the model structures that have been developed by using 50 percents of calibration data. The MLR model structures for each catchment are described as follows: 141 (i) Sungai Bekok catchment area y (t ) = 0.054 xt + 0.096 xt −1 + 0.167 xt − 2 + 0.219 xt −3 + 0.213xt − 4 + 0.201xt −5 + 0.197 xt −6 + 0.183xt −7 + 0.167 xt −8 + 0.143xt −9 + 0.126 xt −10 + 0.139 xt −11 + 0.149 xt −12 + 0.152 xt −13 + 0.159 xt −14 + 0.159 xt −15 (4.5) (ii) Sungai Ketil catchment area y (t ) = 0.085 xt + 0.059 xt −1 + 0.210 xt − 2 + 0.260 xt −3 + 0.142 xt − 4 + 0.165 xt −5 + 0.141xt −6 + 0.132 xt −7 + 0.095 xt −8 + 0.109 xt −9 + 0.130 xt −10 + 0.120 t −11 + 0.100 xt −12 + 0.098 xt −13 + 0.105 xt −14 + 0.086 xt −15 (4.6) (iii) Sungai Klang catchment area y (t ) = 0.581xt + 0.286 xt −1 + 0.141xt − 2 + 0.102 xt −3 + 0.064 xt − 4 + 0.085 xt −5 + 0.082 xt −6 + 0.050 xt −9 + 0.046 xt −10 + 0.061xt −11 + 0.079 xt −12 + 0.067 xt −13 + 0.096 xt −15 + 0.048 xt −16 (4.7) (iv) Sungai Slim catchment area y (t ) = 0.305 xt + 0.443xt −1 + 0.252 xt − 2 + 0.220 xt −3 + 0.217 xt − 4 + 0.217 xt −5 + 0.143xt −6 + 0.148 xt −7 + 0.142 xt −8 + 0.121xt −9 + 0.142 xt −10 + 0.122 xt −11 + 0.070 xt −12 + 0.100 xt −13 + 0.121xt −14 + 0.125 xt −15 (4.8) where y(t ) is predicted runoff; xt , xt −1 ,..., xt −n are the rainfall input for the corresponding time. It was also found that 16 inputs data gives significant contributions to the model structures for Sungai Bekok, Sungai Ketil and Sungai Slim catchments respectively; and Sungai Klang catchment with only 14 inputs data. 4.4.1.3 Model developed using 25% of calibration data This section shows the model structures that have been developed by using 25 percents of calibration data. The MLR model structures for each catchment are described as follows: 142 (i) Sungai Bekok catchment area y (t ) = 0.126 xt − 2 + 0.201xt −3 + 0.227 xt − 4 + 0.224 xt −5 + 0.208 xt −6 + 0.175 xt −7 + 0.154 xt −8 + 0.140 xt −9 + 0.125 xt −10 + 0.135 xt −11 + 0.153xt −12 + 0.140 xt −13 + 0.127 xt −14 + 0.116 xt −15 (4.9) (ii) Sungai Ketil catchment area y (t ) = 0.160 xt + 0.139 xt −1 + 0.210 xt − 2 + 0.242 xt −3 + 0.168 xt −4 + 0.237 xt −5 + 0.167 xt −6 + 0.132 xt −7 + 0.126 xt −8 + 0.155 xt −9 + 0.161t −10 + 0.077 t −11 + 0.070 xt −13 + 0.070 xt −14 (4.10) (iii) Sungai Klang catchment area y (t ) = 0.619 xt + 0.286 xt −1 + 0.143xt − 2 + 0.104 xt −3 + 0.091xt − 4 + 0.079 xt −5 + 0.111xt −6 + 0.063xt −9 + 0.062 xt −10 + 0.060 xt −11 + 0.056 xt −12 + 0.093xt −13 + 0.082 xt −15 + 0.075 xt −16 (4.11) (iv) Sungai Slim catchment area y (t ) = 0.316 xt + 0.498 xt −1 + 0.246 xt − 2 + 0.204 xt −3 + 0.194 xt −4 + 0.219 xt −5 + 0.125 xt −6 + 0.156 xt −7 + 0.132 xt −8 + 0.120 xt −9 + 0.124 xt −10 + 0.093xt −11 + 0.088 xt −13 + 0.104 xt −14 + 0.127 xt −15 (4.12) where y(t ) is predicted runoff; xt , xt −1 ,..., xt −n are the rainfall input for the corresponding time. It was observed that, by using 25 percents of the data for calibration, there are only 14 inputs data gives significant contributions to the model structures for Sungai Bekok, Sungai Ketil and Sungai Klang catchments respectively. Meanwhile, for Sungai Slim catchment, it was observed that 15 inputs data gives a significant contribution to the model structure. Table H4.61 to H4.68 in Appendix H show the daily and hourly results of percentage bias (PBIAS) of calibration or training for the MLR model. According to 143 Yapo et. al (1996), the MLR model can be classified as bias model. This study indicates the PBIAS are in the range of -50.0 to +300.0. Therefore, the models have a tendency of overestimate or underestimate. 4.4.2 Results of Daily MLR Model Tables 4.25(a) to 4.25(c) present the R 2 , RMSE, RRMSE, and MAPE resulting from MLR model of daily rainfall-runoff relationship for the Sungai Bekok catchment. For Sungai Bekok catchment, the numbers of input considered for MLR are 16 inputs after calibrated using both 100% and 50% of the data sets. 14 inputs are considered after calibrated using 25% of the data sets. The MLR model give R 2 in the range of 70% to 90% and this condition can be classified as fairly good model. According to Johnson and King (1988) results of modelling for Sungai Bekok with MAPE around 30% can be considered a reasonable prediction. During the calibration phase, the RMSE is consistently less than 186 cumecs. The RRMSE is consistently less than 33.7 for this model. During testing, the RMSE is consistently less than 183.3 cumecs and RRMSE is less than 35.9. Table 4.25(a): Results of MLR Model for Sg. Bekok catchment – using 100% of data sets in training phase MODEL Data Set MLR TRAINING MLR-TEST Set 1 MLR-TEST Set 2 MLR-TEST Set 3 MLR-TEST Set 4 MLR-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 16 input 6 0.8255 180.6978 32.6649 27.2412 16 input 6 0.8803 164.8479 32.4936 28.0611 16 input 6 0.8567 166.2273 32.6982 28.9360 16 input 6 0.7568 164.1098 32.5462 27.1136 16 input 6 0.8106 183.2238 35.8822 32.7099 16 input 6 0.6682 142.3990 28.5501 21.6171 cumecs-meter cubic second; COC-correlation of coefficient 144 Table 4.25(b): Results of MLR Model for Sg. Bekok catchment – using 50% of data sets in training phase MODEL Data Set MLR TRAINING MLR-TEST Set 1 MLR-TEST Set 2 MLR-TEST Set 3 MLR-TEST Set 4 MLR-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 16 input 6 0.8754 185.6333 33.6433 28.6339 16 input 6 0.8708 163.3746 32.2331 27.9174 16 input 6 0.9353 164.9307 32.3937 28.7739 16 input 6 0.8258 161.9409 32.0716 26.7573 16 input 6 0.9542 182.7206 35.7068 32.6845 16 input 6 0.7409 140.9284 28.2122 21.4292 cumecs-meter cubic second; COC-correlation of coefficient Table 4.25(c): Results of MLR Model for Sg. Bekok catchment – using 25% of data sets in training phase MODEL Data Set MLR TRAINING MLR-TEST Set 1 MLR-TEST Set 2 MLR-TEST Set 3 MLR-TEST Set 4 MLR-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 14 input 6 0.8267 149.2337 27.8447 22.7327 14 input 6 0.8494 147.8456 29.1103 24.7764 14 input 6 0.9121 149.1146 29.2196 25.5280 14 input 6 0.8103 147.3607 29.1291 23.8458 14 input 6 0.9051 165.8000 32.2783 28.8100 14 input 6 0.7340 127.7982 25.5352 19.0643 cumecs-meter cubic second; COC-correlation of coefficient 145 Tables C4.26(a) to C4.26(c) in Appendix C present the R 2 , RMSE, RRMSE, and MAPE resulting from MLR model for the Sungai Ketil catchment. For Sungai Ketil catchment, the numbers of input considered for MLR are 16 inputs after calibrated using 100% and 50% of the data sets. Meanwhile, 14 inputs considered after calibrated using 25% of the data sets. The MLR model gives R 2 in the range of 60% to 80%. According to the correlation of coefficient, this condition can be classified as moderate model. According to Johnson and King (1988) results of modelling for Sungai Bekok with MAPE more than 30% can be considered as not reasonable model. During the calibration phase, the RMSE is consistently less than 19.6 cumecs. The RRMSE is consistently less than 0.66 for this model. During testing, the RMSE is consistently less than 21.8 cumecs and RRMSE is less than 0.73. Tables C4.27(a) to C4.27(c) in Appendix C present the R 2 , RMSE, RRMSE, and MAPE resulting from MLR model for the Sungai Klang catchment. The numbers of input considered for MLR are 14 inputs after calibrated using 100%, 50% and 25% of the data sets. The MLR model gives R 2 in the range of 60% to 80%. According to the correlation of coefficient, this condition can be classified as moderate model. According to Johnson and King (1988) results of modelling for Sungai Bekok with MAPE more than 30% can be considered as not reasonable model. During the calibration phase, the RMSE is consistently less than 9.1 cumecs. The RRMSE is consistently less than 0.54 for this model. During verification, the RMSE is consistently less than 13.1 cumecs and RRMSE is less than 0.61. Tables C4.28(a) to C4.28(c) in Appendix C present the R 2 , RMSE, RRMSE, and MAPE resulting from MLR model for the Sungai Slim catchment. The numbers of input considered for MLR are 16 inputs after calibrated using 100% and 50% of the data sets. Meanwhile, 15 inputs considered after calibrated using 25% of the data sets. The MLR model gives R 2 in the range of 50% to 80%. According to the correlation of coefficient, this condition can be classified as moderate model. According to Johnson and King (1988) results of modelling for Sungai Slim with MAPE around 30% can be considered as a reasonable prediction model. During the calibration phase, the RMSE is consistently 146 less than 144.2 cumecs. The RRMSE is consistently less than 2.2 for this model. During verification, the RMSE is consistently less than 276 cumecs and RRMSE is less than 4.2. 4.4.3 Verification The only verification test that is widely used in rainfall-runoff modelling is the split sample test in which one period of observations is used in calibration and another separate period is used to check that the model predictions are satisfactory. As the ANN model, the verification of MLR model was carried out by using difference sets of data with difference period of time. For daily rainfall-runoff modelling, 5 different sets of data namely set 1, set 2, set 3, set 4 and set 5 were used to study the capability of the models in transforming a rainfall into runoff in any condition. The measures of performance of each model are indicated by R 2 , RMSE, RRMSE, and MAPE. MLR model shows an unsatisfied performance with not good agreement of RMSE, RRMSE and MAPE results in the calibration and validation phase. Obviously, the main limitation of the MLR model is the model is unable to carry a large number of data sets. So, it may not properly calibrate. The disadvantage of these models is we have to using trial and error method to find out the best parameters that produce the best fit results. So, it will takes longer times to be adequately calibrated and may not achieve the desired level of accuracy. This will be a source of uncertainty in the modelling. This study found that even using intensive series of measurements of parameter values, the results have not been entirely satisfactory. 4.4.4 Robustness Test As the MLP and RBF models, the robustness tests for MLR model also carried out by using same difference sets of data with difference period of time. For daily and 147 hourly rainfall-runoff modelling, 5 different sets of data namely set 1, set 2, set 3, set 4 and set 5 were used to study the capability of the models in transforming a rainfall into runoff in any condition. The measures of performance of each model are indicated by R 2 , RMSE, RRMSE, and MAPE. According to the coefficient of efficiency, R 2 of the model, it shows that the performance of the neural network model is better than the MLR model in the training and testing phase. The performance of the MLR model can be classified as moderate. For both calibration and testing processes, the MLR model also offers a reduction in the time taken for calibration process compared to the ANN models. The application of input node selection by Harun (1999) considerably can reduce the time taken to fine out the number of input nodes to the neural network model structures. This study confirms that the MLR model is not capable and not robust enough in modelling continuous daily rainfall-runoff relationship and hourly streamflow hydrograph. By referring to the results of R 2 and the error analysis, it is found that the MLR model is slightly better than the HEC-HMS and SWMM models. The results of MLR model is considered as not consistently robust. It is confirm that the MLR model shows a not good agreement between the input and output of rainfall-runoff relationship compared to the results of MLP and RBF models. The limitation of the MLR model is the model is unable to carry a large number of data sets, and we cannot adjust the parameters in the calibration process. The advantage of this model is it were takes a shorter time in training and validation process compared to the MLP and RBF networks. It is because the parameters for the MLR model are less than the parameters for the MLP and RBF networks. 4.5 Results of the HEC-HMS Model 148 The HEC-HMS is designed to simulate the rainfall-runoff processes of watershed systems. It is designed to be applicable in a wide range of geographic areas for solving the widest possible range of problems. It utilizes a graphical user interface to build a watershed model and to set up the rainfall and control variables for simulation. The program features a completely integrated work environment including a database, data entry utilities, computation engine, and results reporting tools. The application of HECHMS model involves two steps. First, the model was calibrated using previous data sets to determine the best parameters. Second, the model was verified by using new sets of data. HEC-HMS was run with the previous hourly rainfall-runoff data in order to provide hourly prediction of runoff entering selected catchments. 4.5.1 Calibration The calibration of the HEC-HMS model parameters is accomplished by trial and error process. It is because the calibration accuracy of the models is very subjective and highly dependent on the knowledge, experience, and understanding of the components of the model and catchment characteristics. All model calibrations and subsequent predictions will be subject to uncertainty. This uncertainty arises in that no rainfallrunoff models are a true reflection of the processes involved. That it is impossible to specify the initial and boundary conditions required by the model with complete accuracy, and that the observational data available for model calibration are not errorfree. It is observed that, for both calibration and validation processes, the HEC-HMS take a longer time for calibration process compared to the ANN and MLR models. It is found that, if a model is calibrated using data that are in error, then the effective parameter values will be affected and the predictions for other periods, which depend on the calibrated parameter values, will be affected. This will be a source of uncertainty in the modelling by using HEC-HMS model. Therefore, it is worth stressing 149 that prior to applying any model the rainfall-runoff data should be checked for consistency, even some errors may not be obvious. In calibration process, some initial values of the parameters are chosen and the model is run with a calibration data set. The resulting predictions are compared with some observed variables and a measure of goodness of fit is calculated and scaled so that if the model was a perfect fit the goodness of fit would have a value of 1.0 or get closer to a value of 1.0, and if the fit was very poor it would have a value of zero. It is relatively simple matter to set up the model to change the values of the parameters, make another run, and recalculate the goodness of fit. No all those runs would result in models giving good fits to the data. A lot of computer time could therefore be saved by avoiding model runs that give poor fits, to find an optimum parameter sets. In the calibration processes, it found that the best parameters used for modelling daily rainfall-runoff relationship for Sungai Bekok, Sungai Ketil, Sungai Klang and Sungai Slim catchments are a Initial-Constant infiltration/loss parameterisation, the SCS hydrograph transformation routine, and a recession base flow component. Meanwhile, the best parameters used for modelling hourly rainfall-runoff relationship for the same catchments are a Initial-Constant infiltration/loss parameterisation, the Clark hydrograph transformation routine, and a recession base flow component. For Sungai Ketil catchment, a transformation routine used is SCS hydrograph. The initial loss and initial flow are treated as initial conditions and vary from simulation to simulation. Table H4.61 to H4.68 in Appendix H shows the daily and hourly results of percentage bias (PBIAS) of calibration or training for the HEC-HMS model. The same input and output data that have been used for the previous model were used to calibrate this distributed or lump model. The trial and error method was implemented to get the best parameters for this model. The optimization facilities of the model were used to check either it is the best solutions that have been carried out. According to Yapo et. al (1996), the HEC-HMS model shows an unsatisfactorily results. For the daily rainfallrunoff model, the PBIAS are in the range of -39.0 to +9.0. Meanwhile, for the hourly 150 rainfall-runoff model, the PBIAS are in the range of -7.0 to +60.0. It means that the models were not robust and biased. Therefore, the models have a tendency of overestimate or underestimate. Calibration parameters for the daily rainfall-runoff modelling for Sungai Bekok are shown in Table 4.29(a) to 4.29(c). Meanwhile, calibration parameters for the daily rainfall-runoff modelling for Sungai Ketil, Sungai Klang, and Sungai Slim catchments are shown in Table D4.30(a) to D4.30(c), TableD4.31(a) to D4.31(c), and Table D4.32(a) to D4.32(c) respectively as enclosed in Appendix D (Part A). Calibration parameters for hourly rainfall-runoff modelling for Sungai Bekok are shown in Table 4.33(a) to 4.33(b). Meanwhile, calibration parameters for the hourly rainfall-runoff modelling for Sungai Ketil, Sungai Klang, and Sungai Slim catchments are shown in Table D4.34(a) to D4.34(b), Table D4.35(a) to D4.35(b), and Table D4.36(a) to D4.36(b) respectively, as enclosed in Appendix D (Part B). 4.5.1.1 Results of the Daily HEC-HMS Model Calibration Table 4.29(a) to 4.29(c) shows the results of calibration coefficients for Sungai Bekok catchment. It was carried out by using 100, 50 and 25 percents of historical data respectively. Table 4.29(a): Calibration Coefficients of Sungai Bekok catchment (using 100% of data) Model parameter Calibrated value Constant Loss Rate (mm/hr) 2.0 Imperviousness (%) 42 SCS Lag (minutes) 3000.00 Recession Constant 1 Threshold Flow (cumecs) 0.995 *Another 2 parameters (catchment size & baseflow) are fixed in the model 151 Table 4.29(b): Calibration Coefficients of Sungai Bekok catchment (using 50% data) Model parameter Calibrated value Constant Loss Rate (mm/hr) 1.62 Imperviousness (%) 42 SCS Lag (minutes) 2350.00 Recession Constant 1 Threshold Flow (cumecs) 0.995 *Another 2 parameters (catchment size & baseflow) are fixed in the model Table 4.29(c): Calibration Coefficients of Sungai Bekok catchment (using 25% data) Model parameter Calibrated value Constant Loss Rate (mm/hr) 1.0 Imperviousness (%) 42 SCS Lag (minutes) 1700.00 Recession Constant 1 Threshold Flow (cumecs) 0.995 *Another 2 parameters (catchment size & baseflow) are fixed in the model 4.5.1.2 Results of the Hourly HEC-HMS Model Calibration Table 4.33(a) and 4.33(b) shows the results of calibration coefficients for Sungai Bekok catchment by using 25 percents and minimum amounts of historical data respectively. 152 Table 4.33(a): Calibration Coefficients of Sungai Bekok catchment (using 25% of data) Model parameter Calibrated value Constant Rate (mm/hr) 50 Imperviousness (%) 48 Time of Concentration (hr) 25.8 Storage Coefficient (hr) 180 Recession Constant Threshold Flow (cumecs) 1 0.99 *Another 2 parameters (catchment size & baseflow) are fixed in the model Table 4.33(b): Calibration Coefficients of Sungai Bekok catchment (using minimum data) Model parameter Calibrated value Constant Rate (mm/hr) 3 Imperviousness (%) 48 Time of Concentration (hr) 18.25 Storage Coefficient (hr) 18 Recession Constant 1 Threshold Flow (cumecs) 0.99 *Another 2 parameters (catchment size & baseflow) are fixed in the model 4.5.2 Results of Daily HEC-HMS Model 153 Tables 4.37(a) to 4.37(c) present the R 2 , RMSE, RRMSE, and MAPE resulting from HEC-HMS model for the Sungai Bekok catchment. The HEC-HMS yield the R 2 below 70%, and this condition shows the poor performance and is unsatisfactory. The results of MAPE for Sungai Bekok are more than 30%. This condition can be considered as not reasonable prediction. It may cause by the huge amount of data that intriduced to the model. In general, the performance of the HEC-HMS model is unsatisfactory during calibration and verification phase. During the calibration phase, the RMSE for Sungai Bekok is consistently in the range of 0.5 to 0.8 cumecs. The RRMSE also maintains below 0.165 for this model. During verification, the RMSE is less than 0.84 cumecs and RRMSE is less than 0.19. Further, the results of RMSE and RRMSE show that the model calibrated using dry-set of data (set 4) approximate rainfall-runoff process in the catchment more closely than the model calibrated using wet-set of data (set 5). Obviously, the application of HEC-HMS method to model rainfall-runoff relationship of Sungai Bekok is not very successful because of poor correlation between the observed and computed results. Table 4.37(a): Results of HEC-HMS Model for Sg. Bekok catchment using 100% of data sets in training phase MODEL Data Set HEC TRAINING HEC-TEST Set 1 HEC-TEST Set 2 HEC-TEST Set 3 HEC-TEST Set 4 HEC-TEST Set 5 COC ( R2 ) RMSE (cumecs) RRMSE No. of Parameter MAPE (%) 7 0.1120 0.7984 0.1630 125.3162 7 0.0486 0.5511 0.1156 91.6911 7 0.2014 0.5392 0.1124 87.3058 7 0.3436 0.6037 0.1261 100.1061 7 0.1231 0.6255 0.1312 99.4574 7 0.2680 0.8351 0.1818 139.0932 cumecs-meter cubic second; COC-correlation of coefficient 154 Table 4.37(b): Results of HEC-HMS Model for Sg. Bekok catchment using 50% of data sets in training phase MODEL Data Set HEC TRAINING HEC-TEST Set 1 HEC-TEST Set 2 HEC-TEST Set 3 HEC-TEST Set 4 HEC-TEST Set 5 COC ( R2 ) RMSE (cumecs) RRMSE No. of Parameter MAPE (%) 7 0.0320 0.6600 0.1310 100.5741 7 0.0560 0.5038 0.1054 84.2492 7 0.2005 0.5170 0.1081 84.7230 7 0.2692 0.5428 0.1116 91.8308 7 0.0745 0.6335 0.1316 101.9193 7 0.0755 0.7434 0.1571 138.9936 cumecs-meter cubic second; COC-correlation of coefficient Table 4.37(c): Results of HEC-HMS Model for Sg. Bekok catchment using 25% of data sets in training phase MODEL Data Set HEC TRAINING HEC-TEST Set 1 HEC-TEST Set 2 HEC-TEST Set 3 HEC-TEST Set 4 HEC-TEST Set 5 COC ( R2 ) RMSE (cumecs) RRMSE No. of Parameter MAPE (%) 7 0.4242 0.5933 0.1185 93.8401 7 0.3465 0.5992 0.1225 95.0859 7 0.2997 0.5478 0.1100 85.0862 7 0.3979 0.6203 0.1272 102.4020 7 0.2195 0.6135 0.1236 92.2602 7 0.2146 0.6888 0.1451 120.4417 cumecs-meter cubic second; COC-correlation of coefficient 155 Tables E4.38(a) to E4.38(c) in Appendix E (Part A) present the R 2 , RMSE, RRMSE, and MAPE resulting from HEC-HMS model for the Sungai Ketil catchment. For Sungai Ketil catchment, the HEC-HMS also yield the R 2 below 70%, and this condition shows the poor performance and is unsatisfactory. In general, the performance of the HEC-HMS model is unsatisfactory during calibration and verification phase. Most of the results of MAPE for Sungai Ketil are above 30%. It was considered not reasonable prediction model. But, some simulation shows a reasonable prediction especially when we used 25% of data sets in training phase. According to Johnson and King (1988), it is considered very accurate. During the calibration phase, the RMSE for Sungai Ketil is consistently below 0.44 cumecs. The RRMSE also maintains below 0.015 for this model. During verification, the RMSE is less than 0.75 cumecs and RRMSE is less than 0.03 and approach zero. Obviously, the application of HEC-HMS method to model rainfall-runoff relationship of Sungai Ketil is moderate. Tables E4.39(a) to E4.39(c) in Appendix E (Part A) present the R 2 , RMSE, RRMSE, and MAPE resulting from HEC-HMS model for the Sungai Klang catchment. For Sungai Klang catchment, the HEC-HMS yield the R 2 below 50%, and this condition shows the poor performance and is unsatisfactory. The results of MAPE for Sungai Ketil are above 30%. According to Johnson and King (1988), it is considered not reasonable. In general, the performance of the HEC-HMS model is unsatisfactory during calibration and verification phase. During the calibration phase, the RMSE for Sungai Klang is consistently below 16.6 cumecs. The RRMSE are below 0.68 for this model. Meanwhile, during verification, the RMSE is less than 29 cumecs and RRMSE is less than 0.88. Obviously, the application of HEC-HMS method to model rainfall-runoff relationship of Sungai Klang is not satisfactory. Tables E4.40(a) to E4.40(c) in Appendix E (Part A) present the R 2 , RMSE, RRMSE, and MAPE resulting from HEC-HMS model for the Sungai Slim catchment. For Sungai Slim catchment, the HEC-HMS also yield the R 2 below 50%, and this condition shows the poor performance and is unsatisfactory. The results of MAPE for Sungai Slim are also more than 30%. According to Johnson and King (1988), it is 156 considered not reasonable prediction model. In general, the performance of the HECHMS model is moderate during calibration and verification phase. During the calibration phase, the RMSE for Sungai Slim is consistently below 0.1 cumecs. The RRMSE are below 0.002 for this model. During verification, the RMSE is less than 0.15 cumecs and RRMSE is less than 0.0025. Obviously, the application of HEC-HMS method to model rainfall-runoff relationship of Sungai Slim is moderate. The study demonstrates the neural network model based on MLP and RBF networks are suitable for modelling the rainfall-runoff relationship compared to the HECHMS model. The HEC-HMS model gives a higher error than the MLP and RBF models give a worse degree of efficiency; in term of correlation of coefficient, RMSE and RRMSE. It takes a longer time for calibration in the HEC-HMS model. The trial and error procedure is applied to find out the best parameters for the model structure. Apparently, the MLP and RBF model shows a better performance than the HEC-HMS model with good agreement of correlation of coefficient, RMSE, RRMSE and MAPE results in the calibration and verification phase, revealing best fitting to the data. 4.5.3 Results of Hourly HEC-HMS Model Tables 4.41(a) to 4.41(b) present the R 2 , RMSE, RRMSE, and MAPE resulting from HEC-HMS model for the Sungai Bekok catchment. For Sungai Bekok catchment, the HEC-HMS yield the R 2 below 75%, and this condition shows the poor performance and is unsatisfactory. Meanwhile, results of modelling for Sungai Bekok with MAPE less than 30% is considered as reasonable prediction. In general, the performance of the MLP and RBF models is better than the HEC-HMS model in the calibration and verification phase. 157 Table 4.41(a): Results of HEC-HMS Model for Sg. Bekok catchment using 25% of data sets in training phase MODEL Data Set HEC TRAINING HEC-TEST Set 1 HEC-TEST Set 2 HEC-TEST Set 3 HEC-TEST Set 4 HEC-TEST Set 5 RMSE (cumecs) RRMSE No. of Parameter COC ( R2 ) MAPE (%) 8 0.4867 0.7105 0.1367 30.8068 8 0.0209 0.2165 0.0427 116.6770 8 0.1480 0.3832 0.0488 62.5096 8 0.0440 0.4732 0.0905 88.5970 8 0.0798 0.2328 0.1312 42.9340 8 0.7069 0.2967 0.0084 22.5021 cumecs-meter cubic second; COC-correlation of coefficient Table 4.41(b): Results of HEC-HMS Model for Sg. Bekok catchment using minimum data sets in training phase MODEL Data Set HEC TRAINING HEC-TEST Set 1 HEC-TEST Set 2 HEC-TEST Set 3 HEC-TEST Set 4 HEC-TEST Set 5 RMSE (cumecs) RRMSE No. of Parameter COC ( R2 ) MAPE (%) 8 0.6097 0.5329 0.0945 22.1940 8 0.6128 0.3508 0.0677 21.4041 8 0.37326 0.8719 0.1092 40.5906 8 0.3241 0.8794 0.1759 37.8493 8 0.1518 1.1288 0.3922 68.0408 8 0.2066 0.6248 0.0362 72.6862 cumecs-meter cubic second; COC-correlation of coefficient 158 Tables E4.42(a) to E4.42(b) in Appendix E (Part B) present the R 2 , RMSE, RRMSE, and MAPE resulting from HEC-HMS model for the Sungai Ketil catchment. For Sungai Ketil catchment, the HEC-HMS yield the R 2 below 70%, and this condition also shows the poor performance and is unsatisfactory. Meanwhile, results of modelling for Sungai Ketil with MAPE less than 30% is considered as reasonable prediction. In general, the performance of the MLP and RBF models is better than the HEC-HMS model in the calibration and verification phase. Tables E4.43(a) to E4.43(b) in Appendix E (Part B) present the R 2 , RMSE, RRMSE, and MAPE resulting from HEC-HMS model for the Sungai Klang catchment. For Sungai Klang catchment, the HEC-HMS yield the R 2 in the range of 10% to 80%. This condition shows the model is unsatisfactory and not consistent. Meanwhile, results of modelling for Sungai Klang with MAPE more than 30% is considered as not reasonable prediction. In general, the performance of the MLP and RBF models is better than the HEC-HMS model in the calibration and verification phase. Tables E4.44(a) to E4.44(b) in Appendix E (Part B) present the R 2 , RMSE, RRMSE, and MAPE resulting from HEC-HMS model for the Sungai Slim catchment. For Sungai Slim catchment, the HEC-HMS yield the R 2 in the range of 10% to 60%. This condition shows the model is unsatisfactory and not consistence and reliable. Meanwhile, results of modelling for Sungai Slim with MAPE more than 30% is considered as not reasonable prediction. In general, the performance of the MLP and RBF models is better than the HEC-HMS model in the calibration and verification phase. 4.5.4 Verification Verification of distributed model is an issue that has received a great deal of recent attention in the field of rainfall-runoff modelling following by several number of studies carried out by hydrologists. In fact, the verification processes is not an 159 appropriate term to use, since no model approximation can be expected to be a valid representation of a complex reality processes. Model verification processes in distributed models make distributed predictions; there is a lot of potential for evaluating of the discharge predictions at a catchment outlet, and also the internal state variables such as table levels, soil moisture levels, etc. The lack of evaluation in distributed models is due to expense of collecting widespread measurements of such internal variables. It is because there are some difficulties in measuring quantities that can truly be significantly different from the model element scale at which the predictions of the model are made. Freeze (1972) reported that because of uncertainties in the boundary conditions, initial conditions and parameter values of a distributed model, it is unlikely that a true model validation will ever be possible since the errors in representing the system and specifying the inputs will surely induce unavoidable errors in the simulations, however well a model appears to have been calibrated. This study has found that, even using intensive series of measurements of parameter values, the results have not been entirely satisfactory in model verification processes. It may causes by plenty of scope for the runoff to be simulated by a variety of different mechanisms or parameters. The performance of the HEC-HMS model can be classified as moderate. We observed that the performance of the HEC-HMS model is unsatisfactory, which lack from consistency and not robust. According to the coefficient of efficiency of the model, it shows that the performance of the ANN models is better than the HEC-HMS model in the training and validation phase. For both calibration and validation processes, the HEC-HMS take a long time for calibration process compared to the other models. 4.5.5 Robustness Test 160 As the other models, the robustness tests for HEC-HMS model also carried out by using same difference sets of data with difference period of time for daily and hourly rainfall-runoff modelling. The measures of performance of each model are also indicated by R 2 , RMSE, RRMSE, and MAPE. The HEC-HMS model has the lowest calibration and verification accuracy amongst the best-fit networks such as MLP, RBF and MLR models. For HEC-HMS model, R 2 values vary in the range from 0.1 to 0.6 for calibration and verification. In addition, the percent estimates of MAPE are comparatively high at more than 30 percents of the observed values for this model. The HEC-HMS model takes a longer time for calibrate the parameters of the model. The trial and error procedure is applied to find out the best parameters for the HEC-HMS model. But, the advantage of this model is it has the optimization function for approximate the best fit of parameters values. Even this function is simple and flexible, but it was not consistence and robust. This function consists of many parameters that related to each others and supposedly affect the overall process of the model. Furthermore, the main limitation of the SWMM model compared to the MLP and RBF models are it unable to carry a large number of data sets. The MLP and RBF method has been shown that it can easily handle the existence of non-linearity processes within the catchment compared to the HEC-HMS models. This study confirms that the HEC-HMS model is not capable and shows unsatisfactory results compared to the MLR and ANN models. The HEC-HMS model is not robust in modelling continuous daily rainfall-runoff relationship that has not good agreement between the input and output of rainfall-runoff relationship. Meanwhile, it shows moderate results in modelling of hourly streamflow hydrograph. By referring to the results of R 2 and the error analysis, we found that the HEC-HMS model is not consistent and robust. In other words, this model cannot work consistently or unstable with long period or big amounts of data. 161 4.6 Results of the SWMM Model The application of SWMM model also involves two steps. First, the model was calibrated using previous data sets to determine the best parameters. Second, the model was verified by using new sets of data. This program features a completely integrated work environment including a database, data entry utilities, computation engine, and results reporting tools. SWMM was run with the previous hourly rainfall-runoff data in order to provide hourly prediction of runoff entering selected catchments. 4.6.1 Calibration The calibration of the SWMM model parameters is also accomplished by trial and error process. In this study, it is found that the calibration process of the SWMM model also can be affected by using data that are in error. So that, the effective values of parameter will be affected. Therefore, the predictions for other periods, which depend on the calibrated parameter values, will be affected. As the HEC-HMS model, the SWMM model processes are also subject to uncertainty. It is because, it is impossible to specify the initial and boundary conditions required by the model with complete accuracy, and that the observational data available for model calibration are not error-free. Uncertainties and sensitivity may also depend on the period of data used. As in regression, these uncertainties will normally get larger as the model predicts the responses for more and more extreme conditions relative to the data used in calibration. The calibration processes for each different catchment were carried out for determining parameter values for a particular catchment. For each different catchment, a set of parameters needs to be established so that the SWMM model can simulate the rainfall-runoff processes. For the calibration purposed, the same rainfall (input) and runoff (output) data that have been used for the previous model were used to calibrate this model. The trial and error method was implemented to get the best parameters for 162 this model. It found that, the model parameters used for modelling daily rainfall-runoff relationship for Sungai Bekok, Sungai Klang and Sungai Slim catchments are a Horton infiltration/loss and SCS Hydrology. Meanwhile, for Sungai Ketil catchment, the model parameters used is a Horton infiltration/loss and Time Area unit hydrograph. For hourly rainfall-runoff relationship modelling, the model parameters used for modelling Sungai Bekok and Sungai Slim catchments are Horton infiltration/loss and SCS Hydrology. Meanwhile, for Sungai Klang and Sungai Ketil catchments, the model parameters used are Horton infiltration/loss and Rational Formula unit hydrograph. Calibration parameters for the daily rainfall-runoff modelling for Sungai Bekok are shown in Table 4.45(a) to 4.45(c). Meanwhile, calibration parameters for the daily rainfall-runoff modelling for Sungai Ketil, Sungai Klang, and Sungai Slim catchments are shown in Table F4.46(a) to F4.46(c), Table F4.47(a) to F4.47(c), and Table F4.48(a) to F4.48(c) respectively as enclosed in Appendix F (Part A). Calibration parameters for the hourly rainfall-runoff modelling for Sungai Bekok are shown in Table 4.49(a) to 4.49(b). Meanwhile, calibration parameters for the hourly rainfall-runoff modelling for Sungai Ketil, Sungai Klang, and Sungai Slim catchments are shown in Table F4.50(a) to F4.50(b), Table F4.51(a) to F4.51(b), and Table F4.52(a) to F4.52(b) respectively as enclosed in Appendix F (Part B). Table H4.61 to H4.68 in Appendix H shows the daily and hourly results of percentage bias (PBIAS) of calibration or training for the SWMM model. According to Yapo et. al (1996), the SWMM model also shows an unsatisfactorily results. For the daily rainfall-runoff model, the PBIAS are in the range of -29.0 to +7.0. Meanwhile, for the hourly rainfall-runoff model, the PBIAS are in the range of -6.0 to +55.0. It means that the models were not robust and biased. As the HEC-HMS model, the SWMM model also has a tendency of overestimate or underestimates due to lower performance or the model compared to the MLP and RBF. In rainfall-runoff modelling, the hydrological concepts are given priority rather than the problems of parameter calibration, particularly in physical-based models. For 163 the models that have many parameters, there will likely be sets of parameters that give a good fit to the hydrograph using the required mechanisms. There are particular problems in assessing the response surface and sensitivity of parameters in distributed models. It is because of the very large number of parameter values involved and the possibilities for parameter interaction in specifying distributed fields of parameters. This will remain a difficulty for the foreseeable future and the only sensible strategy in calibrating distributed models would appear to insist that mostly of the parameters are either fixed or calibrated with respect to some distributed observations and not catchment discharge alone. 4.6.1.1 Results of the Daily SWMM Model Calibration Table 4.45(a) to 4.45(c) shows the results of calibration coefficients for Sungai Bekok catchment. It was carried out by using 100, 50 and 25 percents of historical data respectively. Table 4.45(a): Calibration Coefficients of Sungai Bekok catchment (using 100% of data) Model parameter Calibrated value Imperviousness (%) 34 Pervious Area CN 66 Time of Concentration (hr) 50 Initial Abstraction 0.2 Decay rate of infiltration 0.0012 *Another 2 parameters (catchment size & baseflow) are fixed in the model Table 4.45(b): Calibration Coefficients of Sungai Bekok catchment (using 50% data) Model parameter Calibrated value Imperviousness (%) 34 Pervious Area CN 66 164 Time of Concentration (hr) 39.17 Initial Abstraction 0.15 Decay rate of infiltration 0.0012 *Another 2 parameters (catchment size & baseflow) are fixed in the model Table 4.45(c): Calibration Coefficients of Sungai Bekok catchment (using 25% data) Model parameter Calibrated value Imperviousness (%) 34 Pervious Area CN 66 Time of Concentration (hr) 28.33 Initial Abstraction 0.15 Decay rate of infiltration 0.0012 *Another 2 parameters (catchment size & baseflow) are fixed in the model Calibration is not the only problems with finding an optimum parameter set. Optimization generally assumes that the observations with which the simulations are compared are error-free and that the model is a true representation of the data. Thus, the optimum parameter set found for a particular model structure may be sensitive both to small changes in the observations, or to the period of observations considered in the calibration, and possibly to changes in the model structure. In calibration process for all these three models, it has been found that the parameter values determined by calibration are effectively valid only inside the model structure used in the calibration. It may not be appropriate to use those values in different models or in different catchments. Furthermore, the concept of an optimum parameter set may be ill-founded in hydrological modelling. It is most unlikely that, given a number of parameter sets that give reasonable fits to the data, the ranking of those sets in terms of the objective function will be the same for different periods of calibration data. 4.6.1.2 Results of the Hourly SWMM Model Calibration 165 Table 4.49(a) and 4.49(b) shows the results of calibration coefficients for Sungai Bekok catchment by using 25 percents and minimum amounts of historical data respectively. Table 4.49(a): Calibration Coefficients of Sungai Bekok catchment (using 25% of data) Model parameter Calibrated value Imperviousness (%) 40 Pervious Area CN 60 Time of Concentration (hr) 52.8 Initial Abstraction 0.2 Decay rate of infiltration 0.00115 *Another 2 parameters (catchment size & baseflow) are fixed in the model Table 4.49(b): Calibration Coefficients of Sungai Bekok catchment (using minimum data) Model parameter Calibrated value Imperviousness (%) 40 Pervious Area CN 60 Time of Concentration (hr) Initial Abstraction Decay rate of infiltration 38.25 0.2 0.00115 *Another 2 parameters (catchment size & baseflow) are fixed in the model 4.6.2 Results of Daily SWMM Model Tables 4.53(a) to 4.53(c) present the R 2 , RMSE, RRMSE, and MAPE resulting from SWMM model of daily rainfall-runoff relationship for the Sungai Bekok catchment. 166 Table 4.53(a): Results of SWMM Model for Sg. Bekok catchment using 100% of data sets in training phase MODEL Data Set SWMM TRAINING SWMM TEST Set 1 SWMM TEST Set 2 SWMM TEST Set 3 SWMM TEST Set 4 SWMM TEST Set 5 RMSE (cumecs) RRMSE No. of Parameter COC ( R2 ) MAPE (%) 7 0.2643 0.8421 0.1003 113.2260 7 0.1221 0.4373 0.2013 102.1902 7 0.4220 0.6821 0.2011 46.5643 7 0.5336 0.6001 0.1092 34.3310 7 0.2173 0.7565 0.1237 97.5860 7 0.4381 0.8330 0.1902 103.4751 cumecs-meter cubic second; COC-correlation of coefficient Table 4.53(b): Results of SWMM Model for Sg. Bekok catchment using 50% of data sets in training phase MODEL Data Set SWMM TRAINING SWMM TEST Set 1 SWMM TEST Set 2 SWMM TEST Set 3 SWMM TEST Set 4 SWMM TEST Set 5 COC ( R2 ) RMSE (cumecs) RRMSE No. of Parameter MAPE (%) 7 0.2635 0.5644 0.1020 107.8650 7 0.3433 0.5253 0.2011 76.4595 7 0.4929 0.4535 0.1104 83.2752 7 0.3001 0.4768 0.1007 67.8901 7 0.1021 0.6218 0.1467 90.6321 7 0.1388 0.8764 0.1623 145.2320 cumecs-meter cubic second; COC-correlation of coefficient 167 Table 4.53(c): Results of SWMM Model for Sg. Bekok catchment using 25% of data sets in training phase MODEL Data Set SWMM TRAINING SWMM TEST Set 1 SWMM TEST Set 2 SWMM TEST Set 3 SWMM TEST Set 4 SWMM TEST Set 5 RMSE (cumecs) RRMSE No. of Parameter COC ( R2 ) MAPE (%) 7 0.5320 0.4532 0.2231 35.6701 7 0.3515 0.4012 0.4519 74.5109 7 0.4310 0.6421 0.2350 76.6534 7 0.3001 0.7509 0.2754 97.6503 7 0.2054 0.8431 0.1176 100.2210 7 0.4012 0.7210 0.1324 55.4341 cumecs-meter cubic second; COC-correlation of coefficient For the SWMM model, the result shows slightly better than the HEC-HMS model with little improvement in term of correlation of coefficient and error analysis. For Sungai Bekok catchment, the SWMM model yield the R 2 below 60%, and this condition shows the unsatisfactory. The results of MAPE for Sungai Bekok are above 30%. According to Johnson and King (1988), it is considered as not reasonable prediction model. In general, the performance of the SWMM model is moderate during calibration and verification phase. During the calibration phase, the RMSE for Sungai Slim is consistently below 0.85 cumecs. The RRMSE values are below 0.23 for this model. During verification, the RMSE is less than 0.88 cumecs and RRMSE is less than 0.46. For the SWMM model, the results of RMSE and RRMSE show that the model calibrated using dry-set of data (set 4) approximate rainfall-runoff process in the catchment is more accurate than the model calibrated using wet-set of data (set 5). 168 Tables G4.54(a) to G4.54(c) in Appendix G (Part A) present the R 2 , RMSE, RRMSE, and MAPE resulting from SWMM model for the Sungai Ketil catchment. For Sungai Ketil catchment, the SWMM model yield the R 2 below 80%, and this condition considered as moderate. Most of the results of MAPE for Sungai Ketil are around 30%. According to Johnson and King (1988), it is considered a reasonable prediction model. In general, the performance of the SWMM model is moderate during calibration and verification phase. During the calibration phase, the RMSE for Sungai Ketil is consistently below 0.45 cumecs. The RRMSE are below 0.06 for this model. During verification, the RMSE is less than 0.68 cumecs and RRMSE is less than 0.035 and approach zero. Tables G4.55(a) to G4.55(c) in Appendix G (Part A) present the R 2 , RMSE, RRMSE, and MAPE resulting from SWMM model for the Sungai Klang catchment. For Sungai Klang catchment, the SWMM model yield the R 2 below 50%, and this condition shows a poor performance and unsatisfactory. The results of MAPE for Sungai Ketil are more than 30%. According to Johnson and King (1988), it is considered unreasonable prediction. In general, the performance of the SWMM model is unsatisfactory for modelling rainfall-runoff relationship in Sungai Klang catchment. During the calibration phase, the RMSE for Sungai Klang is consistently below 15.5 cumecs. The RRMSE are below 0.66 for this model. During verification, the RMSE is less than 28 cumecs and RRMSE is less than 0.91. Tables G4.56(a) to G4.56(c) in Appendix G (Part A) present the R 2 , RMSE, RRMSE, and MAPE resulting from SWMM model for the Sungai Slim catchment. For Sungai Slim catchment, the SWMM model yield the R 2 below 60%, and this condition shows unsatisfactory. The results of MAPE for Sungai Ketil are also more than 30% and according to Johnson and King (1988), it is considered as not reasonable prediction model. In general, the performance of the SWMM model is unsatisfactory for modelling rainfall-runoff relationship in Sungai Slim catchment. During the calibration phase, the RMSE for Sungai Ketil is consistently below 0.09 cumecs. The RRMSE are below 0.002 169 for this model. During verification, the RMSE is less than 0.16 cumecs and RRMSE is less than 0.12. 4.6.3 Results of Hourly SWMM Model Tables 4.57(a) to 4.57(b) present the R 2 , RMSE, RRMSE, and MAPE resulting from SWMM model of hourly rainfall-runoff relationship for the Sungai Bekok catchment. For Sungai Bekok catchment, the SWMM model yield the R 2 below 71%, and this condition shows the poor performance and is unsatisfactory. According to Johnson and King (1988), results of modelling for Sungai Bekok with MAPE around 30% is considered as reasonable prediction. In general, the performance of the SWMM model is better than HEC-HMS model. Obviously, the MLP and RBF models are better than the SWMM model in the calibration and verification phase. Table 4.57(a): Results of SWMM Model for Sg. Bekok catchment using 25% of data sets in training phase MODEL Data Set SWMM TRAINING SWMM TEST Set 1 SWMM TEST Set 2 SWMM TEST Set 3 SWMM TEST Set 4 SWMM TEST Set 5 COC ( R2 ) RMSE (cumecs) RRMSE No. of Parameter MAPE (%) 7 0.5221 0.6574 0.0937 21.0921 7 0.3049 0.2029 0.0492 27.8877 7 0.1854 0.3430 0.0323 58.7964 7 0.1012 0.4708 0.1090 81.0303 7 0.1102 0.2029 0.1093 39.8821 7 0.7071 0.3029 0.0080 30.8991 cumecs-meter cubic second; COC-correlation of coefficient 170 Table 4.57(b): Results of SWMM Model for Sg. Bekok catchment using minimum data sets in training phase MODEL Data Set SWMM TRAINING SWMM TEST Set 1 SWMM TEST Set 2 SWMM TEST Set 3 SWMM TEST Set 4 SWMM TEST Set 5 COC ( R2 ) RMSE (cumecs) RRMSE No. of Parameter MAPE (%) 7 0.6983 0.4292 0.0494 28.9690 7 0.6303 0.3605 0.0129 30.5830 7 0.3626 0.8332 0.1103 120.1891 7 0.2949 0.8574 0.1563 116.7541 7 0.3829 1.3292 0.4039 149.7655 7 0.3029 0.5079 0.0293 76.2327 cumecs-meter cubic second; COC-correlation of coefficient Tables G4.58(a) to G4.58(b) in Appendix G (Part B) present the R 2 , RMSE, RRMSE, and MAPE resulting from SWMM model for the Sungai Ketil catchment. For Sungai Ketil catchment, the SWMM model yield the R 2 below 70%, and this condition also shows the poor performance and is unsatisfactory. According to Johnson and King (1988), results of modelling for Sungai Bekok with MAPE more than 30% is considered as not reasonable prediction. In general, the MLP and RBF models are better than the SWMM model in the calibration and verification phase. Tables G4.59(a) to G4.59(b) in Appendix G (Part B) present the R 2 , RMSE, RRMSE, and MAPE resulting from SWMM model for the Sungai Klang catchment. For Sungai Klang catchment, the SWMM model yield the R 2 in the range of 10% to 85%, 171 and this condition shows a poor performance and unreliable. According to Johnson and King (1988), results of modelling for Sungai Klang with MAPE more than 30% is considered as not reasonable prediction. In general, the MLP and RBF models are better than the SWMM model in the calibration and verification phase. Tables G4.60(a) to G4.60(b) in Appendix G (Part B) present the R 2 , RMSE, RRMSE, and MAPE resulting from SWMM model for the Sungai Slim catchment. For Sungai Slim catchment, the SWMM model yield the R 2 below 75%, and this condition shows a poor performance and is unsatisfactory. According to Johnson and King (1988), results of modelling for Sungai Bekok with MAPE more than 30% is considered as not reasonable prediction model. But, some of the simulation shows a reasonable prediction with MAPE around 30%. In general, the performance of the model is moderate. The MLP and RBF models are better than the SWMM model in the calibration and verification phase. 4.6.4 Verification Model verification is a process of verifying the correctness of model parameters for a catchment. In general, the procedure of accomplishing can be summarized as follows: (i) Pick a data series for a period of time with has not been used by the model for parameter calibration (ii) Simulate the runoff for the rainfall event of that period (iii) Compare the results of simulation to the observed ones (iv) If the results are within a specific error range, then the model is verified for that catchment. Many different parameter sets may give good fits to the data and it may be very difficult to decide whether this one is better than another. Furthermore, having chosen a model parameter sets for one period of observations may not be the optimum sets for 172 another period. So, in order to approaching this problem for models calibration and testing, the SWMM model has more good features compared to HEC-HMS models. The graphical features in the SWMM model make the modelling process clearer and easier. Therefore, the parameter sets can be adjusted easier. This study has found that, even using intensive series of measurements of parameter values, the results have not been entirely satisfactory. If the concept of an optimum parameter set must be superseded by the idea that many possible parameter sets and perhaps models may provide acceptable simulations of the response of a particular catchment, then it follows that validation of those models may be equally difficult. In fact, rejection of some of the acceptable models given additional data may be a much more practical than suggesting that the models might be validated. 4.6.5 Robustness Test As the other models, the robustness tests for SWMM model also carried out by using same difference sets of data with difference period of time for daily and hourly rainfall-runoff modelling. The measures of performance of each model are also indicated by correlation of coefficient ( R 2 ), root mean square error (RMSE), relative root mean square error (RRMSE), and mean absolute percentage error (MAPE). The results show that the SWMM model has the lower calibration and verification accuracy amongst the best-fit networks. For the SWMM model, R 2 values vary in the range from 0.1 to 0.7 for calibration and verification. In addition, the percent estimates of MAPE are comparatively high that are more than 30 percents of the observed values for this model. Obviously, the main limitation of the SWMM model is the model is unable to carry a large number of data sets. It also involved so many parameters to be approximate in model calibration. So, it may not properly calibrate. The disadvantage of these models is the trial and error method has to be used to find out the best parameters that 173 produce the best fit results. So, it will takes longer times to be adequately trained and may not achieve the desired level of accuracy. In general, the SWMM model exhibit or demonstrated less capabilities in rainfallrunoff modelling. In modelling continuous daily and hourly rainfall-runoff relationship, the SWMM displays unsatisfactory performance. By referring to the results of R 2 and the error analysis, it is clear that the SWMM model is not robust compared to the MLP and RBF models. 4.7 Discussions on the Rainfall-Runoff Modelling The nonlinear nature of the relationship of rainfall-runoff processes is appropriate for the application of ANN methods. The proper use of a neural network requires not only a physical understanding of the hydrological process under consideration but also knowledge of neural networks and their system operation. Trying to extract rules from a network or impart them with some explanation capability will entail extra computer effort. These fundamental aspects will lead to the construction of good training and validation data sets, selection and inclusion of relevant input variables, and development of proper neural networks architectures and selection of training algorithms. Results of ANN models reflect that the performance of neural network model is better than MLR, HEC-HMS and SWMM models, for modelling the rainfall-runoff relationship. Apparently, the neural network has the ability to predict runoff accurately using the rainfall data as input variable. This study models the rainfall-runoff relationship where the rainfall as input and the runoff as output. In general, the ANN models namely MLP networks and RBF networks are successful in modelling daily and hourly rainfall-runoff process. The HECHMS and SWMM model are considered as not capable and the results show unsatisfactory performance in modelling daily and hourly rainfall-runoff process that has 174 not good agreement between the input and output of rainfall-runoff relationship. The variation of the peak flow predictions is not significant. Meanwhile, the MLR model considers as a moderate for estimates the runoff that has the moderate training and validation accuracy. But, by referring to the results of R 2 and the error analysis, it is found that the MLR model is not consistent and not robust compared to the MLP and RBF models. This due to the fact that the peak flows for calibration and validation of the model is significantly low for this model. This study revealed that the MLR, HEC-HMS, and SWMM models are not very sensitive to the number of observations in the calibration or training data. In fact, when models get more complex with the addition of more input variables, the variation in calibration and validation accuracy become slightly lower. Meanwhile, for the MLP and RBF networks, when the complexity increases in the model, the requirement for the amount of data increases. Thus, the variation in training and testing accuracy becomes more pronounced. Based on the results of training or calibration of the models, a MLP and RBF networks was selected as the best-fit networks to model the rainfall-runoff relationship for the selected catchments because of its high training and testing accuracy. For the hourly rainfall-runoff modelling, it shows that the network estimates the peak runoffs or high flows closely to their observed value. In overall, the network provides fairly accurate predictions of low and high flow conditions for training and testing. Meanwhile, for the daily rainfall-runoff modelling, some of low flows and high flows are over- and under-predicted for training and testing average-year data. The predictions of flows are also considered fairly accurate with a good agreement of goodness of fit tests. It is observed that MLP model performed better than the RBF model in term of modelling of continuous daily rainfall-runoff relationship. It is because the MLP model used a back-propagation algorithm to yield a best-fit results and it required a long period of data for model calibration. Meanwhile, the RBF model performed better that the MLP model in modelling hourly rainfall-runoff hydrographs or short event hydrographs modelling. In general, MLP and RBF network shows slightly better performance both in 175 the training and testing periods with good agreement of RMSE and RRMSE results, revealing best fitting to the data. In general, for the four selected catchments, the ANN model provided higher training and testing accuracy when compared to the MLR model. Based on the goodness of fit statistics, the accuracy of ANN compared favourably to the model accuracy of existing technique. MLP and RBF network shows slightly better performance both in the training and validation phase compared to the MLR model. The result shows that the MLP and RBF models gives a lower error that the MLR model with good agreement of RMSE and RRMSE results, revealing best fitting to the data. But, it is observed that the MLR model is consistent and robust compared to the HEC-HMS and SWMM models. Although the R 2 values for training and testing increased slightly, the results of RMSE, RRMSE and MAPE are still fairly poor compared to the HEC-HMS and SWMM models. For the four selected catchments, the ANN model provided higher training and testing accuracy when compared to the HEC-HMS model. Based on the goodness of fit statistics, the accuracy of ANN compared favourably to the model accuracy of HECHMS model. The study demonstrates the neural network model based on MLP and RBF networks are suitable for modelling the rainfall-runoff relationship compared to the HECHMS model. Apparently, the MLP and RBF model shows a better performance than the HEC-HMS model with good agreement of RMSE and RRMSE results in the training and validation phase, revealing best fitting to the data. The HEC-HMS model gives a higher error than the MLP and RBF models with a worse degree of efficiency; in term of correlation of coefficient, RMSE and RRMSE. The study demonstrates the neural network model based on MLP and RBF networks are reliable and suitable for modelling the daily and hourly rainfall-runoff relationship compared to the SWMM model. For the four selected catchments, the ANN model provided higher training and testing accuracy when compared to the SWMM model. It was verified by the goodness of fit results analyzed. The SWMM model gives a higher error than the MLP and RBF models with a worse degree of efficiency. But, it 176 shows an improvement compared to the HEC-HMS model. As the HEC-HMS model, it also takes a longer time for calibrate the model. The trial and error procedure is applied to find out the best parameters for the SWMM model. Apparently, the MLP and RBF model shows a better performance than the SWMM model with good agreement of goodness of fit results in the training and validation phase, revealing best fitting to the data. Furthermore, the MLP and RBF method has been shown that it can easily handle the existence of non-linearity processes within the catchment compared to the XPSWMM models. 4.7.1 Basic Model Structure In this study, the first important processes that have been carried out are to develop the model structure for various models. It was carried out using trial and error method and some professional judgement to define the optimal number of input nodes. It is emphasize that this process is important for the neural network models. It is because there are no methods or formula that can be used to model the structure of watersheds because of their complexity and nonlinear. The only way to develop the network structure is by using appropriate procedures or rules. The number of nodes in the hidden layer was determined by trial and error for each case. If this number of hidden nodes is small, the network can suffer from under fit of the data and may not achieve the desired level of accuracy, while with too many nodes it will take a long time to be adequately trained and may some times over fit the data. French et. al. (1992) proposed that normally neural networks were developed using 15, 30, 45, 60 and 100 hidden nodes. This procedure is also considered to examine the performance of neural network model with different number of hidden nodes and hidden layers. In this study, these procedures are not relevance. It is because every one additional number of node yields different results and may contribute to the consistency and accuracy of the model. For example, a networks using 5, 6, 7, 8 were developed and 177 so on for hidden nodes until the optimum number of nodes that produced the highly accurate and reliable results. In overall, by extremely increase the number of hidden layer and number of hidden nodes in the model, it will increase the complexity of the system, and it may slow the calibration process without substantially improving the efficiency of the network. So, it is proposed to increase the number of nodes in the hidden layer one by one. In general, there are no significant different between the results of 3 layer (with one hidden layer) and 4 layer of MLP (with two hidden layer) networks in modelling daily and hourly rainfall-runoff relationship for the selected catchments. Although twohidden layer networks provide slightly better training accuracy that the one-hidden layer networks, the latter has significantly better testing accuracy. The increase in training accuracy in two-hidden layer networks is due to over-fitting. As a result, the one-hidden layer networks provide better testing accuracy. There is no significant difference in the goodness of fit values between one and two-hidden layer networks for both training and testing phase. In the literature review, the researchers believe that a network with one hidden layer is enough. This statement is true for situations where the data contain enough information on the system interest. The results of this study indicate that twohidden layer networks do not lead to an increase in the prediction accuracy. As suggested by Master (1993) that using one hidden layer for MLP was sufficient because in most problems with two hidden layer will not produce a large improvement in performance. The use of more than one hidden layers substantially increases the number of parameters to be estimated. In general, by increase the number of hidden layer and number of hidden nodes in the model, it will increase the complexity of the system, and it may slow the calibration process without substantially improving the efficiency of the network. The accuracy of MLP increases as more and more input data are made available to them. Harun (1999) proposed the stepwise regression technique to determine of the number of input nodes for the input layer in neural network. This technique will select only certain variables that contribute to the model. This study has determined that the 178 trial and error method is more relevance and practicality. It had been found that every single input were contribute to the model and substantially improving the efficiency of the network is achieved. This process is repeated until the optimal number of input nodes for the network. Harun (1999) proposed in their previous work to use multilayer perceptron with only one types of transfer function, namely binary sigmoid. There are several types of activation function that are more dynamic and can initialize the weights as biases to random values between +1 and -1, such as hyperbolic tangent function (tansig). There are many of previous work that used this kind of activation function. The hyperbolic tangent and sigmoid are often employed as transfer functions in the training of network (Tokar and Johnson, 1999). Tokar (1996) in their previous work have model the daily runoff as a function of daily precipitation, temperature and snowmelt from two watersheds, the Little Patuxent River, Maryland and the Independence River, New York. They have compared the neural network model to the regression and the simple conceptual models. Tokar have concluded that the ANN model provided higher training and testing accuracy when compared to the regression and a simple conceptual model. Tokar found that the networks trained using two year data provided slightly better training and testing accuracy and also higher percent predictions of peak discharges compared to the networks trained using one year data. This work has found that the neural network model can provide good performance and accurate in training and testing phase by train the networks using a longer period of data sets. The networks have been trained using 100%, 50% and 25% of 4 years data. It has been found that the network that have been trained using 100% data will produce higher accuracy and consistence results compared to the results produced by the networks that have been trained using 50% of the data. Tokar also concluded that the accuracy of the network trained using hyperbolic tangent activation function was slightly better than the one trained using sigmoid transfer functions. To evaluate the network reliability for future predictions, Tokar have proposed to use the goodness of fit statistic namely roots mean square error (RMSE). There are several types of goodness of fits statistics can be applied. It is not fair if we just use one statistic method to measure of accuracy of the sample size. They would try more than 179 one method before make any conclusion to avoid biases. Tokar also concluded that the ANN models are not very sensitive to the selection of the number of neurons and layers in the network. This study has found that the neural network model is sensitive to the selection of the number of neurons in the network. Once the number of hidden nodes is increased, the number of parameters will increase too. Apparently, it can be concluded that the number of hidden nodes is sensitive to the selection of the number of input variables. In this study, the trial and error procedures are adopted to develop the network structures for several selected watersheds. It may differ to each other depends on the time interval, the quality and quantity of input-output data pairs. The rainfall at current and previous time, t , t − i (where i =1,2,…) appears to contain the data that are needed to model rainfall-runoff processes due to the high intercorrelations with runoff data at time t . In most commonly used rainfall-runoff models, such as the Martinec model, runoff for the previous time period is included as an input variable to the model (Hawley, 1979). Therefore, this study found that by including the runoff at previous time ( t − i ), it resulting the models capable to produce good results. It was observed especially at the developed area such as Sungai Klang catchment. The characteristic of the catchment is very complex and the data consists of many highly extreme events. For Sungai Klang catchment, it was found that the runoff at previous time ( t − 1) and ( t − 2 ) gave a significant contribution to the hourly ANN model capability to produce more accurate prediction. The following ANN model structures are proposed for the prediction of runoff using available rainfall measurements from rain gauges locate at the site. The models are proposed to describe the relationships between daily and hourly rainfall against runoff using available data series. Section 4.7.1.1 and section 4.7.1.2 described the optimal numbers of input nodes for daily and hourly rainfall-runoff models respectively. 4.7.1.1 Optimal numbers of input nodes for daily rainfall-runoff models 180 This section shows a model structure of daily rainfall-runoff relationship for each catchment. It was carried out by using trial and error method. The model structures proposed for each catchment are described as follows: (i) Sungai Bekok catchment y (t ) = f {x(t ), x (t − 1), ... , x (t − 15)} (4.13) For the Sungai Bekok catchment, it was observed that the current and previous time of rainfalls at time t to ( t − 15 ) gave significant contributions to the model accuracy of ANN network structure. In overall, this particular area consists of sixteen inputs data in their model structure. Figure 4.25(a) and Figure 4.25(b) shows the architecture of 3-layer and 4layer of the MLP network structures for Sg. Bekok catchment respectively. (ii) Sungai Ketil catchment y (t ) = f {x(t ), x (t − 1), ... , x (t − 15)} (4.14) For the Sungai Ketil catchment, it was observed that the current and previous time of rainfalls at time t to ( t − 15 ) gave significant contributions and most accurate to the ANN network structure. Thus, this particular area is also consists of sixteen inputs data in their model structure. Figure J4.26(a) and Figure J4.26(b) in Appendix J shows the architecture of 3-layer and 4-layer of the MLP network structures for Sg. Ketil catchment respectively. (iii) Sungai Klang catchment y (t ) = {x (t ), x (t − 1), ... , x (t − 15), x (t − 16)} (4.15) For the Sungai Klang catchment, it was observed that the current and previous time of rainfalls at time t to ( t − 16 ) yield significant contributions and most accurate to the ANN network structure. In overall, it consists of seventeen inputs data in their model structure and the relationship of the rainfall-runoff in this particular area reflect as more complex and highly non-linear system. Figure J4.27(a) and Figure J4.27(b) in Appendix J shows the architecture of 3-layer and 4-layer of the MLP network structures for Sg. Klang catchment respectively. 181 (iv) Sungai Slim catchment y (t ) = f {x(t ), x (t − 1), ... , x (t − 15)} (4.16) Meanwhile, for the Sungai Slim catchment, it was observed that the current and previous time of rainfalls at time t to ( t − 15 ) gave significant contributions and most accurate to the ANN network structure. Thus, this particular area also consists of sixteen inputs data in their model structure. Figure J4.28(a) and Figure J4.28(b) in Appendix J shows the architecture of 3-layer and 4-layer of the MLP network structures for Sg. Slim catchment respectively. 4.7.1.2 Optimal numbers of input nodes for hourly rainfall-runoff models This section shows a model structure of hourly rainfall-runoff relationship for each catchment. It was carried out by using trial and error method. The model structures proposed for each catchment are described as follows: (i) Sungai Bekok catchment y (t ) = f {x (t ), x(t − 1), x(t − 2), x (t − 3), x (t − 4), x (t − 5), y (t − 1)} (4.17) For the Sungai Bekok catchment, it was observed that the current and previous time of rainfalls at time t to ( t − 5 ) and the flow at previous time ( t − 1) gave significant contributions and most accurate to the hourly ANN network structure. Thus, this particular area consists of seven inputs data in their model structure. Figure 4.29(a) and Figure 4.29(b) shows the architecture of 3-layer and 4-layer of the MLP network structures for Sg. Bekok catchment respectively. (ii) Sungai Ketil catchment y (t ) = f {x(t ), x(t − 1), x(t − 2), x(t − 3), x(t − 4), y (t − 1)} (4.18) For the Sungai Ketil catchment, it was observed that the current and previous time of rainfalls at time t to ( t − 4 ) and the flow at previous time ( t − 1) gave significant contributions and most accurate to the ANN network structure. In overall, this particular area consists of six inputs data in their model structure. Figure J4.30(a) and Figure J4.30(b) in 182 Appendix J shows the architecture of 3-layer and 4-layer of the MLP network structures for Sg. Ketil catchment respectively. (iii) Sungai Klang catchment y (t ) = f {x(t ), x(t − 1), x(t − 2), x (t − 3), x (t − 4), x (t − 5), y (t − 1), y (t − 2)} (4.19) For the Sungai Klang catchment, it was observed that the current and previous time of rainfalls at time t to ( t − 5 ) and the flow at previous time ( t − 1) and ( t − 2 ) gave significant contributions and most accurate to the ANN network structure. Thus, this particular area consists of eight inputs data in their model structure. Again, this condition reflects that the relationship of rainfall-runoff in this particular area is relatively complex and non-linear. Figure J4.31(a) and Figure J4.31(b) in Appendix J shows the architecture of 3-layer and 4-layer of the MLP network structures for Sg. Klang catchment respectively. (iv) Sungai Slim catchment y (t ) = f {x (t ), x(t − 1), x(t − 2), x (t − 3), x (t − 4), x (t − 5), y (t − 1)} (4.20) Meanwhile, for Sungai Slim catchment, it was observed that the current and previous time of rainfalls at time t to ( t − 5 ) and the flow at previous time ( t − 1) gave significant contributions and most accurate to the ANN network structure. This particular area also consists of seven inputs data in their model structure. Figure J4.32(a) and Figure J4.32(b) in Appendix J shows the architecture of 3-layer and 4-layer of the MLP network structures of the MLP model for Sg. Slim catchment respectively. The selection of training data that represents the characteristics of a catchment and rainfall patterns is extremely important in modelling. The value of y (t − 1) in the model structures probably can represent the condition of the soil moisture content or water table content in that particular study area. The training data should be large enough to contain the characteristics of the catchment and to accommodate the requirements of the ANN architecture. Although the network architecture depends highly on the complexity of the mapping between the input and the output variables, it is controlled by 183 the availability of data, especially in the applications where data are limited. Using a large number of neurons or extra layers in a network cannot help to detect the patterns in the phenomenon that is not included in the training data. The selection of the number of nodes or neurons in a layer and the number of layers in a network has a significant effect on the performance of the neural network models and the time spent to train networks. A network with a large of nodes and layers is usually very slow and requires large training sets. However, it has the ability to learn more complex patterns. Meanwhile, if the number of parameters in the network is much smaller than the total number of points in the training set, then there is little or no chance of over-fitting. If more data is used and increase the size of the training set, then there is no need to worry about the problem of over-fitting. So, it can be concluded that the selection of the size of the network is problem specific and can be accomplished by experience and experimentation. Figure 4.25(a) Sg. Bekok catchment. The 3-layer MLP network structures of the daily model for 184 Figure 4.25(b) The 4-layer MLP network structures of the daily model for Sg. Bekok catchment. Figure 4.29(a) The 3-layer MLP network structures of the hourly model for Sg. Bekok catchment. 185 Figure 4.29(b) The 4-layer MLP network structures of the hourly model for Sg. Bekok catchment. 4.7.2 Model Performance The study also demonstrates the neural network model based on MLP and RBF models is suitable for modelling the rainfall-runoff relationship. By considering a good training process and suitable algorithms and nodes, the prediction is more accurate. The GRNN algorithms are ‘fast learners’ and RBF network could predict runoff accurately with good agreement between the observed and predicted values. The RBF model has been proven a robust model in modelling the rainfall-runoff relationship. It was successful for single-storm events and multiple-storm events. The MLP and RBF method is highly recommended for a successful rainfall-runoff modelling problems. In the literature review, the ANN methodology has been reported to provide reasonably good solutions for circumstances where there are complex systems that may be poorly defined or understood using mathematical equations, problems that deal with noise or involve pattern recognition, and situations where incomplete and ambiguous input-output data. Because of these characteristics, it was believed that ANN could be applied to model daily and hourly rainfall-runoff relationship. In previous chapter, it was demonstrated that the ANN rainfall-runoff models exhibit the ability to extract patterns in the training data. The training and testing accuracy that is satisfied based on the goodness of fit tests, supports this conclusion. The study demonstrated that the neural network model based on MLP is suitable for modelling the rainfall-runoff relationship. It has the ability to learn spatially rainfallrunoff data from different locations. The MLP has been identified as a robust model in modelling the rainfall-runoff relationship. It can model accurately the storm hydrograph for single-storm and multiple-storm events. The predicted peak discharge and time to 186 peak are in close agreement to the actual values. Obviously, the MLP and RBF application to model the hourly streamflow hydrograph was successful. Results of rainfall-runoff modelling indicate that application of MLP and RBF method is more accurate for Sungai Bekok catchment compared to Sungai Klang catchment catchment. Normally, for a large catchment size, the river flow is highly nonlinear and influenced by storage effect. Furthermore, the river flow also can influence by the land use of the catchment area especially in fully developed area. In addition, the effect of spatial rainfall and control structures may contribute to the complexity of the system. The rainfall-runoff models that were developed are site-specific and can only be applied for prediction of daily and hourly runoff in the original catchments. If there are any major changes in the catchment characteristics such as heavy development or urbanization, deforestation, or changes in the river characteristics, the models should be retrained using additional data that accounts for these changes. Obviously, the application of neural network method in modelling the daily and hourly rainfall and runoff relationship for Sungai Bekok, Sungai Ketil, Sungai Klang, and Sungai Slim is satisfactory. The results reflect that the performance of neural network model is satisfactory and it is feasible for rainfall-runoff model in Malaysia catchment. The inaccuracy of model could be clarified by utilization of longer period of training data with many events of peak discharge, especially for the urban catchment area that faced the problem of missing data such as Sungai Klang catchment. Using ANN methodology, various combinations of rainfall and runoff at present and previous time periods were trained and tested in order to develop a runoff model for the selected catchments area. Based on the discussion provided for the catchments, oneand two-hidden layer networks were used to train the MLP networks. Based on the training results for the selected catchments, the sensitivity of the runoff to rainfall at present and previous time periods was examined using the goodness of fit tests. Results 187 of R 2 , RMSE, RRMSE, and MAPE for Sungai Bekok, Sungai Ketil, Sungai Klang and Sungai Slim catchments reflect that the MLP and RBF models consistently display a better performances compared to the MLR, HEC-HMS and XP-SWMM models. In daily and hourly rainfall-runoff modelling, the accuracy of training and testing of MLP and RBF models was slightly improved with the appropriate number of input values at the previous time periods. For the daily rainfall-runoff modelling of Sg. Bekok, Sg. Ketil and Sg. Slim catchments, the addition of rainfall at time t − 16 did not have a large impact in the prediction accuracy. But, the additional of rainfalls or inputs until at time t − 15 was appropriated to produce good results. Meanwhile, it was observed that for the Sg. Klang catchment, the addition of rainfall at time t − 17 did not have a large impact in the prediction accuracy. This result was expected since the Sg. Klang catchment is mainly driven by rainfall with obviously have a very wide range of minimum, maximum and average rainfall values recorded. In overall we can concluded that, for the daily ANN models, it was adequately to used only rainfalls data to represent the runoff processes in this particular catchments area. For the hourly rainfall-runoff modelling of Sg. Bekok, Sg. Klang and Sg. Slim catchments, the addition of rainfall at time t − 6 did not have a large impact in the prediction accuracy. Meanwhile, for the Sg. Ketil catchment, the addition of rainfall at time t − 5 did not have a large impact in the prediction accuracy. In fact, for the Sg. Bekok, Sg. Ketil and Sg. Slim catchments, the addition of runoff data at time t − 1, the percent flow estimates for testing improved yields a good results. Meanwhile, with the inclusion of runoff data at time t − 1 and t − 2 , the percent flow estimates for testing improved relatively for the Sg. Klang catchment. Further more, the advantage of RBF model is that it can be trained much faster than the MLP model. It was also found that ANN performance was hardly influenced by the level of non-linearity, and the selection of training data. A large number of training data sets are required to perform successful training especially for the MLP model. By considering a good training process and suitable algorithms and nodes, the prediction is more reliable and accurate. In other words, the MLP and RBF models are highly recommended for a successful daily and hourly rainfall-runoff modelling problems. 188 The training process for MLP network is time consuming. If the architecture of the training algorithms is not suitable, it will affect the accuracy of predictions and a network’s learning ability. The number of hidden layer and the number of hidden nodes significantly influences the performance of a network and the time taken to train the model. The time taken for calibration or training the 3 layer MLP network is less than the 4 layer network. For the MLP model, the results of computed errors are quite close although the number of hidden layer was increased from one to two. It shows that by increased the number of layer in the hidden layer were not improved the performance of the model. This indicates that the model structure of 3 layer networks is appropriate to model the nonlinearity of rainfall-runoff relationship in urban or rural catchments area. The validation processes are running parallel with the calibration processes. The method used for validation is early stopping method. The objective is to monitor the progress of calibration processes as described in chapter three. It can make the model more efficient and accurate. In this study, it is also found that the performance of the model also affected by the number of nodes in the hidden layer. So that, it is important to state the optimal and sufficient number of the hidden nodes in the hidden layer to make sure the network structure can work faster, robust and stable. If the number of nodes is too many or too small, the model will not perform better and may not achieve the desired target. Furthermore, with too many nodes in the hidden layer, the model will take a long time to perform the modelling processes. So, the suitable and appropriate number of nodes in the hidden layer will produce the best fits results. 4.7.3 Transfer Function and Algorithm In general, for the neural networks structure, a learning process using the derivative of the transfer function has been usually employed. A transfer function that is 189 differentiable or continuous everywhere is required in these types of network. The transfer function or activation function is one of the most commonly used continuous functions for the advanced form of networks. In order to select a transfer function, the training and testing accuracy of a network that uses the hyperbolic tangent function were compared with the training and testing accuracy of a network that uses the other activation function such as sigmoid function. The networks that use a hyperbolic tangent function provide slightly better training and testing accuracy when compared to the networks that use a sigmoid function. In addition, the peak flows were predicted more closely to their observed values by the networks that use the hyperbolic tangent transfer function. This result indicates that the hyperbolic tangent function is more flexible and reliable than the other functions in modelling rainfall-runoff process for the selected catchments. For the MLP model, the best results were achieved for the network with hyperbolic tangent (tansig) activation function, Levenberg-Marquardt (LM) algorithm, and appropriated number of input nodes. The accuracy of the MLP network trained using tangent sigmoid function was slightly better. Therefore, the hyperbolic tangent function was chosen as a transfer function to train the networks developed in this study. This combination of MLP structure was performed faster and more accurate near an error minimum. Meanwhile, the back-propagation algorithm used in training phase is a feedforward process. It was the good and popular algorithms because it can work forward and backward until it can produce the best result or output with minimum error, since the process was depends on the current inputs and the previous outputs. Therefore, the rainfall-runoff process can be properly modelled with a procedure using backpropagation algorithm. Meanwhile, the RBF network can be described as universal approximated function using combinations of basis functions centred around weights vectors to provide spatial estimates. The best results were achieved for the network with Gaussian activation function, GRNN algorithms, and appropriated number of input nodes. 190 Therefore, the Gaussian function was chosen as a transfer function to train the RBF networks developed in this study. If the architecture of the training algorithms is not suitable, it will affect the accuracy of predictions and a network’s learning ability. The accuracy of the RBF network trained using Gaussian function was slightly better. Both in MLP and RBF models, the number of input nodes significantly influence the performance of a network and the time taken to train the model. It is related to the complexity of the system being modelled and to the resolution of the data fit. The number of input nodes in the input layer was determined by trial and error for each case. If this number of input nodes is small, the network can suffer from under fit of the data and may not achieve the desired level of accuracy, while with too many nodes it will take a long time to be adequately trained and may some times over fit the data. The RBF model has the ability to training and testing much faster than the MLP model. With the appropriate architecture and the training algorithms, it would produce the best fit results even using optimum number of data sets for the calibration of the model. The ANN algorithms compute weight based on training to fit the objective function without encompass or indicate the hydrological characteristic of catchment. The HEC-HMS and SWMM algorithms try to encompass the physical or hydrological characteristic of a system such as infiltration, overland flow, time of concentration and etc. With the fix hydrological characteristic the models are calibrated to fit the actual against simulated hydrograph. 4.7.4 Robustness and Model Limitation The limitations of the model due to the model structures and the data available on parameter values, initial conditions and boundary conditions will generally make it difficult to apply a hydrological model without some form of training process or 191 calibration. In previous study and from the finding, it was found that majority of cases the parameter values are adjusted to get a better fit to some observed data. The problem arise on how to assess whether one model or set of parameter values is better that another is open to a variety of approaches, from a visual inspection of plots of observed and predicted variables, to a number of different quantitative measures of goodness of fit, performance measures, fitness measures and likelihood measures. Robustness test on five different data set with different time period can reveal the consistency of the model. The robustness test is carried out to evaluate the performance of the model using different input-output data sets. Certainly, a robust model will consistently yield the lowest RMSE, RRMSE, and MAPE errors. The neural network has been proven a robust model in modelling the rainfall-runoff relationship. The MLP and RBF methods are highly recommended for a successful daily and hourly rainfall-runoff modelling problems. Furthermore, the MLP and RBF network has been proven a robust model in modelling the rainfall-runoff relationship compared to the MLR, HEC-HMS and SWMM models. The study demonstrates the neural network model based on MLP and RBF is suitable for modelling the rainfall-runoff relationship compared to the MLR, HEC-HMS and SWMM models. It is suggested that even a simple model with only four or five parameter values to be estimated, it required at least 20 to 25 hydrographs for a reasonably robust calibration. For more complex parameter sets, may more data and different types of data may be required for a robust optimization unless it might be possible to fix many of the parameters. But it was found that, in practice this had proven to be very difficult to achieve. Thus, the optimum parameter set found for a particular model structure may be sensitive both to small changes in the observations or to the period of observations considered in the calibration, and possibly to changes in the model structure such as a change in the element discretization for a distributed model. The limitations of both the model structures and observed data, there may be many representations of a catchment that may be equally valid in term of their ability to 192 produce acceptable simulations of the available data. Hence, different model structures and parameter sets used within a model structure are competing to be considered acceptable as simulators. An optimum parameter sets will give a range of predictions. This may actually be an advantage since it allows the possibility of assessing the uncertainty in predictions, conditioned on the calibration data, and then using that uncertainty as part of the decision-making process arising from a modelling project. The objectives of this project and the data available for calibrating the different models will all limit that potential range of simulators. The important point is that choices between models and between parameter sets must be made in a logical and scientifically defensible way to provide good predictions. The ANN performance is influenced by the level of non-linearity and the selection of training data. A large number of training data sets are required to perform successful training. The number of hidden layer neurons significantly influences the performance of a network. If this number is small, the network can suffer from under fit of the data and may not achieve the desired level of accuracy, while with too many nodes it will take a long time to be adequately trained and may some times over fit the data. Although these models give little insight into the physical processes, they provide good enough and low-cost solution. Although several studies indicate that ANN has proven to be potentially useful tools in hydrology, their disadvantages should not be ignored. The success of an ANN application depends both on the quality and the quantity of the data available. This requirement cannot be easily met, as many hydrologic records do not go back far enough. Quite often, the requisite data is not available and has to be generated by other means, such as another well-tested model. Even when long historic records are available, it not certain that conditions remained homogeneous over this time span. Therefore, data sets recorded over a system that is relatively stable and unaffected by human activities are desirable. The major limitation of ANN is in the lack of physical concepts and relations. This has been one of the primary reasons for the skeptical attitude towards this methodology. Other limitation is there is no standardized way of selecting network architecture. The choice of network architecture, training algorithm, and definition of 193 error are usually determined by the past experience and preference, and also by using trial and error method. The limitation of the HEC-HMS and SWMM models is it requires more physical data to obtain more accurate results. Those models require the soil types, catchment characteristics, based flow, infiltration, losses, etc. Furthermore, it cannot model with rainfall data and runoff data only. Thus, the suitable parameters of the model can be obtained by using trial and error method in the calibration processes that will takes a long times to get proper trained. 4.7.5 River Basin Characteristics The independences of several catchments in Peninsular Malaysia were selected to simulate runoff from rainfall using the ANN methodology. The characteristics of the selected catchments are discussed in chapter 3. The daily and hourly rainfall, runoff, and evapotranspiration values were measured from the year 1980 to year 2000. The locations of raingauges and water level stations are shown in the Figures 3.1 to 3.4. The whole of Peninsular Malaysia experiences a typical Rainy Tropical or Tropical Wet climate, giving arise to a climax vegetation of tropical rain forest. There are no clearly definable seasons, it being warm or hot throughout the year. Although the peninsular is subject to a monsoon regime of winds, there is no distinct dry season of regular occurrence and significant duration that is liable to affect rainfall-runoff relationships. The uniformity of the seasonal pattern of rainfall and potential evapotranspiration allows the floodproducing or runoff aspects of climate to be isolated and identified in the nature of the rainfalls that producing runoffs. The maximum rainfall record for the Sg. Bekok, Sg. Ketil, Sg. Klang and Sg. Slim catchments are 1030mm, 110.3mm, 114mm and 1310mm respectively. The maximum runoff record for the Sg. Bekok, Sg. Ketil, Sg. Klang and Sg. Slim catchments are 7.1m3/s, 33.15m3/s, 89.0m3/s and 66.27m3/s respectively. Meanwhile, the annual daily average of flow or runoff for the catchments is 4.83m3/s, 29.95m3/s, 22.28m3/s and 65.70m3/s respectively. 194 Regional rainfall-runoff relationships of the type proposed in this study can generally only be established for hydrologically ‘large’ catchments. Such catchments within a homogeneous region are those in which the catchment storage processes control the magnitude of the annual maximum floods, as distinct from hydrologically ‘small’ catchments, in which the soil and vegetation conditions and the temporal characteristics of the flood producing rainstorm are of major importance. The distinction between the two cannot be made solely on the basis of catchment area, although in the extreme cases of ‘very small’ and ‘very large’ the difference is obvious. Certain climatic, topographic and land use combinations can be made catchment areas of considerable size exhibit flood characteristics of hydrologically ‘small’ catchments, and vice versa. The results of MLP modelling shows that the neural network performances are influenced by the level of nonlinearity and the selection of training data, quality of the data, and the characteristics of the catchments area. For example, results of rainfallrunoff modelling indicate that application of MLP method is more accurate for Sungai Bekok catchment compared to Sungai Klang catchment catchment. Normally, for a smaller catchment size and for a developed area, the river flow is highly nonlinear and influenced by storage effect. In addition, the effect of spatial rainfall and control structures may contribute to the complexity of the system. In general, there are two families of model (MLP and RBF) with three types of neural models. The first is MLP with one hidden layer; the second is MLP with two hidden layer; and finally the third is the RBF model. The developments of neural network model structure adopt the method by Tokar and Johnson (1999). The results are shown in the previous tables. Sungai Ketil catchment (704 km2) is 2 times bigger than Sungai Bekok catchment (350 km2). Meanwhile, the Sungai Klang catchment (468 km2) and Sungai Slim catchment (455 km2) have relatively the same magnitude of catchment area. It can be seen that the levels of efficiency of the four catchments were improved in the testing stage when the models were trained properly. Furthermore, the correlation of coefficient for Sungai Bekok is better than the Sungai Ketil. Probably, the size of the catchment contributes to the inaccuracy of neural modelling. A large fully developed catchment such as Sungai Klang generates considerably a higher peak flood discharge. 195 So, the neural network model require sufficient amount of data with a large peak discharge during training and generalization to produce the best fit results. In view of the foregoing and because of sensible relationships established between rainfall and runoff, it is evident that within a particular flood frequency region the controlling flood producing characteristics is catchment area. The difference in the flood frequency relationship between regions for hydrologically ‘large’ catchments is the result of a number of factors, amongst these being climate, vegetation and land use, surficial geology and topography. It is important to appreciate the variation in the flood producing characteristics of the regions in a qualitative sense, so that the procedure can be sensibly and usefully applied. This study revealed that the characteristics of the catchment area can give impact to the capability of the models to produce better results. For the ANN models, the important characteristics have to be considered are the size of the catchment, land use, and the quality and the quantity of the input-output data. For example, Sg. Ketil catchment is considered as a semi developed or natural area with size 704 km2. The ANN model that have been developed for this study area yield a very good results compared to the result of Sg. Klang Catchment which considered as a fully developed area with size 468 km2. It shows that the hydrologic cycle in that area is consistent and not extremely disturbed by the infrastructure and highway developments. 4.7.6 Time Interval In this study, two cases of time intervals of rainfall-runoff data have been carried out for modelling rainfall-runoff relationship. First is daily time interval and the second is hourly time interval of rainfall-runoff data. It is found that the neural network models are suitable for modelling rainfall-runoff relationship for both time intervals and it shows a good performance and accurate results. In the case of MLP model, it is required longer 196 period of training data sets for being properly trained. So, it can produce much more good results. Meanwhile, for RBF model, it is only required an optimum number of training data sets to be properly trained. If a very long record, say 1000 years data were available it would be expected that the estimates of the value of flood peaks of a particular return period would be quite reliable. If, say 10 records were selected from the 1000 years record, each of length 100 years, and the flood estimates examined for particular return periods for each record, it would be found that they were not all the same. A larger variation would be expected if 50 records each of 20 years length were examined. What has been done in this study is to examine 50 records of the 10 year records. A good quality input-output pairs of data sets was selected to develop and evaluated of the neural network models. The data are selected from the selected catchment with a large number of input-output pairs of data. Record of 10 years of hourly rainfall-runoff series of Sungai Bekok catchment (1991-2000), Sungai Ketil catchment (1983-1992), Sungai Klang catchment (1991-2000) and Sungai Slim catchment (1988-1997) are used. In this study, 55 hourly sets of data have been selected from the records. The neural network was trained under two sets of conditions. The first 50 sets of data are used for model calibration or training, and the remaining five sets of data are used for model testing and validation, and also to test the robustness and the consistency of the model. Meanwhile, record of five years of continuous daily rainfall-runoff series from four selected catchments are selected to evaluate the performance of the neural network model. The data used consist of two sets: the first three years of data are used for model calibration (training) in the case of ANN, and the remaining two years of data are used for model validation and testing. Increasing the number of training data in the training phase, with no change in neural network structure, will improve performance on the training and testing phase. Thus, it is depends on providing an adequate number of training data. A good quality input-output pairs of data sets was selected to develop and 197 evaluated of the neural network models. In this study, 1-year, 6-months, dry-season and wet-season data were selected for testing, based on the annual average rainfall. The data used for testing or verify the models are divided to 5 set as follows: (i) Data set 1: 1-year data (Jan – Dec) (ii) Data set 2: 6-month data (Jan – Jun) (iii) Data Set 3: 6-month data (Jul – Dec) (iv) Data set 4: 3-month data (Mar – May) – Dry season (v) Data set 5: 3-month data (Oct – Dec) – Wet season In selecting the data, particular attention was given to the land use change in the catchment since it was assumed that the land use data were not available. Recent data were used whenever possible since they reflect the current land use conditions in the catchment. The most current data were used in the test set in order to illustrate the capability of model in predicting future occurrences of runoff, without directly including the land use characteristics of the catchments. In this study, it was observed that the effect of the length of training data on the network accuracy is highly significant especially for the MLP networks. Networks trained using long period data provided slightly better training and testing accuracy and also higher percent predictions of peak flows compared to the existing models. Meanwhile, the impact of the length of training data was insignificant in the accuracy of RBF model. Therefore, the length of data required is correlated to the complexity of the models. The flood data available for studies are not amenable to rigorous treatment. The analyses and results are based on about 5 years of daily and 10 years of hourly past data, which is a very small sample of the long time flood peak population. By combining the flood records within regions that appear to be homogeneous, a process of averaging is carried out, which in the long run should provide results consistent with accumulated flood experience. This past record has been used to develop relationships that are assumed to hold true in the future. This necessarily restricts the study to data that are the 198 result of natural, as distinct from man-made influences. Despite the necessary limitations imposed by the above consideration, it is possible to examine the effects of such aspects as errors in the basic data and the affect of the length of record on the flood estimation procedure. With the addition of optimum number of rainfall at previous time periods in the models, the performances of the neural networks models increased slightly for training and testing when compared to MLR, HEC-HMS and SWMM models. But, the most significant impact of including previous rainfall was the rise in the percent estimation accuracy values during training and testing for hourly rainfall-runoff modelling. Harun (1999) observed that the MLR model is suitable for model continuous daily rainfall-runoff relationship. In this study, it found that the MLR model yield a satisfied results for modelling of daily rainfall-runoff relationship. Meanwhile, for both of HECHMS and SWMM models, it shows satisfied results in modelling of hourly rainfallrunoff relationship compared to the daily results. It means that both of the models are suitable to model the hourly events hydrograph with peak discharge and unsuitable to model continuous daily rainfall-runoff data sets with many events. 199 200 Figure 4.1(a) Daily results of 3-Layer neural networks for Sg. Bekok catchment using 100% of data sets in training phase 201 Figure 4.1(b) Daily results of 3-Layer neural networks for Sg. Bekok catchment using 50% of data sets in training phase 202 Figure 4.1(c) Daily results of 3-Layer neural networks for Sg. Bekok catchment using 25% of data sets in training phase 203 Figure 4.2(a) Daily results of 4-Layer neural networks for Sg. Bekok catchment using 100% of data sets in training phase 204 Figure 4.2(b) Daily results of 4-Layer neural networks for Sg. Bekok catchment using 50% of data sets in training phase 205 Figure 4.2(c) Daily results of 4-Layer neural networks for Sg. Bekok catchment using 25% of data sets in training phase 206 Figure 4.9(a) Hourly results of 3-Layer neural networks for Sg. Bekok catchment using 100% of available data sets in training phase 207 Figure 4.9(b) Hourly results of 3-Layer neural networks for Sg. Bekok catchment using 65% of available data sets in training phase 208 Figure 4.9(c) Hourly results of 3-Layer neural networks for Sg. Bekok catchment using 25% of available data sets in training phase 209 Figure 4.10(a) Hourly results of 4-Layer neural networks for Sg. Bekok catchment using 100% of available data sets in training phase 210 Figure 4.10(b) Hourly results of 4-Layer neural networks for Sg. Bekok catchment using 65% of available data sets in training phase 211 Figure 4.10(c) Hourly results of 4-Layer neural networks for Sg. Bekok catchment using 25% of available data sets in training phase 212 Figure 4.17(a) Daily results of RBF networks for Sg. Bekok catchment using 100% of data sets in training phase 213 Figure 4.17(b) Daily results of RBF networks for Sg. Bekok catchment using 50% of data sets in training phase 214 Figure 4.17(c) Daily results of RBF networks for Sg. Bekok catchment using 25% of data sets in training phase 215 Figure 4.21(a) Hourly results of RBF networks for Sg. Bekok catchment using 25% of available data sets in training phase 216 Figure 4.21(b) Hourly results of RBF networks for Sg. Bekok catchment using min of available data sets in training phase 216 CHAPTER 5 CONCLUSIONS AND RECOMMENDATIONS 5.1 General There are clearly implications for other studies that depend on models of rainfallrunoff processes. Predictions of catchment hydrogeochemistry, sediment production and transport, the dispersion of contaminants, hydroecology, and in general integrated catchment decision support systems depend crucially on good predictions of water flow processes. In that all these components will depend on the prediction of water flows that will be subject to the types of uncertainties in predictive capability. Since the 1930’s, numerous rainfall-runoff models have been developed to predict or forecast runoff. Continuous rainfall-runoff processes that are currently available have very complex and non-linear relationship. These models are very difficult to apply because of the large number of model parameters and equations that define the components of the hydrologic cycle. The potential of artificial neural network models for prediction runoff has been presented in this paper. The non-linear nature of the relationship of rainfall-runoff processes is appropriate for the application of ANN methods. Results of ANN models reflect that the application of neural network methods is feasible for model the rainfall- 217 runoff relationship in Malaysia region. Apparently, the neural network has the ability to predict runoff accurately using the rainfall data as input variable. 5.2 Conclusions The main objective to model the rainfall-runoff processes of hydrology because of the limitations of hydrological measurement techniques. In fact, only limited range of measurement techniques and a limited range of measurements in space and time. It is because it is not be able to measure everything that we would like to know about hydrological systems. Therefore, it need a means of extrapolating from those available measurements in both time and space, particularly to ungauged catchments, where measurements are not available and into the future where measurements are not possible to assess the likely impact of future hydrological change. A new approach such as ANN models can provide a good prediction that will hopefully be helpful in decision-making about a hydrological problem. With increasing demands on water resources throughout the world, improved decision-making within a context of fluctuating weather patterns from year to year, requires improved models. That is what this study is about. In order to evaluate how well a model can be applied to approximate the relationship between rainfall and runoff, it is necessary to compare the predictive capabilities of a model with existing approaches. The comparison of models is usually accomplished by testing all the models of interest on a data set from the same catchment. As discussed in chapter 2, the calibration of existing models (HEC-HMS and SWMM) is very complex and involves a lengthy calibration procedure. Within the time frame available, it was not possible to achieve this goal. Therefore, the ANN models that were developed in this study were compared to the rainfall-runoff models such ad MLR, HECHMS and SWMM. 218 The conclusions of the study have been drawn as the following; (a) Obviously, there are enough by using only one hidden layer in MLP networks. It would train the network much faster than the networks with 2 hidden layer networks. The use of more than one hidden layers substantially increases the number of parameters to be estimated and will take more time to train the networks. Such an increase in the number of the parameters may slow the calibration process without substantially improving the efficiency of the network. (b) These potential errors tend to get forgotten when the discharge data are made available as a computer file for use in rainfall-runoff modelling. There is always a tendency for the modeller to take the values as perfect estimates of the discharges. To some extent this is justified, the data are the only indication of the true discharges and the best data available for calibrating the model parameters. However, if a model is calibrated using data that are in error, then the effective parameter values will be affected and the predictions for other periods, which depend on the calibrated parameter values, will be affected. So, it is worth stressing that prior to applying any model the rainfall-runoff data should be checked for consistency. (c) The performance of the MLR, HEC-HMS and SWMM models are moderate. According to the coefficient of efficiency of the model, it was found that the performance of those models is unsatisfactory. For both calibration and validation processes, the MLR, HEC-HMS and SWMM also take a longer time compared to the ANN models. Furthermore, this evaluation is based on several limitations such as; (1) no sub-divide of the catchment area and (2) no observed data on infiltration, abstraction and moisture content. (d) Obviously, the RBF networks takes a shorter time in training and testing process compared to the MLP networks. It is because the parameters for the RBF networks are less than the parameters for the MLP networks. For example, for Sungai Bekok catchment, the total of parameters involved are 219 250 and 52 for MLP and RBF respectively. But, both of the models were produced best fits and good results. We can conclude that the RBF network is an alternative method for modelling rainfall-runoff relationship beside MLP networks. The RBF method also yields good fit results as the MLP networks. However, the MLP and RBF method has been shown that it can easily handle the existence of non-linearity processes within the catchment. (e) Compared to other models, the ANN models are relatively easy to use and their calibration is more systematic. It also demonstrates that the model structure in the ANN models are not very sensitive to the selection of the number of neurons and hidden layer(s) in a network and transfer functions of neurons. It had proved that a single hidden layer network containing a sufficiently number of nodes could be used to approximate any measurable functional relationship between the input and the output variable to any desired accuracy. In other word, once the architecture of the network is defined, weights are calculated so as to represent the desired output through a learning process where the ANN is trained to obtain the expected results. (f) The selection of training or calibration data has a very large impact on the model prediction accuracy. If the training or calibration data do not represent the characteristics of a catchment and the climate, the model will not provide reliable future predicts. When the input data include the low and high flow extremes, the networks were able to recognize the patterns in test data that contain low, high and average flow conditions. For the hourly event hydrograph models, the networks provided the highest network accuracy for future predictions and estimated the peak discharges closer to their observed values. Since the data includes information on both high and low flow conditions, the networks were able to distinguish the patterns in the test data that is different from the training data. Similar trends were also observed for daily rainfall-runoff models. Based on the goodness of fit tests, ANN networks trained using data that include low 220 and high flow had the good prediction accuracy compared to the other models. (g) The relationship of rainfall-runoff is highly non-linear. Normally, for a small catchment size, the river flow is highly non-linear and influenced by storage effect, which can affect the quality of the data. Meanwhile, the non-linearity of river flow for a big catchment is more consistently. In addition, the effect of spatial rainfall and control structures may contribute to the complexity of the system. However, the MLP and RBF method has been shown that it can easily handle the existence of non-linearity processes within the catchment compared to the MLR, HEC-HMS and SWMM models. (h) The ANN models have been identified as a robust model in modelling the rainfall-runoff relationship. It can model accurately the storm hydrograph for single-storm and multiple-storm events. Obviously, the ANN application to model the daily and hourly streamflow hydrograph was successful. (i) The modelling of hourly event flow hydrograph yields a better accuracy of prediction compared to daily model. This is time for the ANN to become a priority tools to overcome the problem of flow hydrograph prediction. 5.3 Recommendations for future work Even though this research has been conducted with the aim that it will be as thorough and as exhaustive as possible, it has inspired the possibilities for further research and refinements. Listed below are some suggestions for further research: (a) It has been shown in this study that the Artificial Neural Network (ANN) model is capable to model the complex relationship between rainfall and runoff. As part of Artificial Intelligent (AI) groups, the Fuzzy Logic (FL) model has a good characteristics and capability to model this relationship. 221 FL model can be designed through experience of the experts. Therefore, for further research the study of Fuzzy Logic should be carried out. Other type of AI technique is Neuro-Fuzzy model. It is the combination of both methods; neural network and fuzzy logic. Several studies has found that the combination of both methods are more powerful and effectiveness. (b) Further work on the regionalization need to be taken up by incorporating more rainfall stations. This is also to further refine the estimation of quantiles for ungauged sites. The study should also extend to the whole catchments in peninsular Malaysia and also in east Malaysia to get a more comprehensive conclusion for the whole country. (c) In this study, a rainfall-runoff relationship model was carried out using the daily and hourly time intervals. For further research, the study should also extend to short time intervals such as 5 minutes, 10 minutes and 15 minutes. And also, the model predictions were slightly improved using 2 years data rather than using 1 year data. However, this improvement was insignificant in networks trained using 4 years data. In order to reach a conclusion on how the model accuracy changes with the length of training data, more investigation would be needed using training data over a longer period of time. ANN can be trained using 6 years, 10 years, or 15 years training data and compared to the model accuracy of the previous study. (d) In this study, the training of ANN was accomplished using the same transfer function for all neurons and layers in a network. The use of different transfer functions for each layer will be helpful to obtain better prediction accuracy. For example, a network that is trained using a linear transfer function in the first hidden layer and a hyperbolic tangent transfer function in the output layer would differ significantly from a network that is trained using the same transfer function in all layers. (e) In watersheds around the world, it is common that either the rainfall or runoff records are incomplete for certain periods of time. This difficulty is usually overcome by throwing out the incomplete part of record. becomes difficult especially in the calibration of the model. It The 222 traditional methods such as statistical analysis have their own weaknesses. With the advance technology development and computer related activities it also recommended that to use radar system, remote sensing, satellite technology, database management systems, error analysis, etc. to observed data over large and inaccessible areas and to map these areas spatially is vastly improved, making it possible to develop truly distributed models for both gauged and ungauged catchment. (f) In prediction or forecasting runoff, it is very important to update the model without re-training or re-calibrating it. This will be very advantages in where the changes in a catchment can be continuously included. This will aid the hydrologists and engineers in planning, designing and managing future water resources systems with more courageous. Theory that incorporates new information without re-calibration might be applied to model rainfall-runoff processes. (g) Modelling techniques that have been used (HEC-HMS and XP-SWMM) in this study are the lower ranking model compare to other popular models such as Hydrologic Simulation Package Fortran IV (HSPF), Catchment Model (CM), U.S. Geological Survey (USGS) Model, Utah State University (USU) Model, and so on. So, the application of these popular models should be carried out to see how accurate and reliable of the model that have been developed. (h) Another aspect that has to consider is to try to constrain the uncertainty in model predictions over both the short and long term, through data assimilation. If there is more remote sensing and other spatially distributed data become available, then it will be possible to incorporate other types of data into assimilation algorithms. For the long-term, where predictions of future behaviour of a hydrological system are important but highly uncertain, there may be some justification for implementing a measurement program to monitor the impacts of change including future climate change within the context of the natural variability of hydrological systems. 223 REFERENCES Abrahart, R.J. and Kneale, P.E. (1997). Exploring Neural Network Rainfall-Runoff Modelling. Proceedings of the 6th British Hydrological Society Symposium. Salford University. 9.35-9.44. Abrahart, R.J. (1999). Neurohydrology: Implementation Options and Research Agenda. Area: 31(2). 141-149. Abrahart, R. J., See, L. and Kneale, P. E. (1999). Using Pruning Algorithms and Genetic Algorithms to Optimise Network Architectures and Forecasting Inputs in a Neural Network Rainfall-Runoff Model. Journal of Hydroinformatics. 1(2). 103-114. Altrock, C. V. (1995). Fuzzy Logic & Neurofuzzy Applications Explained. Englewood Cliffs, N.J.: Prentice Hall Inc. Anderson, M. G. and Burt, T. P. (Eds.) (1985). Hydrological Forecasting. Chichester, U.K.: John Wiley & Sons Ltd. Anderson, M. L. et. al. (2002). Coupling HEC-HMS with Atmospheric Models for Prediction of Watershed Runoff. Journal of Hydrology Engineering. 7(4). 312-318. Angelo, M., Eddie, T. and Jamshidi, M. (1994). Fuzzy Logic Based Collision Avoidance for a Mobile Robot. Robotica. 12(6). 521-527. 224 Angsorn, S. (1995). Fuzzy Logic in Polder Flood Control Operations in Bangkok. University of British Columbia: Ph.D. Thesis. Anjum, M. M. (2000). Rainfall-Runoff Modelling of Taman Mayang Catchment. Universiti Teknologi Malaysia: Master Thesis. ASCE (2000). Artificial Neural Networks In Hydrology, Part I: Preliminary Concepts. Journal of Hydrology Engineering. 2. 115-123. ASCE (2000). Artificial Neural Networks In Hydrology, Part II: Hydrologic Applications. Journal of Hydrology Engineering. 2. 124-135. Aziz, A. R. A. and Wong, K. F. V. (1992). Neural Network Approach To The Determination of Aquifer Parameters. Ground Water. 30(2). 164-166. Beven, K. J. (2001). Rainfall-Runoff Modeling: The Primer. Chichester: John Wiley & Sons, Ltd. Bishop, C. S. (1995). Neural Networks for Pattern Recognition. Great Clarendon Street, Oxford: Oxford University Press. Blum, A. (1992). Neural Networks in C++: An Object-Oriented Framework for Building Connectionist Systems. USA: John Wiley and Sons Inc. Bojadziev, G. and Bojadziev, M. (1995). Fuzzy Sets, Fuzzy Logic, Applications. Singapore: World Scientific Publishing Co. Box, G. E. P. and Jenkins, G. M. (1976). Time Series Analysis Forecasting and Control. California: Holden-Day Inc. Bronstert, A. (1999). Capabilities and Limitations of Distributed Hillslope Hydrological Modelling. Hydrological Processes. Vol. 13. 21-48. 225 Brooks, K. N. et. al. (1991). Hydrology and the Management of Watersheds. IOWA: State University Press. Broomhead, D. S. and Lowe, D. (1988). Multivariate Functional Interpolation and Adaptive Networks. Complex Systems. Vol. 2. 321-355. Brownlie, W. R. (1983). Flow Depth in Sand-bed Channels. Journal of Hydrology Engineering. Vol. 109(7). 950-990. Buch, A. M., Mazumdar, H. S., and Pandey, P. C. (1993). A Case Study of Runoff Simulation of a Himalayan Glacier Basin. Proceeding of International Joint Conference on Neural Networks. Vol. 1. 971-974. Campbell, P. F. (1993). Application of Fuzzy Sets Theory in Reservoir Operation. University of British Columbia: Master Thesis. Caudill, M. (1987). Neural Networks Primer, Part I. AI Expert. December. 46-52. Carriere. P., Mohaghegh. S., and Gaskari. R. (1996). Performance of a Virtual Runoff Hydrograph System. Journal of Water Resources Planning and Management. Vol. 122(6). 1-7. Chen, S., Cowan, C. F. N. and Grant, P. M. (1991). Orthogonal least squares learning for radial basis function networks. IEEE Transactions on Neural Networks. Vol. 2(2). 302-309. Chow, V. T. (1964). Handbook of Applied Hydrology. New York: McGraw-Hill. Clarke, R. T. (1973). A Review of Some Mathematical Models Used in Hydrology With Observation on Their Calibration and Use. Jurnal of Hydrology. Vol. 19. 1-20. 226 Colby, B. R. (1964). Practical Computations of Bed Material Discharge. Proceeding ASCE. Vol. 90(2). Cox, E. (1999). The Fuzzy Systems Handbook. San Diego: AP Professional. Dandy, G. and Maier, H. (1996). Use of Artificial Neural Networks for Real Time Forecasting of Water Quality. Proceeding of the International Conference on Water Resources and Environmental Research. Japan. Vol. 2. 55-64. Daniel, M. and Paul, F. (1993). Fuzzy Logic for Automatic Control. New York: Simon & Schuster. Da, R. (1998). Fuzzy Logic and Intelligent Technologies for Nuclear Science and Industry. Singapore: World Scientific Publishing Co. Dawson, C. and Wilby, R. (1998). An Artificial Neural Network Approach to RainfallRunoff Modeling. Journal of Hydrology Science. Vol. 43. 47-66. Demuth, H. and Beale, M. (1994). Neural Network Toolbox User’s Guide. Prime Park Way. Natick: The MathWorks Inc. Dibike, Y. B. and Solomatine, D. P. (1999). River Flow Forecasting Using Artificial Neural Networks. Netherlands: International Institute for Infrastructural, Hydraulic, and Environmental Engineering. Dibike, Y.B. and Abbott, M.B. (1999). Application of Artificial Neural Networks to the Simulation of A Two Dimensional Flow. Journal of Hydraulic Research. Vol. 37(4). 435-446. Dibike, Y.B. (2000). Machine Learning Paradigms for Rainfall-Runoff Modeling. Proceeding of the Hydroinformatics. USA: IOWA Conference. 227 Dibike, Y.B. and Solomatine, D. (2001). River Flow Forecasting Using Artificial Neural Networks. Journal of Physics and Chemistry of the Earth. Part B: Hydrology, Oceans and Atmosphere. Vol. 26(1). 1-8. Dooge, J. C. I. (1959). A General Theory of the Unit Hydrograph. Journal of geophysical Research. Vol. 64. 241-256. Dooge, J. C. I. (1981). Parameterization of Hydrologic Processes. Conference on land surface processes in atmospheric general circulation models. 243-284. El-kady, A. I. (1989). Watershed Models and Their Applicability to Conjunctive Use Management. Journal of American Water Resources Association. Vol. 25(1). 25-137. Elshorbagy, A., Simonovic, S. P., and Panu, U. S. (2000). Performance Evaluation of Artificial Neural Networks for Runoff Prediction. Journal of Hydrology Engineering. Vol. 5. 424-427. Encyclopaedia (2000). The Columbia Electronic Encyclopaedia: 6th. ed. Columbia: University Press. Fausett, L. (1994). Fundamentals of Neural Networks. New Jersey: Prentice Hall, Englewood Cliffs. Fei, J. (1991). A Fuzzy Knowledge-based Learning Control System for a Mobile Robot. Syracuse University: Ph.D. Thesis. Fernando, D.A.K. and Jayawardena, A.W. (1998). Runoff Forecasting Using RBF Networks With OLS Algorithm. Journal of Hydrology Engineering. Vol. 3(3). 203-209. Forsyth, R. (ed.) (1984). Expert Systems: Principles and Case Studies. London: Chapman and Hall Ltd. 228 Freeze, R. A. (1972). Role of Subsurface Flow in Generating Surface Runoff: Upstream Source Areas. Water Resources Research. Vol. 8(5). 1272-1283. French, M.N. Krajewski, W.F. and Cuykendall, R.R. (1992). Rainfall Forecasting in Space and Time Using a Neural Network. Journal of Hydrology. Vol. 137. 1-31. George, B. (1997). Fuzzy Logic for Business, Finance, and Management. Singapore: World Scientific Publishing Co. Grubbs, (1969). Procedures for Detecting Outlying Observations in Samples. Technometrics. Vol. 11(1). 1-21. Gupta, H. V. and Sorooshian, S. (1994). A New Optimization Strategy for Global Inverse Solution of Hydrologic Models. Numerical methods in water resources. Boston: Kluwer Academic. Haan, C. T. (1977). Statistical Methods in Hydrology. Iowa: State University Press. Hagen, M.T. and Menhaj, M.B. (1994). Training Feedforward Networks with the Marquardt Algorithm. IEEE Transactions on Neural Networks. Vol. 5(6). Hall, M.J. and Minns, A.W. (1993). Rainfall-Runoff Modelling as a Problem in Artificial Intelligence: Experience with a Neural Network. Proceedings of the 4th British Hydrological Society Symposium. Vol. 5. 51-5.57. Hall, M.J. and Minns, A.W. (1999). The Classification of Hydrologically Homogeneous Regions. Journal of Hydrology Sciences. Vol. 44(5). 693-704. Haubold, V. B. (1993). Fuzzy Logic: A Clear Choice for Temperature Control. I and SC. Vol. 66(6). 39-41. 229 Harun, S. (1999). Forecasting and Simulation of Net Inflows for Reservoir Operation and Management. Universiti Teknologi Malaysia: Ph.D. Thesis. Hawley, M.E. (1979). A Comparative Evaluation of Snowmelt Models. University of Maryland, College Park: Master Thesis. Haykin, S. (1994). Neural Networks: A Comprehensive Foundation. New York: World Scientific Publishing. Henk, B. V. (1999). Fuzzy Logic Control: Advances in Applications. New York: Addison-Wesley Publication Co. Hecht-Nielsen, R. (1991). Neurocomputing. New York: Addison-Wesley Publication Co. Heimes, F. and Heuveln, B. V. (1998). The Normalized Radial Basis Function Neural Network. IEEE Transactions on Neural Networks. Vol. 1. 1609-1614. Heshmaty, B. and Kandel, A. (1985). Fuzzy Linear Regressions and Its Applications to Forecasting in Uncertain Environments. Fuzzy Sets and Systems. Vol. 2. 159-191. Holder, R. L. (1985). Multiple Regression in Hydrology. Wallingford, England: Institute of Hydrology. Hopfield, J. J. (1982). Neural Networks and Physical Systems with Emergent Collective Computational Abilities. Proceeding of National Academy of Scientists. Vol. 79. 2554-2558. Hopfield, J. J and Tank, D. W. (1986). Computing with Neural Circuits: A Model. Science. Vol. 233. 625-633. 230 Horton, R. E. (1933). The Role of Infiltration in the Hydrologic Cycle. Trans. Am. Geophys. Union. Vol. 145. 446-460. Hromadka, T. V., McCuen, R. H. and Yen, C. C. (1988). Effect of Watershed Subdivision on Prediction Accuracy of Hydrologic Models. Hydrosoft. Vol. 1. 19-28. Hsu, K. Gupta, H.V. and Sorooshian, S. (1995). Artificial Neural Network Modelling of the Rainfall-Runoff Process. Water Resources Research. Vol. 31(10). 2517-2530. Hsu, K. Gupta, H.V. and Sorooshian, S. (1998). Streamflow forecasting using artificial neural networks. Water Resources Engineering. Proceeding of ASCE Conference. Tennessee: Memphis. Hydrologic Engineering Center (HEC) (2000). Hydrologic Modelling System HECHMS User’s Manual, version 2.0. Engineering. US Army Corps of Engineers, California: Davis. James, C. B. (1992). Fuzzy Logic and Neural Networks for Pattern Recognition: Visual Materials. Discataway: IEEE Educational Activities Board. James, C. B. (1999). Fuzzy Models and Algorithms for Pattern Recognition and Image Processing. Boston: Kluwer Academic Publishers. Jang, J-S, Sun C-T, and Mizutani (1997). Neuro-fuzzy and Soft Computing. New Jersey: Prentice Hall. Jindrich, L. (1994). Fuzzy Logic System Based Modelling and Control of Complex Chemical Processes. Clemson University: Ph.D. Thesis. Johnson, D. and King, M. (1988). Basic Forecasting Techniques. Great Britain: Butterworth & Co. (Publishers) Ltd. 231 Julien, P. Y. and Moglen, G. E. (1990). Similarity and Length Scale for Spatially Varied Overland Flow. Water Resources Research. Vol. 26(8). 1819-1832. Kasmin, H. (2003). Kesan Pembalakan ke Atas Aliran Ribut. Universiti Teknologi Malaysia: Master Thesis. Kavvas, M. and Chen, Z. (1998). Meteorologic Model Interface for HEC-HMS NCEP Eta Atmospheric Model and HEC Hydrologic Modelling System. Kachroo, R.K. (1986). HOMS Workshop on River Flow Forecasting: Nanjing, China. Unpublished internal report, Dept. of Engineering Hydrology. Ireland: University College Galway. Karim, M. F. and Kennedy, J. F. (1990). Menu of Coupled Velocity and Sediment Discharge Relations for Rivers. Journal Hydrology Engineering. Vol. 116(8). 978- 996. Karunanithi, et. al. (1994). Neural Networks for River Flow Prediction. Journal of Computing in Civil Engineering. Vol. 8(12). 201-220. Kitamura, Y. and Nakayama, H. (1985). Rainfall-runoff in the Catchment Area of Muda and Pedu Dams. Quarterly report no. 18, TARC, Alor Setar. Laursen, E. M. (1958). The Total Sediment Load of Streams. Journal of the hydraulics Division. Vol. 84(1). 1-36. Lipmann, R. P. (1987). An Introduction to Computing with Neural Nets. IEEE ASSP Magazine. Vol. 4. 4-22. Loague, K. M. and Freeze, A. (1985). A Comparison of Rainfall-Runoff Modelling Techniques on Small Upland Catchments. Water Resources Research. Vol. 21(2). 229-248. 232 Lu, B. and Evans, B. L. (1999). Channel Equalization by Feedforward Neural Networks. IEEE Trans. Neural Networks. Vol. 10. 587-590. Lucks, M. B. and Oki, N. (1999). A Radial Basis Function for Function Approximation. IEEE Transactions on Neural Networks. Vol. 5. 1099-1101. Madan, M. G. and Yamakawa, T. (1988). Fuzzy Logic in Knowledge-based System, Decision and Control. Amsterdam: Elsevier Science. Maidment, D. R. (ed.) (1993). Handbook of Hydrology. New York: McGraw-Hill. Maier, H.R. and Dandy, G.C. (1996). The Use of Artificial Neural Networks for the Prediction of Water Quality Parameters. Water Resources Research. Vol. 32(4). Mann, I. And McLaughlin, S. (2000). Dynamical system modelling using Radial Basis Function. IEEE Transactions on Neural Networks. Vol. 7. 461-465. Markus, M. (1997). Application of Neural Networks in Streamflow Forecasting. Colorado State University: Ph.D. Dissertation. Masters, T. (1993). Practical Neural Network Recipes in C++. San Diego, California: Academic Press, Inc. MATLAB (2000). Getting Started with MATLAB. 6 th. ed. Natick, M.A: The Math Works Inc. Mays, L. W. and Tung, Y. K. (1992). Hydrosystems Engineering and Management. USA: McGraw-Hill, Inc. Mazion, E. and Yen, B. C. (1994). Computational Discretization Effect on RainfallRunoff Simulation. Journal of Water Resources Planning & Management. Vol. 120(5). 715-734. 233 McCuen, R. H. (1997). Hydrologic Analysis and Design. 2nd Edition. New Jersey: Prentice Hall Englewood Cliffs. Metcalf and Eddy (1971). Stormwater management model. Final report, EPA Rep. No. 11024DOC07/71. University of Florida, Washington, D. C. Minns, A. W. and Hall, M. J. (1996). Artificial Neural Networks as Rainfall-Runoff Models. Journal of Hydrology Science. Vol. 41(3). 399-417. Nagy, L., Kocbach, L., Pora, K. and Hansen, J. P. (1970). Inference Effects in the Ionization of H2 by Fast Changed Projectiles. Journal of Physics. Vol. 35. 453-459. Nash, J. E. and Sutcliffe, J. V. (1970). River Flow Forecasting Through Conceptual Models: A Discussion of Principles. Journal of Hydrology. Vol. 10. 282-290. Overton, D. E. and Meadows, M. E. (1976). Stormwater Modelling. New York: Academic Press. Pallavicini, I. (1999). Giving Simple Tools to Decision Makers-The Fuzzy Approach To Decision Support Systems. Centre for Ecology & Hydrology. Vol. 11. 7-53. Park, M. et. al. (1999). A New Approach to the Identification of a Fuzzy Model. Fuzzy Sets and Systems. Vol. 104. 169-181. Poff, L. N., Tokar, A. S. and Johnson, P. A. (1996). Steam Hydrological and Ecological Responses to Climate Change Assessed with an Artificial Neural Network. Limnology and Oceanography. Vol. 41(5). 857-863. Poggio, T. and Girosi, F. (1990). Networks for Approximation and Learning. Proc. of the IEEE. 78, 1481-1497. 234 Raman, H. and Chandramouli, V. (1996). Deriving a General Operating Policy for Reservoirs Using Neural Network. Journal of Water Resources Planning and Management. Vol. 122(5). Ranjithan, S. and Eheart, J. W. (1993). Neural Network-based Screening for Groundwater Reclamation Under Uncertainty. Water Resources Research. Vol. 29(3). 563-574. Rogers, L. L. and Dowla, F. U. (1994). Optimization of Groundwater Remediation Using Artificial Neural Networks with Parallel Solute Transport Modelling. Water Resources Researsh. Vol. 30(2). 457-481. Rumelhart, D. E., McClelland, J. L. and the PDP Research Group (1986). Parallel Distributed Processing. Massachusetts: The MIT Press. Vol. 1. 547. Russell, S. O. and Campbell, P. F. (1996). Reservoir Operating Rules with Fuzzy Programming. Journal of Water Resources Planning and Management. Vol. 122(3). 1-9. Salas, J. D. et. al. (1980). Applied Modelling of Hydrologic Time Series. Littleton, Colorado: Water Resources Publication. Sargent, D. M. (1981). An Investigation Into the Effect of Storm Movement on the Design of Urban Drainage System: Part I. Public Health Engineering. Vol. 9. 201-207. Sargent, D. M. (1982). An Investigation Into the Effect of Storm Movement on the Design of Urban Drainage System: Part II. Public Health Engineering. 111-117. Shamseldin, A.Y. (1997). Application of a Neural Network Technique to Rainfall-Runoff Modelling. Journal of Hydrology. Vol. 199. 272-294. 235 Shamseldin, A.Y. O’Connor, K.M. and Liang, G.C. (1997). Methods for Combining the Outputs of Different Rainfall-Runoff Models. Journal of Hydrology. Vol. 197. 203-229. Shamseldin, A.Y. and O’Connor, K.M. (1999). A Real-Time Combination Method for the Outputs of Different Rainfall-Runoff Models. Journal of Hydrology Sciences. Vol. 44(6). 895-912. Sherman, L. K. (1932). Streamflow From Rainfall by the Unit Graph Method. Engineering News-Rec.. Vol. 108. 501-505 Simpson, P. K. (1989). Artificial Neural Systems: Foundations, Paradigms, Applications and Implementations. USA: Pergamon Press. Singh, V. P. (1988). Hydrologic Systems-Rainfall-runoff Modelling. New Jersey: Prentice Hall Englewood Cliffs. Vol. 1. 480. Singh, V. P. (ed.) (1998). Effect of the Direction of Storm Movement on Planar Flow. Hydrologic Processes, 12, 147-170 Singh, V. P. (ed.) (1982). Rainfall-Runoff Relationship. Proceeding of the International Symposium on Rainfall-Runoff Modelling. Littleton, Colorado: Water Resources Publications. Singh, V. P. and Woolhiser, D. A. (2002). Mathematical Modelling of Watershed Hydrology. Journal of Hydrology. Vol. 7(4). 270-292. Skaggs, R. W., Tabrizi, A. N. and Foster, G. R. (1982). Subsurface Drainage Effects on Erosion. Journal Soil Water Cons. Vol. 37. 167-172. 236 Smith, J. and Eli, R. N. (1995). Neural Network Models of Rainfall-Runoff Processes. Journal of Water Resources Planning and Management. Vol. 121(6). 499-508. Sorooshian, S. (1991). Parameter Estimation, Model Identification, and Model Validation: Conceptual-Type Models. In Bowles, D.S. and O'Connell, P.E. (Eds.). Proceedings of the NATO Advanced Study Institute on Recent Advances in the Modelling of Hydrologic Systems. Portugal: Kluwer Academic Publishers. 10-23. Specht, D. F. (1991). A General Regression Neural Network. IEEE Transactions on Neural Networks. Vol. 2. 568-576. SPSS Inc. (1995). SPSS Software User’s Guide: Release 6.0. Chicago: North Michigan Avenue. Starrett, S. K., Najjar, Y. M. and Hill, J. C. (1996). Neural Networks Predict Pesticide Leaching. Proc. Am. Water and Envir. New York. 1693-1698. Surkan, A. J. (1974). Simulation of Storm Velocity Effect of Flow From Distributed Channel Networks. Water Resources Research. Vol. 10. 1149-1160. Svanidze, G. G. (1980). Mathematical Modelling of Hydrologic Series. Littleton, Colorado: Water Resources Publications. Tawfik, M., Ibrahim, A., and Fahmy, H. (1997). Hysteresis Sensitive Neural Network for Modelling Rating Curve. Journal of Computing in Civil Engineering. Vol. 11(3). 206-211. The MathWorks Inc. (1992). The Student Edition of MATLAB: Student User Guide. New Jersey: Prentice-Hall Inc. 237 Thirumalaiah, K. and Deo, M.C. (1998a). Real-Time Flood Forecasting Using Neural Networks. Computer-Aided Civil and Infrastructure Engineering. Vol. 13(2). 101-111. Thirumalaiah, K. and Deo, M.C. (1998b). River Stage Forecasting Using Artificial Neural Networks. Journal of Hydrology Engineering. Vol. 3(1). 26-32. Thiumalaiah, K. and Deo, M.C. (2000). Hydrological Forecasting Using Neural Networks. Journal of Hydrology Engineering. Vol. 5(2). 180-189. Tingsanchali, T. (2000). Forecasting Model of Chao Phraya River Flood Levels at Bangkok. Thailand: Research Report, Asian Institute of Technology. Tokar, A. S. and Markus, M. (2000). Precipitation-Runoff Modelling Using Artificial Neural Networks and Conceptual Models. Journal of Hydrology Engineering. Vol. 2. 156-161. Tokar, A. S. (1996). Rainfall-Runoff Modelling in an Uncertain Environment. University of Maryland: Ph.D. Dissertation. Tokar, A. S. and Johnson, P. A. (1999). Rainfall-Runoff Modelling Using Artificial Neural Networks. Journal of Hydrology Engineering. Vol. 3. 232-239. Todini, E. (1988). Rainfall-Runoff Modelling: Past, Present and Future. Journal of Hydrology. Vol. 100. 341-352. Torno, (1985). Computer Application in Water Resources. Proceedings of the specialty conference. New York: Buffalo. Tsoukalas, L. H. and Uhrig, R. E. (1997). Fuzzy and Neural Approaches in Engineering. New York: John Wiley & Sons Inc.. 238 Turksen, I. B. (1999). Type I and Type II Fuzzy System Modelling. Fuzzy Sets and Systems. Vol. 106. 11-34. U.S. Army Corps of Engineers Hydrologic Engineering Center (USACE-HEC) (2000). HEC-HMS Hydrologic Modelling System Users Manual. California: USACE-HEC. Wasserman, P. D. (1989). Neural Computing, Theory and Practice. New York: Van Nostrand Reinhold. Wasserman, A.I. (2000). Software Tools: Past, Present, and Future. IEEE Transactions on Software Engineering. Vol. 9(3). 3-6. Werbos, P. J. (1974). Beyond Regression: New Tools for Prediction and Analysis in the Behavioural Science. , Havard University, Cambridge: Ph.D. Thesis Wilby, R. L., Hassan, H. and Hanaki, K. (1998). Statistical Downscaling of Hydrometeorological Variables Using General Circulation Model Output. Journal of Hydrology. Vol. 205. 1-19. Woolhiser, D. A. and Brakensiek, D. L. (1982). Hydrologic System Synthesis in Hydrologic Modelling of Small Watersheds. St. Joseph: ASAE Monograph No. 5. 3-16. Woolhiser, D. A. and Goodrich, D. C. (1988). Effect of Storm Rainfall Intensity Patterns on Surface Runoff. Journal of Hydrology. Vol. 102. 29-47. Wu, et. al. (1982). Effects of Spatial Variability of Hydraulic Roughness on Runoff Hydrographs. Agriculture Forest Meteorologic. Vol. 59. 231-248. Wurbs, R. A. (1998). Dissemination of Generalized Water Resources Models in the United States. Water Int. Vol. 23. 190-198. 239 XP-SWMM (2000). Expert Stormwater and Wastewater Management Model. Version 8.05 (42-805-0546), USA. XP-SWMM (2000). Stormwater Management Users Manual-Version 4. Athens: US Environmental Protection Agency. Yager, R. R. (1977). Multiple Objective Decision-Making Using Fuzzy Sets. Intl. Journal of Man-Machine Studies. Vol. 9. 375-382. Yang, C. C., Prasher, S. O., Lacroix, R., Sreekanth S., Patni N. K. and Masse L. (1997). Artifical Neural Networks Model for Subsurface Drained Farmland. Journal of Irr. and Drain. Engrg. Vol. 123(4). 285-292. Yang, S. and Tseng, C. (1988). An Orthogonal Neural Network for Function Approximation. IEEE Transactions on Systems. Vol. 26(5). 779-785. Yapo, P.O., Gupta, H.V., and Sorooshian, S. (1996). Automatic Calibration of Conceptual Rainfall-Runoff Models: Sensitivity to Calibration Data. Journal of Hydrology. Vol. 181. 23-48. Yu, P-S and Yang, T-C (2000). Fuzzy Multi-Objective Function for Rainfall-Runoff Model Calibration. Journal of Hydrology. Vol. 238. 1-14. Yu, P-S, Chen C-J, and Chen, S-J (2000). Application of Gray and Fuzzy Methods for Rainfall Forecasting. Journal of Hydrology Engineering. Vol. 4. 339-345. Zakaria, et. al. (2003). Bio-Ecological Drainage System for Water Quantity and Quality Control. Intl. River Basin Management. Vol. 1(3). 237-251. Zadeh, L. A. (1973). Outline of A New Approach to the Analysis of Complex Systems and Decision Processes. IEEE Trans. Systems, Man and Cybernetics. Vol. 3. 28-44. 240 Zadeh, L. A. and Kacprzyk, J. (eds.) (1992). Fuzzy Logic for the Management of Uncertainty. New York: John Wiley & Sons. Zhang, M., Fulcher, J. and Scofield, R. A. (1997). Rainfall Estimation Using Artificial Neural Network Group. Neurocomputing. Vol. 16. 97-115. Zimmermann, H.-J. (1994). Fuzzy Sets Theory and Its Applications. 2nd. ed. Boston: Kluwer Academic Publishers. Zou, et. al. (2002). Combining Time Series Model for Forecasting. Journal of Forecasting. Vol. 24. 241 APPENDIX A Daily and hourly results of MLP model A. Daily results Table A4.3(a): Results of 3 Layer neural networks for Sg. Ketil catchment using 100% of data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 16-14-1* 268 0.8370 0.2924 0.0096 7.2512 16-14-1* 268 0.8590 0.2995 0.0099 7.3304 16-14-1* 268 0.8912 0.3543 0.0116 8.6508 16-14-1* 268 0.8448 0.3914 0.0131 9.8265 16-14-1* 268 0.8422 0.4765 0.0155 12.7560 16-14-1* 268 0.8930 0.3928 0.0131 9.8143 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient Table A4.3(b): Results of 3 Layer neural networks for Sg. Ketil catchment using 50% of data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 16-14-1* 268 0.7890 0.4068 0.0133 10.8220 16-14-1* 268 0.8226 0.4122 0.0136 11.4731 16-14-1* 268 0.8698 0.3807 0.0125 10.8917 16-14-1* 268 0.8295 0.4319 0.0143 12.1284 16-14-1* 268 0.8151 0.4460 0.0146 12.1461 16-14-1* 268 0.8631 0.5147 0.0170 15.3070 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient 242 APPENDIX A Daily and hourly results of MLP model Table A4.3(c): Results of 3 Layer neural networks for Sg. Ketil catchment using 25% of data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 16-14-1* 268 0.8788 0.2871 0.0095 7.9621 16-14-1* 268 0.7350 0.6404 0.0212 13.5025 16-14-1* 268 0.7194 0.3385 0.0112 7.3006 16-14-1* 268 0.7536 0.7617 0.0252 17.1820 16-14-1* 268 0.6188 0.4652 0.0153 11.6471 16-14-1* 268 0.8131 0.8110 0.0266 16.8580 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient Table A4.4(a): Results of 4 Layer neural networks for Sg. Ketil catchment using 100% of data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 16-14-10-1* 414 0.8370 0.2871 0.0095 7.1466 16-14-10-1* 414 0.8430 0.3195 0.0106 7.6262 16-14-10-1* 414 0.8797 0.3391 0.0111 8.2829 16-14-10-1* 414 0.8279 0.4130 0.0138 9.9779 16-14-10-1* 414 0.8305 0.4536 0.0148 12.0876 16-14-10-1* 414 0.8830 0.4069 0.0135 9.4420 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient 243 APPENDIX A Daily and hourly results of MLP model Table A4.4(b): Results of 4 Layer neural networks for Sg. Ketil catchment using 50% of data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 16-14-10-1* 414 0.8139 0.3424 0.0113 9.2320 16-14-10-1* 414 0.8479 0.3483 0.0116 8.5922 16-14-10-1* 414 0.8654 0.2870 0.0094 6.4820 16-14-10-1* 414 0.8547 0.4336 0.0145 11.3320 16-14-10-1* 414 0.8117 0.3903 0.0127 9.9875 16-14-10-1* 414 0.8802 0.4341 0.0144 9.6348 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient Table A4.4(c): Results of 4 Layer neural networks for Sg. Ketil catchment using 25% of data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 16-14-10-1* 414 0.9198 0.2231 0.0074 5.9640 16-14-10-1* 414 0.7545 0.5803 0.0191 13.0057 16-14-10-1* 414 0.6800 0.3675 0.0121 8.3800 16-14-10-1* 414 0.8241 0.5906 0.0195 14.5750 16-14-10-1* 414 0.5409 0.5000 0.0164 12.5944 16-14-10-1* 414 0.8699 0.6682 0.0218 15.7563 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient 244 APPENDIX A Daily and hourly results of MLP model Table A4.5(a): Results of 3 Layer neural networks for Sg. Klang catchment using 100% of data sets in training phase MODEL Model No. of COC RMSE Data Set Structure Parameter ( R2 ) (cumecs) 17-13-1* 264 0.8203 6.3820 0.3693 25.7937 17-13-1* 264 0.8045 8.0723 0.3845 27.2468 17-13-1* 264 0.8562 7.4985 0.3011 23.3682 17-13-1* 264 0.7974 7.6160 0.3977 28.7316 17-13-1* 264 0.8465 8.9923 0.2582 21.3788 17-13-1* 264 0.7966 7.6309 0.5251 39.6970 MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 RRMSE MAPE (%) (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient Table A4.5(b): Results of 3 Layer neural networks for Sg. Klang catchment using 50% of data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 17-13-1* 264 0.8817 5.0103 0.3553 25.6967 17-13-1* 264 0.7967 8.8230 0.3670 27.4527 17-13-1* 264 0.8594 7.3548 0.2601 19.7716 17-13-1* 264 0.7766 8.5181 0.3872 29.4335 17-13-1* 264 0.8493 9.0087 0.2343 19.3385 17-13-1* 264 0.7606 8.0567 0.4854 37.2270 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient 245 APPENDIX A Daily and hourly results of MLP model Table A4.5(c): Results of 3 Layer neural networks for Sg. Klang catchment using 25% of data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 17-13-1* 264 0.9384 3.7962 0.2548 17.6230 17-13-1* 264 0.7568 9.3368 0.4664 31.8791 17-13-1* 264 0.8176 10.1187 0.3905 28.4961 17-13-1* 264 0.7201 8.9428 0.4771 32.3693 17-13-1* 264 0.7844 13.0202 0.4101 30.1309 17-13-1* 264 0.6352 10.6431 0.6465 46.5411 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient Table A4.6(a): Results of 4 Layer neural networks for Sg. Klang catchment using 100% of data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 17-13-9-1* 386 0.8077 6.5784 0.3739 26.1677 17-13-9-1* 386 0.8045 8.0770 0.3805 26.6066 17-13-9-1* 386 0.8704 7.2422 0.3007 23.1680 17-13-9-1* 386 0.7989 7.5862 0.3948 27.9979 17-13-9-1* 386 0.8677 8.3630 0.2342 18.2616 17-13-9-1* 386 0.8143 7.3068 0.5199 38.0870 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient 246 APPENDIX A Daily and hourly results of MLP model Table A4.6(b): Results of 4 Layer neural networks for Sg. Klang catchment using 50% of data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 17-13-9-1* 386 0.8654 5.3202 0.3584 26.4449 17-13-9-1* 386 0.8020 8.7398 0.3542 26.3884 17-13-9-1* 386 0.8687 7.1217 0.2509 19.4906 17-13-9-1* 386 0.7921 8.3018 0.3712 27.9545 17-13-9-1* 386 0.8661 8.5737 0.2217 17.4775 17-13-9-1* 386 0.7977 7.4976 0.4623 35.0263 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient Table A4.6(c): Results of 4 Layer neural networks for Sg. Klang catchment using 25% of data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 17-13-9-1* 386 0.9213 4.3142 0.3159 21.1549 17-13-9-1* 386 0.7914 8.5172 0.4737 31.4957 17-13-9-1* 386 0.8119 11.0415 0.4383 32.1990 17-13-9-1* 386 0.7855 7.7986 0.4846 32.7049 17-13-9-1* 386 0.7767 14.2847 0.4515 30.7377 17-13-9-1* 386 0.7820 8.7751 0.6667 49.0089 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient 247 APPENDIX A Daily and hourly results of MLP model Table A4.7(a): Results of 3 Layer neural networks for Sg. Slim catchment using 100% of data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 16-14-1* 268 0.8128 0.0564 0.0008 5.6743 16-14-1* 268 0.8838 0.0169 0.0002 1.6532 16-14-1* 268 0.7395 0.0249 0.0004 2.9419 16-14-1* 268 0.8714 0.0213 0.0003 2.1097 16-14-1* 268 0.8596 0.0118 0.0002 1.4608 16-14-1* 268 0.8205 0.0269 0.0004 8.8920 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient Table A4.7(b): Results of 3 Layer neural networks for Sg. Slim catchment using 50% of data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 16-14-1* 268 0.9114 0.0292 0.0004 3.0765 16-14-1* 268 0.8530 0.0157 0.0002 1.7289 16-14-1* 268 0.8258 0.0142 0.0002 1.6508 16-14-1* 268 0.8365 0.0190 0.0003 2.0100 16-14-1* 268 0.5837 0.0135 0.0002 1.6424 16-14-1* 268 0.8204 0.0220 0.0003 2.3775 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient 248 APPENDIX A Daily and hourly results of MLP model Table A4.7(c): Results of 3 Layer neural networks for Sg. Slim catchment using 25% of data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 16-14-1* 268 0.8816 0.0403 0.0006 5.0678 16-14-1* 268 0.8055 0.0196 0.0003 2.3508 16-14-1* 268 0.6247 0.0160 0.0002 1.9067 16-14-1* 268 0.7954 0.0221 0.0003 2.5255 16-14-1* 268 0.6377 0.0131 0.0002 1.6342 16-14-1* 268 0.6855 0.0260 0.0004 2.9890 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient Table A4.8(a): Results of 4 Layer neural networks for Sg. Slim catchment using 100% of data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 16-14-11-1* 430 0.8026 0.0577 0.0009 5.9499 16-14-11-1* 430 0.9072 0.0157 0.0002 1.5960 16-14-11-1* 430 0.6880 0.0145 0.0002 1.5723 16-14-11-1* 430 0.9054 0.0195 0.0003 1.9809 16-14-11-1* 430 0.6037 0.0141 0.0002 1.6277 16-14-11-1* 430 0.8756 0.0251 0.0004 2.7608 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient 249 APPENDIX A Daily and hourly results of MLP model Table A4.8(b): Results of 4 Layer neural networks for Sg. Slim catchment using 50% of data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 16-14-11-1* 430 0.8919 0.0315 0.0005 2.9798 16-14-11-1* 430 0.8983 0.0145 0.0002 1.5466 16-14-11-1* 430 0.7397 0.0141 0.0002 1.6155 16-14-11-1* 430 0.8941 0.0176 0.0003 1.7630 16-14-11-1* 430 0.7087 0.0117 0.0002 1.4508 16-14-11-1* 430 0.8730 0.0222 0.0003 2.4443 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient Table A4.8(c): Results of 4 Layer neural networks for Sg. Slim catchment using 25% of data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 16-14-11-1* 430 0.8747 0.0313 0.0005 2.9290 16-14-11-1* 430 0.8874 0.0159 0.0002 1.6534 16-14-11-1* 430 0.7297 0.0140 0.0002 1.5222 16-14-11-1* 430 0.8857 0.0195 0.0003 2.0200 16-14-11-1* 430 0.6976 0.0121 0.0002 1.4887 16-14-11-1* 430 0.8635 0.0238 0.0004 2.6554 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient 250 APPENDIX A Daily and hourly results of MLP model B. Hourly results Table A4.11(a): Results of 3 Layer neural networks for Sg. Ketil catchment – using 100% of available data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 6-4-1* 38 0.9904 0.0659 0.0022 9.4885 6-4-1* 38 0.9930 0.0260 0.0008 5.6790 6-4-1* 38 0.9932 0.0288 0.0009 4.1704 6-4-1* 38 0.9762 0.0554 0.0018 11.8142 6-4-1* 38 0.9852 0.0487 0.0016 9.9654 6-4-1* 38 0.9830 0.0895 0.0029 1.4510 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient Table A4.11(b): Results of 3 Layer neural networks for Sg. Ketil catchment – using 65% of available data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 6-4-1* 38 0.9914 0.0600 0.0020 0.9080 6-4-1* 38 0.9932 0.0262 0.0008 0.5915 6-4-1* 38 0.9933 0.0292 0.0010 0.3853 6-4-1* 38 0.9765 0.0559 0.0018 1.0960 6-4-1* 38 0.9851 0.0487 0.0016 1.0435 6-4-1* 38 0.9831 0.0899 0.0029 1.4042 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient 251 APPENDIX A Daily and hourly results of MLP model Table A4.11(c): Results of 3 Layer neural networks for Sg. Ketil catchment – using 25% of available data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 6-4-1* 38 0.9812 0.0624 0.0021 1.1621 6-4-1* 38 0.9915 0.0420 0.0014 1.2127 6-4-1* 38 0.9932 0.0295 0.0010 0.4285 6-4-1* 38 0.9769 0.0550 0.0018 1.0609 6-4-1* 38 0.9849 0.0484 0.0016 1.0070 6-4-1* 38 0.9830 0.0895 0.0029 1.4190 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient Table A4.12(a): Results of 4 Layer neural networks for Sg. Ketil catchment – using 100% of available data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 6-4-8-1* 82 0.9901 0.0671 0.0022 0.9828 6-4-8-1* 82 0.9926 0.0265 0.0008 0.6166 6-4-8-1* 82 0.9928 0.0294 0.0010 0.4564 6-4-8-1* 82 0.9764 0.0551 0.0018 1.1440 6-4-8-1* 82 0.9842 0.0499 0.0016 1.1090 6-4-8-1* 82 0.9806 0.0979 0.0032 1.9667 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient 252 APPENDIX A Daily and hourly results of MLP model Table A4.12(b): Results of 4 Layer neural networks for Sg. Ketil catchment – using 65% of available data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 6-4-8-1* 82 0.9915 0.0595 0.0020 0.9018 6-4-8-1* 82 0.9922 0.0272 0.0009 0.6474 6-4-8-1* 82 0.9914 0.0324 0.0011 0.5117 6-4-8-1* 82 0.9020 0.1294 0.0043 8.9520 6-4-8-1* 82 0.8337 0.1663 0.0560 2.6960 6-4-8-1* 82 0.9592 0.1346 0.0045 3.1456 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient Table A4.12(c): Results of 4 Layer neural networks for Sg. Ketil catchment – using 25% of available data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 6-4-8-1* 82 0.9843 0.0564 0.0019 0.8250 6-4-8-1* 82 0.9919 0.0292 0.0009 0.6806 6-4-8-1* 82 0.9775 0.0576 0.0019 1.1043 6-4-8-1* 82 0.9443 0.0819 0.0027 1.4669 6-4-8-1* 82 0.9844 0.0504 0.0017 1.0686 6-4-8-1* 82 0.9828 0.0897 0.0029 1.5360 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient 253 APPENDIX A Daily and hourly results of MLP model Table A4.13(a): Results of 3 Layer neural networks for Sg. Klang catchment – using 100% of available data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 8-6-1* 68 0.8638 9.8268 0.2862 14.5678 8-6-1* 68 0.8588 9.8049 0.2134 11.2398 8-6-1* 68 0.9166 4.6828 0.0929 3.7779 8-6-1* 68 0.8743 9.0456 0.2424 12.5075 8-6-1* 68 0.8863 13.4652 0.2068 9.4064 8-6-1* 68 0.8235 6.9646 0.1164 7.0568 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient Table A4.13(b): Results of 3 Layer neural networks for Sg. Klang catchment – using 70% of available data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 8-6-1* 68 0.8586 10.9196 0.3045 14.2968 8-6-1* 68 0.8619 9.5673 0.2125 13.3512 8-6-1* 68 0.9085 4.8491 0.0991 5.3344 8-6-1* 68 0.8550 9.6491 0.2966 14.2024 8-6-1* 68 0.8862 13.4952 0.2180 11.3032 8-6-1* 68 0.7922 8.8336 0.2154 18.0782 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient 254 APPENDIX A Daily and hourly results of MLP model Table A4.13(c): Results of 3 Layer neural networks for Sg. Klang catchment – using 40% of available data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 8-6-1* 68 0.8682 7.4827 0.2561 10.2560 8-6-1* 68 0.8638 9.2846 0.2055 11.5943 8-6-1* 68 0.9217 4.7954 0.0967 4.8928 8-6-1* 68 0.8909 8.2057 0.3064 10.5765 8-6-1* 68 0.8855 14.5339 0.1537 8.5149 8-6-1* 68 0.7672 8.1017 0.1388 8.6668 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient Table A4.14(a): Results of 4 Layer neural networks for Sg. Klang catchment – using 100% of available data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 8-6-7-1* 118 0.8636 9.8162 0.2611 10.5408 8-6-7-1* 118 0.8587 9.6108 0.2144 11.9465 8-6-7-1* 118 0.9355 4.1073 0.0819 4.1601 8-6-7-1* 118 0.8987 7.9272 0.2554 11.5848 8-6-7-1* 118 0.8781 13.9002 0.2246 10.0978 8-6-7-1* 118 0.8098 7.6145 0.1237 5.9622 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient 255 APPENDIX A Daily and hourly results of MLP model Table A4.14(b): Results of 4 Layer neural networks for Sg. Klang catchment – using 65% of available data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 8-6-7-1* 118 0.8456 11.3374 0.2954 12.3124 8-6-7-1* 118 0.8546 9.5754 0.2258 13.1625 8-6-7-1* 118 0.7653 10.5593 0.2551 14.9596 8-6-7-1* 118 0.6741 21.3165 0.7847 53.7492 8-6-7-1* 118 0.7882 35.8792 0.4956 26.8303 8-6-7-1* 118 0.4959 12.2470 0.3496 24.8221 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient Table A4.14(c): Results of 4 Layer neural networks for Sg. Klang catchment – using 30% of available data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 8-6-7-1* 118 0.8837 6.9320 0.2127 8.9768 8-6-7-1* 118 0.9166 7.5062 0.1908 9.7557 8-6-7-1* 118 0.9132 4.8488 0.1019 7.6912 8-6-7-1* 118 0.7226 11.8962 0.3431 14.7521 8-6-7-1* 118 0.8238 15.7230 0.2648 14.0124 8-6-7-1* 118 0.8410 6.7002 0.1132 5.3757 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient 256 APPENDIX A Daily and hourly results of MLP model Table A4.15(a): Results of 3 Layer neural networks for Sg. Slim catchment – using 100% of available data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 7-5-1* 52 0.9894 0.0498 0.0020 0.9618 7-5-1* 52 0.9542 0.0367 0.0015 0.7475 7-5-1* 52 0.9712 0.0563 0.0023 1.2782 7-5-1* 52 0.9893 0.0556 0.0022 1.3100 7-5-1* 52 0.9921 0.0426 0.0017 0.7290 7-5-1* 52 0.9877 0.0382 0.0015 0.9378 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient Table A4.15(b): Results of 3 Layer neural networks for Sg. Slim catchment – using 65% of available data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 7-5-1* 52 0.9889 0.0502 0.0020 0.8868 7-5-1* 52 0.9547 0.0364 0.0015 0.7134 7-5-1* 52 0.9712 0.0563 0.0023 1.2612 7-5-1* 52 0.9902 0.0534 0.0021 1.2589 7-5-1* 52 0.9914 0.0441 0.0018 0.7479 7-5-1* 52 0.9299 0.0973 0.0039 2.5545 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient 257 APPENDIX A Daily and hourly results of MLP model Table A4.15(c): Results of 3 Layer neural networks for Sg. Slim catchment – using 30% of available data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 7-5-1* 52 0.9859 0.0466 0.0019 0.7567 7-5-1* 52 0.9537 0.0364 0.0015 0.6813 7-5-1* 52 0.9656 0.0657 0.0026 1.8349 7-5-1* 52 0.9906 0.0516 0.0020 1.2090 7-5-1* 52 0.9904 0.0460 0.0018 0.8350 7-5-1* 52 0.9880 0.0378 0.0015 0.9184 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient Table A4.16(a): Results of 4 Layer neural networks for Sg. Slim catchment – using 100% of available data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 7-5-9-1* 110 0.9846 0.0597 0.0024 1.3368 7-5-9-1* 110 0.9408 0.0452 0.0018 1.1728 7-5-9-1* 110 0.9703 0.0597 0.0024 1.3872 7-5-9-1* 110 0.9884 0.0594 0.0024 1.4710 7-5-9-1* 110 0.9846 0.0663 0.0027 1.8754 7-5-9-1* 110 0.9844 0.0457 0.0018 1.2596 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient 258 APPENDIX A Daily and hourly results of MLP model Table A4.16(b): Results of 4 Layer neural networks for Sg. Slim catchment – using 65% of available data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 7-5-9-1* 110 0.9864 0.0554 0.0022 1.0898 7-5-9-1* 110 0.9456 0.0406 0.0016 0.9034 7-5-9-1* 110 0.7163 0.1790 0.0072 4.0341 7-5-9-1* 110 0.9882 0.0589 0.0023 1.4146 7-5-9-1* 110 0.6500 0.2898 0.0115 6.8653 7-5-9-1* 110 0.6837 0.2103 0.0083 4.2920 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient Table A4.16(c): Results of 4 Layer neural networks for Sg. Slim catchment – using 30% of available data sets in training phase MODEL Data Set MLP TRAINING MLP-TEST Set 1 MLP-TEST Set 2 MLP-TEST Set 3 MLP-TEST Set 4 MLP-TEST Set 5 Model Structure No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 7-5-9-1* 110 0.8881 0.1249 0.0051 1.7640 7-5-9-1* 110 0.7520 0.0906 0.0037 1.2845 7-5-9-1* 110 0.6588 0.1909 0.0077 4.6483 7-5-9-1* 110 0.4226 0.5048 0.0198 13.9979 7-5-9-1* 110 0.0128 0.4924 0.0195 12.4315 7-5-9-1* 110 0.7045 0.2010 0.0080 3.3840 (*)-input nodes-hidden nodes-output nodes; cumecs-meter cubic second; COC-correlation of coefficient 259 APPENDIX B Daily and hourly results of RBF model Part A: Daily results Table B4.18(a): Results of RBF networks for Sg. Ketil catchment using 100% of data sets in training phase MODEL Data Set Model Structure RBF TRAINING RBF-TEST Set 1 RBF-TEST Set 2 RBF-TEST Set 3 RBF-TEST Set 4 RBF-TEST Set 5 16 input nodes 16 input nodes 16 input nodes 16 input nodes 16 input nodes 16 input nodes No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 52 0.9460 0.1717 0.0057 3.6483 52 0.8645 0.2987 0.0099 7.0205 52 0.8225 0.4076 0.0133 9.6987 52 0.8764 0.3430 0.0115 7.7610 52 0.7593 0.5432 0.0176 14.2859 52 0.9367 0.3015 0.0101 6.4010 cumecs-meter cubic second; COC-correlation of coefficient Table B4.18(b): Results of RBF networks for Sg. Ketil catchment using 50% of data sets in training phase MODEL Data Set RBF TRAINING RBF-TEST Set 1 RBF-TEST Set 2 RBF-TEST Set 3 RBF-TEST Set 4 RBF-TEST Set 5 Model Structure 16 input nodes 16 input nodes 16 input nodes 16 input nodes 16 input nodes 16 input nodes No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 52 0.9689 0.1383 0.0046 2.6770 52 0.8246 0.3445 0.0114 8.3736 52 0.8083 0.4171 0.0136 10.0984 52 0.8297 0.3971 0.0132 9.1910 52 0.7462 0.5499 0.0179 14.3318 52 0.9330 0.3177 0.0106 7.6435 cumecs-meter cubic second; COC-correlation of coefficient 260 APPENDIX B Daily and hourly results of RBF model Table B4.18(c): Results of RBF networks for Sg. Ketil catchment using 25% of data sets in training phase MODEL Data Set Model Structure RBF TRAINING RBF-TEST Set 1 RBF-TEST Set 2 RBF-TEST Set 1 RBF-TEST Set 2 RBF-TEST Set 3 16 input nodes 16 input nodes 16 input nodes 16 input nodes 16 input nodes 16 input nodes No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 52 0.9722 0.1315 0.0044 2.2279 52 0.7749 0.4742 0.0157 12.3495 52 0.7915 0.3816 0.0125 10.0193 52 0.7670 0.5593 0.0186 14.0947 52 0.7318 0.4632 0.0151 12.1967 52 0.8915 0.5012 0.0165 14.2769 cumecs-meter cubic second; COC-correlation of coefficient Table B4.19(a): Results of RBF networks for Sg. Klang catchment using 100% of data sets in training phase MODEL Data Set Model Structure RBF TRAINING RBF-TEST Set 1 RBF-TEST Set 2 RBF-TEST Set 3 RBF-TEST Set 4 RBF-TEST Set 5 17 input nodes 17 input nodes 17 input nodes 17 input nodes 17 input nodes 17 input nodes No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 55 0.8462 6.2737 0.3323 24.0405 55 0.6623 10.6638 0.4359 30.5000 55 0.7948 10.1066 0.3342 27.9063 55 0.7501 8.7609 0.4416 29.5988 55 0.7565 12.8041 0.2990 25.2004 55 0.6938 8.9530 0.5847 39.4501 cumecs-meter cubic second; COC-correlation of coefficient 261 APPENDIX B Daily and hourly results of RBF model Table B4.19(b): Results of RBF networks for Sg. Klang catchment using 50% of data sets in training phase MODEL Data Set RBF TRAINING RBF-TEST Set 1 RBF-TEST Set 2 RBF-TEST Set 3 RBF-TEST Set 4 RBF-TEST Set 5 Model Structure 17 input nodes 17 input nodes 17 input nodes 17 input nodes 17 input nodes 17 input nodes No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 55 0.9205 4.3089 0.2776 20.1004 55 0.6776 11.3620 0.3484 27.2397 55 0.7751 10.5567 0.2968 24.3221 55 0.7559 9.5698 0.3414 26.7941 55 0.7274 13.4357 0.2948 24.2662 55 0.7401 10.1590 0.4054 33.1348 cumecs-meter cubic second; COC-correlation of coefficient Table B4.19(c): Results of RBF networks for Sg. Klang catchment using 25% of data sets in training phase MODEL Data Set RBF TRAINING RBF-TEST Set 1 RBF-TEST Set 2 RBF-TEST Set 3 RBF-TEST Set 4 RBF-TEST Set 5 Model Structure 17 input nodes 17 input nodes 17 input nodes 17 input nodes 17 input nodes 17 input nodes No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 55 0.9392 3.9291 0.2449 15.8175 55 0.7112 9.7425 0.3685 26.7373 55 0.7400 9.6423 0.3417 27.7973 55 0.7772 8.0970 0.3640 25.0004 55 0.6616 12.4606 0.3506 28.0038 55 0.7205 8.9022 0.4771 34.8004 cumecs-meter cubic second; COC-correlation of coefficient 262 APPENDIX B Daily and hourly results of RBF model Table B4.20(a): Results of RBF networks for Sg. Slim catchment using 100% of data sets in training phase MODEL Data Set Model Structure RBF TRAINING RBF-TEST Set 1 RBF-TEST Set 2 RBF-TEST Set 3 RBF-TEST Set 4 RBF-TEST Set 5 16 input nodes 16 input nodes 16 input nodes 16 input nodes 16 input nodes 16 input nodes No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 52 0.8689 0.0509 0.0008 5.3167 52 0.7816 0.0217 0.0003 2.4099 52 0.6302 0.0186 0.0003 2.0171 52 0.7702 0.0259 0.0004 2.8813 52 0.5478 0.0196 0.0003 2.2808 52 0.7538 0.0312 0.0005 3.7190 cumecs-meter cubic second; COC-correlation of coefficient Table B4.20(b): Results of RBF networks for Sg. Slim catchment using 50% of data sets in training phase MODEL Data Set RBF TRAINING RBF-TEST Set 1 RBF-TEST Set 2 RBF-TEST Set 3 RBF-TEST Set 4 RBF-TEST Set 5 Model Structure 16 input nodes 16 input nodes 16 input nodes 16 input nodes 16 input nodes 16 input nodes No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 52 0.9479 0.0229 0.0003 2.1907 52 0.8073 0.0246 0.0004 2.6206 52 0.5945 0.0176 0.0003 1.9221 52 0.7911 0.0315 0.0005 3.4534 52 0.4092 0.0183 0.0003 2.2090 52 0.8046 0.0332 0.0005 3.9468 cumecs-meter cubic second; COC-correlation of coefficient 263 APPENDIX B Daily and hourly results of RBF model Table B4.20(c): Results of RBF networks for Sg. Slim catchment using 25% of data sets in training phase MODEL Data Set Model Structure RBF TRAINING RBF-TEST Set 1 RBF-TEST Set 2 RBF-TEST Set 3 RBF-TEST Set 4 RBF-TEST Set 5 16 input nodes 16 input nodes 16 input nodes 16 input nodes 16 input nodes 16 input nodes No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 52 0.9521 0.0204 0.0003 1.6890 52 0.7915 0.0264 0.0004 2.8865 52 0.5654 0.0190 0.0003 2.1240 52 0.8168 0.0332 0.0005 3.8255 52 0.3463 0.0192 0.0003 2.2851 52 0.8052 0.0334 0.0005 4.0113 cumecs-meter cubic second; COC-correlation of coefficient Part B: Hourly results Table B4.22(a): Results of RBF networks for Sg. Ketil catchment – using 20% of available data sets in training phase MODEL Data Set RBF TRAINING RBF-TEST Set 1 RBF-TEST Set 2 RBF-TEST Set 3 RBF-TEST Set 4 RBF-TEST Set 5 Model Structure 6 input nodes 6 input nodes 6 input nodes 6 input nodes 6 input nodes 6 input nodes No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 22 0.9970 0.0249 0.0008 0.1749 22 0.9987 0.0112 0.0004 0.1064 22 0.9982 0.0150 0.0005 0.1608 22 0.9965 0.0215 0.0007 0.3010 22 0.9972 0.0210 0.0007 0.2574 22 0.9968 0.0387 0.0013 0.3535 cumecs-meter cubic second; COC-correlation of coefficient 264 APPENDIX B Daily and hourly results of RBF model Table B4.22(b): Results of RBF networks for Sg. Ketil catchment – using minimum data sets in training phase MODEL Data Set RBF TRAINING RBF-TEST Set 1 RBF-TEST Set 2 RBF-TEST Set 3 RBF-TEST Set 4 RBF-TEST Set 5 Model Structure 6 input nodes 6 input nodes 6 input nodes 6 input nodes 6 input nodes 6 input nodes No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 22 0.9921 0.0181 0.0006 0.8890 22 0.9999 0.0032 0.0001 0.0226 22 0.9998 0.0051 0.0002 0.0425 22 0.9975 0.0182 0.0006 0.1678 22 0.9992 0.0114 0.0004 0.1159 22 0.9997 0.0117 0.0004 0.1094 cumecs-meter cubic second; COC-correlation of coefficient Table B4.23(a): Results of RBF networks for Sg. Klang catchment – using 40% of available data sets in training phase MODEL Data Set RBF TRAINING RBF-TEST Set 1 RBF-TEST Set 2 RBF-TEST Set 3 RBF-TEST Set 4 RBF-TEST Set 5 Model Structure 8 input nodes 8 input nodes 8 input nodes 8 input nodes 8 input nodes 8 input nodes No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 28 0.9999 0.2077 0.0172 0.6618 28 1.000 0.0005 0.000 0.0004 28 1.000 0.0005 0.0006 0.0007 28 1.000 0.0221 0.0015 0.1698 28 1.000 0.0004 0.0004 0.0006 28 1.000 0.0002 0.0002 0.0008 cumecs-meter cubic second; COC-correlation of coefficient 265 APPENDIX B Daily and hourly results of RBF model Table B4.23(b): Results of RBF networks for Sg. Klang catchment – using minimum data sets in training phase MODEL Data Set RBF TRAINING RBF-TEST Set 1 RBF-TEST Set 2 RBF-TEST Set 3 RBF-TEST Set 4 RBF-TEST Set 5 Model Structure 8 input nodes 8 input nodes 8 input nodes 8 input nodes 8 input nodes 8 input nodes No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 28 1.000 0.0003 0.0002 0.0002 28 1.000 0.0004 0.0001 0.0004 28 1.000 0.0007 0.0002 0.0007 28 1.000 0.0006 0.0003 0.0002 28 1.000 0.0004 0.0001 0.0009 28 1.000 0.0005 0.0002 0.0004 cumecs-meter cubic second; COC-correlation of coefficient Table B4.24(a): Results of RBF networks for Sg. Slim catchment – using 30% of available data sets in training phase MODEL Data Set RBF TRAINING RBF-TEST Set 1 RBF-TEST Set 2 RBF-TEST Set 3 RBF-TEST Set 4 RBF-TEST Set 5 Model Structure 7 input nodes 7 input nodes 7 input nodes 7 input nodes 7 input nodes 7 input nodes No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 25 0.9938 0.0309 0.0013 0.3309 25 0.9814 0.0231 0.0009 0.3906 25 0.9972 0.0176 0.0007 0.2884 25 0.9981 0.0234 0.0009 0.2536 25 0.9995 0.0110 0.0004 0.2091 25 0.9974 0.0177 0.0007 0.3210 cumecs-meter cubic second; COC-correlation of coefficient 266 APPENDIX B Daily and hourly results of RBF model Table B4.24(b): Results of RBF networks for Sg. Slim catchment – using minimum data sets in training phase MODEL Data Set RBF TRAINING RBF-TEST Set 1 RBF-TEST Set 2 RBF-TEST Set 3 RBF-TEST Set 4 RBF-TEST Set 5 Model Structure 7 input nodes 7 input nodes 7 input nodes 7 input nodes 7 input nodes 7 input nodes No. of Parameter COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 25 0.9976 0.0171 0.0007 0.2018 25 0.9889 0.0179 0.0007 0.2455 25 0.9992 0.0094 0.0004 0.1299 25 0.9989 0.0179 0.0007 0.1604 25 0.9998 0.0066 0.0003 0.0793 25 0.9994 0.0083 0.0003 0.1339 cumecs-meter cubic second; COC-correlation of coefficient 267 APPENDIX C Results of application of MLR model Table C4.26(a): Results of MLR Model for Sg. Ketil catchment – using 100% of data sets in training phase MODEL Model No. of Data Set Structure Parameter MLR TRAINING 16 input 6 MLR-TEST Set 1 16 input 6 MLR-TEST Set 2 16 input 6 MLR-TEST Set 3 16 input 6 MLR-TEST Set 4 16 input 6 MLR-TEST Set 5 16 input 6 COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 0.6016 19.5476 0.6562 59.1316 0.6783 19.9696 0.6719 60.4182 0.7681 20.4595 0.6871 63.1152 0.6936 19.9418 0.6725 59.3665 0.6742 16.0532 0.5334 45.4869 0.7686 21.4246 0.7204 63.6091 cumecs-meter cubic second; COC-correlation of coefficient Table C4.26(b): Results of MLR Model for Sg. Ketil catchment – using 50% of data sets in training phase MODEL Model No. of Data Set Structure Parameter MLR TRAINING 16 input 6 MLR-TEST Set 1 16 input 6 MLR-TEST Set 2 16 input 6 MLR-TEST Set 3 16 input 6 MLR-TEST Set 4 16 input 6 MLR-TEST Set 5 16 input 6 COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 0.6207 18.8502 0.6305 55.5948 0.6682 19.6808 0.6622 59.3300 0.7376 20.2302 0.6794 62.6161 0.7012 19.6150 0.6616 57.7664 0.6411 15.7738 0.5242 45.1279 0.7783 21.2494 0.7144 62.7848 cumecs-meter cubic second; COC-correlation of coefficient 268 APPENDIX C Results of application of MLR model Table C4.26(c): Results of MLR Model for Sg. Ketil catchment – using 25% of data sets in training phase MODEL Model No. of Data Set Structure Parameter MLR TRAINING 14 input 6 MLR-TEST Set 1 14 input 6 MLR-TEST Set 2 14 input 6 MLR-TEST Set 3 14 input 6 MLR-TEST Set 4 14 input 6 MLR-TEST Set 5 14 input 6 COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 0.6737 18.4978 0.6204 53.3926 0.6045 19.9485 0.6703 60.1075 0.6970 20.6432 0.6922 64.3279 0.6297 19.8242 0.6677 58.2319 0.5933 17.3075 0.5732 50.4693 0.6932 21.7474 0.7295 64.9861 cumecs-meter cubic second; COC-correlation of coefficient Table C4.27(a): Results of MLR Model for Sg. Klang catchment – using 100% of data sets in training phase MODEL Model No. of Data Set Structure Parameter MLR TRAINING 14 input 6 MLR-TEST Set 1 14 input 6 MLR-TEST Set 2 14 input 6 MLR-TEST Set 3 14 input 6 MLR-TEST Set 4 14 input 6 MLR-TEST Set 5 14 input 6 COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 0.6223 9.01273 0.5319 44.6148 0.6561 10.4614 0.5480 46.4188 0.7899 11.6684 0.5791 50.0855 0.6478 9.4858 0.5104 42.2342 0.8068 13.0131 0.4795 42.1463 0.6886 9.0047 0.4984 38.5141 cumecs-meter cubic second; COC-correlation of coefficient 269 APPENDIX C Results of application of MLR model Table C4.27(b): Results of MLR Model for Sg. Klang catchment – using 50% of data sets in training phase MODEL Model No. of Data Set Structure Parameter MLR TRAINING MLR-TEST Set 1 MLR-TEST Set 2 MLR-TEST Set 3 MLR-TEST Set 4 MLR-TEST Set 5 COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 14 input 6 0.7000 7.5302 0.5251 42.0505 14 input 6 0.6526 10.5988 0.5496 46.1864 14 input 6 0.7831 11.5517 0.5719 49.6184 14 input 6 0.6092 9.9586 0.5281 42.5334 14 input 6 0.8106 12.6366 0.4618 40.7592 14 input 6 0.6732 9.3088 0.5379 39.3836 cumecs-meter cubic second; COC-correlation of coefficient Table C4.27(c): Results of MLR Model for Sg. Klang catchment – using 25% of data sets in training phase MODEL Model No. of Data Set Structure Parameter MLR TRAINING 14 input 6 MLR-TEST Set 1 14 input 6 MLR-TEST Set 2 14 input 6 MLR-TEST Set 3 14 input 6 MLR-TEST Set 4 14 input 6 MLR-TEST Set 5 14 input 6 COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 0.7460 7.4099 0.5079 38.2439 0.6565 10.2078 0.5453 44.4387 0.7852 10.7336 0.5473 46.2175 0.6175 10.0458 0.5480 42.6807 0.8071 11.5182 0.4312 36.7242 0.6581 10.5492 0.6039 42.7946 cumecs-meter cubic second; COC-correlation of coefficient 270 APPENDIX C Results of application of MLR model Table C4.28(a): Results of MLR Model for Sg. Slim catchment – using 100% of data sets in training phase MODEL Model No. of Data Set Structure Parameter MLR TRAINING 16 input 6 MLR-TEST Set 1 16 input 6 MLR-TEST Set 2 16 input 6 MLR-TEST Set 3 16 input 6 MLR-TEST Set 4 16 input 6 MLR-TEST Set 5 16 input 6 COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 0.5849 127.1532 1.9319 13.2305 0.8808 201.4210 3.0635 23.0220 0.5501 136.4133 2.0759 16.0203 0.8852 255.2275 3.8814 30.7050 0.4277 160.5863 2.4439 19.1346 0.8525 253.1943 3.8507 31.0617 cumecs-meter cubic second; COC-correlation of coefficient Table C4.28(b): Results of MLR Model for Sg. Slim catchment – using 50% of data sets in training phase MODEL Model No. of Data Set Structure Parameter MLR TRAINING 16 input 6 MLR-TEST Set 1 16 input 6 MLR-TEST Set 2 16 input 6 MLR-TEST Set 3 16 input 6 MLR-TEST Set 4 16 input 6 MLR-TEST Set 5 16 input 6 COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 0.7849 144.1439 2.1923 16.0407 0.8140 217.8754 3.3139 25.0923 0.5289 149.0955 2.2689 17.7358 0.8131 275.4002 4.1882 33.1587 0.4527 174.0586 2.6489 21.0313 0.7794 272.6427 4.1467 33.7076 cumecs-meter cubic second; COC-correlation of coefficient 271 APPENDIX C Results of application of MLR model Table C4.28(c): Results of MLR Model for Sg. Slim catchment – using 25% of data sets in training phase MODEL Model No. of Data Set Structure Parameter MLR TRAINING MLR-TEST Set 1 MLR-TEST Set 2 MLR-TEST Set 3 MLR-TEST Set 4 MLR-TEST Set 5 COC ( R2 ) RMSE (cumecs) RRMSE MAPE (%) 15 input 6 0.7536 141.3654 2.1502 15.0923 15 input 6 0.7805 208.8311 3.1763 23.6218 15 input 6 0.5076 143.1880 2.1790 16.6663 15 input 6 0.7807 263.7675 4.0113 31.1748 15 input 6 0.4346 165.1244 2.5130 19.5123 15 input 6 0.7500 262.0962 3.9862 31.8572 cumecs-meter cubic second; COC-correlation of coefficient 272 APPENDIX D Daily and hourly results of the HEC-HMS model calibration Part A: Daily results Table D4.30(a): Calibration Coefficients of Sungai Ketil catchment (using 100% of data) Model parameter Calibrated value Constant Rate (mm/hr) 10 Imperviousness (%) 54 SCS Lag (minutes) 9985.16 Recession Constant 1 Threshold Flow (cumecs) 0.99 Table D4.30(b): Calibration Coefficients of Sungai Ketil catchment (using 50% data) Model parameter Calibrated value Constant Rate (mm/hr) 2 Imperviousness (%) 54 SCS Lag (minutes) 7785.20 Recession Constant 1 Threshold Flow (cumecs) 0.99 Table D4.30(c): Calibration Coefficients of Sungai Ketil catchment (using 25% data) Model parameter Calibrated value Constant Rate (mm/hr) 2 Imperviousness (%) 54 SCS Lag (minutes) 5420.80 Recession Constant 1 Threshold Flow (cumecs) 0.99 *Note: Another 2 parameters (catchment size & baseflow) are fixed in the model 273 APPENDIX D Daily and hourly results of the HEC-HMS model calibration Table D4.31(a): Calibration Coefficients of Sungai Klang catchment (using 100% of data) Model parameter Constant Rate (mm/hr) Calibrated value 1.95 Imperviousness (%) 78 SCS Lag (minutes) 2842.12 Recession Constant 1 Threshold Flow (cumecs) 0.998 Table D4.31(b): Calibration Coefficients of Sungai Klang catchment (using 50% data) Model parameter Constant Rate (mm/hr) Calibrated value 2.12 Imperviousness (%) 78 SCS Lag (minutes) 1807.50 Recession Constant 1 Threshold Flow (cumecs) 0.998 Table D4.31(c): Calibration Coefficients of Sungai Klang catchment (using 25% data) Model parameter Calibrated value Constant Rate (mm/hr) 1.7 Imperviousness (%) 78 SCS Lag (minutes) 1500.00 Recession Constant 1 Threshold Flow (cumecs) 0.998 *Note: Another 2 parameters (catchment size & baseflow) are fixed in the model 274 APPENDIX D Daily and hourly results of the HEC-HMS model calibration Table D4.32(a): Calibration Coefficients of Sungai Slim catchment (using 100% of data) Model parameter Calibrated value Constant Rate (mm/hr) 30 Imperviousness (%) 35 SCS Lag (minutes) 16362.26 Recession Constant 1 Threshold Flow (cumecs) 0.98 Table D4.32(b): Calibration Coefficients of Sungai Slim catchment (using 50% data) Model parameter Calibrated value Constant Rate (mm/hr) 20 Imperviousness (%) 35 SCS Lag (minutes) 11445.30 Recession Constant 1 Threshold Flow (cumecs) 0.98 Table D4.32(c): Calibration Coefficients of Sungai Slim catchment (using 25% data) Model parameter Calibrated value Constant Rate (mm/hr) 20 Imperviousness (%) 35 SCS Lag (minutes) 7995.95 Recession Constant 1 Threshold Flow (cumecs) 0.98 *Note: Another 2 parameters (catchment size & baseflow) are fixed in the model 275 APPENDIX D Daily and hourly results of the HEC-HMS model calibration Part B: Hourly results Table D4.33(a): Calibration Coefficients of Sungai Ketil catchment (using 25% of data) Model parameter Calibrated value Constant Rate (mm/hr) 34 Imperviousness (%) 52 SCS Lag (minutes) 2261.5 Recession Constant 1 Threshold Flow (cumecs) 0.992 Table D4.33(b): Calibration Coefficients of Sungai Ketil catchment (using minimum data) Model parameter Calibrated value Constant Rate (mm/hr) 34 Imperviousness (%) 56 SCS Lag (minutes) 2093.5515 Recession Constant 1 Threshold Flow (cumecs) 0.992 Table D4.34(a): Calibration Coefficients of Sungai Klang catchment (using 25% of data) Model parameter Calibrated value Constant Rate (mm/hr) 82 Imperviousness (%) 82 Time of Concentration (hr) 5 Storage Coefficient (hr) 17 Recession Constant 1 Threshold Flow (cumecs) 0.995 *Note: Another 2 parameters (catchment size & baseflow) are fixed in the model 276 APPENDIX D Daily and hourly results of the HEC-HMS model calibration Table D4.34(b): Calibration Coefficients of Sungai Klang catchment (using minimum data) Model parameter Calibrated value Constant Rate (mm/hr) 81 Imperviousness (%) 82 Time of Concentration (hr) 7 Storage Coefficient (hr) 10 Recession Constant 1 Threshold Flow (cumecs) 0.995 Table D4.35(a): Calibration Coefficients of Sungai Slim catchment (using 25% of data) Model parameter Calibrated value Constant Rate (mm/hr) 55 Imperviousness (%) 38 Time of Concentration (hr) 35 Storage Coefficient (hr) 70 Recession Constant 1 Threshold Flow (cumecs) 0.985 Table D4.35(b): Calibration Coefficients of Sungai Slim catchment (using minimum data) Model parameter Calibrated value Constant Rate (mm/hr) 40 Imperviousness (%) 38 Time of Concentration (hr) 22.6 Storage Coefficient (hr) 55 Recession Constant 1 Threshold Flow (cumecs) 0.985 *Note: Another 2 parameters (catchment size & baseflow) are fixed in the model 277 APPENDIX E Daily and hourly results of application of HEC-HMS model Part A: Daily Results Table E4.38(a): Results of HEC-HMS Model for Sg. Ketil catchment using 100% of data sets in training phase MODEL Data Set HEC TRAINING HEC-TEST Set 1 HEC-TEST Set 2 HEC-TEST Set 3 HEC-TEST Set 4 HEC-TEST Set 5 COC ( R2 ) RMSE (cumecs) RRMSE No. of Parameter MAPE (%) 7 0.4390 0.3963 0.0131 69.6432 7 0.5513 0.4298 0.0142 29.2748 7 0.5776 0.4479 0.0148 24.7868 7 0.5277 0.4222 0.0140 31.7811 7 0.4448 0.6328 0.0209 28.3676 7 0.6474 0.4462 0.0146 36.8108 cumecs-meter cubic second; COC-correlation of coefficient Table E4.38(b): Results of HEC-HMS Model for Sg. Ketil catchment using 50% of data sets in training phase MODEL Data Set HEC TRAINING HEC-TEST Set 1 HEC-TEST Set 2 HEC-TEST Set 3 HEC-TEST Set 4 HEC-TEST Set 5 COC ( R2 ) RMSE (cumecs) RRMSE No. of Parameter MAPE (%) 7 0.3455 0.4099 0.0134 96.9587 7 0.3141 0.4791 0.0158 67.3545 7 0.2948 0.5070 0.0169 58.7408 7 0.3306 0.4645 0.0152 40.2240 7 0.2268 0.7075 0.0235 55.9711 7 0.3425 0.5208 0.0166 89.7145 cumecs-meter cubic second; COC-correlation of coefficient 278 APPENDIX E Daily and hourly results of application of HEC-HMS model Table E4.38(c): Results of HEC-HMS Model for Sg. Ketil catchment using 25% of data sets in training phase MODEL Data Set HEC TRAINING HEC-TEST Set 1 HEC-TEST Set 2 HEC-TEST Set 3 HEC-TEST Set 4 HEC-TEST Set 5 RMSE (cumecs) RRMSE No. of Parameter COC ( R2 ) MAPE (%) 7 0.5696 0.4309 0.0142 30.2807 7 0.5509 0.5112 0.0169 21.6564 7 0.6148 0.5168 0.0169 20.8150 7 0.4700 0.5174 0.0172 62.6365 7 0.5304 0.7440 0.0243 37.9182 7 0.4938 0.6069 0.0202 44.7081 cumecs-meter cubic second; COC-correlation of coefficient Table E4.39(a): Results of HEC-HMS Model for Sg. Klang catchment using 100% of data sets in training phase MODEL Data Set HEC TRAINING HEC-TEST Set 1 HEC-TEST Set 2 HEC-TEST Set 3 HEC-TEST Set 4 HEC-TEST Set 5 COC ( R2 ) RMSE (cumecs) RRMSE No. of Parameter MAPE (%) 7 0.1095 13.6513 0.5740 44.2057 7 0.2095 16.9564 0.5803 47.8109 7 0.2628 17.3662 0.5397 50.1142 7 0.2723 17.3328 0.6410 48.4294 7 0.2483 21.7856 0.6089 57.1594 7 0.3109 21.2287 0.7834 53.2633 cumecs-meter cubic second; COC-correlation of coefficient 279 APPENDIX E Daily and hourly results of application of HEC-HMS model Table E4.39(b): Results of HEC-HMS Model for Sg. Klang catchment using 50% of data sets in training phase MODEL Data Set HEC TRAINING HEC-TEST Set 1 HEC-TEST Set 2 HEC-TEST Set 3 HEC-TEST Set 4 HEC-TEST Set 5 RMSE (cumecs) RRMSE No. of Parameter COC ( R2 ) MAPE (%) 7 0.2559 12.7938 0.5346 44.0716 7 0.2479 17.8388 0.5997 53.1483 7 0.2903 18.5725 0.6022 57.6016 7 0.3090 17.9154 0.6194 51.2026 7 0.2888 23.0966 0.6649 63.7140 7 0.4383 20.4854 0.6630 49.0098 cumecs-meter cubic second; COC-correlation of coefficient Table E4.39(c): Results of HEC-HMS Model for Sg. Klang catchment using 25% of data sets in training phase MODEL Data Set HEC TRAINING HEC-TEST Set 1 HEC-TEST Set 2 HEC-TEST Set 3 HEC-TEST Set 4 HEC-TEST Set 5 COC ( R2 ) RMSE (cumecs) RRMSE No. of Parameter MAPE (%) 7 0.3418 16.5412 0.6718 50.7731 7 0.2855 20.6104 0.6797 56.0819 7 0.3238 18.4099 0.6095 56.4563 7 0.3383 23.4747 0.7681 58.5394 7 0.3253 23.2594 0.6818 63.1373 7 0.4727 28.8749 0.8762 61.2457 cumecs-meter cubic second; COC-correlation of coefficient 280 APPENDIX E Daily and hourly results of application of HEC-HMS model Table E4.40(a): Results of HEC-HMS Model for Sg. Slim catchment using 100% of data sets in training phase MODEL Data Set HEC TRAINING HEC-TEST Set 1 HEC-TEST Set 2 HEC-TEST Set 3 HEC-TEST Set 4 HEC-TEST Set 5 COC ( R2 ) RMSE (cumecs) RRMSE No. of Parameter MAPE (%) 7 0.1560 0.0962 0.0015 103.6129 7 0.2736 0.0928 0.0014 110.6630 7 0.1002 0.0689 0.0010 87.0855 7 0.2636 0.1125 0.0017 137.4980 7 0.1321 0.0696 0.0011 87.4498 7 0.1812 0.1087 0.0016 139.3122 cumecs-meter cubic second; COC-correlation of coefficient Table E4.40(b): Results of HEC-HMS Model for Sg. Slim catchment using 50% of data sets in training phase MODEL Data Set HEC TRAINING HEC-TEST Set 1 HEC-TEST Set 2 HEC-TEST Set 3 HEC-TEST Set 4 HEC-TEST Set 5 RMSE (cumecs) RRMSE No. of Parameter COC ( R2 ) MAPE (%) 7 0.4214 0.0709 0.0011 80.4609 7 0.3030 0.1085 0.0016 126.0659 7 0.1775 0.0839 0.0013 102.1349 7 0.3001 0.1336 0.0020 160.5670 7 0.1876 0.0834 0.0013 97.0609 7 0.2768 0.1287 0.0020 159.2668 cumecs-meter cubic second; COC-correlation of coefficient 281 APPENDIX E Daily and hourly results of application of HEC-HMS model Table E4.40(c): Results of HEC-HMS Model for Sg. Slim catchment using 25% of data sets in training phase MODEL Data Set HEC TRAINING HEC-TEST Set 1 HEC-TEST Set 2 HEC-TEST Set 3 HEC-TEST Set 4 HEC-TEST Set 5 COC ( R2 ) RMSE (cumecs) RRMSE No. of Parameter MAPE (%) 7 0.3832 0.0816 0.0012 89.0290 7 0.2962 0.1172 0.0018 135.4532 7 0.1818 0.0933 0.0014 114.7980 7 0.2955 0.1440 0.0022 169.9900 7 0.1753 0.0939 0.0014 109.3430 7 0.2785 0.1410 0.0021 170.1134 cumecs-meter cubic second; COC-correlation of coefficient 282 APPENDIX E Daily and hourly results of application of HEC-HMS model Part B: Hourly Results Table E4.42(a): Results of HEC-HMS Model for Sg. Ketil catchment using 25% of data sets in training phase MODEL Data Set HEC TRAINING HEC-TEST Set 1 HEC-TEST Set 2 HEC-TEST Set 3 HEC-TEST Set 4 HEC-TEST Set 5 COC ( R2 ) RMSE (cumecs) RRMSE No. of Parameter MAPE (%) 7 0.1001 0.9106 0.0300 26.6170 7 0.4194 0.2831 0.0090 60.5143 7 0.2104 0.2147 0.0122 62.5150 7 0.6183 0.1742 0.0135 49.8880 7 0.3647 0.3924 0.0129 113.1124 7 0.5767 0.2959 0.0242 77.1620 cumecs-meter cubic second; COC-correlation of coefficient Table E4.42(b): Results of HEC-HMS Model for Sg. Ketil catchment using minimum data sets in training phase MODEL Data Set HEC TRAINING HEC-TEST Set 1 HEC-TEST Set 2 HEC-TEST Set 3 HEC-TEST Set 4 HEC-TEST Set 5 COC ( R2 ) RMSE (cumecs) RRMSE No. of Parameter MAPE (%) 7 0.6105 0.1276 0.0042 30.1550 7 0.3376 0.2914 0.0093 63.1680 7 0.1519 0.2121 0.0124 61.8832 7 0.5820 0.1767 0.0135 30.3545 7 0.2874 0.3940 0.0132 112.4270 7 0.5086 0.2997 0.0242 79.4885 cumecs-meter cubic second; COC-correlation of coefficient 283 APPENDIX E Daily and hourly results of application of HEC-HMS model Table E4.43(a): Results of HEC-HMS Model for Sg. Klang catchment using 40% of data sets in training phase MODEL Data Set HEC TRAINING HEC-TEST Set 1 HEC-TEST Set 2 HEC-TEST Set 3 HEC-TEST Set 4 HEC-TEST Set 5 RMSE (cumecs) RRMSE No. of Parameter COC ( R2 ) MAPE (%) 8 0.0914 35.2076 2.1584 91.8729 8 0.4036 33.8449 1.3703 50.8431 8 0.8496 6.8700 0.1902 13.9516 8 0.1021 23.9458 0.8421 62.6154 8 0.2208 35.4623 0.6088 32.6375 8 0.4024 14.5522 0.2324 40.3196 cumecs-meter cubic second; COC-correlation of coefficient Table E4.43(b): Results of HEC-HMS Model for Sg. Klang catchment using minimum data sets in training phase MODEL Data Set HEC TRAINING HEC-TEST Set 1 HEC-TEST Set 2 HEC-TEST Set 3 HEC-TEST Set 4 HEC-TEST Set 5 RMSE (cumecs) RRMSE No. of Parameter COC ( R2 ) MAPE (%) 8 0.1830 21.5490 0.8719 88.7204 8 0.5340 20.0460 0.3592 29.6470 8 0.8865 12.4918 0.2156 13.9121 8 0.2895 18.2621 0.9586 56.5022 8 0.3169 34.0235 0.6801 42.1594 8 0.2542 16.0454 0.2305 97.0262 cumecs-meter cubic second; COC-correlation of coefficient 284 APPENDIX E Daily and hourly results of application of HEC-HMS model Table E4.44(a): Results of HEC-HMS Model for Sg. Slim catchment using 30% of data sets in training phase MODEL Data Set HEC TRAINING HEC-TEST Set 1 HEC-TEST Set 2 HEC-TEST Set 3 HEC-TEST Set 4 HEC-TEST Set 5 COC ( R2 ) RMSE (cumecs) RRMSE No. of Parameter MAPE (%) 8 0.2128 0.4141 0.0166 12.8902 8 0.3872 0.3399 0.0088 10.7243 8 0.1097 0.2178 0.0173 70.3790 8 0.0070 0.4703 0.0262 157.2870 8 0.1677 0.2545 0.0236 84.0278 8 0.1443 0.3269 0.0131 98.5790 cumecs-meter cubic second; COC-correlation of coefficient Table E4.44(b): Results of HEC-HMS Model for Sg. Slim catchment using minimum data sets in training phase MODEL Data Set HEC TRAINING HEC-TEST Set 1 HEC-TEST Set 2 HEC-TEST Set 3 HEC-TEST Set 4 HEC-TEST Set 5 RMSE (cumecs) RRMSE No. of Parameter COC ( R2 ) MAPE (%) 8 0.5154 0.3566 0.0145 23.1082 8 0.3450 0.4218 0.0110 114.3003 8 0.1347 0.3156 0.0227 95.2197 8 0.0285 0.5998 0.0285 120.5626 8 0.4791 0.4081 0.0216 30.2660 8 0.4922 0.3948 0.0100 32.3102 cumecs-meter cubic second; COC-correlation of coefficient 285 APPENDIX F Daily and hourly results of the SWMM model calibration Part A: Daily results Table F4.46(a): Calibration Coefficients of Sungai Ketil catchment (using 100% of data) Model parameter Imperviousness (%) Calibrated value 45 Time of Concentration (hr) 66.42 Decay rate of infiltration 0.0025 Table F4.46(b): Calibration Coefficients of Sungai Ketil catchment (using 50% data) Model parameter Imperviousness (%) Calibrated value 45 Time of Concentration (hr) 29.75 Decay rate of infiltration 0.002 Table F4.46(c): Calibration Coefficients of Sungai Ketil catchment (using 25% data) Model parameter Imperviousness (%) Calibrated value 45 Time of Concentration (hr) 30.35 Decay rate of infiltration 0.002 286 APPENDIX F Daily and hourly results of the SWMM model calibration Table F4.47(a): Calibration Coefficients of Sungai Klang catchment (using 100% of data) Model parameter Calibrated value Imperviousness (%) 78 Pervious Area CN 30 Time of Concentration (hr) 7.37 Initial Abstraction 0.14 Decay rate of infiltration 0.001 Table F4.47(b): Calibration Coefficients of Sungai Klang catchment (using 50% data) Model parameter Calibrated value Imperviousness (%) 75 Pervious Area CN 30 Time of Concentration (hr) 3.13 Initial Abstraction 0.14 Decay rate of infiltration 0.001 Table F4.47(c): Calibration Coefficients of Sungai Klang catchment (using 25% data) Model parameter Calibrated value Imperviousness (%) 75 Pervious Area CN 30 Time of Concentration (hr) 2.5 Initial Abstraction 0.12 Decay rate of infiltration 0.001 287 APPENDIX F Daily and hourly results of the SWMM model calibration Table F4.48(a): Calibration Coefficients of Sungai Slim catchment (using 100% of data) Model parameter Calibrated value Imperviousness (%) 35 Pervious Area CN 95 Time of Concentration (hr) 72.7 Initial Abstraction 0.16 Decay rate of infiltration 0.0013 Table F4.48(b): Calibration Coefficients of Sungai Slim catchment (using 50% data) Model parameter Calibrated value Imperviousness (%) 35 Pervious Area CN 95 Time of Concentration (hr) 90.76 Initial Abstraction 0.15 Decay rate of infiltration 0.0013 Table F4.48(c): Calibration Coefficients of Sungai Slim catchment (using 25% data) Model parameter Calibrated value Imperviousness (%) 35 Pervious Area CN 95 Time of Concentration (hr) 33.26 Initial Abstraction 0.15 Decay rate of infiltration 0.0012 288 APPENDIX F Daily and hourly results of the SWMM model calibration Part B: Hourly results Table F4.50(a): Calibration Coefficients of Sungai Ketil catchment (using 25% of data) Model parameter Calibrated value Imperviousness (%) 46 Pervious Area CN 84 Time of Concentration (hr) Decay rate of infiltration 37.69 0.00125 Table F4.50(b): Calibration Coefficients of Sungai Ketil catchment (using minimum data) Model parameter Calibrated value Imperviousness (%) 46 Pervious Area CN 84 Time of Concentration (hr) Decay rate of infiltration 34.89 0.00125 Table F4.51(a): Calibration Coefficients of Sungai Klang catchment (using 25% of data) Model parameter Calibrated value Imperviousness (%) 82 Pervious Area CN 98 Time of Concentration (hr) 5 Decay rate of infiltration 0.001 289 APPENDIX F Daily and hourly results of the SWMM model calibration Table F4.51(b): Calibration Coefficients of Sungai Klang catchment (using minimum data) Model parameter Calibrated value Imperviousness (%) 82 Pervious Area CN 98 Time of Concentration (hr) 6 Decay rate of infiltration 0.001 Table F4.52(a): Calibration Coefficients of Sungai Slim catchment (using 25% of data) Model parameter Calibrated value Imperviousness (%) 38 Pervious Area CN 92 Time of Concentration (hr) 50 Initial Abstraction Decay rate of infiltration 0.15 0.0012 Table F4.52(b): Calibration Coefficients of Sungai Slim catchment (using minimum data) Model parameter Calibrated value Imperviousness (%) 38 Pervious Area CN 92 Time of Concentration (hr) 51.5 Initial Abstraction 0.15 Decay rate of infiltration 0.0012 290 APPENDIX G Daily and hourly results of application of SWMM model Part A: Daily Results Table G4.54(a): Results of SWMM Model for Sg. Ketil catchment using 100% of data sets in training phase MODEL Data Set SWMM TRAINING SWMM TEST Set 1 SWMM TEST Set 2 SWMM TEST Set 3 SWMM TEST Set 4 SWMM TEST Set 5 RMSE (cumecs) RRMSE No. of Parameter COC ( R2 ) MAPE (%) 5 0.5848 0.2378 0.0595 28.78643 5 0.6421 0.3412 0.0210 29.07223 5 0.4532 0.5060 0.0105 37.01045 5 0.6210 0.5239 0.0102 29.79555 5 0.4543 0.3456 0.0247 54.5238 5 0.7432 0.3499 0.0122 28.9544 cumecs-meter cubic second; COC-correlation of coefficient Table G4.54(b): Results of SWMM Model for Sg. Ketil catchment using 50% of data sets in training phase MODEL Data Set SWMM TRAINING SWMM TEST Set 1 SWMM TEST Set 2 SWMM TEST Set 3 SWMM TEST Set 4 SWMM TEST Set 5 RMSE (cumecs) RRMSE No. of Parameter COC ( R2 ) MAPE (%) 5 0.4023 0.4087 0.0110 98.7632 5 0.3027 0.5010 0.0201 62.1008 5 0.3075 0.5021 0.0154 117.651 5 0.4033 0.4354 0.0144 91.0868 5 0.3202 0.6678 0.0231 83.5644 5 0.3356 0.6020 0.0187 100.5443 cumecs-meter cubic second; COC-correlation of coefficient 291 APPENDIX G Daily and hourly results of application of SWMM model Table G4.54(c): Results of SWMM Model for Sg. Ketil catchment using 25% of data sets in training phase MODEL Data Set SWMM TRAINING SWMM TEST Set 1 SWMM TEST Set 2 SWMM TEST Set 3 SWMM TEST Set 4 SWMM TEST Set 5 COC ( R2 ) RMSE (cumecs) RRMSE No. of Parameter MAPE (%) 5 0.5645 0.4452 0.0188 30.3644 5 0.5643 0.5336 0.0232 32.0901 5 0.5874 0.6100 0.0166 30.7375 5 0.4986 0.5121 0.0194 32.6852 5 0.5412 0.6786 0.0343 28.8043 5 0.5547 0.5611 0.0233 27.9174 cumecs-meter cubic second; COC-correlation of coefficient Table G4.55(a): Results of SWMM Model for Sg. Klang catchment using 100% of data sets in training phase MODEL Data Set SWMM TRAINING SWMM TEST Set 1 SWMM TEST Set 2 SWMM TEST Set 3 SWMM TEST Set 4 SWMM TEST Set 5 COC ( R2 ) RMSE (cumecs) RRMSE No. of Parameter MAPE (%) 7 0.2102 12.7491 0.4675 42.2178 7 0.2033 15.9874 0.5732 45.9753 7 0.3209 17.4323 0.5258 50.6578 7 0.3754 17.0427 0.6361 47.9233 7 0.4211 19.6473 0.7006 55.8722 7 0.4039 20.0037 0.6984 51.8643 cumecs-meter cubic second; COC-correlation of coefficient 292 APPENDIX G Daily and hourly results of application of SWMM model Table G4.55(b): Results of SWMM Model for Sg. Klang catchment using 50% of data sets in training phase MODEL Data Set SWMM TRAINING SWMM TEST Set 1 SWMM TEST Set 2 SWMM TEST Set 3 SWMM TEST Set 4 SWMM TEST Set 5 COC ( R2 ) RMSE (cumecs) RRMSE No. of Parameter MAPE (%) 7 0.2627 11.6734 0.5022 42.7446 7 0.2986 15.7943 0.5031 52.9911 7 0.3002 18.2109 0.6012 57.2412 7 0.3077 16.7822 0.5988 49.0133 7 0.3106 20.0109 0.6808 60.5186 7 0.4354 19.0574 0.6701 49.0076 cumecs-meter cubic second; COC-correlation of coefficient Table G4.55(c): Results of SWMM Model for Sg. Klang catchment using 25% of data sets in training phase MODEL Data Set SWMM TRAINING SWMM TEST Set 1 SWMM TEST Set 2 SWMM TEST Set 3 SWMM TEST Set 4 SWMM TEST Set 5 RMSE (cumecs) RRMSE No. of Parameter COC ( R2 ) MAPE (%) 7 0.4020 15.4677 0.6521 49.0921 7 0.3928 18.5466 0.6753 56.2100 7 0.3205 18.0912 0.6974 55.4325 7 0.3487 21.0231 0.7987 57.9862 7 0.3243 23.7544 0.6328 64.0881 7 0.4733 27.8754 0.9001 60.6781 cumecs-meter cubic second; COC-correlation of coefficient 293 APPENDIX G Daily and hourly results of application of SWMM model Table G4.56(a): Results of SWMM Model for Sg. Slim catchment using 100% of data sets in training phase MODEL Data Set SWMM TRAINING SWMM TEST Set 1 SWMM TEST Set 2 SWMM TEST Set 3 SWMM TEST Set 4 SWMM TEST Set 5 COC ( R2 ) RMSE (cumecs) RRMSE No. of Parameter MAPE (%) 7 0.2160 0.0875 0.0015 110.3453 7 0.2768 0.0732 0.0018 91.20311 7 0.2069 0.0659 0.0020 80.6877 7 0.3103 0.1204 0.0029 148.3300 7 0.2059 0.0593 0.0010 107.5987 7 0.1805 0.0968 0.0013 113.0632 cumecs-meter cubic second; COC-correlation of coefficient Table G4.56(b): Results of SWMM Model for Sg. Slim catchment using 50% of data sets in training phase MODEL Data Set SWMM TRAINING SWMM TEST Set 1 SWMM TEST Set 2 SWMM TEST Set 3 SWMM TEST Set 4 SWMM TEST Set 5 RMSE (cumecs) RRMSE No. of Parameter COC ( R2 ) MAPE (%) 7 0.5402 0.0695 0.0018 30.8320 7 0.3402 0.1203 0.0017 61.3043 7 0.2103 0.0783 0.0019 51.0115 7 0.2960 0.1269 0.0012 51.5927 7 0.1975 0.0782 0.0011 80.8323 7 0.3029 0.1302 0.0021 101.4110 cumecs-meter cubic second; COC-correlation of coefficient 294 APPENDIX G Daily and hourly results of application of SWMM model Table G4.56(c): Results of SWMM Model for Sg. Slim catchment using 25% of data sets in training phase MODEL Data Set COC ( R2 ) RMSE (cumecs) RRMSE No. of Parameter MAPE (%) 7 0.3938 0.0683 0.0010 80.7394 7 0.3018 0.1110 0.0016 61.4202 7 0.2019 0.0843 0.0014 91.0500 7 0.2899 0.1200 0.0022 111.7020 7 0.1900 0.0684 0.1106 91.0903 7 0.3905 0.1520 0.0018 51.0997 SWMM TRAINING SWMM TEST Set 1 SWMM TEST Set 2 SWMM TEST Set 3 SWMM TEST Set 4 SWMM TEST Set 5 cumecs-meter cubic second; COC-correlation of coefficient Part B: Hourly Results Table G4.58(a): Results of SWMM Model for Sg. Ketil catchment using 25% of data sets in training phase MODEL Data Set SWMM TRAINING SWMM TEST Set 1 SWMM TEST Set 2 SWMM TEST Set 3 SWMM TEST Set 4 SWMM TEST Set 5 RMSE (cumecs) RRMSE No. of Parameter COC ( R2 ) MAPE (%) 6 0.1867 0.8102 0.0293 82.0927 6 0.4540 0.1403 0.102 50.4392 6 0.3828 0.2010 0.0192 50.6442 6 0.6685 0.1948 0.0110 30.5390 6 0.4102 0.3868 0.0200 51.0173 6 0.5983 0.2344 0.0194 30.6322 cumecs-meter cubic second; COC-correlation of coefficient 295 APPENDIX G Daily and hourly results of application of SWMM model Table G4.58(b): Results of SWMM Model for Sg. Ketil catchment using minimum data sets in training phase MODEL Data Set SWMM TRAINING SWMM TEST Set 1 SWMM TEST Set 2 SWMM TEST Set 3 SWMM TEST Set 4 SWMM TEST Set 5 COC ( R2 ) RMSE (cumecs) RRMSE No. of Parameter MAPE (%) 6 0.6039 0.1109 0.0184 34.0837 6 0.4292 0.3029 0.0103 36.2025 6 0.2203 0.3039 0.0108 55.0239 6 0.6054 0.1463 0.0192 35.0252 6 0.2982 0.2929 0.0206 71.0491 6 0.5022 0.1948 0.0215 37.8588 cumecs-meter cubic second; COC-correlation of coefficient Table G4.59(a): Results of SWMM Model for Sg. Klang catchment using 40% of data sets in training phase MODEL Data Set SWMM TRAINING SWMM TEST Set 1 SWMM TEST Set 2 SWMM TEST Set 3 SWMM TEST Set 4 SWMM TEST Set 5 RMSE (cumecs) RRMSE No. of Parameter COC ( R2 ) MAPE (%) 6 0.1932 35.8969 2.0594 88.8810 6 0.4029 32.7570 1.3029 50.0010 6 0.8506 6.2733 0.0948 11.6454 6 0.1896 21.7890 0.7574 55.1276 6 0.2985 33.2093 0.6932 30.0553 6 0.4838 14.0026 0.3292 17.4222 cumecs-meter cubic second; COC-correlation of coefficient 296 APPENDIX G Daily and hourly results of application of SWMM model Table G4.59(b): Results of SWMM Model for Sg. Klang catchment using minimum data sets in training phase MODEL Data Set SWMM TRAINING SWMM TEST Set 1 SWMM TEST Set 2 SWMM TEST Set 3 SWMM TEST Set 4 SWMM TEST Set 5 COC ( R2 ) RMSE (cumecs) RRMSE No. of Parameter MAPE (%) 6 0.1803 20.2293 0.9302 112.0912 6 0.6828 17.9695 0.2002 27.9887 6 0.7904 12.0092 0.1095 13.0123 6 0.3069 15.8282 0.8604 54.6571 6 0.4082 32.9920 0.5532 38.1953 6 0.3029 15.0290 0.2010 34.1208 cumecs-meter cubic second; COC-correlation of coefficient Table G4.60(a): Results of SWMM Model for Sg. Slim catchment using 30% of data sets in training phase MODEL Data Set SWMM TRAINING SWMM TEST Set 1 SWMM TEST Set 2 SWMM TEST Set 3 SWMM TEST Set 4 SWMM TEST Set 5 COC ( R2 ) RMSE (cumecs) RRMSE No. of Parameter MAPE (%) 7 0.3059 0.3312 0.0249 51.0921 7 0.4502 0.2090 0.0083 51.0322 7 0.2097 0.1110 0.0129 80.5021 7 0.0193 0.6022 0.0194 102.019 7 0.3020 0.3002 0.0294 30.2933 7 0.1099 0.2001 0.0122 60.4059 cumecs-meter cubic second; COC-correlation of coefficient 297 APPENDIX G Daily and hourly results of application of SWMM model Table G4.60(b): Results of SWMM Model for Sg. Slim catchment using minimum data sets in training phase MODEL Data Set SWMM TRAINING SWMM TEST Set 1 SWMM TEST Set 2 SWMM TEST Set 3 SWMM TEST Set 4 SWMM TEST Set 5 COC ( R2 ) RMSE (cumecs) RRMSE No. of Parameter MAPE (%) 7 0.7039 0.2010 0.0105 21.0192 7 0.3594 0.3302 0.0193 51.1030 7 0.2019 0.2793 0.0110 70.5066 7 0.0295 0.4029 0.0392 111.7868 7 0.5094 0.4029 0.0102 11.0022 7 0.5731 0.2192 0.0090 10.9403 cumecs-meter cubic second; COC-correlation of coefficient APPENDIX H Daily and hourly results of calibration/training process Part A: Daily Results Table H4.61: The percentage bias (PBIAS) of model calibration/training of Sungai Bekok catchment Model PBIAS (%) Calibration (a) 100% (b) 50% (c) 25% 3-MLP +0.072 +0.602 +0.289 4-MLP +0.072 +0.078 +0.099 RBF -1.596 -0.913 -0.682 MLR +280.61 +300.04 +290.68 HEC-HMS +8.555 +0.947 +0.629 XP-SWMM +6.286 +1.203 +0.678 Table H4.62: The percentage bias (PBIAS) of model calibration/training of Sungai Ketil catchment Model PBIAS (%) Calibration (a) 100% (b) 50% (c) 25% 3-MLP -0.206 -0.734 +0.325 4-MLP -0.116 +0.410 +0.153 RBF -0.098 -0.101 -0.089 MLR -53.39 -47.62 -45.18 HEC-HMS +0.064 +0.008 +0.008 XP-SWMM +0.082 -0.011 -0.070 Table H4.63: The percentage bias (PBIAS) of model calibration/training of Sungai Klang catchment PBIAS (%) Model Calibration (a) 100% (b) 50% (c) 25% 3-MLP +0.278 -0.099 -1.842 4-MLP -0.075 +0.111 +3.903 RBF -5.439 -5.515 -5.824 MLR -27.41 -11.50 -8.55 HEC-HMS -35.01 -38.40 -26.43 XP-SWMM -28.93 -25.03 -17.62 298 APPENDIX H Daily and hourly results of calibration/training process Table H4.64: The percentage bias (PBIAS) of model calibration/training of Sungai Slim catchment Model PBIAS (%) Calibration (a) 100% (b) 50% (c) 25% 3-MLP -0.003 +0.009 +0.036 4-MLP +0.004 +0.001 +0.000 RBF -0.023 -0.010 -0.007 MLR +111.24 +147.14 +138.56 HEC-HMS -0.019 -0.007 +0.006 XP-SWMM +0.016 -0.008 -0.008 Part B: Hourly Results Table H4.65: The percentage bias (PBIAS) of model calibration/training of Sungai Bekok catchment PBIAS (%) Model Calibration (a) 100% (b) 50% (c) 25% 3-MLP +0.002 +0.002 +0.010 4-MLP +0.002 +0.002 +0.167 RBF +0.003 (opt) -0.054 (min) HEC-HMS +0.532 -6.490 XP-SWMM +0.455 -5.332 Table H4.66: The percentage bias (PBIAS) of model calibration/training of Sungai Ketil catchment Model PBIAS (%) Calibration (a) 100% (b) 50% (c) 25% 3-MLP +0.003 -0.000 +0.041 4-MLP +0.002 +0.000 +0.001 RBF +0.000 (opt) +0.007 (min) HEC-HMS -2.679 -0.010 XP-SWMM -2.875 -0.011 299 APPENDIX H Daily and hourly results of calibration/training process Table H4.67: The percentage bias (PBIAS) of model calibration/training of Sungai Klang catchment Model PBIAS (%) Calibration (a) 100% (b) 50% (c) 25% 3-MLP -0.004 +0.101 +0.476 4-MLP -0.018 -0.306 +0.242 RBF -0.000 (opt) +0.000 (min) HEC-HMS +57.19 +62.29 XP-SWMM +55.01 +48.22 Table H4.68: The percentage bias (PBIAS) of model calibration/training of Sungai Slim catchment Model PBIAS (%) Calibration (a) 100% (b) 50% (c) 25% 3-MLP +0.000 +0.001 +0.001 4-MLP +0.003 -0.000 -0.004 RBF +0.000 (opt) +0.000 (min) HEC-HMS -0.849 +1.055 XP-SWMM -0.445 +0.992 300 APPENDIX I 301 Figures illustrate the daily and hourly result of ANN models Figure I4.3(a) Daily results of 3-Layer neural networks for Sg. Ketil catchment using 100% of data sets in training phase APPENDIX I 302 Figures illustrate the daily and hourly result of ANN models Figure I4.3(b) Daily results of 3-Layer neural networks for Sg. Ketil catchment using 50% of data sets in training phase APPENDIX I 303 Figures illustrate the daily and hourly result of ANN models Figure I4.3(c) Daily results of 3-Layer neural networks for Sg. Ketil catchment using 25% of data sets in training phase APPENDIX I 304 Figures illustrate the daily and hourly result of ANN models Figure I4.4(a) Daily results of 4-Layer neural networks for Sg. Ketil catchment using 100% of data sets in training phase APPENDIX I 305 Figures illustrate the daily and hourly result of ANN models Figure I4.4(b) Daily results of 4-Layer neural networks for Sg. Ketil catchment using 50% of data sets in training phase APPENDIX I 306 Figures illustrate the daily and hourly result of ANN models Figure I4.4(c) Daily results of 4-Layer neural networks for Sg. Ketil catchment using 25% of data sets in training phase APPENDIX I Figures illustrate the daily and hourly result of ANN models Figure I4.18(a) Daily results of RBF networks for Sg. Ketil catchment using 100% of data sets in training phase 307 APPENDIX I 308 Figures illustrate the daily and hourly result of ANN models Figure I4.18(b) Daily results of RBF networks for Sg. Ketil catchment using 50% of data sets in training phase APPENDIX I 309 Figures illustrate the daily and hourly result of ANN models Figure I4.18(c) Daily results of RBF networks for Sg. Ketil catchment using 25% of data sets in training phase APPENDIX I 310 Figures illustrate the daily and hourly result of ANN models Figure I4.5(a) Daily results of 3-Layer neural networks for Sg. Klang catchment using 100% of data sets in training phase APPENDIX I 311 Figures illustrate the daily and hourly result of ANN models Figure I4.5(b) Daily results of 3-Layer neural networks for Sg. Klang catchment using 50% of data sets in training phase APPENDIX I 312 Figures illustrate the daily and hourly result of ANN models Figure I4.5(c) Daily results of 3-Layer neural networks for Sg. Klang catchment using 25% of data sets in training phase APPENDIX I 313 Figures illustrate the daily and hourly result of ANN models Figure I4.6(a) Daily results of 4-Layer neural networks for Sg. Klang catchment using 100% of data sets in training phase APPENDIX I 314 Figures illustrate the daily and hourly result of ANN models Figure I4.6(b) Daily results of 4-Layer neural networks for Sg. Klang catchment using 50% of data sets in training phase APPENDIX I 315 Figures illustrate the daily and hourly result of ANN models Figure I4.6(c) Daily results of 4-Layer neural networks for Sg. Klang catchment using 25% of data sets in training phase APPENDIX I Figures illustrate the daily and hourly result of ANN models Figure I4.19(a) Daily results of RBF networks for Sg. Klang catchment using 100% of data sets in training phase 316 APPENDIX I 317 Figures illustrate the daily and hourly result of ANN models Figure I4.19(b) Daily results of RBF networks for Sg. Klang catchment using 50% of data sets in training phase APPENDIX I 318 Figures illustrate the daily and hourly result of ANN models Figure I4.19(c) Daily results of RBF networks for Sg. Klang catchment using 25% of data sets in training phase APPENDIX I 319 Figures illustrate the daily and hourly result of ANN models Figure I4.7(a) Daily results of 3-Layer neural networks for Sg. Slim catchment using 100% of data sets in training phase APPENDIX I 320 Figures illustrate the daily and hourly result of ANN models Figure I4.7(b) Daily results of 3-Layer neural networks for Sg. Slim catchment using 50% of data sets in training phase APPENDIX I 321 Figures illustrate the daily and hourly result of ANN models Figure I4.7(c) Daily results of 3-Layer neural networks for Sg. Slim catchment using 25% of data sets in training phase APPENDIX I 322 Figures illustrate the daily and hourly result of ANN models Figure I4.8(a) Daily results of 4-Layer neural networks for Sg. Slim catchment using 100% of data sets in training phase APPENDIX I 323 Figures illustrate the daily and hourly result of ANN models Figure I4.8(b) Daily results of 4-Layer neural networks for Sg. Slim catchment using 50% of data sets in training phase APPENDIX I 324 Figures illustrate the daily and hourly result of ANN models Figure I4.8(c) Daily results of 4-Layer neural networks for Sg. Slim catchment using 25% of data sets in training phase APPENDIX I Figures illustrate the daily and hourly result of ANN models Figure I4.20(a) Daily results of RBF networks for Sg. Slim catchment using 100% of data sets in training phase 325 APPENDIX I 326 Figures illustrate the daily and hourly result of ANN models Figure I4.20(b) Daily results of RBF networks for Sg. Slim catchment using 50% of data sets in training phase APPENDIX I 327 Figures illustrate the daily and hourly result of ANN models Figure I4.20(c) Daily results of RBF networks for Sg. Slim catchment using 25% of data sets in training phase APPENDIX I 328 Figures illustrate the daily and hourly result of ANN models Figure I4.11(a) Hourly results of 3-Layer neural networks for Sg. Ketil catchment using 100% of available data sets in training phase APPENDIX I 329 Figures illustrate the daily and hourly result of ANN models Figure I4.11(b) Hourly results of 3-Layer neural networks for Sg. Ketil catchment using 65% of available data sets in training phase APPENDIX I 330 Figures illustrate the daily and hourly result of ANN models Figure I4.11(c) Hourly results of 3-Layer neural networks for Sg. Ketil catchment using 25% of available data sets in training phase APPENDIX I 331 Figures illustrate the daily and hourly result of ANN models Figure I4.12(a) Hourly results of 4-Layer neural networks for Sg. Ketil catchment using 100% of available data sets in training phase APPENDIX I 332 Figures illustrate the daily and hourly result of ANN models Figure I4.12(b) Hourly results of 4-Layer neural networks for Sg. Ketil catchment using 65% of available data sets in training phase APPENDIX I 333 Figures illustrate the daily and hourly result of ANN models Figure I4.12(c) Hourly results of 4-Layer neural networks for Sg. Ketil catchment using 25% of available data sets in training phase APPENDIX I 334 Figures illustrate the daily and hourly result of ANN models Figure I4.22(a) Hourly results of RBF networks for Sg. Ketil catchment using 25% of available data sets in training phase APPENDIX I 335 Figures illustrate the daily and hourly result of ANN models Figure I4.22(b) Hourly results of RBF networks for Sg. Ketil catchment using min of available data sets in training phase APPENDIX I 336 Figures illustrate the daily and hourly result of ANN models Figure I4.13(a) Hourly results of 3-Layer neural networks for Sg. Klang catchment using 100% of available data sets in training phase APPENDIX I 337 Figures illustrate the daily and hourly result of ANN models Figure I4.13(b) Hourly results of 3-Layer neural networks for Sg. Klang catchment using 65% of available data sets in training phase APPENDIX I 338 Figures illustrate the daily and hourly result of ANN models Figure I4.13(c) Hourly results of 3-Layer neural networks for Sg. Klang catchment using 25% of available data sets in training phase APPENDIX I 339 Figures illustrate the daily and hourly result of ANN models Figure I4.14(a) Hourly results of 4-Layer neural networks for Sg. Klang catchment using 100% of available data sets in training phase APPENDIX I 340 Figures illustrate the daily and hourly result of ANN models Figure I4.14(b) Hourly results of 4-Layer neural networks for Sg. Klang catchment using 65% of available data sets in training phase APPENDIX I 341 Figures illustrate the daily and hourly result of ANN models Figure I4.14(c) Hourly results of 4-Layer neural networks for Sg. Klang catchment using 25% of available data sets in training phase APPENDIX I 342 Figures illustrate the daily and hourly result of ANN models Figure I4.23(a) Hourly results of RBF networks for Sg. Klang catchment using 25% of available data sets in training phase APPENDIX I 343 Figures illustrate the daily and hourly result of ANN models Figure I4.23(b) Hourly results of RBF networks for Sg. Klang catchment using min of available data sets in training phase APPENDIX I 344 Figures illustrate the daily and hourly result of ANN models Figure I4.15(a) Hourly results of 3-Layer neural networks for Sg. Slim catchment using 100% of available data sets in training phase APPENDIX I 345 Figures illustrate the daily and hourly result of ANN models Figure I4.15(b) Hourly results of 3-Layer neural networks for Sg. Slim catchment using 65% of available data sets in training phase APPENDIX I 346 Figures illustrate the daily and hourly result of ANN models Figure I4.15(c) Hourly results of 3-Layer neural networks for Sg. Slim catchment using 25% of available data sets in training phase APPENDIX I 347 Figures illustrate the daily and hourly result of ANN models Figure I4.16(a) Hourly results of 4-Layer neural networks for Sg. Slim catchment using 100% of available data sets in training phase APPENDIX I 348 Figures illustrate the daily and hourly result of ANN models Figure I4.16(b) Hourly results of 4-Layer neural networks for Sg. Slim catchment using 65% of available data sets in training phase APPENDIX I 349 Figures illustrate the daily and hourly result of ANN models Figure I4.16(c) Hourly results of 4-Layer neural networks for Sg. Slim catchment using 25% of available data sets in training phase APPENDIX I 350 Figures illustrate the daily and hourly result of ANN models Figure I4.24(a) Hourly results of RBF networks for Sg. Slim catchment using 25% of available data sets in training phase APPENDIX I Figures illustrate the daily and hourly result of ANN models Figure I4.24(b) Hourly results of RBF networks for Sg. Slim catchment using min of available data sets in training phase 351 APPENDIX J Architecture of daily and hourly MLP network structures Figure J4.26(a): The 3-layer MLP network structures of the daily model for Sg. Ketil catchment. Figure J4.26(b): The 4-layer MLP network structures of the daily model for Sg. Ketil catchment. 352 APPENDIX J Architecture of daily and hourly MLP network structures Figure J4.27(a): The 3-layer MLP network structures of the daily model for Sg. Klang catchment. Figure J4.27(b): The 4-layer MLP network structures of the daily model for Sg. Klang catchment. 353 APPENDIX J Architecture of daily and hourly MLP network structures Figure J4.28(a): The 3-layer MLP network structures of the daily model for Sg. Slim catchment. Figure J4.28(b): The 4-layer MLP network structures of the daily model for Sg. Slim catchment. 354 APPENDIX J Architecture of daily and hourly MLP network structures Figure J4.30(a): The 3-layer MLP network structures of the hourly model for Sg. Ketil catchment. Figure J4.30(b): The 4-layer MLP network structures of the hourly model for Sg. Ketil catchment. 355 APPENDIX J Architecture of daily and hourly MLP network structures Figure J4.31(a): The 3-layer MLP network structures of the hourly model for Sg. Klang catchment. Figure J4.31(b): The 4-layer MLP network structures of the hourly model for Sg. Klang catchment. 356 APPENDIX J Architecture of daily and hourly MLP network structures Figure J4.32(a): The 3-layer MLP network structures of the hourly model for Sg. Slim catchment. Figure J4.32(b): The 4-layer MLP network structures of the hourly model for Sg. Slim catchment. 357