Mulungushi river flow modelling in the headstreams of the Zambezi basin, Zambia J.M. Kampata1,2, B.P. Parida1, D.B. Moalafhi1, D. Mazvimavi3 and S. Ringrose4 1 University of Botswana, Faculty of Science, Department of Environmental Science, P/Bag UB 00704, Gaborone, Botswana. 2 Ministry of Energy and Water Development, Department of Water Affairs, P.O Box 50288, Lusaka. Zambia. 3 4 University of Western Cape, South Africa. University of Botswana, HOORC, Maun, Botswana. Abstract The Mulungushi river basin in central Zambia is the major source of water for demand centres and/or water users including Kabwe mining township, the Mulungushi University, subsistence and commercial agricultural enterprises, local communities and the Lunsemfwa Hydropower Company further downstream. The water managers overseeing the water provision services over the area experience difficulties particularly during the dry season in rationing water allocations amongst the various users as the temporal variations in available water (river flows) for the competing uses is influenced by largely ‘unexplained’ complex interplay of factors such as climatic variables which subsequently affect the river flow. Since Artificial Neural Network (ANN) models are capable of mimicking complex processes, an attempt has therefore been made to develop an appropriate network which can be used for predicting the observed river flows based on rainfall and evaporation. This then formed the basis from which short term forecasts of river flow were made to help for onward planning. For this, readily available data in terms of. two major input variables viz: the monthly rainfall and monthly evaporation and the output variables which are the concurrent monthly discharges have been considered. A two layer feed-forward network with one hidden layer with 30 neurons through which information gets transferred with the help of the sigmoidal transfer function resulted as the best network with a coefficient of determination (r2) of 0.831 and Mean squared Error (MSE) of 1.55. The training alogorithm used was the Levenberg-Marquardt back propagation with early stopping in which data was divided into three groups for training, validation and testing. The same network using rainfall as the only input resulted in the least MSE of 2.07 and with r2 of 0.804, while the optimum network using evaporation as the only input gave MSE and r2 of 14 and 0.630 respectively. Key words: Artificial neural networks, Levenberg-Marquardt back propagation, Mean Squared Error, Neurons, Sigmoidal transfer function. Corresponding Author email: jkampata@yahoo.com 1 1. Introduction The inadequate knowledge and information on water resources required for decision-making and long-term planning continues to be a major constraint for effective water resources management (World Water Assessment Programme, 2009). Modelling of rainfall- runoff or the prediction of river flows provides water managers with information to quantify the water resources of a river basin. This is particularly critical for equitable water allocation amidst the increasing demands from various users. The deterministic/ conceptual models (that consider the physics of the underlying process) and systems theoretic/black-box models (that do not consider the underlying physics of the process) are the two major types of models that have been used for modelling the rainfallrunoff or prediction of river flow (Dawson and Wilby, 2001; Srinivasulu and Jain, 2006; Mutlu et al., 2008; Akhtar et al., 2009). The deterministic/conceptual models are more suitable for modelling virgin flows, and are not able to perform adequately in catchments altered due to manmade or other related activity (e.g. many storage structures), which essentially affect the true dynamic nature of the rainfall–runoff relationship in a catchment. The systems theoretic models, on the other hand attempt to develop relationships among input and output variables involved in a physical process without considering the underlying physical process (Srinivasulu and Jain, 2006). Practitioners in water resources have primarily used simple linear regression or time series models to obtain approximations of the relationships between variables (Maier and Dandy, 2000). One reason for this is that the rules governing sophisticated statistical models have generally been considered to be too restrictive and to make it too difficult to utilise them for real-life applications (Maier and Dandy, 2000). Rainfall- runoff process exhibit complex nonlinear dynamics (Abrahart and See, 2007; Aqil et al., 2007). Considering the complexity of this process, there is a strong need to explore solutions, which can extract the complexity of the rainfall – runoff process even without having the complete physical understanding of the system (Aqil et al., 2007). The use of linear regression however raises a fundamental violation of the characteristics of water resources variables which are highly non-linear. In addition water resources variables are often not normally distributed, and their nature is such that it is extremely difficult, if not impossible, to find suitable transformations to normality (Maier and Dandy, 2001). To overcome this constraint, ANNs offer an easier solution to solve non-linear problems such as rainfall-runoff modeling (Maier and Dandy, 2000; Aqil et al, 2007; Mutlu et al. 2008). In modelling rainfall-runoff relationship in arid and semiarid regions in which the rainfall and runoff are very irregular, Riad and Mania (2004) found that ANN approach gives much better prediction than the traditional multiple linear regression (MLR) method. Artificial Neural Networks (ANNs) fall in the category of systems theoretic models (Dawson and Wilby, 2001; Srinivasulu and Jain, 2006; Mutlu et al., 2008; Akhtar et al., 2009) and they have been widely used in the field of hydrology and water resources modelling. Maier and Dandy (2000) have undertaken a comprehensive review of the use of ANNs in hydrology and water resources management while Dawson and Wilby (2001) specifically reviewed the use of 2 ANNs in rainfall–runoff modelling and flood forecasting. They indicate that ANNs are well suited to the challenging tasks of rainfall–runoff and flood forecasting. This is further ascertained by Abrahart and See (2007) who demonstrated that neural networks are capable of modelling nonlinear hydrological processes and are therefore appropriate tools for hydrological modelling. In addition, comparison of the performance of ANN against traditional multi linear regression models for modelling non-linear hydrology process such as rain-fall and runoff has found that ANNs perform well (Dawson and Wilby (2001); Abrahart and See, 2007) ANNs have therefore been chosen over the use of conceptual models as in these headwater catchments, as is the case in most such catchments in Africa, knowledge of the internal hydrologic processes is not well known. ANNs therefore offer an advantage as they are capable of modelling complex nonlinear relationships between input and output data sets (Dawson and Wilby 2001). This paper is an attempt to contribute to the practical use of ANNs especially for the semi-arid environment with limited data. Continued research and development in this field will provide a stronger understanding and appreciation of the hydrological modelling opportunities that are on offer such that good scholarship and greater awareness might encourage the wider acceptance of neural solutions (Abrahart and See, 2007). Artificial Neural Networks An Artificial Neural networks (ANN) is a highly interconnected network of many simple processing units called neurons, which are analogous to the biological neurons in the human brain (Srinivasulu and Jain, 2006). They are invaluable for applications where formal analysis would be difficult or impossible, such as pattern recognition and nonlinear system identification and control (Demuth et al., 2008). ANNs have been used to solve Classification, Function Approximation, Prediction and Clustering problems (NeuroDimension, Inc., 2007). An ANN normally consists of two layers, a hidden layer and an output layer. The structure of a two layer feed-forward ANN is shown in figure 1. Source: (Demuth et al., 2008). Figure 1: A two-layer feed-forward neural network 3 More detailed description on ANNs may be found in Maier and Dandy, 2000; Dawson and Wilby, 2001 and Demuth et al., 2008). 2. Methodology Based on the problem, a network that is able to predict (and forecast) the runoff based on climate and derived information is needed. Such a problem falls under the category known in the literature by various terms namely fitting problem, function approximation, regression learning, nonlinear regression (Demuth et al., 2008). In this study, the main objective was to develop an ANN and train it to learn the relationships between climatic variables (precipitation, evaporation and temperature) as model predictor variables on one hand and river flow as the dependent variable on the other hand. After learning the relationships, the model would be used to simulate the river flow patterns. The study used the feedforward neural network using the backpropagation algorithm for training as these have been found to be the most suitable and are widely used for the prediction and forecasting of water resources variables (Sivakumar et al., 2002; Maier and Dandy, 2001; Maier and Dandy, 2000). The steps of using ANNs for rainfall-runoff modelling as described by Maier and Dandy (2001; 2000) and adopted in this study were followed. These are discussed below: i. Division and pre-processing of the available data The available data was spilt into three sub-sets; a training set, a validation set and a testing set which is implemented as a means of improving generalization under early stopping. In order to ensure that all variables receive equal attention during the training process, they were standardised. In addition, the variables were scaled to between 0 and 1 in such a way as to be commensurate with the limits of the activation functions used in the output layer. Outputs of the logistic transfer function are between 0 and 1. This is usually taken as a precautionary measure to avoid too slow or too fast convergence of the performance function and thus this aids realization of optimum solution. ii. Determination of appropriate model inputs The input variables based on a priori knowledge of available causal variables were rainfall, temperature and evaporation. Analysis of the influence of the input variables (independent variables) on the output (dependant variable) was undertaken by determination of the individual correlations between each input variables and the target variable. This provided the means of confirming which combination of input variables had significant influence. This avoided the need to undertake trial and error approach of training separate networks for each input variable which is computationally intensive. The input variable with the highest correlation are retained for use in the model. Matignon (2005) indicates that the draw back to this approach is that it ignores the partial correlation 4 between the other inputs in the model that might result in erroneous inputs added or removed from the model. To over come this, Matignon (2005) suggests dropping the non-significant inputs are dropped one at a time. This was done by dropping each input which has its p-value above the 0.05 alpha- level which means only the input variables that have significant influence on the output variable are retained. This avoids the need to undertake trail and error approach of training separate networks for each input variable which is computationally intensive. iii. Network architecture The number of nodes in the input layer is equal to the number of model inputs, whereas the number of nodes in the output layer are equal the number of model outputs. A two-layer feedforward network with sigmoid transfer function in the hidden neurons and linear transfer function output neurons has been used. One hidden layer is used as it has been found to be adequate in applications of forecasting and predicting water resources variables. The number of neurons in the hidden layer of the neural network were changed from time to time to investigate the best results. The network was trained with Levenberg-Marquardt backpropagation algorithm. The LM algorithm has been found to be very efficient for training small to medium-size networks (Aqil et al. 2007). The architecture used was a multi-layer feed-forward neural network (FNN as shown in Figure 1. The hidden layer nodes allow the network to detect and capture the relevant pattern(s) in the data, and to perform complex nonlinear mapping between the input and the output variables. The sole role of the input layer of nodes is to relay the external inputs to the neurons of the hidden layer. The outputs of the hidden layer are passed to the last (or output) layer which provides the final output of the network. The results from various combinations of models was analyzed for the best performing model in order to recommend an appropriate ANN model that best represents the rainfall-run off process. The 72 input samples are randomly selected, 50 training, 11 validation and 11 testing. This approach ensures that the independence assumption of the inputs is retained. iv. Optimisation of the connection weights (training) The input from each node in the previous layer (xi) are multiplied by a connection weight (wji). These connection weights are adjustable and may be likened to the coefficients in statistical models. The process of optimising the connection weights is known as ‘training’ or ‘learning’. This is equivalent to the parameter estimation phase in conventional statistical models. The connection weights (w) will be adjusted using the gradient descent rule of optimization. E wt wt 1 , (1) w s 1 5 where s is the training sample presented to the network, η is the learning rate, and μ is the momentum value. The number of training samples presented to the network between weight updates is called the epoch size (ε). At each node, the weighted input signals are summed and a threshold value (θj) is added. This combined input (Ij) is then passed through a nonlinear transfer function (f(.)) to produce the output of the node (yj). The output of one node provides the input to the nodes in the next layer (Minasny et al., 2004; Moalafhi, 2004; Maier and Dandy, 2001). This process is summarised in equations (2) and (3) and illustrated in Figure 2. I j w ji xi j , summation (2) y i f I j , transfer (3) f(I j) X0 X1 Ij w j0 w j1 Yi X2 Sum w j2 X3 Transfer Output path w j3 Source: (Moalafhi, 2004; Maier and Dandy, 2001) Figure 2: Operation of hidden layer neuron of a feedforward Artificial Neural Network The transformation of input into a hidden layer node to an output was done using the logistic function which is of the sigmoidal type transfer function. Abrahart and See (2007), mention that “models incorporating sigmoid transfer functions can support improved generalizations and superior learning characteristics”. The backpropagation algorithm which is based on the method of steepest descent was used for optimizing the feedforward ANN. It is extensively used in feedforward networks as such networks with biases, a sigmoid layer, and a linear output layer are capable of approximating any function with a finite number of discontinuities (Demuth et al., 2008). The error function, E, most commonly used is the Mean Squared Error (MSE) function. The preference of using the MSE as the performance function is mostly because it is calculated easily, it penalises large errors and its partial derivative with respect to the weights can be calculated easily. As each input is applied to the network, the network output (modelled value) 6 is compared to the target (observed value). The error is calculated as the difference between the target output and the network output. The goal is to minimize the average of the sum of these errors. It is given as: 1 E MSE N 2 N obs i 1 i cali (11) Where obsi is the observed data and cali is the calculated output (modelled) predicted by network. As MSE grows, the accuracy of model reduces. v. Model validation and testing. Data sets not used in training were used for model validation and testing. These are the 20% of the data that are each kept for validation and testing. The validation data set are used to measure network generalization, and to halt training when generalization stops improving while the testing data set has no effect on training and so provides an independent measure of network performance during and after training. These are checked to see how well the model performs. MATLAB 7.6.0.324 (R2008a) (Maths Works Inc., 2008) was used to develop, train and simulate the ANN. 3. The Study area The Mulungushi river sub-basin is located between longitude 28.17º to 28.83º East and latitude 14º to 14.5º South on the western part of the Lunsemfwa river basin. It covers an area of 1448 km2 measured upstream of the hydrological station on the Mulungushi river at Great North Road Bridge (Figure 3). 7 28°9'59" 28°19'58" 28°29'57" 28°39'56" 28°49'55" 28°59'54" N W E Kapiri Moshi # 14°10'05" 14°10'05" Great North Road 14°00'06" 14°00'06" S 14°20'04" Mulungushi River 28°9'59" 0 28°19'58" 30 28°29'57" 28°39'56" 28°49'55" 60 Kilometers 28°59'54" 14°30'03" 14°30'03" Kabwe # 14°20'04" M ulu ng us hi Figure 3: Mulungushi river sub-basin The town of Kabwe (Provincial Capital of Central Province), the Mulungushi University, various subsistence and commercial farmers, local communities and Lunsemfwa hydropower Company (down stream) depend on the Mulungushi river to meet their water needs. Water demand is high during the dry season when flows are low. Conflicts currently occur between municipal water, agriculture and hydropower water use. Modelling flows would provide a better understanding on the flow characteristics. This information can be used by water managers to improve on water apportionment and monitoring water availability thus mitigate conflicts among various water users. Hydro-meteorological data for the period October 1971 to September 1977 was used (72 months). This period was used as it has data with no gaps. Rainfall data was obtained from the Zambia Meteological Department station at Kabwe and river discharge data of station Mulungushi River at Great North Road from the Zambia Department of Water Affairs. 8 4. Results and Discussion Determination of appropriate model inputs Table 1 shows the correlations between each of the variables and Table 2 the p-values Table 1: Lower Correlation coefficients matrix between the variables Discharge Rainfall Temperature Evaporation Discharge 1.0000 0.4894 0.1245 -0.3726 Rainfall Temperature Evaporation 1.0000 0.4135 -0.2965 1.0000 0.2975 1.0000 As expected the influence of rainfall on the discharge is high which is reflected by the higher correlation coefficient between them. This means that the higher the rainfall, the higher the discharge. The significance of these inputs is then tested to select the inputs that significantly influence the discharge. The hypothesis of no correlation is tested by calculating the p-values. If p is small, say less than 0.05, then the correlation r is significant. Table 2: p-values between variables Discharge Rainfall Temperature Evaporation Discharge 1.0000 0.0000 0.2973 0.0013 Rainfall Temperature Evaporation 1.0000 0.0003 0.0115 1.0000 0.2975 1.0000 From table 2 the significant correlations are between Discharge-Rainfall, Discharge – Evaporation, Rainfall-Temperature, Rainfall-Evaporation. Since Temperature is not significantly correlated to the discharge it does not contribute to explaining the dependant variable and may thus be dropped from the input variables. The selected inputs to the model are thus rainfall and evaporation. ANN model results Table 3 shows the results of the ANN model using various inputs and with increasing the number of neurons in its hidden layer. Model number 3 with 30 neurons in its hidden layer with input rainfall & evaporation gave the best model results with MSE of 1.55 and Coefficient of determination (r2) of 0.831. Table 3: Model results Model no Neurons in its hidden Input rainfall only Input evaporation only 9 Input rainfall & evaporation layer 10 20 30 1 2 3 r2 0.767 0.748 0.804 MSE 9.82 4.44 2.07 MSE 14.7 14 14.4 r2 0.625 0.622 0.621 MSE 3.20 2.88 1.55 r2 0.778 0.791 0.831 The plot of the modelled discharge resulting from application of rainfall and evaporation as only inputs and with rainfall and evaporation is shown in Figure 4. It is seen that the models with just one input of rainfall or evaporation do not simulate the discharge as well as the model with the two inputs of rainfall and evaporation. This reinforces the importance of evaporation in influencing the rainfall-runoff process. Q-obs Q-input rain Q-Input evap Q-Input rain and evap Plot of observed and modelled discharge - Mulungushi River at Great North road 30 Discharge (m3/s) 25 20 15 10 5 0 -5 0 10 20 30 40 50 60 70 -10 Time (months) Figure 4: Plot of observed and modelled discharge for various inputs- Mulungushi River at Great North road 10 80 Figure 5: Performance and regression plots for model 3 (input monthly rainfall and evaporation) 11 Plot of observed and modelled discharge - Mulungushi River at Great North road Q-obs Q-Input rain and evap 30 Discharge (m3/s) 25 20 15 10 5 0 0 10 20 30 40 50 60 70 80 Time (months) Figure 6: Plot of observed and modelled discharge for model 3- Mulungushi River at Great North road Evaluating the performance of model Apart from the Coefficient of determination and the MSE additional tests were done to check on the performance of the model. An assumption was made that if the modelled (output) values fall within 95% confidence internal then the model can be considered to have performed satisfactory. To check this, two plots were evaluated. Figure 1 shows the robust regression line and Figure 2: Plot of residuals both with 95% confidence internal. 25 Output (modelled discharge) m3/s 20 15 10 Data Robust Regression fit 95% Confidence Interval 5 0 0 5 10 15 Target (Observed discharge) m3/s 12 20 Figure 7: Robust regression Figure 8 shows the plot of the data residuals which are visually examined to gain an insight into the "goodness" of the model fit. The residuals are calculated as the difference between observed and modelled discharge. For a good fit the residuals should be close to zero and have a random scatter. If the residual plot has a pattern which does not appear to have a random scatter, this indicates that the model does not properly fit the data (Demuth et al., 2008). This then means there is need for a different model structure or input variable. Residual Case Order Plot 15 Residuals (modelled-Taget) 10 5 0 -5 -10 -15 10 20 30 40 50 60 70 Time period Figure 8: Plot of residuals with 95% confidence internal. The residuals are seen to be random and mostly around zeros as well as within the 95% confidence internal around zero. The model can thus be concluded to be performing satisfactorily. There are however, three points that have residuals that do not contain zero (points in red). This indicates that the residual is larger than expected in 95% of new observations. This may suggest the data points may be outliers (Montgomery et al., 2001). 13 5. Conclusion In this study the absence of continues long term river discharge, water level data and meteorological data of precipitation, temperature, evaporation etc was a constraint. In view of this constrain the inputs available for use were limited to the monthly precipitation and evaporation. This inevitably affected the length of available data as well as the types of inputs to use which would have improved the accuracy of the modelling. Even with short term/ insufficient data the ANN approach has been found suitable to be used to model the rainfall- runoff. It is better to have some information than none at all to make the necessary water resources management decisions. Improving the ANN model should be a continuous process of updating it as new data in the basin becomes available. This will ultimately improve its forecasting capabilities. There is need to reemphasize the importance of hydro-meteorological data management if water resources are to be effectively managed. Acknowledgements Thanks to the Zambia Meteorological Department and Water Affairs Department for providing the rainfall and the river discharge data respectively; Gratitude to UNESCO who are supporting the SIMDAS project through which this study has been conducted as input into the ongoing research. References Abrahart, R.J, and See, L.M. 2007. Neural network modeling of non-linerar hydrological relationships. Hydrology and Earth System Sciences, 11, 1563-1579 Akhtar, M. K., Corzo, G. A., van Andel, S. J., and Jonoski, A., 2009 River flow forecasting with Artificial Neural Networks using satellite observed precipitation pre-processed with flow length and travel time information: case study of the Ganges river basin, Hydrol. Earth Syst. Sci. Discussion., 6, 3385-3416. Aqil, M, Kita I., Yano A., Nishiyama, S., 2007. Neural Networks for Real Time Catchment Flow Modeling and Prediction. Dawson C.W. and Wilby R.L., 2001. Hydrological modelling using artificial neural networks. Progress in Physical Geography 25,1 (2001) pp. 80–108 Demuth, H., Beale, M, and Hagan, M. 2008. MATHLAB Network Neural Tool box 6 Users guide for Version 7.6.0.324 (R2008a). Maths works Inc. Maths Works Inc., 2008, MATHLAB Version 7.6.0.324 (R2008a) Maier and Dandy, 2001, Neural Network Based Modelling of Environmental Variables: A Systematic Approach. Mathematical and Computer Modelling 33 (2661) 669-682 Minasny, B., Hopmans, J.W., Harter, T., Eching, S. O., Tuli, A. and Denton M. A., 2004. Neural Networks Prediction of Soil Hydraulic Functions for Alluvial Soils Using Multistep Outflow Data. Soil Science Society of America Journal 68, pp 417–429 Moalafhi, D.B, 2004. Effect of Urbanisation on the Flow Regime of Notwane Catchment, Botswana. MPhill Thesis submitted to the University Of Botswana. PP 103 Montgomery D.C, Runger G.C., and Hubele N. F., 2001. Engineering Statistics. 2 nd Edition. John Wiley and Sons 14 Mutlu, I. Chaubey, H. Hexmoor, S. G. Bajwa, 2008. Comparison of artificial neural network models for hydrologic predictions at multiple gauging stations in an agricultural watershed Hydrological Processes Volume 22, Issue 26, Pages: 5097-5106. NeuroDimension, Inc., 2007. NeuroSolutions Program version 5.05 Help Riad S. And Mania J., 2004. Rainfall-Runoff Model Using an Artificial Neural Network Approach. Mathematical and Computer Modelling 40 (2004) 839-846 Sivakumar, B., Jayawardena, A.W., Fernando, T.M.K.G., 2002. River flow forecasting: use of phase-space reconstruction and artificial neural networks approaches. Journal of Hydrology 265, pp 225–245 Srinivasulu, S. and Jain, A. 2006. A comparative analysis of training methods for artificial neural network rainfall–runoff models. Applied Soft Computing 6 (2006) 295–306 World Water Assessment Programme. 2009. The United Nations World Water Development Report 3: Water in a Changing World. Paris: UNESCO, and London: Earthscan. 15