PROBABILISTIC FLOOD FORECASTING USING A DISTRIBUTED RAINFALL-RUNOFF MODEL PAUL JAMES SMITH 2005 Acknowledgements ACKNOWLEDGEMENTS This thesis has been made possible through the help of a number of people who I would like to thank. It is difficult to overstate my gratitude to my research supervisor, Professor Toshiharu Kojiri, who has continually provided encouragement, sound advice, and good company, and who throughout my thesis-writing period, seemed to be constantly thinking several steps ahead of me. Without his assistance and openness in inviting me to Japan and in providing a comfortable research environment, my work with Kyoto University would not have been possible. I am grateful to the members of the Water Resources Research Center, Disaster Prevention Research Center, Kyoto University for their support with my research. Special thanks to Professor Shuichi Ikebuchi and Professor Kunio Tomosugi who provided encouragement throughout the research period and to Dr. Toshio Hamaguchi who thoroughly read the first draft of this thesis and offered many valuable suggestions. I am thankful for the generous time and invaluable guidance regarding nowcasting of rainfall patterns provided by Professor Eiichi Nakakita. Thanks is extended to Mr. Katsuyoshi Sekii who made an excellent research partner and was of great assistance in helping prepare this thesis. Sincere thanks is extended to Tomoya Kawaguchi and staff of the Environmental Engineering Department, Engineering Division, Nihon Suido Consultants Co. Ltd., and to Yoji Sakamoto of Mitsui Consultants Co. Ltd., for their assistance regarding rainfallrunoff modeling, and to Yoshiyuki Zushi of the Foundation of River and Basin Integrated Communications, Japan, and to the staff of the Kiso River Upstream River Construction Management Division, Chubu Regional Bureau, Ministry of Land, Infrastructure and Transport, Japan, for providing a range of useful hydrological data. I am indebted to Professor Graeme Dandy, Mr. Trevor Daniell, and Dr. David Walker, of the University of Adelaide’s Department of Civil and Environmental Engineering, for their help in suggesting Professor Kojiri and Kyoto University for my graduate studies, and for the four years of excellent preparation they gave me during my time as an undergraduate student in their care. i Acknowledgements I would also like to thank my fellow students at the Water Resources Research Center, DPRI, Kyoto University, for the invaluable conversations we’ve shared regarding a wide range of research and non-research related topics, for their support over the years in helping me to make the most of my time in Japan, and for their friendship. Finally, I would like to deeply thank my parents for all their love, encouragement and understanding during my long stay abroad. ii Contents CONTENTS ACKNOWLEDGEMENTS .............................................................................................. i CONTENTS .................................................................................................................... iii LIST OF FIGURES ......................................................................................................... vi LIST OF TABLES ........................................................................................................... ix 1. INTRODUCTION ...................................................................................................... 1 2. DISTRIBUTED RAINFALL-RUNOFF MODELING .............................................. 3 2.1. Rainfall-runoff modeling strategies .................................................................... 3 2.2. Hydro-BEAM ..................................................................................................... 6 2.3. Hydro-BEAM structure ...................................................................................... 7 2.3.1. Mesh cell model .......................................................................................... 7 2.3.2. Watershed model ....................................................................................... 10 2.4. Hydro-BEAM flow routing theory and equations ............................................ 12 2.4.1. Finite difference approximation of the kinematic wave equation ............. 13 2.4.2. Linear approximation method ................................................................... 14 2.4.3. Surface flow in mountain regions .............................................................. 15 2.4.4. Surface flow in city and water regions ...................................................... 16 2.4.5. Channel flow ............................................................................................. 17 2.5. Target watershed ............................................................................................... 17 2.5.1. Model calibration ....................................................................................... 20 2.6. Conclusions ...................................................................................................... 24 3. PROBABILISTIC DISTRIBUTED RAINFALL FORECASTING ........................ 25 3.1. Study location and rainfall data ........................................................................ 25 3.2. Modeling translation and rotation of rainfall fields .......................................... 26 3.2.1. Identification of translation vector parameters from rainfall patterns ....... 27 3.2.2. Extrapolation of rainfall patterns ............................................................... 27 3.3. Time series analysis of observed translation vector parameters ....................... 29 3.3.1. ARMA time series analysis ....................................................................... 29 3.3.2. ARMA model identification ...................................................................... 30 3.3.3. Generation of vector parameters................................................................ 38 3.4. Time series analysis of growth-decay of rainfall fields .................................... 38 3.5. Statistical analysis of growth-decay of rainfall fields ...................................... 39 3.5.1. Statistical analysis...................................................................................... 41 3.5.2. Estimating spatial correlation .................................................................... 46 3.5.3. Conditional generation of spatially-correlated noise ................................. 48 iii Contents 3.6. Radar observation error .................................................................................... 49 3.7. Monte Carlo simulation .................................................................................... 49 3.8. Application ....................................................................................................... 52 3.9. Conclusion ........................................................................................................ 58 4. ADAPTIVE UPDATING OF A DISTRIBUTED RAINFALL-RUNOFF MODEL 60 4.1. Overview of methodology ................................................................................ 60 4.2. Adaptive updating algorithm ............................................................................ 61 4.3. Prediction and error estimation ........................................................................ 64 4.4. Distributed updating for an entire watershed ................................................... 66 4.4.1. Definitions ................................................................................................. 67 4.4.2. Gain calculation for non-observation point mesh cells ............................. 68 4.4.3. Updating factor calculation – inverse distance weighting interpolation ... 68 4.4.4. Updating factor calculation – linear variation method .............................. 69 4.5. Partial distributed updating............................................................................... 73 4.6. Application ....................................................................................................... 74 4.7. Conclusion ........................................................................................................ 78 5. AI-BASED ERROR CORRECTION FOR RAINFALL-RUNOFF MODELING .. 79 5.1. Procedure of AI-based error correction approach............................................. 79 5.2. Genetic programming for error correction ....................................................... 80 5.2.1. Genetic programming calculation procedure ............................................ 82 5.3. Feedforward artificial neural network for error correction .............................. 85 5.4. Self-organizing map for data clustering ........................................................... 85 5.5. Application ....................................................................................................... 87 5.5.1. Problem formulation .................................................................................. 88 5.5.2. Training, cross-check and verification data ............................................... 88 5.5.3. Results ....................................................................................................... 89 5.6. Conclusion ........................................................................................................ 92 6. INTERPOLATION OF RUNOFF PREDICTIONS FOR DISTRIBUTED FLOOD FORECASTING............................................................................................................. 94 6.1. Proposed interpolation strategy ........................................................................ 94 6.2. Local linear modeling ....................................................................................... 95 6.2.1. Introduction ............................................................................................... 95 6.2.2. Nearest neighbors search ........................................................................... 96 6.3. Global regression .............................................................................................. 97 6.4. Choice of query vector form ............................................................................. 97 6.4.1. Temporal correlation between elements .................................................... 98 iv Contents 6.4.2. Additional elements ................................................................................... 98 6.5. Application ....................................................................................................... 99 6.6. Results and discussion .................................................................................... 101 6.7. Conclusion ...................................................................................................... 105 7. PROBABILISTIC FLOOD FORECASTING ....................................................... 106 7.1. Modeling uncertainty in flood forecasts ......................................................... 106 7.2. Probabilistic flood forecast formulation ......................................................... 108 7.2.1. Precipitation uncertainty .......................................................................... 109 7.2.2. Hydrologic uncertainty ............................................................................ 109 7.2.3. Combining precipitation uncertainty and hydrologic uncertainty ............110 7.3. Application ......................................................................................................112 7.4. Conclusion .......................................................................................................114 8. EVACUATION DECISION ....................................................................................115 8.1. Decision model ................................................................................................115 8.1.1. Estimating potential costs .........................................................................116 8.1.2. Estimating inundation probability and severity ........................................116 8.1.3. Evacuation decision formulation and timing of the evacuation ...............119 8.1.4. Objective function formulation ............................................................... 121 8.1.5. Risk aversion ........................................................................................... 122 8.2. Demonstration of the evacuation decision framework ................................... 123 8.3. Evacuation path planning using probabilistic information ............................. 128 8.4. Conclusion ...................................................................................................... 130 9. CONCLUSION ...................................................................................................... 132 REFERENCES ............................................................................................................. 135 v List of Figures LIST OF FIGURES Figure 2-1: Mesh cell model............................................................................................. 8 Figure 2-2: Arrangement of land use categories on a mesh cell surface .......................... 9 Figure 2-3: Discharge routing within a mesh cell .......................................................... 10 Figure 2-4: Flow routing map structure ...........................................................................11 Figure 2-5: 8-direction flow routing scheme .................................................................. 12 Figure 2-6: Finite difference mesh used in Beven kinematic wave routing model ........ 14 Figure 2-7: Combination of surface and layer A flows .................................................. 16 Figure 2-8: Rectangular and trapezoidal channel cross-sections.................................... 17 Figure 2-9: Nagara River basin ...................................................................................... 19 Figure 2-10: (a) Coverage of Gozaisho and Jyatoge weather radar facilities, (b) topographical map of the Nagara River watershed ................................................ 21 Figure 2-11: (a) Flow routing map for the Nagara River watershed, (b) map of dominant land use within each mesh cell ............................................................................... 22 Figure 2-12: Percentage cover within each mesh cell of (a) forest, (b) field, (c) urban area, (d) paddy field, (e) water body ...................................................................... 23 Figure 2-13: Example simulation results for (a) Chusetsu, and (b) Akutami ................. 24 Figure 3-1: Observed rainfall intensity pattern described as a 3-dimensional surface ... 26 Figure 3-2: Vector fields for (a) Case 1, and (b) Case 2 (11/9/2000 20:55 - 21:00)....... 28 Figure 3-3: Time series of translation vector parameters c1 ~ c6, 10/9/2001.................. 31 Figure 3-4: Sample autocorrelation functions of c1 ~ c6, 10/9/2001 .............................. 32 Figure 3-5: Time series of c1 ~ c6 , 10/9/2001 .......................................................... 34 Figure 3-6: Sample autocorrelation functions of c1 ~ c6 , 10/9/2001........................ 35 Figure 3-7: Selected sample cross-correlation functions of c1 ~ c6 , 10/9/2001 ....... 36 Figure 3-8: (a) Optimal calculated rainfall pattern (mm/hr) for 11 September 2000 21:00, (b) corresponding calculated residual growth-decay field (mm/hr/5-min). ........... 40 Figure 3-9: Distributions of observed growth-decay rate frequencies for various values of RA ........................................................................................................................ 44 Figure 3-10: Variation of (a) scale and (b) shape parameters with RA for lognormal distribution .............................................................................................................. 46 Figure 3-11: Variation of (a) scale and (b) shape parameters with RA for Weibull distribution .............................................................................................................. 46 Figure 3-12: Series of I(h) for residual growth-decay field (11/9/2000 21:00) .............. 48 Figure 3-13: Time series of I(1) and I(2) for residual growth-decay fields for 11/9/2000 15:00 ~ 21:00 (No radar observation available for 18:55) ..................................... 48 vi List of Figures Figure 3-14: Three-stage process for stochastic rainfall generation ............................... 52 Figure 3-15: Observed rainfall field at 11/9/2000 21:00 (t = 0) ..................................... 54 Figure 3-16: Simulated time series of parameters (a) c3, and (b) c6 ............................... 54 Figure 3-17: Simulated time series of parameters (a) c7, (b) c8, and (c) c9 .................... 55 Figure 3-18: Simulated rainfall fields, 11/9/2000, (a) 21:05 (t = 1), (b) 21:30 (t = 6) ... 55 Figure 3-19: Simulated rainfall fields, 11/9/2000, (a) 22:00 (t = 12), (b) 03:00 (t = 72) 56 Figure 3-20: Simulated results, 21:05: (a) white noise field, (b) growth-decay field .... 56 Figure 3-21: Simulated rainfall fields, 11/9/2000, (a) 21:05 (t = 1), (b) 21:30 (t = 6) ... 57 Figure 3-22: Simulated rainfall fields, 11/9/2000, (a) 22:00 (t = 12), (b) 03:00 (t = 72) 57 Figure 3-23: Series of I(h) for simulated growth-decay field (11/9/2000 21:05) ........... 58 Figure 4-1: (a) Channel updating model (left), and (b) surface and layer A runoff updating model (right) ............................................................................................ 63 Figure 4-2: Recursive filtering algorithm for estimation of adaptive gain parameter and updating of a distributed rainfall-runoff model’s discharge ................................... 64 Figure 4-3: Basin mesh cell categories applied to the Nagara River watershed ............ 67 Figure 4-4: Influence of observation points on each mesh cell of the Nagara River watershed: (a) Inari, 1 , (b) Shimohorado, 2 , (c) Mino, 3 , (d) Chusetsu, 4 .. 72 Figure 4-5: Updating results for Akutami (Event 1)....................................................... 76 Figure 4-6: Prediction results for Chusetsu (Event 2) .................................................... 77 Figure 4-7: Prediction results for Chusetsu (Event 3) .................................................... 77 Figure 5-1: Schematic of proposed AI-based discharge forecasting approach for river basin locations with real-time discharge observation data ..................................... 80 Figure 5-2: Parse tree representation .............................................................................. 82 Figure 5-3: Crossover ..................................................................................................... 84 Figure 5-4: Mutation ....................................................................................................... 84 Figure 5-5: GP procedure flowchart ............................................................................... 84 Figure 5-6: Basic structure of a self-organizing map ..................................................... 87 Figure 5-7: Runoff predictions for Chusetsu (16-17/7/2001) ......................................... 91 Figure 5-8: Runoff predictions for Chusetsu (14-16/9/2001) ......................................... 91 Figure 5-9: 3-hour ahead runoff predictions for Chusetsu (14-16/9/2001) .................... 92 Figure 6-1: Nagara River flow routing map and discharge observation stations ......... 100 Figure 6-2: Observed discharge, Event 1: 23-28/4/2003 .............................................. 101 Figure 6-3: Observed discharge, Event 2: 11-13/7/2003 .............................................. 101 Figure 6-4: Interpolation for Mino, (23-28/4/2003) ..................................................... 102 Figure 6-5: Interpolation for Mino, (11-13/7/2003) ..................................................... 103 Figure 6-6: Extrapolation for Shimohorado, (23-28/4/2003) ....................................... 103 vii List of Figures Figure 6-7: Extrapolation for Shimohorado, (11-13/7/2003) ....................................... 104 Figure 7-1: Ensemble forecast for Chusetsu made at 21:00 11 September 2000 ..........113 Figure 7-2: Probabilistic forecast of discharge considering precipitation uncertainty, 21:00 11 September 2000, Chusetsu .....................................................................113 Figure 8-1: (a) PDF of inundation levels, and (b) severity curve ..................................118 Figure 8-2: (a) PDF of discharge rates for a location under analysis for a given future point in time, (b) Severity curve for location under analysis ................................119 Figure 8-3: Evacuation progress index ......................................................................... 120 Figure 8-4: Multi-stage decision model........................................................................ 121 Figure 8-5: Utility function .......................................................................................... 123 Figure 8-6: Mino flood hazard map .............................................................................. 125 Figure 8-7: Conceptual flood risk maps for real-time evacuation path planning ......... 130 viii List of Tables LIST OF TABLES Table 2-1: Rainfall-runoff modeling strategies and application to flood prediction ........ 6 Table 2-2: Land use groupings for Hydro-BEAM............................................................ 8 Table 2-3: Land use regions and sub-cell structure .......................................................... 9 Table 2-4: Infiltration and roughness coefficients for each land use type and channels. 24 Table 3-1: Parameter combinations and corresponding modeled phenomena ............... 28 Table 3-2: Parameterization of the lognormal and Weibull distribution ........................ 42 Table 3-3: Cumulative probabilities for growth-decay rates for values of 1mm/hr ≤ RA < 5mm/hr ................................................................................................................... 45 Table 3-4: Cumulative probabilities for growth-decay rates for values of RA = 0mm/hr 45 Table 3-5: Distribution parameters as functions of RA, for RA ≥ 5mm/hr ....................... 46 Table 4-1: Watershed mesh cell categorization .............................................................. 67 Table 4-2: Mesh cell types for partial distributed updating ............................................ 73 Table 4-3: Updating parameters ..................................................................................... 76 Table 4-4: Storm events used in application ................................................................... 76 Table 5-1: Prediction error comparison .......................................................................... 90 Table 6-1: Global regression results for Mino and Shimohorado ................................ 104 Table 8-1: Severity curve parameters ........................................................................... 124 Table 8-2: Evacuation curve parameters ...................................................................... 124 Table 8-3: Probabilistic flood forecast data .................................................................. 126 ix Introduction 1. INTRODUCTION It has long been the goal of flood forecasting to provide timely and accurate estimates of future discharge conditions at specific watershed locations. The objective of this research is to develop a flood forecasting system that not only provides accurate flood level forecasts, but is also capable of providing probabilistic forecasts at all locations within a river network. In order to achieve a shift away from the traditional flood prediction framework which focuses primarily on using point rainfall observations and lumped parameter or statistical models to make deterministic best-guess predictions of runoff rates for only a handful of locations within a river basin, a distributed rainfall-runoff model is chosen to simulate rainfall-runoff dynamics. Distributed rainfall-runoff models have been used in recent years for a range of different water quantity and quality simulations, however little attention has been given to the task of short-term flood forecasting. The distributed nature of such models provides the potential for simulations of superior accuracy to purely data-driven or lumped parameter forecasts, and allows flood forecasts to be made at all locations within a watershed. As a distributed rainfall-runoff model is being used for real-time flood simulation and forecasting a compatible technique for assimilating real-time discharge observations is required. An adaptive updating procedure and an alternative artificial intelligence-based error correction model are developed and shown to be effective in improving the performance of the distributed rainfall-runoff model. While much attention has been devoted to increasing the accuracy of flood forecasts, there is also a largely unfilled need to provide a measure of the confidence that can be placed in a given forecast. No forecast of hydrological conditions can be perfect, and often is the case that too much faith is placed in a ‘best’ prediction of future conditions, which can potentially lead to non-optimal decisions being made during the period leading up to a flood. It is recognized that the inability to accurately predict short-term rainfall conditions is a major source of error in discharge predictions. For this reason, a Monte Carlo simulation approach is used to generate a range of future possible rainfall conditions based on recent observations of rainfall dynamics in the considered region. These 1 Introduction patterns are input into the distributed rainfall-runoff model to generate simulated forecast hydrographs and to allow the future discharge of a river network to be described in a probabilistic sense. The proposed framework is comprised of the following system components: A distributed rainfall-runoff model capable of describing a watershed in terms of the distributed geographical properties of the watershed, and capable of converting rainfall patterns into discharge at each location within the watershed (Chapter 2). A rainfall simulation model capable of analyzing radar-observed rainfall patterns and stochastically generating future rainfall patterns (Chapter 3). An adaptive updating scheme for a distributed rainfall-runoff model capable of utilizing real-time river discharge observations to reduce forecast error (Chapter 4). An AI-based error forecasting model for use with a rainfall-runoff model and an interpolation scheme for making predictions of distributed runoff conditions (Chapter 5). A Monte Carlo simulation strategy for combining the rainfall simulation model and the distributed rainfall-runoff model to provide a probabilistic forecast of future watershed discharge conditions (Chapter 6). A strategy for producing a probabilistic flood forecast considering the combined effects of all input and model uncertainties (Chapter 7). A decision support tool for making optimal evacuation decisions for residents within the target watershed (Chapter 8). Example applications are provided for the Nagara River located in Gifu Prefecture, Japan. A 6-hour-ahead forecast is desired so as to provide sufficient time for the issuing of evacuations and appropriate operation of flood mitigation structures and machinery. 2 Distributed Rainfall-Runoff Modeling 2. DISTRIBUTED RAINFALL-RUNOFF MODELING A rainfall-runoff model is a fundamental component of a short-term flood forecasting system. In this chapter a number of rainfall-runoff modeling strategies are discussed, followed by an introduction of the distributed rainfall-runoff model Hydro-BEAM, which has been chosen for use within the probabilistic distributed flood forecasting system developed in this thesis. 2.1. Rainfall-runoff modeling strategies Rainfall-runoff modeling is the process of transforming a rainfall hyetograph into a runoff hydrograph. This can be achieved through the use of data-driven or statistical mathematical techniques, through developing physical descriptions of the rainfall-runoff process, or through various combinations of these approaches. Data-driven statistical techniques have received much attention in recent years, for their ability to infer relationships between observed hydrological time series and future watershed conditions. The autoregressive moving average (ARMA) and related time series models of Box and Jenkins (1976) have been used in modeling various water resources systems in the past. It was noted by Hsu et al. (1995) that linear time series models such as these do not attempt to represent the nonlinear dynamics inherent in the transformation of rainfall to runoff and that as a result they could not always be relied on to perform well. A number of alternatives to these time series models have been explored in recent years, which are capable of modeling non-linear processes, including artificial neural networks (ANN), Genetic Programming (GP), and Support Vector Machines (SVM). A number of studies into the application of ANN to rainfall-runoff modeling and flood forecasting have been carried out (Karunanithi et al., 1994; Lorrai and Sechi, 1995; Campolo et al., 1999). Hsu et al. (1995) compared ANN models with traditional black box models, concluding that an ANN model is capable of giving superior performance over a linear ARMAX (autoregressive moving average with exogenous inputs) time series approach, when observed time series of flow rate and rainfall are used as input. In general, ANN have been found to perform well in predicting short-term flood stage for flood events closely resembling in magnitude previous flood events used for training the networks. ANN models, however, tend to perform poorly during extreme events, and 3 Distributed Rainfall-Runoff Modeling for this reason Elshorbagy and Simonovic (2000) warn against using ANN models as the sole runoff prediction strategy. Also, it is difficult to determine the optimal ANN architecture for a given watershed, and in most cases, a trial-and-error approach is still used. GP is an evolutionary algorithm technique based on Darwin’s theory of natural selection. Unlike the associated Genetic Algorithm techniques that have found wide-spread use in parameter optimization of water resources systems (Cai et al., 2001; Cheng et al., 2002), GP is used to find a function that best fits a given data set, by searching a domain of all possible solutions. A study was conducted by Liong et al. (2002) applying GP to the problem of determining the relationship between future runoff and recently observed rainfall and runoff data, at the outlet of the Upper Bukit Timah watershed in Singapore. It was concluded that the functional relationships determined using GP could be used to give a reasonable short-term forecast superior to the naïve persistence forecasting technique. SVM, from the field of control theory and based on the principle of structural risk minimization, have been shown to have excellent time series prediction capabilities (Vapnik, 1995). SVM has been demonstrated to be a robust flood stage-forecasting tool (Liong and Sivapragasam, 2002), and unlike ANN, SVM have the advantage that the model architecture need not be defined a priori, however the problem of optimal parameter identification requires further research. In contrast to the data-driven techniques described above, conceptual and physical models require a considerable amount of effort to develop and use, requiring a set of equations that can adequately describe the hydrological processes being modeled, as well as calibration of the large number of parameters involved. Such models, however, have the benefit that they attempt to incorporate an understanding of the internal subprocesses of the rainfall-runoff process and are therefore less likely than the data-driven techniques to give wild or unrealistic predictions, especially when modeling rare or extreme flood events. One of the early pioneering lumped methods for forecasting runoff based on an observed rainfall hyetograph was the unit hydrograph method (Sherman, 1932), which is based on the assumption that a watershed is a linear system, and that rainfall intensity is uniform over a watershed. Lumped models, however, do not account explicitly for the 4 Distributed Rainfall-Runoff Modeling spatial variability of hydrologic processes, using averages to represent spatially distributed properties (Ramírez, 2000). An alternative to the lumped parameter model is the distributed rainfall-runoff model, which attempts to describe all surface and subsurface flow phenomena. Many distributed rainfall-runoff models, including Hydro-BEAM which is introduced in the next section, employ variations of the kinematic wave model for flow routing (e.g. Ishihara and Takasao, 1963). The required effort in setting up a model is further increased when using such models, as a detailed description of the watershed and gridbased hydrological inputs such as rainfall, are required. Distributed rainfall-runoff models have long been unsuitable for application to real-time flood forecasting, due to the large number of calculations that must be performed to run a simulation for an entire watershed, and due to the unavailability of a filtering technique for the improvement of a forecast based on real-time observation of flood stage. Computers have roughly doubled in speed every two years over the past four decades, closely following Gordon Moore’s 1965 forecast, that the number of transistors per square inch on a computer chip would double every 18 months (Moore, 1965). As a result of this phenomenon, it is becoming increasingly possible to simulate flood events using computer resource intensive distributed rainfall-runoff models at such a computing speed that the results may be used for real-time forecasting of flood events. Similarly, developments in remote sensing have produced an abundance of detailed accurate geographical survey data relevant to distributed modeling of watersheds such as digital elevation maps (DEM) and land use information, for many regions throughout the world. As access to satellite and radar observation technology improves, and with Moore’s Law likely to remain valid for a number of years to come, further increases in simulation detail and speed will be realized. Also, considering that generic software tools can be readily developed to automatically fit distributed models to watersheds when provided with the necessary geographical data, there will likely be a move in the near-future away from the simplistic black box or lumped forecasting models to potentially more accurate distributed rainfall-runoff models. The use of the distributed rainfall-runoff model Hydro-BEAM for short-term flood stage forecasting is investigated in this research, due to its potential for highly-accurate 5 Distributed Rainfall-Runoff Modeling flood modeling, and for its ability to provide runoff simulation results for all areas of a watershed. A summary of the various rainfall-runoff modeling strategies currently in use is given in Table 2-1. Table 2-1: Rainfall-runoff modeling strategies and application to flood prediction Modeling approach Black box / regression Examples GP ANN SVM ARMA Lumpedparameter Tank model Unit hydrograph Distributed Kinematic wave Dynamic wave Physical Comments Advantages: Rapid calculation Flood stage prediction capability documented Easy setup No detailed description of hydrological system required Disadvantages: Prone to giving unrealistic predictions during extreme events Advantages: Rapid calculation Flood stage prediction capability documented Advantages: Potential for highly-accurate simulations Calculation of distributed watershed discharge Disadvantages: Requires detailed input data set and lengthy setup period Computer resource-intensive Few examples available of real-time use for flood prediction 2.2. Hydro-BEAM Hydro-BEAM (Hydrological Basin Environmental Assessment Model) is being developed by the Water Resources Research Center of Kyoto University for the purpose of distributed rainfall-runoff simulation. It is noted that a large number of other distributed rainfall-runoff models have also been developed such as the Topography 6 Distributed Rainfall-Runoff Modeling Based Hydrological Model (TOPMODEL) (Beven et al., 1995), Systeme Hydrologique Europeen (SHE) model (Bathurst, 1986), and TOPOG_IRM (Zhang et al., 1999). Hydro-BEAM was first developed by Kojiri et al. (1998) as a tool to assist in simulating long-term fluctuations in water quantity and quality in rivers through an understanding of the hydrological processes that occur within a watershed. It has since been used in a pioneering work on comparative hydrology, where a methodology for assessing the similarity between watersheds was proposed (Park et al., 2000), to investigate sediment transport processes in the large watershed of the Yellow River, China (Tamura and Kojiri, 2002), and to investigate pesticide levels in rivers and their effects on hormone levels in fish (Tokai et al., 2002). Hydro-BEAM is used for the first time in this study for real-time flood stage forecasting, in collaboration with Mitsui Consultants Co., Ltd. of Japan. The use of a distributed rainfall-runoff model allows simulation and prediction of discharge levels at every point within a watershed’s channel network, rather than just at a handful of specified locations as with lumped-parameter hydrological models. It is reasoned that the spatially and temporally-detailed input data used by Hydro-BEAM will enable flow routing to be modeled with an accuracy higher than that currently achieved by lumped model counterparts commonly in use today. 2.3. Hydro-BEAM structure The watershed is modeled as a uniform array of multi-layered mesh cells, each containing information regarding surface land use characteristics, ground surface slope and runoff direction, and the presence/absence of a channel. The original Hydro-BEAM model that uses four shallow subsurface layers can be calibrated to include only two subsurface layers as described in the following sections, to allow for faster real-time calculation where necessary. Evaporation losses during a flood event are ignored in this study, as their magnitude is considered negligible during a flood event. Water quality modeling functions of Hydro-BEAM are also removed. 2.3.1. Mesh cell model The watershed to be investigated is divided into an array of unit mesh cells. A mesh cell can be arranged as a combination of a surface layer and several subsurface layers. The following description considers Hydro-BEAM calibrated with only two subsurface 7 Distributed Rainfall-Runoff Modeling layers, labeled A and B. Both subsurface layers are assumed to have a slope equal to the slope of the ground surface. The mesh cell model is depicted in Figure 2-1. Figure 2-1: Mesh cell model Land use data available from satellite survey is grouped into five standard categories as given in Table 2-2. If necessary, these categories and definitions can easily be adjusted to better suit the peculiarities of the particular watershed being modeled. The ground surface land use characteristics of a mesh cell are modeled as demonstrated in Figure 2-2, with land use types grouped and represented as a percentage land cover of the total area of the mesh cell. Table 2-2: Land use groupings for Hydro-BEAM Category Forest Field Urban area Paddy field Water body Description Densely-vegetated regions Agricultural regions including farms and orchards Paved or otherwise impervious urban regions Regions composed of paddy fields Bodies of water including inland waters and the sea. 8 Distributed Rainfall-Runoff Modeling Satellite image Modeled surface Figure 2-2: Arrangement of land use categories on a mesh cell surface Land use information is used to specify the structure of a sub-cell, and its infiltration and runoff characteristics. Three different sub-cell structures are created as given in Table 2-3 to suit the various land use types. Table 2-3: Land use regions and sub-cell structure Sub-cell type Mountain City Water Category Forest, field Urban area Paddy field, water body Sub-cell structure Layer A Infiltration Yes Yes No No No Yes Discharge from each mesh cell is calculated based on the discharge from each land use type sub-cell. A precipitation input for each sub-cell is determined relative to the percentage cover of the land use type, resulting in surface runoff to an adjacent river channel or downstream mesh cell, or infiltration to a subsurface layer. It is assumed that 100% of the runoff that infiltrates to layer B is lost to groundwater recharge, with discharge from layer A and surface discharge being routed to the nearest downstream river channel. Discharge routing within a mesh cell that does not contain a river channel, is described in Figure 2-3. 9 Distributed Rainfall-Runoff Modeling Rainfall Rainfall Forest Field Layer A Layer A Paddy field Water body Urban area Lateral flow Infiltration Groundwater Downstream mesh cell Figure 2-3: Discharge routing within a mesh cell 2.3.2. Watershed model In order to route the discharge in a watershed resulting from a precipitation event to the watershed mouth, it is necessary to connect the mesh cells that comprise the watershed through the use a flow routing map. The function of a flow routing map is to define a downstream destination for the discharge resulting from every cell in the watershed, with the exception of the furthest downstream mesh cell located at the watershed mouth. A procedure to determine a flow routing map for a generic watershed was developed in this research as part of a semi-automated model calibration tool to be used with HydroBEAM. The procedure for determining a flow routing map is outlined here. A combination of a digital elevation map and a printed watershed map can be used to achieve the following: - Determination of the watershed boundary location. - Division of the watershed into a regular grid of mesh cells. Determination of a flow routing network based on mesh cell elevation as given by a DEM and checked against a printed map. Flow direction from any given mesh cell can be estimated using a simple algorithm where information is not otherwise available from ground surveys of channel positions. The algorithm chooses a runoff path to be in the direction with the greatest slope as 10 Distributed Rainfall-Runoff Modeling determined by a DEM, with alterations made to remove areas where ponding will occur due to a mesh cell having an elevation below that of each of its surrounding mesh cells, or where a complete path from every mesh cell to the furthest downstream point of the watershed is not achieved. An example flow routing map is shown in Figure 2-4. As can be seen, using regularshaped mesh cells, it is difficult to exactly match the modeled surface boundary with the boundary of the watershed, however the error associated with this difference is generally small, especially with watersheds with a surface area greater by an order of three than that of the unit mesh cell. Flow direction Mesh cell Actual watershed boundary Watershed mouth Mesh with channel Mesh without channel Figure 2-4: Flow routing map structure Hydro-BEAM was originally developed to use a 4-direction flow routing map. For this research, however, Hydro-BEAM has been modified to allow 8-direction flow routing. This has introduced the need to consider how to model runoff within mesh cells that slope toward a diagonally-adjacent mesh cell, and to re-consider how to route flow between mesh cells. The 8-direction flow routing scheme is described in Figure 2-5. The major changes involve the need to distinguish between the flow from adjacent mesh cells that enters laterally and that which enters from above a mesh cell. Also, the surface of a diagonally sloping mesh cell is approximately represented as a rectangle with dimensions 2l by l 2 for the sake of surface runoff calculations. The channel length is increased to be 2l , where l is the length of a mesh cell. Mesh cells are classified as above or laterally positioned, depending on their position relative to the 11 Distributed Rainfall-Runoff Modeling downstream receiving mesh cell. The two possible cases are shown in the upper-left corner of Figure 2-5, with mesh cells positioned above the receiving mesh cell colored green and with laterally-positioned mesh cells colored blue. As with Figure 2-4, mesh cells containing a river channel contain a navy-colored arrow. For the case of 8-directional flow to a downstream cell (colored orange) containing a river channel, surface flow entering from laterally-positioned upstream mesh cells is distributed along the length of the downstream mesh cell’s boundary, and surface flow entering from above is routed directly to the upstream boundary of the adjacent downstream channel. Channel flow is routed from all upstream mesh cells, whether positioned above or laterally, directly to the upstream boundary of the downstream mesh cell’s channel. Figure 2-5: 8-direction flow routing scheme 2.4. Hydro-BEAM flow routing theory and equations A finite difference approximation of the kinematic wave model can be used to model 12 Distributed Rainfall-Runoff Modeling watershed runoff on the surface and in layer A. The kinematic wave model is described using the following equations: A q 2-1 r (t , x ) t x q f ( x, A) 2-2 where A is cross-sectional flow area [m2], q is discharge [m3/s], t is time [s], x is longitudinal distance along a channel or surface [m], and r is lateral inflow per unit length of flow [m3/m.s] equal to the sum of precipitation and inflow from adjacent areas minus losses due to infiltration. The Kinematic wave model ignores inertial and pressure forces and is based on the Saint-Venant equations. The main assumptions for the Saint-Venant equations are as follows: Flow is one-dimensional The slope of the channel bottom is small The vertical pressure distribution is hydrostatic and wave lengths are large compared with water depth The fluid being modeled is incompressible The necessary initial and boundary conditions for the kinematic wave model used in a distributed rainfall-runoff simulation model are as follows: 2-3 A(0, x) A0 ( x), 0 x L A(t,0) AB (t ), 0 t 2-4 where L is the length [m] of the channel or slope being modeled, A0 is the crosssectional area of flow at time t = 0, and AB is the cross-sectional area at the upstream mesh cell boundary. 2.4.1. Finite difference approximation of the kinematic wave equation The kinematic wave equations presented here do not have an explicit analytical solution, however a range of finite difference numerical solutions may be used, which involve solution of the partial differential equations on an x-t plane divided into a grid. One finite difference approximation of the kinematic wave model described in Equation 2-1 and Equation 2-2 is the scheme developed by Beven (1979), given below. 13 Distributed Rainfall-Runoff Modeling qt 1 qt 1 qt qt qit 1 qit 2-5 cit1/1 2 i 1 i r 1 cit1/ 2 i 1 i r 0 t x x Here the subscript refers to the space coordinate and the superscript refers to the time coordinate, α is a time weighting parameter, and: 2-6 ci 1 2 0.5(ci ci 1 ) where c is the kinematic wave velocity: dq 2-7 c dA The solution is calculated along a time line from upstream to downstream as shown in Time, t Figure 2-6. t+1 Known points Δt Unknown points t Δx xi xi+1 Distance, x Figure 2-6: Finite difference mesh used in Beven kinematic wave routing model 2.4.2. Linear approximation method In the case of the above finite difference approximation of the kinematic wave equation, problems related to instability of the approximation, especially when using larger temporal and spatial step sizes, occur. To some extent, these may be reduced by closely observing the Courant condition, which states that a wave or hydrograph should not be allowed to travel through a subreach x in a time less than the computational interval t , such that x 2-8 c t A linear approximation method developed by Shiiba (1993) has been found to be a stable alternative to the kinematic wave approximation method given above, and is used in this research. For a given cell, whether considering surface flow or flow in a channel, the following notation is used: iit, i = 1,M Inflow from upstream cells (m3/s) Qt Discharge from downstream (m3/s) 14 Distributed Rainfall-Runoff Modeling q(t,x) rt At St t Flow rate at a given point (m3/s) Lateral inflow (m3/s) Flow area at a position in the cell (m2) Storage within a sub-length of the channel or surface (m3) Calculation time step (s) Upstream boundary: q(t,0) = iit , i = 1,M Downstream boundary: q(t,n) = Qt where L is the length of the cell, divided into N segments, dividing points along the cell labeled j = 0 (upstream) through to j = n (downstream). Calculation proceeds by setting the two conditions that: The relationship between flow rate and flow area is defined: q = f(A) Flow rate q varies linearly from upstream to downstream within a cell The following two equations describe the flow routing within a sub-length of the cell based on the above conditions: 2-9 q j f Aj q j j Q q0 N q0 2-10 The change in storage within the whole mesh cell over time step t is S t iit rt Qn 2-11 where storage S can be calculated using trapezoidal interpolation such that St Adx L N Aj 1 Aj 2 2-12 An approximation Qn of Q can then be calculated by reducing the following evaluation function Fn to be approximately zero using the Newton-Raphson method, and making use of Equation 2-9. Fn t iit rt Qn S t iit rt Qn Sn 1 Sn 2-13 t iit rt Qn Sn 1 L / N Aj 1 Aj 2 2.4.3. Surface flow in mountain regions On mountain-type slopes, rainfall can easily penetrate the ground surface and infiltrate into layer A. For this reason, the rainfall that initially falls on the surface is permitted to flow within layer A with surface flow only occurring once layer A is full, as depicted in 15 Distributed Rainfall-Runoff Modeling Figure 2-7. Only surface flow is considered for city and water-type slopes, since layer A is impervious. : Average surface slope (rad), D: Layer A thickness (m), w: Slope width, HS: Surface flow depth, HA: Layer A flow depth Figure 2-7: Combination of surface and layer A flows The equation of momentum equivalent to Equation 2-2 for a mountain-type region is based on the Manning formula: q whk sinθγ+ / whk sinθ /γ , sin(θ ) (w h γD)5/ 3 , h γD n h γ D 2-14 where h = HA + HS, and where it is assumed that the width w of the water body is much greater than the height h, and where and k are the effective porosity and the permeability, respectively, of layer A, and n is surface roughness. 2.4.4. Surface flow in city and water regions It assumed that the rainfall input to city and water regions will develop into surface flow without first requiring the ground surface to become saturated. The sheet flow model can be described using the momentum equation given in Equation 2-15. 16 Distributed Rainfall-Runoff Modeling q sin(θ ) 2 / 3 5 / 3 w A n 2-15 2.4.5. Channel flow In Hydro-BEAM, the mesh cells containing channels are specified, with flow occurring within these channels as a result of upstream inputs and lateral inflow from surface runoff. The equation of momentum for flow within a channel with dimensions as shown in Figure 2-8 is given in Equation 2-16. This simplifies to Equation 2-17 for the case of a rectangular channel, where the side slope of the channel becomes / 2 . Figure 2-8: Rectangular and trapezoidal channel cross-sections sin(θ ) 5/ 3 2-16 A ( w 2h 1 z 2 ) 2 / 3 n sin(θ ) 5/ 3 2-17 q A ( w 2h) 2 / 3 n Here w is the width of the base of the channel, and z cot where is the side slope q of the channel walls. A rectangular channel shape is assumed in this research. 2.5. Target watershed The Nagara River (Figure 2-9) is a southward flowing river located in the Gifu and Mie prefectures of Japan, and has a total catchment area of 1985 km2 (Ministry of Construction, 2000). The river’s water is mainly used for irrigation and hydroelectricity generation, and provides water for the Tokai region. The Nagara River has undergone a number of improvements over the past centuries, commencing with the flood mitigation works of the Dutch engineer Jogannes Derijke in the 1870’s. Presently the majority of the 54km length of Nagara River’s main stream is lined by concrete banks. No large dams, reservoirs, or weirs obstruct flow upstream of the Chusetsu observation station. This makes the upper and middle sections of the 17 Distributed Rainfall-Runoff Modeling watershed ideal for testing the forecasting method, as there is no need to consider artificially stored water bodies. The Nagara River watershed is also a good choice for application of the forecast as it has a long history of flooding. Typhoon No. 17 of September, 1976 caused a heavy storm, which resulted in a breach of the bank of Nagara River, causing an estimated 1000 million yen in damage (Ministry of Construction, 1980). More recently, a flood event of considerable magnitude resulted from Typhoon No. 14 of September, 2000 for the Nagara River and the neighboring watersheds of the Kiso and Ibi rivers. Presently, the design high water discharge and high-water flood stage for the Nagara River at Chusetsu are 7500m3/s (Ministry of Construction, 2000) and 6.68m (Ministry of Construction, 1995), respectively. Radar observations of precipitation conditions over the entire landmass of Japan are made available in real-time by both the Japanese Ministry of Land, Infrastructure and Transport, and by the Japan Meteorology Agency. Radar data is currently provided by the former in the vicinity of the Nagara River watershed at five-minute intervals at a spatial resolution of 1km by weather radars located at Gozaisho and Jyatoge, as shown in Figure 2-10(a). 18 Distributed Rainfall-Runoff Modeling Figure 2-9: Nagara River basin 19 Distributed Rainfall-Runoff Modeling 2.5.1. Model calibration Hydro-BEAM is fitted to the Nagara River watershed using a combination of digital survey data and printed maps to generate flow routing and land use maps, and using rainfall and discharge observations for calibration of runoff and infiltration parameters. The topography of the Nagara River watershed is shown in Figure 2-10(b) in the form of a digital elevation map. A corresponding 8-direction flow routing map (Figure 2-11(a)) is calculated based on the topography and with reference to a map of the actual location of channels. Each mesh cell within the flow routing map is color-coded according to the number of mesh cells that lie upstream of the cell. This allows for easy cross-checking of the positions of major and minor river channels against a printed map of the watershed. 1556 mesh cells of approximately 1km2 in area are used to describe the upper and middle catchment areas located upstream of Chusetsu. The land use of this area is divided into 5 categories, and the percentage cover of each land use for each mesh cell is extracted from data sets obtained from satellite images. The dominant land use type of each mesh cell is shown in Figure 2-11(b), and the percentage cover within each mesh cell of each land use type is shown in Figure 2-12. The model parameters are calibrated through trial and error using observations from typhoon events that occurred between 1992 and 1999, and the infiltration and roughness coefficients for each land use type and channels are given in Table 2-4. Example simulation results for a rainfall event that occurred over the period 19-20/6/2001 are given in Figure 2-13 for Chusetsu and Akutami. Hourly rainfall averaged over every mesh cell in the watershed is included in each figure for reference. 20 Distributed Rainfall-Runoff Modeling Elevation (m) Figure 2-10: (a) Coverage of Gozaisho and Jyatoge weather radar facilities, (b) topographical map of the Nagara River watershed 21 Distributed Rainfall-Runoff Modeling Figure 2-11: (a) Flow routing map for the Nagara River watershed, (b) map of dominant land use within each mesh cell 22 Distributed Rainfall-Runoff Modeling Figure 2-12: Percentage cover within each mesh cell of (a) forest, (b) field, (c) urban area, (d) paddy field, (e) water body 23 Distributed Rainfall-Runoff Modeling Table 2-4: Infiltration and roughness coefficients for each land use type and channels. Type Forest Field Urban area Paddy field Water body Channel Infiltration coefficient (mm/hr) 0.2 0.2 0 0.1 0.1 0 1400 0 Roughness coefficient 0.3 0.2 0.05 2 0.01 0.05 1400 0 10 10 1200 1200 20 600 60 70 400 Rainfall Calculated discharge 0:00 6:00 20/6/2001 Time 12:00 60 70 Rainfall 80 Observed discharge 200 90 Calculated discharge 100 18:00 50 600 80 0 12:00 40 400 Observed discharge 200 30 800 0 18:00 6:00 19/6/2001 Rainfall (mm/hr) 50 Discharge (m3/s) 40 800 6:00 19/6/2001 20 1000 30 Rainfall (mm/hr) Discharge (m3/s) 1000 90 100 12:00 18:00 0:00 6:00 20/6/2001 Time 12:00 18:00 Figure 2-13: Example simulation results for (a) Chusetsu, and (b) Akutami 2.6. Conclusions A distributed rainfall-runoff model is chosen for use in this research to provide simulation results for each point within a watershed, rather than at only a limited number of specific locations. Hydro-BEAM has been chosen as a suitable distributed model and its structure and underlying equations have been summarized. Modifications to Hydro-BEAM have been made to allow for real-time flood routing, through removal of water quality and evapotranspiration components, introduction of an 8-direction flow routing scheme, and provision of the option to reduce the number of subsurface layers. Calibration for the Nagara River watershed has been demonstrated. 24 Probabilistic Distributed Rainfall Forecasting 3. PROBABILISTIC DISTRIBUTED RAINFALL FORECASTING A procedure for probabilistically forecasting short-term distributed rainfall conditions is developed in this chapter. This procedure involves the analysis of historical and current stochastic properties of the translation, rotation and growth-decay characteristics of rainfall so as to allow for the stochastic generation of the future development of currently-observed rainfall fields. The approach presented here is developed from the stochastic rainfall pattern simulator proposed by Smith (2003) and Smith and Kojiri (2004). A Monte Carlo simulation procedure based on a translation vector model is employed for modeling the temporal and spatial dynamics of rainfall patterns in terms of their horizontal translation and growth-decay properties. The procedure is based on a time series and statistical analysis of the vector series that describe the radar-observed rainfall patterns. This procedure is capable of generating distributions of future rainfall field time series sufficient for use in forecasting short-term rainfall-runoff dynamics during periods of heavy rainfall. The stochastically generated rainfall patterns can subsequently be input into the distributed rainfall-runoff model Hydro-BEAM to produce an ensemble forecast of future discharge conditions at all locations within a watershed. 3.1. Study location and rainfall data An approximately 240km × 240 km region encompassing the entire Nagara River watershed, located between 34°40’ and 36°40’N and 135°00’ and 138°00’E, is used as the study location. A forecast horizon of up to 6-hours is considered in the case of this study so as to allow ample time for evacuation warning, and as such this region is chosen to encompass not only the target watershed, but also a buffer region surrounding Nagara River that allows for a distance traversable by a rainfall field over a 6-hour period. Rainfall intensity within the target region is modeled as a surface on a Cartesian plane divided into a rectangular mesh of dimension x y , where both x and y are approximately 1km. Rainfall intensity at time t within a mesh cell with center coordinates x, y, is described as z(x,y,t). The coordinates of this system are as follows: xi i 0.5 x, i 1, , M 3-1 y j j 0.5 y , j 1, , N Here M and N are the number of mesh cells in the x and y directions, respectively. An 25 Probabilistic Distributed Rainfall Forecasting example rainfall intensity pattern converted from a rainfall echo observed using weather radar is depicted in Figure 3-1. Figure 3-1: Observed rainfall intensity pattern described as a 3-dimensional surface 3.2. Modeling translation and rotation of rainfall fields A rainfall translation model (Shiiba et al., 1984; Takasao et al., 1994) is used as the basis for modeling and forecasting the movement of rainfall patterns. The translation model describes the dynamics of the rainfall strength z at each mesh cell x, y as: z z z 3-2 u v w t x y where u and v are the elements of the translation vector and w is the growth-decay head. These variables are further described by the following one-dimensional functions: u c1 x c2 y c3 v c4 x c5 y c6 3-3 w c7 x c8 y c9 Here, c1~c9 are parameters to be estimated through analysis of past rainfall patterns. 26 Probabilistic Distributed Rainfall Forecasting 3.2.1. Identification of translation vector parameters from rainfall patterns In order to analyze the translation, rotation and growth-decay behavior of observed rainfall pattern series the translation vector parameters must first be determined. In order to identify the parameters, Jc 1 M 1 N 1 v k K i 2 j 2 2 ijk is minimized, where tk k t, k K , K 1, 3-4 3-5 ,0 and (K+1) is the number of observations used over time period K t . Here, vijk is defined as follows: z z vijk c1 xi c2 y j c3 t ijk x ijk z c4 xi c5 y j c6 c7 xi c8 y j c9 y ijk 3-6 Traditionally, translation vector parameters and the corresponding translation vectors and the growth-decay head have been used to extrapolate the precipitation pattern series into the future, in effect providing a simple short-term precipitation prediction, based on the assumption that the identified parameters remain constant for a short period of time (e.g. 1~2 hours). 3.2.2. Extrapolation of rainfall patterns Extrapolation of a rainfall pattern forward in time over time step t proceeds through tracing the pattern movement backwards along a characteristic curve defined for that time step by the translation vector parameters, using the following differential expressions: dx t c1 x t c2 y t c3 dt dy t 3-7 c4 x t c5 y t c6 dt dz t c7 x t c8 y t c9 dt The expressions in Equation 3-7 can be rearranged to determine an extrapolated pattern for a time step of into the future: 27 Probabilistic Distributed Rainfall Forecasting z x, y, t0 z x t0 , y t0 , t0 S ; c1 , x , c9 y , 1 3-8 x x t0 R ; c1 , , c6 y y t 0 1 where S and R are 3 3 and 2 3 matrices, respectively. Through repeating the above process for each time step using the corresponding translation vector parameters, a series of future rainfall patterns can be generated. Depending on the phenomena that need to be modeled, some of the parameters can be omitted from the analysis (i.e. set to a value of zero). Table 3-1 gives examples of the sets of parameters that can be used to model different combinations of translation, rotation and growth-decay phenomena, and Figure 3-2(a) and Figure 3-2(b) give example vector fields for parallel translation, and translation and rotation, respectively. Table 3-1: Parameter combinations and corresponding modeled phenomena Phenomena Case 1: Parallel translation only Case 2: Translation and rotation only Case 3: Parallel translation, growth-decay Case 4: Translation, rotation, growth-decay c1 c2 c3 c4 c5 ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ c6 c7 c8 c9 ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ Figure 3-2: Vector fields for (a) Case 1, and (b) Case 2 (11/9/2000 20:55 - 21:00) 28 Probabilistic Distributed Rainfall Forecasting 3.3. Time series analysis of observed translation vector parameters The translation model described above can be used to analyze the temporal variation of rain fields through conversion of observations into series of vector parameters. Once observed patterns have been converted to their corresponding translation vector parameters, they can be analyzed using the autoregressive moving average (ARMA) methodology (Box and Jenkins, 1976). ARMA processes have been used previously to model the temporal variation of the translation vector parameters c3, c6, c7, c8, and c9, which describe parallel translation and growth-decay phenomena (Smith and Kojiri, 2004). The ARMA model is extended in this research to include the parameters that describe rotation, and is modeled as a multivariate time series, recognizing that the translation vector parameters may have not only serial dependence but also interdependence between each parameter series. 3.3.1. ARMA time series analysis The time series of vector parameters are analyzed here using autoregressive moving average (ARMA) analysis. ARMA processes are defined by linear difference equations with constant coefficients and can be used to model and forecast stationary time series. {Xt} is an ARMA(p,q) process if {Xt} is stationary and if for every t, X t 1 X t 1 p X t p Z t 1Z t 1 q Z t q 3-9 where {Zt} is a zero mean white noise process with variance 2 : {Z t } ~ WN(0, 2 ) , and the polynomials (1 1z p z p ) and (1 1z q z q ) have no common factors. Equation 3-9 can be represented using the concise form B X t B Zt 3-10 The autoregressive (AR) operator of order p and the moving average (MA) operator of order q can be expanded as 3-11 B 1 1B p B p and 29 Probabilistic Distributed Rainfall Forecasting B 1 1B q B q 3-12 respectively, where 1 , 2 , , p are the AR parameters, and 1 , 2 , parameters and B is the backward shift operator ( B j X t X t j ). , q are the MA 3.3.2. ARMA model identification An ARMA(p,q) model can be fitted to time series data through inspection of the data, identification and removal of any trend or seasonality, identification of the correlation characteristics, and subsequent application of the Yule-Walker equations (Yule, 1927; Walker, 1931). Appropriate estimates of p and q can be made considering the correlation characteristics of the data. Model identification and fitting for the time series of parameters c1 through c6 will be discussed using parameters calculated from 5-minutely radar rainfall observations over the period spanning 6:30 through 23:55, 10th September, 2001 (Figure 3-3). A limitation of the translation vector model is that the vector parameters display erratic behavior during periods of low rainfall. The ARMA model representation of vector parameter fluctuations is inappropriate for periods when the rainfall field strength is on average less than approximately 1.0mm/hr. This is not a serious limitation as the system is to be used during periods of heavy rainfall in order to forecast flood conditions. Inspection of the charts for vector parameters c1 ~ c6 suggests that the time series are non-stationary. This is confirmed by the charts in Figure 3-4 which display slowly decaying positive sample autocorrelation functions (ACF’s) and cross-correlation functions (CCF’s), which is an indication that trend elimination through differencing is necessary to acquire a stationary time series to which an ARMA model can be fitted. A generalization of the ARMA class of time series models which incorporates a wide range of non-stationary series is the ARIMA (AutoRegressive Integrated Moving Average) class of processes. These are processes that through differencing can be reduced to stationary ARMA processes, with {Xt} being an ARIMA(p,d,q) process if d is a non-negative integer and Yt = (1-B)dXt is a causal ARMA(p,q) process. 30 0.15 0.1 0.1 0.05 c2 (km/hr/km) c1 (km/hr/km) Probabilistic Distributed Rainfall Forecasting 0.05 0 -0.05 9:00 11:30 14:00 16:30 Time (hr:min) 19:00 -0.15 6:30 21:30 10 0.15 5 0.1 0 -5 -10 -15 9:00 11:30 14:00 16:30 Time (hr:min) 19:00 21:30 9:00 11:30 14:00 16:30 Time (hr:min) 19:00 21:30 19:00 21:30 0.05 0 -0.05 -0.1 -20 6:30 9:00 11:30 14:00 16:30 Time (hr:min) 19:00 -0.15 6:30 21:30 0.15 10 0.1 5 0.05 c6 (km/hr) c5 (km/hr/km) -0.05 -0.1 c4 (km/hr/km) c3 (km/hr) -0.1 6:30 0 0 -0.05 -0.1 -5 -10 -15 -0.15 -0.2 6:30 0 9:00 11:30 14:00 16:30 Time (hr:min) 19:00 -20 6:30 21:30 9:00 11:30 14:00 16:30 Time (hr:min) Figure 3-3: Time series of translation vector parameters c1 ~ c6, 10/9/2001 Differencing once at lag one produces the time series given in Figure 3-5 and the corresponding ACF’s and CCF’s given in Figure 3-6 and Figure 3-7, which suggest that an ARMA process may be suitable for the differenced series, equivalent to an ARIMA(p,1,q) process for the non-differenced series. It is also necessary to consider the possibility of cross-correlation between vector parameters. Some of the sample cross-correlations ˆij ( h ), i j for the differenced series lie outside the significance bounds of ±1.96n-0.5, within which approximately 95% of sample correlations should fall in the case of non-correlated series. In order to test for independence between each of the time series, maximum likelihood univariate ARMA models are fitted to each time series separately, and the correlation between time series of residuals from the models are examined. The hypothesis of independence between two series is rejected if it observed that for any fixed h, ˆ12 (h) of the residuals of the two series do not fall between the bounds of ±1.96n-0.5 with a probability of approximately 0.95. Inspection of 31 Probabilistic Distributed Rainfall Forecasting the correlation between the time series of residuals indicates that strong interdependence exists between the parameters c1 and c3, and c2 and c3, which are the parameters used to describe translation in the x-direction and also between c4 and c6, and c5 and c6, which are used to describe translation in the y-direction. This indicates the need to model the translation vector parameters as multivariate time series. There exists only weak dependence, if any, between other parameter combinations, and as such the x-direction parameter time series (c1, c2 and c3) are treated as being independent of the y-direction parameter time series (c4, c5 and c6) in this analysis. 1 Autocorrelation c1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 -0.2 0 0 4 8 12 16 -0.2 20 -0.4 -0.4 -0.6 -0.6 -0.8 -0.8 -1 -1 1 Autocorrelation c3 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 4 8 12 16 20 Autocorrelation c4 0 0 4 8 12 16 -0.2 20 -0.4 -0.4 -0.6 -0.6 -0.8 -0.8 -1 -1 1 Autocorrelation c5 0 4 8 12 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 -0.2 0 1 0.8 -0.2 Autocorrelation c2 16 20 Autocorrelation c6 0 0 4 8 12 16 -0.2 20 -0.4 -0.4 -0.6 -0.6 -0.8 -0.8 -1 -1 0 4 8 12 16 Figure 3-4: Sample autocorrelation functions of c1 ~ c6, 10/9/2001 32 20 Probabilistic Distributed Rainfall Forecasting If the multivariate time series processes Xt and Yt are defined as ct1 ct1 c c t2 t2 c c X t t 3 , Yt X t X t 1 X t t 3 ct 4 ct 4 ct 5 ct 5 ct 6 ct 6 3-13 where is the lag-1 difference operator, defined such that X t X t X t 1 (1 B) X t , then a multivariate ARMA(p,q) process for {Yt} equivalent to a multivariate ARIMA(p,1,q) process for {Xt} can be defined by requiring that {Yt} satisfies a set of linear difference equations with constant coefficients. {Yt} is an ARMA(p,q) process if {Yt} is stationary and if for every t, Yt 1Yt 1 p Yt p Zt 1Zt 1 q Zt q 3-14 where {Zt} is zero mean white noise with covariance matrix Σ, written {Zt}~WN(0, Σ), and the AR and MA coefficients { i }(i = 1, 2, …, p) and { i }(i = 1, 2, …, q) are real m×m matrices. Also, {Yt} can be considered an ARMA(p,q) process with mean μ if {Yt - μ} is an ARMA(p,q) process. 33 0.15 0.1 0.1 0.05 0.05 dc2 (km/hr) dc1 (km/hr/km) Probabilistic Distributed Rainfall Forecasting 0 -0.05 9:00 11:30 14:00 16:30 Time (hr:min) 19:00 -0.15 6:30 21:30 15 11:30 14:00 16:30 Time (hr:min) 19:00 21:30 9:00 11:30 14:00 16:30 Time (hr:min) 19:00 21:30 19:00 21:30 0.1 dc4 (km/hr/km) 5 dc3 (km/hr) 9:00 0.15 10 0 -5 -10 0.05 0 -0.05 -15 -20 6:30 9:00 11:30 14:00 16:30 Time (hr:min) 19:00 -0.1 6:30 21:30 0.2 20 0.15 15 0.1 10 dc6 (km/hr) dc5 (km/hr/km) -0.05 -0.1 -0.1 -0.15 6:30 0 0.05 0 -0.05 0 -5 -0.1 -10 -0.15 -0.2 6:30 5 9:00 11:30 14:00 16:30 Time (hr:min) 19:00 -15 6:30 21:30 9:00 11:30 14:00 16:30 Time (hr:min) Figure 3-5: Time series of c1 ~ c6 , 10/9/2001 34 Probabilistic Distributed Rainfall Forecasting 1 Autocorrelation dc1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 -0.2 0 0 4 8 12 16 -0.2 20 -0.4 -0.4 -0.6 -0.6 -0.8 -0.8 -1 -1 1 Autocorrelation dc3 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 4 8 12 16 20 Autocorrelation dc4 0 0 4 8 12 16 -0.2 20 -0.4 -0.4 -0.6 -0.6 -0.8 -0.8 -1 -1 1 Autocorrelation dc5 0 4 8 12 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 -0.2 0 1 0.8 -0.2 Autocorrelation dc2 16 20 Autocorrelation dc6 0 0 4 8 12 16 -0.2 20 -0.4 -0.4 -0.6 -0.6 -0.8 -0.8 -1 -1 0 4 8 12 16 Figure 3-6: Sample autocorrelation functions of c1 ~ c6 , 10/9/2001 35 20 Probabilistic Distributed Rainfall Forecasting 1 Cross-correlation dc1,dc2 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 -0.2 0 0 4 8 12 16 -0.2 20 -0.4 -0.4 -0.6 -0.6 -0.8 -0.8 -1 -1 1 Cross-correlation dc1,dc3 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 4 8 12 16 20 Cross-correlation dc4,dc6 0 0 4 8 12 16 -0.2 20 -0.4 -0.4 -0.6 -0.6 -0.8 -0.8 -1 -1 1 Cross-correlation dc2,dc3 0 4 8 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 -0.2 0 1 0.8 -0.2 Cross-correlation dc4,dc5 12 16 20 Cross-correlation dc5,dc6 0 0 4 8 12 16 -0.2 20 -0.4 -0.4 -0.6 -0.6 -0.8 -0.8 -1 -1 1 0 4 8 12 16 20 Cross-correlation dc3,dc6 0.8 0.6 0.4 0.2 0 -0.2 0 4 8 12 16 20 -0.4 -0.6 -0.8 -1 Figure 3-7: Selected sample cross-correlation functions of c1 ~ c6 , 10/9/2001 An AR(2) model with zero mean is fitted to the differenced series {Yt} using YuleWalker estimation, for modeling parallel translation and rotation (Case 2). This is equivalent to an ARIMA(2,1,0) model for {Xt}. The AR parameter matrices and the white noise covariance matrix are given below. 36 Probabilistic Distributed Rainfall Forecasting 0 0 0 -.650 -.213 -.00122 -.338 -.644 -.00236 0 0 0 -.348 0 0 0 25.5 25.0 1 0 0 -0.487 -0.218 -.00176 0 0 0 0 0.260 -0.269 .00199 0 0 -18.4 -.0908 -.530 0 0 0 0 -.325 -.0253 -.000565 -.295 -.425 -.00264 0 0 0 -.0125 0 0 0 28.3 17.8 2 0 0 -.292 -.241 -.00175 0 0 0 0 .157 -.00102 .00106 0 0 -1.59 3.55 -.121 0 0 0 0 .00114 .000092 -.0998 .000092 .000658 -.0531 0 0 0 -.0531 13.2 0 0 0 -.0998 0 0 .000840 -.000087 -.0509 0 0 0 0 -.000087 .000813 -.0507 0 0 -.0509 -.0507 8.80 0 Inspection of the multivariate residual series of the model developed in this analysis indicates that the ACF’s and CCF’s are negligible for all lags greater than zero, which is a requirement for a well-fitted model. A time series model for the case of parallel translation only (Case 1) is also fitted by considering ct3 and ct6 as separate univariate processes. The time series and autocorrelation functions for ct3 and ct6 differ from the series given in the above figures for the 10/9/2001 as they are calculated without considering rotation of the rainfall field. In this case, Xt and Yt are defined as c c 3-15 X t t 3 , Yt X t X t 1 X t t 3 c c t6 t6 and an ARMA(0,1) model with zero mean is fitted to the differenced series {Yt}, equivalent to an ARIMA(0,1,1) model for {Xt}. The AR and MA parameter matrices and the white noise covariance matrix are given below. 37 Probabilistic Distributed Rainfall Forecasting 0 -0.689 1 -0.590 0 0 2.49 2.68 0 3.3.3. Generation of vector parameters Future translation vector parameters can be stochastically generated based on time series models estimated using the methodology proposed above. With each extrapolation step, white noise Zt is sampled based on the characteristics of the covariance matrix Σ. For a multivariate AR(p) model for {Yt}, initial values of Xt1,…,Xt-p, Xt-p-1 are estimated during the identification process based on the four most recently observed rainfall patterns, along with the corresponding values of Yt-1,…,Yt-p. The simulation proceeds through solution of Equation 3-14 for each future time step up to the desired forecast lead-time. 3.4. Time series analysis of growth-decay of rainfall fields A time series model can be fitted to the parameters c7, c8 and c9 which comprise the growth-decay vector w, using the same approach as for the parameters that describe translation and rotation. Cross-correlation analysis indicates no correlation with parameters c1 through c6, and as such the time series model for growth-decay is developed independently. Here Xt and Yt are defined as: ct 7 ct 7 Xt ct 8 , Yt Xt Xt 1 Xt ct 8 ct 9 ct 9 3-16 The cross-correlation between c7, c8 and c9 is weaker than for translation and rotation parameters, as is temporal dependence. The time series process {Xt} demonstrates weak non-stationarity, and as such the differenced series {Yt} is used for fitting a time series model. An AR(2) model is chosen for {Yt}, equivalent to an ARIMA(2,1,0) model for {Xt}. The AR parameter matrices and the white noise covariance matrix identified for the period spanning 6:30 through 23:55, 10th September, 2001 are given below. 38 Probabilistic Distributed Rainfall Forecasting -0.566 -0.128 -0.000327 1 0.125 -0.530 0.00190 -11.7 -3.36 -0.800 -0.326 -0.0520 0.00026 2 0.0908 -0.0678 0.00259 -6.88 -25.0 -0.618 0.000984 0.000236 -0.0587 0.000236 0.000656 -0.0520 -0.0587 -0.0520 8.89 3.5. Statistical analysis of growth-decay of rainfall fields The translation vector model can be used to model growth-decay dynamics of regional rainfall, however it is unable to account for the small variations that occur at the mesh cell scale. As such, even if an observed rainfall pattern could be extrapolated into the future over a 5-minute time step using a set of optimal vector parameters that would give the best possible prediction, small errors between the prediction and the corresponding observation would be observed due to the limitations of the model. Tachikawa et al. (2003) developed a stochastic model of real-time rainfall prediction error based on an analysis of the frequency distributions of prediction error of the translation vector model. This approach was capable of stochastically modeling rainfall growth-decay dynamics at the mesh cell level, with the minor drawback that the model could not simulate rainfall conditions for a mesh cell once rain ceased to fall within that cell. An approach is developed here which considers a residual growth-decay term that is added to each mesh cell following translation and rotation. This residual growth-decay term is characterized by a distribution which is conditional on the rainfall strength at the mesh cell and is considered to be spatially correlated with surrounding residual growthdecay terms. This approach can be used as an alternative to time series analysis for modeling growth-decay of rainfall fields. It is designed to be used together with the time series model for translation and rotation of rainfall fields. In this approach, the growth-decay vector from Equation 3-2 is set as w = 0, and replaced with a residual growth-decay term ε(x,y,t): z z z 3-17 u v t x y 39 Probabilistic Distributed Rainfall Forecasting Observations of calculated residual growth-decay fields suggest that the distribution of possible values of ε at a given mesh cell at a given time is primarily conditional on the following factors: i. The magnitude of the extrapolated rainfall value at the target mesh cell (RA). ii. The average magnitude of the extrapolated rainfall at cells within the proximity of the target cell (RB). For the examples used here a radius of two mesh cells is used to define this area, which includes all adjacent cells, and all cells separated by a distance of one cell from the target mesh cell. iii. The growth-decay values at surrounding mesh cells (spatial correlation). The above can be easily observed from a cursory look at the example given in Figure 3-8 where (a) is the result of translation using the set of optimal model parameters identified for the previous 5-minute time period, and (b) is the corresponding residual growth-decay field, calculated as the difference between the translated field and the corresponding observed rainfall field for that time step. 0 20 40 60 80 100 120 -45 -30 -15 Rainfall (mm/hr) 0 15 30 Error field (mm/hr/5min) Figure 3-8: (a) Optimal calculated rainfall pattern (mm/hr) for 11 September 2000 21:00, (b) corresponding calculated residual growth-decay field (mm/hr/5-min). Distributions of growth-decay values are determined using sets of historical radar 40 Probabilistic Distributed Rainfall Forecasting rainfall observations. The residual growth-decay pattern for a given 5-minute time step between two radar observations made successively at times t-1 and t0 can be calculated through the following procedure: i. Identify the optimal translation vector parameters for the period between the radar observations made at t-1 and t0. ii. Apply the translation vector parameters to the rainfall pattern observed at t-1 to produce an extrapolated rainfall pattern for t0. iii. Subtract the optimal extrapolated pattern for t0 from the actual observed pattern for t0. 3.5.1. Statistical analysis Analysis is performed on past records of rainfall observations to identify the statistical properties of the growth-decay value for mesh cells under various conditions. The procedure given above for calculating residual growth-decay patterns is applied to an example storm, with the distribution of growth-decay values tallied for various combinations of RA and RB. The distributions of growth-decay rates for values of RA greater than or equal to 5mm can be approximately modeled as Weibull distributions or as lognormal distributions. The parameters for each distribution (Table 3-2) can be conveniently modeled as functions of RA. In considering the location parameter for each distribution, a growthdecay value less than –RA would lead to negative resultant rainfall strength for that particular mesh cell which is unacceptable, and as such the location parameter can be set to –RA, which is the logical lower boundary for the distribution. Shape and scale parameters are found by performing regression for observed values of γ and α, and σ and m, for Weibull and lognormal distributions respectively. 41 Probabilistic Distributed Rainfall Forecasting Table 3-2: Parameterization of the lognormal and Weibull distribution Parameter Location Lognormal θ Weibull μ Shape σ γ Scale m α Note Shifts the distribution relative to the standard distribution. In the case of the Weibull and lognormal distributions, defines the location of the lower boundary. Set at –RA. Defines the shape of the distribution. Modeled as a function of RA. Stretches the distribution in relation to the associated standard distribution. Modeled as a function of RA. Radar-observed rainfall data from a storm commencing 11/9/2000 was used for the statistical analysis with a selection of the resulting distributions given in Figure 3-9. A visual inspection of results suggest that the resulting distributions can be approximated using either lognormal or Weibull distributions. It must be noted that while low Kolmogorov-Smirnov test scores were achieved for all distributions, the scores were not low enough to confirm that the growth-decay observations were likely to have been sampled from lognormal or Weibull distributions. For the purpose of stochastic generation of growth-decay values it is considered that either of these distribution types provide a suitable representation of the growth-decay distributions. The distributions of growth-decay values for RA of less than 5mm/hr can not be suitably modeled by either lognormal or Weibull distributions, so values of the cumulative distribution function of growth-decay values for each RA are tabulated (Table 3-3) rather than fitted to a particular distribution. Furthermore, to differentiate between growth-decay values at mesh cells when surrounding areas are completely dry, and when some surrounding areas are experiencing rainfall, distributions for RA = 0 mm/hr and RB = 0 mm/hr, and RA = 0 mm/hr and RB ≥ 1mm/hr, respectively, are tabulated separately (Table 3-4). The distribution for the case RA = 0 mm/hr and RB = 0 mm/hr as shown in Figure 3-9 is very heavily dominated by growth-decay values of 0mm/hr/5-min as there is often no rainfall activity in the vicinity of the target mesh cells being considered, and thus growth-decay seldom occurs. It is for this reason that the distinction between RB = 0 mm/hr and RB ≥ 1mm/hr must be considered. The location, scale and shape parameters for both the lognormal and Weibull distributions can be modeled as functions of RA for values of RA ≥ 5mm/hr. The 42 Probabilistic Distributed Rainfall Forecasting variation with RA of the scale and shape parameters fitted to the observed growth-decay distributions are given for the lognormal distribution in Figure 3-10, and the Weibull distribution in Figure 3-11. Lines of best fit are calculated through regression analysis and where necessary through a Genetic Programming search based on the results for 5mm/hr ≤ RA ≤ 60mm/hr. Distributions for RA ≥ 60mm/hr are not considered in the regression as the fitted parameters for these distributions are less reliable due to the low number of observations that are available for these very high rainfall strengths. However, the results in Figure 3-10 and Figure 3-11 suggest that the observed parameter trends can be extrapolated for higher values of RA. The functions for the location, scale and shape parameters identified are given in Table 3-5. Thus the stochastic characteristics of the growth-decay dynamics of rainfall fields can be conveniently described at the mesh cell level through the use of only three functions. 43 Probabilistic Distributed Rainfall Forecasting 1.2 0.4 Observed Data 1 RA = 0mm 0.3 RA = 5mm 0.8 0.25 0.6 P P Observed Data Weibull distribution fit Lognormal distribution fit 0.35 0.2 0.15 0.4 0.1 0.2 0.05 0 0 2 4 6 Growth-Decay Rate (mm/hr/5-min) 8 0 -10 10 0.2 0 5 10 15 20 Growth-Decay Rate (mm/hr/5-min) 25 30 0.12 0.18 0.16 Observed Data Weibull distribution fit Lognormal distribution fit 0.1 Observed Data Weibull distribution fit Lognormal distribution fit RA = 10mm 0.08 RA = 15mm 0.14 0.12 P 0.1 P -5 0.06 0.08 0.04 0.06 0.04 0.02 0.02 0 -10 -5 0 5 10 15 20 Growth-Decay Rate (mm/hr/5-min) 25 0 -20 30 9 0 10 Growth-Decay Rate (mm/hr/5-min) 20 30 7 Observed Data Weibull distribution fit Lognormal distribution fit 8 7 Observed Data Weibull distribution fit Lognormal distribution fit 6 5 6 RA = 20mm RA = 30mm P (10^-2) P (10^-2) -10 5 4 3 4 3 2 2 1 1 0 -20 -10 0 10 Growth-Decay Rate (mm/hr/5-min) 20 0 -30 30 4.5 30 40 3.5 3 RA = 40mm 3.5 Observed Data Weibull distribution fit Lognormal distribution fit 3 RA = 50mm 4 P (10^-2) P (10^-2) -10 0 10 20 Growth-Decay Rate (mm/hr/5-min) 4.5 Observed Data Weibull distribution fit Lognormal distribution fit 4 2.5 2 1.5 2.5 2 1.5 1 1 0.5 0.5 0 -40 -20 -30 -20 -10 0 10 20 30 Growth-Decay Rate (mm/hr/5-min) 40 0 -50 50 -30 -10 10 30 Growth-Decay Rate (mm/hr/5-min) 50 Figure 3-9: Distributions of observed growth-decay rate frequencies for various values of RA 44 Probabilistic Distributed Rainfall Forecasting Table 3-3: Cumulative probabilities for growth-decay rates for values of 1mm/hr ≤ RA < 5mm/hr RA ε -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1 2 3 4 0.146 0.693 0.893 0.976 0.993 0.997 0.999 0.999 0.999 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.057 0.142 0.736 0.857 0.948 0.982 0.991 0.996 0.997 0.998 0.999 0.999 0.999 0.999 1.000 1.000 1.000 1.000 1.000 0.027 0.070 0.234 0.705 0.826 0.921 0.972 0.983 0.990 0.995 0.996 0.997 0.998 0.998 0.999 0.999 0.999 1.000 1.000 1.000 0.004 0.015 0.144 0.286 0.686 0.805 0.920 0.959 0.977 0.988 0.992 0.995 0.997 0.998 0.998 0.999 0.999 0.999 0.999 0.999 1.000 Table 3-4: Cumulative probabilities for growth-decay rates for values of RA = 0mm/hr RB ε 0 1 2 3 4 5 6 7 8 9 0 ≥1 0.951 0.976 0.993 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.529 0.652 0.884 0.981 0.989 0.994 0.999 0.999 0.999 1.000 45 Probabilistic Distributed Rainfall Forecasting 4 80 Observed Best fit 60 3 50 2.5 Shape Scale 70 Observed Best fit 3.5 40 2 1.5 30 20 1 10 0.5 0 0 0 10 20 30 40 50 RA (mm/hr) 60 70 80 0 10 20 30 40 50 60 70 80 RA (mm/hr) Figure 3-10: Variation of (a) scale and (b) shape parameters with RA for lognormal distribution 80 4 70 3.5 3 60 Observed Best fit Observed Best fit 2.5 Shape Scale 50 40 30 2 1.5 20 1 10 0.5 0 0 10 20 30 40 50 60 70 0 80 0 RA (mm/hr) 10 20 30 40 50 60 70 80 RA (mm/hr) Figure 3-11: Variation of (a) scale and (b) shape parameters with RA for Weibull distribution Table 3-5: Distribution parameters as functions of RA, for RA ≥ 5mm/hr Param. Scale Shape Lognormal m = -0.0016RA2 + 0.871RA + 0.873 log2.5 1.43 2 5.0 2.0RA 9.0 Location θ = -RA Weibull α = -0.0023RA2 + 1.06RA + 0.838 γ = 0.409ln(RA) + 1.73 μ = -RA 3.5.2. Estimating spatial correlation In addition to modeling translation dynamics of a rain field at the regional scale and growth-decay dynamics at the mesh cell scale, it is also necessary to consider the relationship that exists on the intermesh scale. For this reason a framework for modeling spatial correlation is required. Spatial correlation can be assessed based on Moran’s I, a weighted product-moment correlation coefficient. 46 Probabilistic Distributed Rainfall Forecasting N N , j i I N i 1 w zz ij i j 1 N j 3-18 S0 z 2 i i 1 Here N equals the number of mesh cells, wij is a weight denoting the proximity of the mesh cells i and j, zi and zj are values separated by a geographic distance, and S0 is the sum of the weights: N N ,i j S0 i 1 w 3-19 ij j 1 Moran’s I is used to judge whether mesh cells in close proximity to each other are more similar than would be expected under spatial randomness. Values of I larger than 0 indicate positive spatial correlation and values smaller than 0 indicate negative spatial correlation. The expectation of I under the null hypothesis H0, that values are spatially independent, is given by: 1 3-20 E(I h ) E i N 1 The expectation approaches zero as N increases. Equation 3-18 can be modified to express the correlation between values separated by a given number of mesh cells defined by class h, with h=1 representing neighboring mesh cells sharing an edge or a diagonal corner, class h=2 represents all pairs of mesh cells that have one cell lying between them, and so on. N N , j i I(h ) N i 1 zz j 1 i j 3-21 N z i 1 2 i Here the weights are set to unity since each value pair is separated by approximately the same distance. This representation allows for the calculation of a correlogram through calculation of the I statistic for a variety of different spatial lags. The I(h) series for the residual growth-decay field shown in Figure 3-8(b) is given in Figure 3-12. A time series of I(1) and I(2) is charted for the period 11/9/2000 15:00 ~ 21:00 in Figure 3-13, which shows that the spatial correlation characteristics remain stable for the 6-hour period. 47 Probabilistic Distributed Rainfall Forecasting 1 Spatial correlation I(h ) 0.8 0.6 0.4 0.2 0 -0.2 1 2 3 4 5 6 7 8 9 10 -0.4 -0.6 -0.8 -1 Separation distance h Figure 3-12: Series of I(h) for residual growth-decay field (11/9/2000 21:00) I(1) I(2) 1 Correlation 0.8 0.6 0.4 0.2 0 15:00 16:00 17:00 18:00 19:00 20:00 21:00 Time Figure 3-13: Time series of I(1) and I(2) for residual growth-decay fields for 11/9/2000 15:00 ~ 21:00 (No radar observation available for 18:55) 3.5.3. Conditional generation of spatially-correlated noise In conducting a Monte Carlo simulation, it becomes necessary to be able to generate a field of spatially-correlated noise for use in sampling from distributions of growthdecay values. The following method is used for generating a noise field with spatial structure: 1. Generate an ordinary white noise field with no spatial correlation 2. Calculate the field’s spatial correlation 3. Randomly swap any two values 4. Recalculate the spatial correlation 48 Probabilistic Distributed Rainfall Forecasting 5. If the spatial correlation shows improvement then accept the swap, otherwise reverse it. 6. Repeat steps 3 - 5 until the desired spatial correlation is reached. This method is computationally expensive, and as such it becomes necessary to prepare a number of white noise fields of varying spatial correlation prior to running a real-time Monte Carlo simulation, so as to avoid delaying the simulation. An example of a spatially correlated white noise field with first order spatial correlation I(1) = 0.85 is given in Figure 3-20(a). 3.6. Radar observation error As rainfall is the primary driving input for hydrological models, radar observation errors often lead to significant errors in the simulated hydrographs. Radar observation errors can be analyzed and modeled separately as a spatially and temporally correlated stochastic process. However, radar rainfall observation residual noise is not modeled separately in the proposed Monte Carlo simulation due to the use of a filtering method in the probabilistic flood forecasting system proposed in this research. The filter is used to estimate and correct modeled runoff error during the period up until the point in time when the forecast is made, and to predict future errors that may arise during the forecast period. Through sequential comparison of modeled runoff with observed runoff, biases in the simulated and forecasted runoff rates that result from errors in the rainfall observations made during the pre-forecast period are reduced. In the same way, the uncertainty in the runoff forecast due to rainfall observation errors is estimated together with the uncertainty arising due to limitations of the hydrological model and the choice of model parameters. As the corrections made to the hydrological model outputs are made only based on runoff rates observed during the pre-forecasting period, the uncertainties in forecasted runoff arising from residual noise in the rainfall forecast due to growth-decay dynamics are unaccounted for by the filter, necessitating the separate stochastic model for the rainfall forecasting process. 3.7. Monte Carlo simulation The motivation behind analyzing the translation and growth-decay rate characteristics of past storm events is to provide the necessary knowledge for stochastically generating realistic time series of rainfall scenarios for use in a Monte Carlo simulation to aid in real-time probabilistic forecasting of rainfall-runoff conditions in a target watershed. 49 Probabilistic Distributed Rainfall Forecasting In designing a Monte Carlo simulation, it is important to first consider whether or not the stochastic characteristics identified for historical and recently observed rainfall fields can be used when modeling future rainfall. From inspection of the time series of translation vector parameters it is found that: ACF’s and CCF’s for the various series have similar characteristics for all storm events up to the first lag, with correlation dropping to insignificant levels from the second lag onwards, indicating that the type and structure of the ARIMA model used is consistent during and between storms. The variance and covariance values which define the white noise characteristics for each time series show some variation between storms. The variation is considerably less within individual storms. For the Monte Carlo simulation the assumption is made that the characteristics of the time series process can be considered invariable over a short-term period of several hours. As discussed in Section 3.5.2, the spatial correlation characteristics of the growthdecay term can be assumed stable over short periods. With the above considerations in mind, the Monte Carlo simulation for probabilistically forecasting rainfall is designed as the three-stage process outlined in Figure 3-14. Off-line analysis period: If the statistical model (Section 3.5) for growth-decay is to be used, the growth-decay dynamics of rainfall events in the target region are analyzed off-line in a preliminary analysis stage. The equations describing the scale and shape of the distributions used for modeling the stochastic characteristics of growth-decay dynamics for RA ≥ 5mm/hr are determined during this stage, along with the cumulative probability tables for RA ≤ 5mm/hr. Observation period: A period of τobs 5-minute time steps spanning t = -τobs through t = 0 (present) is used for making rainfall observations and calculating ARMA parameters for describing translation and rotation dynamics of the current rainfall. An ARMA model for describing growth-decay is also fitted if using the time series model for growth-decay as described in Section 3.4. It is assumed that the translation, rotation and growth-decay dynamics of the rainfall fields are stable over the observation stage and the subsequent simulation stage. Additionally, the spatial correlation of the residual growth-decay fields is calculated for each time step during the observation period and averaged. Spatial correlation characteristics are assumed to remain stable over the observation and subsequent simulation stages. Simulation period: A simulation is carried out where n sets of rainfall time series 50 Probabilistic Distributed Rainfall Forecasting are stochastically generated. Each set is generated through the following process i. Commence simulation s = 1 from t = 0 based on observed rainfall pattern for t = 0. ii. Generate ci parameters for time step t according to ARMA model based on parameters from previous time step(s). iii. Translate rainfall pattern for five-minute period spanning time t and t+1 according to ci parameters generated in step ii. iv. Generate growth-decay values for each mesh cell in the translated rainfall pattern. When using a time series model of growth-decay this is achieved by generating parameters c7 ~ c9 according to the fitted ARMA model, and applying the resulting parameters to the translated rainfall field. When using a statistical model of growth-decay this is achieved by calculating scale, shape and location parameters for the chosen distribution type (Weibull or lognormal) based on the RA and RB values at each mesh cell. A spatially correlated white noise field is generated such that a seed value of the range 0 < p ≤ 1 is obtained for each mesh cell. Each random value of p is used to sample a growth-decay value from the probability distribution function associated with the RA and RB values of that mesh cell. The procedure for conditional generation of spatially-correlated noise described in Section 3.5.3 v. vi. is followed, with the exception that the spatial correlation is calculated based on the resulting sampled growth-decay values, rather than the values of p. The procedure iterates until a field of growth-decay values of suitably high spatial correlation is achieved, at which point these values are added to their corresponding values of RA. Update time step to t = t + 1. If t < τsim then repeat procedure for new time step t by returning to step 3 above. If s = n then simulation is complete. Otherwise update the counter s = s + 1 and return to step 2 to commence simulation for next rainfall time series set. 51 Probabilistic Distributed Rainfall Forecasting OFF-LINE Event 1 Event 2 REAL-TIME … Analysis ● Collect historical rainfall data ● Fit G-D distributions ● Determine m and σ functions ● Measure average spatial correlation of G-D noise t = -τobs t = τsim t=0 Observation ● Observe rainfall ● Estimate ARMA parameters ● Calculate average spatial correlation Simulation ● Generate n rain time series - Generate c1~c6 parameters - Translation and rotation - Generate G-D noise using c7~c9 or statistical model Figure 3-14: Three-stage process for stochastic rainfall generation 3.8. Application An application of the proposed probabilistic distributed rainfall forecasting model is presented for a major rainfall event that occurred within an approximately 240km (34°40’ - 36°40’N) by 240km (135°00’ - 138°00’ E) region in the vicinity of the Nagara River during 11th – 12th September, 2000. A region much larger than that required to cover the Nagara River basin is necessary so as to observe rain patterns hours in advance of their arrival at the basin. An observation period of τobs = 6 hours (11/9 15:00 ~ 21:00) and a simulation period of τsim = 6 hours (11/9 21:00 ~ 12/9 03:00) are chosen for this example. Parallel translation is used for modeling horizontal translation dynamics in this example, and an ARIMA(0,1,1) model is fitted for the c3 and c6 parameters based on the identified parameters from the observation period. The MA parameter matrix and the white noise covariance matrix are given below. 0 -0.392 1 -0.956 0 1.63 0 0 2.91 The rainfall field observed at t = 0 is given in Figure 3-15. Simulated time series of c3 and c6, used to describe the translation of the rainfall field, are given in Figure 3-16. 52 Probabilistic Distributed Rainfall Forecasting A simulation is carried out using the time series model for generating growth-decay parameters as described in Section 3.4. An AR(2) model with zero mean is fitted to the differenced series {Yt} and the AR parameter matrices and the white noise covariance matrix are given below. -0.784 0.0466 -0.000907 1 0.0508 -0.506 0.000791 16.9 4.22 -0.791 -0.487 -0.121 -0.000667 2 -0.210 -0.534 -0.00127 22.6 20.7 -0.288 0.00195 0.000193 -0.124 0.000193 0.00137 -0.173 -0.124 -0.173 44.4 The simulated rainfall fields for this simulation for t = 1, 6, 12 and 72 are given in Figure 3-18 and Figure 3-19 showing a weakening rainfall field moving to the North. A second simulation is carried out using the statistical analysis approach for modeling growth decay as described in Section 3.5. In this simulation the m and σ functions from Table 3-5 are used for describing the growth-decay distributions. The average value of I(1) for the observation period is calculated as 0.84 (refer to Figure 3-13), and this value is used in simulating growth-decay fields in the Monte Carlo simulation with a tolerance of ± 0.5. An example white noise field generated for the first 5-minute interval of the first simulation run is given in Figure 3-20(a). The corresponding simulated growth-decay field is given in Figure 3-20(b) and the correlogram describing the field’s spatial correlation characteristics is shown in Figure 3-23 to closely resemble the correlogram calculated based on observed data for the previous time step, as required. The simulated rainfall fields for t = 1, 6, 12 and 72 are given in Figure 3-21 and Figure 3-22, showing the rainfall field breaking up and weakening with time and moving steadily to the North-East for the majority of the simulation period. 53 Probabilistic Distributed Rainfall Forecasting 0 20 40 60 80 100 120 Observed rainfall (mm/hr) 6 -10 4 -12 2 c6 (km/hr) c3 (km/hr) Figure 3-15: Observed rainfall field at 11/9/2000 21:00 (t = 0) 0 -2 -16 -18 -4 -6 21:00 -14 22:00 23:00 0:00 1:00 Time (hr:min) 2:00 -20 21:00 3:00 22:00 23:00 0:00 1:00 Time (hr:min) Figure 3-16: Simulated time series of parameters (a) c3, and (b) c6 54 2:00 3:00 0.2 0.25 0.15 0.2 0.1 c8 (mm/hr/hr/km) c7 (mm/hr/hr/km) Probabilistic Distributed Rainfall Forecasting 0.05 0 -0.05 -0.1 0.15 0.1 0.05 0 -0.05 -0.15 -0.2 21:00 22:00 23:00 0:00 1:00 Time (hr:min) 2:00 3:00 0:00 1:00 Time (hr:min) 2:00 3:00 -0.1 21:00 22:00 23:00 0:00 1:00 Time (hr:min) 2:00 3:00 20 c9 (mm/hr/hr) 10 0 -10 -20 -30 -40 21:00 22:00 23:00 Figure 3-17: Simulated time series of parameters (a) c7, (b) c8, and (c) c9 0 20 40 60 80 100 120 0 Simulated rainfall (mm/hr) 20 40 60 80 100 120 Simulated rainfall (mm/hr) Figure 3-18: Simulated rainfall fields, 11/9/2000, (a) 21:05 (t = 1), (b) 21:30 (t = 6) 55 Probabilistic Distributed Rainfall Forecasting 0 20 40 60 80 0 100 120 20 40 60 80 100 120 Simulated rainfall (mm/hr) Simulated rainfall (mm/hr) 0 0 40 40 South <-- North (km) South <-- North (km) Figure 3-19: Simulated rainfall fields, 11/9/2000, (a) 22:00 (t = 12), (b) 03:00 (t = 72) 80 120 160 200 0 80 120 160 200 40 80 120 160 200 0 West --> East (km) 0.0 0.2 0.4 0.6 0.8 40 80 120 160 200 West --> East (km) 1.0 -45 -30 -15 Spatially correlated white noise 0 15 30 Simulated growth-decay (mm/hr/5min) Figure 3-20: Simulated results, 21:05: (a) white noise field, (b) growth-decay field 56 Probabilistic Distributed Rainfall Forecasting 0 20 40 60 80 100 120 0 Simulated rainfall (mm/hr) 20 40 60 80 100 120 Simulated rainfall (mm/hr) Figure 3-21: Simulated rainfall fields, 11/9/2000, (a) 21:05 (t = 1), (b) 21:30 (t = 6) 0 20 40 60 80 100 120 0 Simulated rainfall (mm/hr) 20 40 60 80 100 120 Simulated rainfall (mm/hr) Figure 3-22: Simulated rainfall fields, 11/9/2000, (a) 22:00 (t = 12), (b) 03:00 (t = 72) 57 Probabilistic Distributed Rainfall Forecasting 1 Spatial correlation I(h ) 0.8 0.6 0.4 0.2 0 -0.2 1 2 3 4 5 6 7 8 9 10 -0.4 -0.6 -0.8 -1 Separation distance h Figure 3-23: Series of I(h) for simulated growth-decay field (11/9/2000 21:05) 3.9. Conclusion A methodology for analyzing the short-term dynamics of rainfall fields has been presented together with a procedure for stochastically generating future rainfall fields based on a Monte Carlo simulation. Time series analysis was applied to a translation vector model to describe and model the horizontal movement of rainfall fields, and time series and statistical models were developed for stochastically generating growth-decay fields. A multivariate autoregressive integrated time series model is developed for modeling translation and rotation phenomenon of rainfall fields, and two univariate ARIMA time series models are used for modeling the case where only parallel translation of rainfall fields is considered. The time series models perform well during the simulation period, producing realistic translation vectors that could reasonably be expected to occur considering recently observed rainfall phenomena. Applications were carried out for both the time series model and the statistical model for generating growth-decay patterns for rainfall fields. The time series model approach is convenient and easy to use, and effectively models the range of growth-decay phenomenon that may be experienced for the rainfall field as a whole. A limitation to this approach is that the growth-decay noise at the mesh cell scale is not modeled, however this does not limit the applicability of the approach for use in probabilistically forecasting discharge conditions. 58 Probabilistic Distributed Rainfall Forecasting The statistical model approach alternatively seeks to simulate rainfall growth-decay dynamics at the mesh cell scale, which adequately considers growth-decay dynamics at the mesh cell scale. Realistic simulations of 5-minute growth-decay fields are achieved with similar spatial correlation characteristics to observed growth-decay fields. An important finding is that the stochastic properties of rainfall growth-decay dynamics in the region considered could be described by as few as three functions during periods of medium to heavy rainfall. Simulations longer than one hour (12 time steps) tend to produce dispersed rainfall fields with rainfall gradually spreading across the modeled region, which limits the applicability of the statistical model for use with only very short term rainfall and discharge forecasts. 59 Adaptive Updating of a Distributed Rainfall-Runoff Model 4. ADAPTIVE UPDATING OF A DISTRIBUTED RAINFALL-RUNOFF MODEL In order to facilitate the real-time updating of the calculated discharge given by the distributed rainfall-runoff model used for this research, such that an accurate real-time discharge prediction can be achieved, a scheme is developed here for utilizing observed discharge data available in real-time from discharge observation stations within the watershed. The rainfall-runoff model relies on the correct prior identification of model structure and parameters and the accuracy of the rainfall input to ensure a meaningful model output. It is clear, however, that no matter how accurately model parameters are calibrated, the uniqueness of each hydrological event, and the inherent weakness of the model as a simplified representation of a physical reality, ensures that an error between model output and actual discharge will always be present. A number of studies (Kitanidis and Bras, 1980; Puente and Bras, 1987) have successfully used discharge observations to improve flood forecast accuracy through application of the state-space Kalman filter (Kalman, 1960) to the problem of real-time adaptive estimation of lumped rainfall-runoff model parameters. The problem of updating a distributed rainfall-runoff model in real-time to reflect actual river discharge conditions poses two main challenges. The first challenge is that of how to combine a filtering algorithm with a kinematic wave-based model. Secondly, how can an entire watershed model be suitably updated using discharge observations available at only a limited number of locations? The above two issues are addressed here, and a scheme suitable for a distributed rainfall-runoff model such as Hydro-BEAM is proposed. 4.1. Overview of methodology It is desired that real-time discharge observations available at a limited number of locations (observation points) within the watershed can be used together with a distributed rainfall-runoff model to improve the ability of the model to forecast future discharge rates. The method employed must be capable of improving the model’s ability to forecast discharge at each point within a watershed while restricting the forecast calculation time to an acceptably short period so as not to render the system unusable in real-time forecasting. 60 Adaptive Updating of a Distributed Rainfall-Runoff Model Shiiba et al. (2000) developed a deterministic flood routing model based on the dynamic wave flood routing model, coupled with a simplified variation of the Kalman filter which approximates the error covariance matrix using a reduced rank square-root algorithm. The application of the model to a 4000m one-channel reach was investigated with an emphasis on reducing calculation time, however there is a need for further research on extension of the model before it can be used in channel networks. Ideally, the deterministic kinematic wave model, which forms the basis of the distributed rainfall-runoff model used in this study, and like the dynamic wave model is based on the Saint-Venant equations, could be converted to a stochastic model through the introduction of noise terms to reflect the inaccuracies in the model. However, even with the simplifications made by Shiiba, et al. (2000) to reduce the calculation time of the error covariance matrix, it is clear that the calculation time required for the application of the Kalman filter to such a stochastic model makes its real-time use at every grid point along a channel network prohibitive. The approach investigated here involves the use of discharge observations at a limited number of observation points in a watershed to update the state of the rainfall-runoff model’s surface and subsurface discharge, with no changes made to the model parameters. Recursive updating of discharge levels over the entire watershed prior to a forecast is used to bring the amount and position of water in the physical model to closely resemble observed discharge rates. It is reasoned that a combination of the natural lag between rainfall input and runoff output, otherwise referred to as a watershed’s response time, together with continued adaptive updating based on extrapolated filter parameters, will allow a forecast of considerable time length to be made. An algorithm for performing recursive updating of gain parameters at each of the watershed’s observation points is developed together with two schemes for using the weighted sums of each of these gains to update the discharge of other mesh cells within the watershed. 4.2. Adaptive updating algorithm Adaptive updating must be applied to every major water volume within the watershed to ensure that all areas that contribute flow to the river channels are adjusted to achieve an 61 Adaptive Updating of a Distributed Rainfall-Runoff Model improvement in the model’s representation of the watershed’s runoff dynamics and the discharge forecast. The proposed updating method recognizes that a difference exists between actual discharge at time t, Qt, for a given location, and the modeled discharge Qmod.t: 4-1 Qt Qmod .t t where t is a general noise term, which accounts for model and input errors. The updating method is based on the calculation of a time variable updating factor or gain for each mesh cell within the watershed, for each time step where discharge observations are available. This parameter is used as a multiplier to adjust the discharge rate of each water body within the corresponding mesh cell as follows: 4-2 Qˆt t|t 1Qmod .t|t 1 where Qmod .t|t 1 is model discharge output at a given time step and Qˆ t is the updated model discharge for the same time step. Depending on the water body being updated, the quantity Qmod takes on a different physical meaning. Updating is applied to channel discharge, surface flow, and where applicable layer A discharge. In the case of a mesh cell containing a river channel, and a number of different land use types, the quantity Qmod is updated separately for each of these mesh cell components. The scheme for a channel cross-section is depicted in Figure 4-1(a). Urbanized areas and non-river water bodies are treated as having a surface flow that eventually contributes to the river channel flow, and no contributing sub-surface flow. Both of these cases can be considered in a similar fashion to the cells containing a river channel, in that the updating factor is applied to the surface flow only. Forest and field regions are modeled such that the flow in the uppermost subsurface layer, layer A, is assumed to eventually contribute to the flow in downstream river channels, together with surface flow. These flows are treated as a combined flow when being updated: 4-3 Qmod Qmod .S Qmod . A where the subscripts S and A refer to the surface and subsurface layer A respectively, remembering that subsurface layer A must be filled before surface flow can occur. The 62 Adaptive Updating of a Distributed Rainfall-Runoff Model updating scheme for surface flow and layer A flow in forest and field regions is depicted in Figure 4-1(b). Qmod.t|t = φt* Qmod.t|t-1 Qmod.t|t-1 Figure 4-1: (a) Channel updating model (left), and (b) surface and layer A runoff updating model (right) A method for time variable parameter estimation presented in Young (1984) is used for the real-time estimation of the time-variable gain. A similar method has been used previously by Lees et al. (1994) for the Dumfries flood forecasting system. The procedure is considered here for the real-time updating of the state of a distributed rainfall-runoff model. Recursive filtering of a time-variable gain parameter t* is performed for each model mesh cell containing a river discharge observation station. Here * is used to denote a gain parameter calculated at a mesh cell containing an observation station, and is used to indicate an updating factor applied to a mesh cell of any description within a watershed. In this algorithm t* depends on the gain at the previous time step t*1 , and a function of the prediction error (Qt Qˆt ) , where Qt and Qˆ t are the observed discharge at time t and the predicted discharge for time t respectively. A predictor-corrector algorithm is employed for the recursive estimation of the gain parameter. Prediction: ˆ* * t |t 1 t 1 4-4 Pt|t 1 Pt 1 C NVR Correction: 63 Adaptive Updating of a Distributed Rainfall-Runoff Model ˆt*|t 1 * t Pt|t 1Qˆ t Qt ˆt*|t 1Qˆ t 1 Pt|t 1Qˆ t2 4-5 2 Pt|t 1Qˆ t Pt Pt|t 1 1 Pt|t 1Qˆ t2 Here CNVR is a noise variance ratio (NVR), and is used together with P to control the degree to which the time-variable gain is allowed to change between steps, with filter memory decreasing for increasing values of CNVR. A simplifying a priori assumption is made that the variation of the gains calculated at each observation station are not correlated, and thus the updating algorithm is used as a single input single output (SISO) transfer function model to separately estimate the gain at each observation point. The predictor-corrector equations are applied to the model in real-time prior to the commencement of a forecast as shown in Figure 4-2. Note that the final equation shown in the corrector step in Figure 4-2 uses t , which is calculated individually for each mesh cell in the watershed as a function of all of the updated t* . It is used to update the model output Qmod .t|t 1 for each water body within the watershed at time t to Qmod .t . Predict Precipitation ˆt*|t 1 t*1 Observation Rainfall-runoff model loop Pt|t 1 Pt 1 CNVR Qˆ t Qˆ t Qmod .t|t 1 Correct Unit Delay ˆt*|t 1 * t Pt|t 1Qˆ t Qt ˆt*|t 1Qˆt 1 Pt|t 1Qˆ t2 2 Qt Pt|t 1Qˆ t Pt Pt|t 1 1 Pt |t 1Qˆ t2 Qmod .t t Qmod .t |t 1 Figure 4-2: Recursive filtering algorithm for estimation of adaptive gain parameter and updating of a distributed rainfall-runoff model’s discharge 4.3. Prediction and error estimation As the objective of this research is to provide a forecast of future distributed flood 64 Adaptive Updating of a Distributed Rainfall-Runoff Model conditions, it is important that the recursive filtering method discussed above is capable of improving the modeled runoff rates at future time steps, and not just those time steps where observation data is available. This is achieved primarily by the action of the filter bringing the distributed state of the model to a condition that approximates observed conditions, and secondarily by allowing the filter to continue running for the forecast period under the assumption that runoff observations are approximately equal to model outputs during the prediction period: 4-6 Q Qˆ t t In this way, the value of t* slowly converges towards unity, at which point the filter ceases to influence the predicted value. This is a reasonable assumption considering that the model error is correlated for a small number of time steps. The rate of convergence is controlled in the filter by the value chosen for CNVR, which can be set differently for the prediction period as required. Under the above assumption, t* is updated in the correction step of the filter as follows: Pt|t 1Qˆ t2 1 ˆt*|t 1 * * ˆ 4-7 t t|t 1 1 P Qˆ 2 t |t 1 t An advantage associated with recursive parameter estimation is that prediction limits for the forecast can be estimated based on recursive estimation of the forecast error variance (Beven, 2000). Lees et al. (1994) used the algorithm to describe the discharge forecast in probabilistic terms. The variance of the n-step-ahead forecast error t n (Qt n Qˆ t n ) is estimated as: 4-8 var n ˆ t2 1 pt nQˆ t2n where ˆ t2 is the current forecast error variance estimated recursively using ˆ t2 ˆ t21 pt et2 ˆ t21 et Qt Qˆ t 1 p Qˆ 2 t 4-9 t where et is the scaled forecast error. Only the error Qt Qˆt at the current time can be known, however the predictor-corrector equations can be used to extrapolate estimates of pt and t* to time t+n. Here pt is a recursively computed scalar gain. While this procedure may be capable of providing an estimate of the uncertainty due to modeling errors in the forecast at observation points, no real-time observations of discharge are available for the remaining mesh cells within the watershed model, 65 Adaptive Updating of a Distributed Rainfall-Runoff Model making it difficult to estimate the uncertainty in the forecast due to model errors at nonobservation point mesh cells. It would be incorrect to use the weighting system used for distributed updating introduced below, as this would provide an overconfident forecast since it would be based on the results at the observation points, where the potential for reduction in model error due to updating is the greatest. The variance of the predicted discharge error at a given non-observation point mesh cell is likely to range between the value estimated for a nearby observation point and the value that can be achieved for Hydro-BEAM without the adaptive updating scheme. It can be seen that this type of adaptive updating algorithm addresses errors due to hydrologic uncertainty, but does not take into account precipitation uncertainty, as the error variance is recursively estimated using discharge observations prior to the commencement of the rainfall forecast. This is convenient in that the algorithm can be used to complement the probabilistic distributed rainfall forecast described in Chapter 3, which considers precipitation uncertainty, to provide a complete probabilistic forecast considering both types of uncertainty. 4.4. Distributed updating for an entire watershed A limitation of the range of adaptive updating or filtering methods presently used in river discharge forecasting is that while they are effective in improving forecast accuracy at locations where real-time observations of discharge are available, they cannot be used to improve forecasts at other locations in a watershed. It is of no practical purpose to update the modeled discharge rates at only a limited number of locations in a watershed while ignoring other locations, when using a distributed rainfall-runoff model. The problem of updating a watershed based on real-time observations from a limited number of locations will now be discussed. The approach taken here involves using the gain parameters calculated at each observation point as the basis for calculating an updating factor for other mesh cells within the distributed rainfall-runoff model. A method for updating every mesh cell within a watershed is considered in this section, with a modified approach presented in Section 4.5 for updating only those mesh cells within close proximity of observation points. 66 Adaptive Updating of a Distributed Rainfall-Runoff Model 4.4.1. Definitions The mesh cells of the flow routing map created for the target watershed are categorized here based on their geographical relationship to a watershed’s discharge observation stations, to aid in creating a systematic method for updating the distributed rainfallrunoff model used for this research. The definitions outlined in Table 4-1 are used below to describe the distributed updating scheme, and Figure 4-3 demonstrates the result of applying each definition to the Nagara River flow routing map. Table 4-1: Watershed mesh cell categorization Category Mesh cell Path cell Junction Observation point Cell category Mesh cell Path cell Junction Observation point Definition Any cell lying within the target watershed. Any mesh cell that forms a part of the observation path. Any path cell at a confluence or terminal position in the observation path, or that contains a discharge observation station. Any junction that contains a discharge observation station. Color Figure 4-3: Basin mesh cell categories applied to the Nagara River watershed 67 Adaptive Updating of a Distributed Rainfall-Runoff Model In Table 4-1 the observation path refers to the collection of mesh cells that are observation points or lie downstream of an observation point. It is evident that each definition also includes the attributes of each of the definitions that lie above it in the table. For example, an observation point is also a junction, a path cell, and a mesh cell. 4.4.2. Gain calculation for non-observation point mesh cells An updating factor for each of the n mesh cells in the target watershed is calculated as a weighted sum of each of the m observation point gain parameters i* as follows: m 4-10 ii* i 1 where i represents the weight applied to the ith observation point’s gain parameter * , i m is the number of observation points in the watershed, and with the condition that: m 4-11 i 1 i 1 In vector notation the above equation can be rewritten as: αφ*T 4-12 where α 1 , 2 , , m for a given mesh cell, and φ*T is the transpose of the vector of observation point gain parameters. Thus a set of m i ’s can be calculated for each mesh cell within the target watershed prior to running the rainfall-runoff model, expressing the relationship between each mesh cell and each observation point. In this way, updating factors for each mesh cell can be readily calculated in real-time during a typhoon event as a linear function of the updating factors at each of the observation points. Two methods for the calculation of the updating factor weights are presented here. The simpler of the two methods is based on inverse difference weighting interpolation, with distance calculated based on the number of steps that must be traversed along river channels. A second method is proposed as an attempt to take into account certain properties of a watershed, which the inverse difference weighting technique cannot account for. 4.4.3. Updating factor calculation – inverse distance weighting interpolation Inverse distance weighting lends itself to the problem of calculating a set of distributed 68 Adaptive Updating of a Distributed Rainfall-Runoff Model weight factors for the real-time updating of a distributed rainfall-runoff model. The weighting function is based on inverse power: 1 4-13 w(d ) p , p 0 d where d represents distance and p is specified by the user. A variation of the inverse distance weighting interpolation technique was introduced by Shepard (1968), and along with the Thiessen polygon spatial interpolation technique (Thiessen, 1911), is often used in hydrology for estimating distributed hydrological variables from point observations, such as interpolating precipitation fields using information from an array of rain gauges. In order to determine the set of distributed weight vectors for model updating, the method is modified such that d refers to the distance along the river channel network, measured in terms of mesh cells using the target watershed’s flow routing map. Also, the weights for a given mesh cell are chosen such that their sum equals unity. The weights for a given mesh cell are calculated as follows: 1 d ip i 4-14 m 1 j 1 d p j with di being the distance measured in mesh cells that must be traversed along the river channel network to reach observation point i, and with a suitable choice for p being 2. 4.4.4. Updating factor calculation – linear variation method The following method is proposed such that mesh cell updating factors vary smoothly between adjacent connected mesh cells, and that the updating factors calculated for path cells are a function of the updating factors for the junctions located directly upstream and downstream. This is done in an attempt to limit the influence of observation points not directly located upstream or downstream from a mesh cell. It should be noted however that this influence cannot be completed removed due to the necessity of having the updating factors vary smoothly along channels and between adjacent connected mesh cells. Step 1 – Calculation of the weight vectors for the observation points 69 Adaptive Updating of a Distributed Rainfall-Runoff Model The observation points are numbered 1 to m in no particular order, and the weight vector α 1 , 2 , , m determined for each observation point. The m weights for each observation point j are simply set as follows: 1, i j 4-15 i 0, i j Step 2 - Calculation of the weight vectors for the remaining junctions The weight vectors for the remaining junctions are then calculated, proceeding upstream to downstream. For each remaining junction the p neighboring upstream junctions (where an observation point is also considered a junction) for each upstream channel branch are used for determining the weight vector, together with the closest downstream observation point. Here ‘neighboring’ will be used to indicate a mesh cell of the same type which is connected upstream or downstream via a path of mesh cells such that the connecting path contains no other cells of that type. The rationale behind the calculation of the updating factor here is that the value of the updating factor is first calculated based on a linear variation between upstream junction k and the downstream observation point, ignoring the presence of the other p-1 upstream junctions. This is performed p times for each of the upstream junctions and then the average is taken, such that the updating factor is of the form: p d ( ) k 1 USJ .k USJd.k DSO d USJ .k 4-16 USJ .k DSO p where dUSJ.k represents the distance counted in number of mesh cells along a channel from the junction to upstream junction k, dDSO represents the distance from junction to the neighboring downstream observation point, and USJ .k and DSO represent the updating factor of the upstream junction k and the neighboring downstream observation point, respectively. Through expansion the weight vector can be calculated as: dUSJ .k p k 1 dUSJ .k d DSO 4-17 α j α DSO p where α j and α DSO refer to the weight vectors of upstream junctions and the weight vector of the neighboring downstream observation point, respectively. dUSJ .k 1 dUSJ .k d DSO p α k 1 p 70 Adaptive Updating of a Distributed Rainfall-Runoff Model A special case occurs when the furthest downstream mesh cell, located at the basin mouth, is not an observation point. According to the definitions given in Section 4.4.1, the mesh cell at the basin mouth is a junction as it lies downstream of an observation point, therefore making it a path cell, and it is also located at an end point of the observation path, qualifying it as a junction. In this case, the weight vector for this junction is calculated immediately following completion of step one above, and is the average of the weight vectors of the p upstream neighboring observation points: α p αUSO.k 4-18 p where the subscript USO.k refers to the kth neighboring upstream observation point. In calculating the remaining junctions, the basin mouth mesh cell can be used as an observation point in Equation 4-16 and Equation 4-17 in the case where no downstream observation point exists. k 1 Step 3 - Calculation of the weight vectors for the remaining path cells The weight vectors for the remaining path cells are simply calculated such that the updating factor varies linearly between junctions: d 4-19 USJ USJ DSJ USJ dUSJ d DSJ where the subscripts USJ and DSJ refer to the neighboring upstream and downstream junctions respectively, recalling that an observation points are also junctions. Through expansion the weight vector is calculated as: dUSJ α 1 dUSJ d DSJ dUSJ αUSJ dUSJ d DSJ α DSJ 4-20 Step 4 - Calculation of the weight vectors for the remaining mesh cells The remaining mesh cells are assigned a weight vector equal to the weight vector of the neighboring downstream path cell: α α DSOPC 4-21 where the subscript DSOPC refers to the neighboring downstream path cell. An example of the weight distribution for the Nagara River watershed, Gifu, Japan, using observation points at Inari, Shimohorado, Mino, and Chusetsu is given in Figure 4-4. Each figure gives an indication of the relative influence that the gain parameter 71 Adaptive Updating of a Distributed Rainfall-Runoff Model calculated for a given observation point will have on the estimated updating factor for each mesh cell within the watershed. 0.0 (a) (b) (c) (d) 0.5 i 1.0 Figure 4-4: Influence of observation points on each mesh cell of the Nagara River watershed: (a) Inari, 1 , (b) Shimohorado, 2 , (c) Mino, 3 , (d) Chusetsu, 4 72 Adaptive Updating of a Distributed Rainfall-Runoff Model 4.5. Partial distributed updating An alternative to the distributed updating approach introduced in Section 4.4 is considered here for watersheds where only a very limited number of observation points are available. In such cases the extrapolation of gain parameters across large distances may be undesirable, and an alternative that allows only a portion of the watershed to be updated becomes preferable. The following approach is adapted from Madsen et al. (2005) whereby three types of functions (constant, triangular, mixed exponential) were introduced for determining the influence of observations on adjacent locations along a river. In considering that the degree of the influence of a gain value calculated for an observation point is limited to adjacent locations along a river with influence decreasing with distance, an alternative exponential function (Sekii et al., 2005) is used in this research to spatially extrapolate the values of * to adjacent river locations. In this approach, the mesh cell categorization is simplified to include only the three types given in Table 4-2. Table 4-2: Mesh cell types for partial distributed updating Type Type 1 Type 2 Type 3 Definition Mesh cells that contain observation stations. Mesh cells located between upstream and downstream observation stations. Mesh cells that are neither of type 1 or type 2. The calculation of for each mesh cell is as follows. Type 1 mesh cells The value of for type 1 cells is defined as being equal to the gain calculated for that cell using the predictor-corrector algorithm given in Section 4.2. Type 2 mesh cells The value of for type 2 cells is calculated as 73 Adaptive Updating of a Distributed Rainfall-Runoff Model t 1.0 t*us 1.0 e up t*ds 1.0 edown 4-22 where d us up d us d ds 2 d ds d us d ds 4-23 2 down and tus and tds are the gains at the neighboring upstream and downstream observation points respectively, and dus and dds are the distances from the target type 2 mesh cell to the neighboring upstream and downstream observation points respectively, calculated in terms of the number of river mesh cells separating the locations. α is a parameter which influences the degree and distance to which gain values at observation points influence the gain calculated for surrounding mesh cells. In setting α, Equation 4-24 must be satisfied for every type 2 mesh cell in the watershed. This ensures that the influence of the upstream and downstream gain values has a combined weight of no more than unity. 4-24 e up edown 1.0 Type 3 mesh cells The value of for type 3 cells is defined as t 1.0 tneighbor 1.0 e 2 d neighbor 4-25 where dneighbor is the distance to the neighboring type 1 or type 2 mesh cell calculated in terms of the number of mesh cells that water must flow through to travel to or from that mesh cell, and tneighbor is the gain at that neighboring cell. β is a parameter which has a similar role to α, and must be set such that for all type 3 mesh cells: d2 4-26 e neighbor 1.0 4.6. Application An application is conducted for the Nagara River watershed. Real-time discharge observations are currently made available at Mino, Akutami and Chusetsu. In the application presented here, observations at Mino and Chusetsu are used for updating, with discharge observations at Akutami reserved for validation of the distributed updating scheme. Due to the limited number of observation stations available, and the absence of an observation station in the upstream region, the partial distributed updating scheme 74 Adaptive Updating of a Distributed Rainfall-Runoff Model introduced in Section 4.5 is used. The parameters given in Table 4-3 are used for each application, and the three storm events given in Table 4-4 are considered. Figure 4-5 shows the influence at Akutami of the adaptive updating technique based on observations at Mino and Chusetsu which are located at distances of approximately 17km upstream and 8km downstream respectively. Despite the highly-accurate HydroBEAM results, for nearly all time steps the updated discharge shows an improvement over the model calculated discharge results for Event 1. Figure 4-6 and Figure 4-7 show the results at Chusetsu when using the adaptive updating technique for prediction with Event 2 and Event 3 respectively. An improvement due to the adaptive updating scheme can be seen for the 1-hour ahead prediction in Figure 4-6 with the influence of the filter only minimal for the 2-hour ahead prediction, and negligible for the 3-hour ahead prediction. Conversely, the results for Event 3 show a substantial improvement for all three lead times. These results can be explained by considering that the influence of the predictor-corrector algorithm is in proportion to the error series observed up to the commencement of the prediction, and decays to zero during the prediction period. For most of the duration of Event 2 including the period prior to 16:00 and the period following 21:00 there is only a small relative error between observed and calculated discharge and as such the gain variable remains close to 1.0 for these periods and is slow to respond to the sudden increase in error observed at 17:00 through 19:00. In Event 3, the adaptive updater is continually compensating for a large calculation error, and as such the gain parameter for the location of Chusetsu and Mino remains above 1.0 for the duration of the event and is slow to recede to 1.0 during each 3-hour prediction period. A drawback associated with hydrological model filtering systems that calculate model error as being based on the difference between calculated and observed runoff is that the assumption is made that the modeling error is due to the flood wave being too large or too small, rather than considering that it could be temporally misaligned, having been delayed or having passed through the watershed too quickly. It can be seen that in Event 3 the response of Hydro-BEAM to the rainfall input appears to be delayed. The adaptive updating procedure attempts to correct the low discharge rate observed at Mino and Chusetsu by increasing the discharge rates in the surrounding regions, which leads to the predicted hydrographs overshooting the observed hydrograph at a later stage. This highlights the value of using a hydrological model that gives the correct temporal response to rainfall inputs. 75 Adaptive Updating of a Distributed Rainfall-Runoff Model Table 4-3: Updating parameters Parameters Optimized value 5.0×10-7 5.0×10-5 6 0.01 CNVR CNVR (prediction) α β Table 4-4: Storm events used in application Event Event 1 Event 2 Event 3 Storm term 10~12/9/2000 18~20/6/2001 14~16/9/2001 Max discharge (Chusetsu) 5041.0 (t/s) 1096.0 (t/s) 510.0 (t/s) 6000 0 10 20 30 4000 40 3000 50 60 2000 70 Rainfall Observed discharge Calculated discharge Updated discharge 1000 0 80 90 100 9:00 11/9/2000 12:00 15:00 18:00 21:00 0:00 Time Figure 4-5: Updating results for Akutami (Event 1) 76 3:00 Rainfall (mm/hr) Discharge (m3/s) 5000 Adaptive Updating of a Distributed Rainfall-Runoff Model 1400 0 10 1200 20 30 Rainfall 800 40 Observed discharge Calculated discharge 50 1-hr ahead predicted discharge 600 60 2-hr ahead predicted discharge 3-hr ahead predicted discharge 400 Rainfall (mm/hr) Discharge (m3/s) 1000 70 80 200 90 0 100 7:00 19/6/2001 10:00 13:00 16:00 19:00 Time 22:00 1:00 Figure 4-6: Prediction results for Chusetsu (Event 2) 800 0 10 700 20 Rainfall 500 400 Observed discharge 30 Calculated discharge 40 1-hr ahead predicted discharge 2-hr ahead predicted discharge 50 3-hr ahead predicted discharge 60 300 70 200 80 100 90 0 11:00 15/9/2001 100 14:00 17:00 20:00 Time 23:00 2:00 5:00 Figure 4-7: Prediction results for Chusetsu (Event 3) 77 Rainfall (mm/hr) Discharge (m3/s) 600 Adaptive Updating of a Distributed Rainfall-Runoff Model 4.7. Conclusion An adaptive updating scheme has been proposed for use with a distributed rainfallrunoff model. Due to the large computational effort that would be required to run a traditional filtering scheme, such as the widely-used Kalman filter, for each mesh cell within a watershed, an alternative that can feasibly be used in real-time that updates a watershed model’s discharge based on discharge observations available at a limited number of locations has been explored. Two schemes for distributing the influence of the adaptive updating scheme from observation stations to other areas within a watershed have been proposed, and applications for three storm events have been carried out to demonstrate updating of the downstream region of the Nagara River watershed. Results of the application for prediction over a 3-hour lead time show that in nearly all cases the updated discharge is of higher or similar accuracy when compared with the non-updated discharge provided by Hydro-BEAM. The updating scheme has also been shown to effectively update mesh cells within close proximity of observation stations, as demonstrated for the location of Akutami where available observation data was used for verification only. The inclusion of observations from Akutami would further increase the utility of the adaptive updating scheme. When using the partial distributed updating scheme, depending on the size of the watershed, the locations and number of available observation stations, and the values chosen for α and β, the influence of the filter may not necessarily extend to all regions of the watershed. In the case of the Nagara River watershed, as the majority of residents live in downstream regions close to Mino, Akutami and Chusetsu, this is not considered a limitation. 78 AI-based Error Correction for Distributed Rainfall-Runoff Models 5. AI-BASED ERROR MODELING CORRECTION FOR RAINFALL-RUNOFF Assumptions associated with the model structure and errors associated with observed rainfall and the physical structure of the river basin being modeled, can lead to errors in the simulated and predicted runoff rate outputs from the model. Furthermore, it is recognized through numerous attempts to calibrate the Hydro-BEAM model used in this research, that there exists no one optimal set of model parameters that can adequately model hydrological dynamics for every type of rainfall event that may be encountered. For this reason, it is desirable to be able to recognize and decrease the model error in real time through a process of comparing recent model outputs with discharge observations at those locations within the river basin where such information is available on a real-time basis. An artificial intelligence based approach to error correction of outputs from distributed rainfall-runoff models is considered here. Unlike the real-time adaptive updating approach presented in the previous chapter, which seeks to improve the state of the model through continuously correcting the state of an entire river basin through a feedback process based on runoff observations, the alternative AI-driven approach works through predicting the model error based on the model performance over recent time steps. These predicted errors are added to the simulated model discharge rates to give a discharge prediction for each location within the river basin where real-time discharge observations are available. These discharge predictions can then be interpolated and extrapolated across the river basin based on an understanding of spatial and temporal relationships between the hydrographs at each watershed location, as will be discussed in the next chapter. 5.1. Procedure of AI-based error correction approach The procedure of the proposed error correction method used for locations in a river basin where real-time discharge observation data is available is illustrated in Figure 5-1. The time series of observed and modeled discharge rates to the present, and future discharge rates modeled using forecasted rainfall are used as the inputs to the system. The input data are passed through a self-organizing map (SOM) which has been trained to group input data into clusters based on data similarity. The SOM determines which cluster the input data belongs to, and passes the data to a genetic programming (GP) 79 AI-based Error Correction for Distributed Rainfall-Runoff Models function specifically chosen for modeling data displaying the characteristics represented by that particular cluster. The GP function is used to map the observed and modeled runoff rate inputs to an estimate of the discharge rate at the specified lead time. Where forecasts for multiple lead times are required, multiple GP functions must be provided for each cluster. Only one SOM needs to be built providing that the same choice of input variables are used for all lead times. An artificial neural network (ANN) can be used together with or in place of the GP function. Observed Observed rainfall rainfall Observed runoff Forecasted Forecasted rainfall rainfall Hydro-BEAM GP1 Modeled runoff GP2 Genetic programming function GPn NN1 Forecasted discharge NN2 Self-organizing map (cluster analysis) NNn Feedforward neural network model Figure 5-1: Schematic of proposed AI-based discharge forecasting approach for river basin locations with real-time discharge observation data 5.2. Genetic programming for error correction Genetic programming (GP) is a machine learning technique based on Darwin’s theory of natural selection that employs symbolic regression to search for a solution to a given problem. GP belongs to a class of evolutionary algorithm methodologies which also includes Genetic algorithms, evolutionary strategies, classifier systems, and 80 AI-based Error Correction for Distributed Rainfall-Runoff Models evolutionary programming. Evolutionary algorithms share the common approach of randomly generating an initial population of solutions, assessing the fitness of each solution using a predefined objective function, and applying genetic operators to the ‘fittest’ solutions to replace the existing population with a new population. This evolutionary process of breeding and assessing new populations of solutions is repeated such that the population as a whole becomes stronger until a satisfactory solution is found. A benefit of using GP is that unlike other black-box techniques such as the artificial neural network and support vector machine, it offers a readily-understandable transparent solution. As this solution takes the form of a function, often some hint can be obtained as to the underlying dynamic processes. Additionally, in searching for solutions, the user can assist in the GP search process by giving more weight to any operators, variables, constants or combinations of the three that are thought to be important to the solution, thus directing the evolution of the population in the right direction. A handful of applications of GP to flood forecasting have been reported to date. Babovic and Bojkov (2001) applied GP, together with a time-lag recurrent neural network and a Kohen self-organzing map to the task of predicting short-term runoff from various hydrometeorological observations. Through a combination of these three techniques, prediction accuracy superior to naïve prediction could be achieved under certain limited conditions. Whigham and Crapper (2001) used a daily time series of rainfall-runoff observations together with GP to discover relationships suitable for describing rainfall-runoff dynamics in two watersheds. Results using GP were found to be equivalent to those for a calibrated deterministic lumped parameter model during periods when rainfall and runoff were correlated. When such correlation was weak, it wad found that GP had an advantage as the procedure did not assume any underlying physical processes. A study was conducted by Liong et al. (2002) applying GP to the problem of determining the relationship between future runoff and recently observed rainfall and runoff data, at the outlet of the Upper Bukit Timah watershed in Singapore. It was concluded that the functional relationships determined using GP could be used to give a short-term forecast superior to the naïve persistence forecasting technique. 81 AI-based Error Correction for Distributed Rainfall-Runoff Models Each of the above studies focused on using GP to discover the relationship between input hydrometeorological variables and future runoff conditions, however the evolutionary process struggled to discover all the required knowledge to sufficiently model the full range of rainfall-runoff phenomenon. 5.2.1. Genetic programming calculation procedure GP uses parse tree structures to describe individual candidate solutions, as they allow for convenient description, calculation and manipulation of functions. A parse tree is a structure whose nodes are comprised of operators, variables and constants. In a tree structure, a node consisting of a function which accepts arguments (consisting of combinations of operators, variables and constants) is referred to as a branch of the tree, and a node consisting of a variable or constant and thus accepting no arguments is a terminal node and is referred to as a leaf of the tree, as demonstrated in Figure 5-2. The set of operators used within a tree structure may include arithmetic operators (+, -, /) and mathematical functions (log, sin), and for more complex problems may be extended to include logical operators (IF, ELSE), iterative procedures (LOOP-UNTIL), and userdefined functions. Input variables and constants that are considered to be relevant to the phenomenon being modeled can be predefined as candidate tree components. root node / + branch node a (a+b)/2 2 leaf node b Figure 5-2: Parse tree representation An initial population of individuals is generated at the start of a GP run. The number of individuals is predefined based on user experience, and individuals are generated randomly from a set of candidate operators, variables and constants (both predefined and random). Often a limit to the maximum allowed complexity (number of branches, depth of tree structure) of functions is also defined. Following the creation of each new generation, the fitness of each individual in the population is calculated to determine which will be used to create the next generation, and which will be expelled from the population. An objective function is used to measure the ability of each individual to correctly map input vectors to associated target 82 AI-based Error Correction for Distributed Rainfall-Runoff Models outputs, with the fitness of each individual determined based on all data points in the training data set. This accuracy can be specified using one or more appropriate measures, such as the root mean square error or average absolute error. In order to evolve the population, new solutions are generated from fit individuals through a process of crossover and mutation. In crossover, two new offspring solutions are generated from two parent solutions through exchange of genetic material. A single node from each parent tree is chosen randomly, and the node together with the entire branch below it is swapped with the other parent. Mutation allows the addition of new characteristics to the population, through randomly replacing a node in an individual solution with another element, with the restriction that the new element must take the same number of arguments as the old element (i.e. an addition operator which takes two arguments can not be replaced by a variable x that accepts zero arguments). The concepts of crossover and mutation are demonstrated in Figure 5-3 and Figure 5-4, respectively. A GP run proceeds as follows: 1. Randomly generate an initial population of n solutions. 2. Test each solution in the population against the objective function to determine fitness levels. Solutions that best solve the problem being modeled are assigned the highest fitness levels. 3. Select the fittest m solutions to be passed to the next generation, and to act as parents for the new solutions. 4. Perform crossover on random combinations of parents to generate n-m new trees to complete the new population. Alternatively, perform mutation on one individual remaining from the previous generation to produce a new offspring. 5. Repeat steps 2 through 4 until either i) a suitably accurate solution is generated, or ii) the maximum calculation time / number of allowed generations is reached. A flowchart describing this process is given in Figure 5-5. 83 AI-based Error Correction for Distributed Rainfall-Runoff Models × / + a √ - 2 b c a 0.5 Parent 1: (a+b)/2 Parent 2: (c-a)×√0.5 / × √ + a b - 2 c 0.5 Child 1: (a+b)/√0.5 a Child 2: (c-a)×2 Figure 5-3: Crossover / / 2 + a 2 - b a Parent: (a+b)/2 b Child: (a-b)/2 Figure 5-4: Mutation START Randomly generate initial population Evaluate fitness of each individual Test stopping criteria Yes STOP No Remove least fit individuals and promote surviving individuals to next generation Crossover Select 2 individuals Select genetic operator Mutation Select 1 individual Perform crossover Perform mutation Add 2 new children to population Add 1 new child to population Yes Enough individuals? No Figure 5-5: GP procedure flowchart 84 AI-based Error Correction for Distributed Rainfall-Runoff Models 5.3. Feedforward artificial neural network for error correction An artificial neural network (ANN) is a mathematical model used for data processing inspired by the bioelectrical networks in the brain comprised of neurons and synapses. In an ANN, simple processing elements referred to as neurons are used to create networks that are capable of learning to model complex systems. For an introduction to the structure and design of Artificial Neural Networks the reader is referred to Hagan et al. (1996). A number of studies into the application of ANN in the field of rainfall-runoff modeling and flood forecasting have been carried out (Karunanithi et al., 1994; Lorrai and Sechi, 1995; Campolo et al., 1999). Hsu et al. (1995) compared ANN models with traditional black box models, concluding that an ANN model is capable of giving superior performance over a linear ARMAX (autoregressive moving average with exogenous inputs) time series approach, when observed time series of flow rate and rainfall are used as input. In general, ANN have been found to perform well in predicting shortterm flood stage for flood events closely resembling in magnitude previous flood events used for training the networks. ANN models, however, tend to perform poorly during extreme events, and Elshorbagy and Simonovic (2000) warn against using ANN models as the sole runoff prediction strategy. Also, it is difficult to determine the optimal ANN architecture for a given watershed, and in most cases, a trial-and-error approach is still used. As an alternative to the genetic programming strategy introduced above, an ANN can be considered for use in forecasting the error between the outputs of a physical rainfallrunoff model and the observed runoff rates. A feedforward neural network has been used for this purpose by Smith et al. (2004) and was found to provide similar accuracy to GP. An advantage of GP is that it is easier to use than an ANN approach in that it uses a function in the forecasting stage rather than a complicated network of neurons. Furthermore, as a function can be easily understood by simple inspection, in many cases it might offer clues about the dynamics of the phenomena being explored – in this case the relationship between the observations, model outputs and error time series. 5.4. Self-organizing map for data clustering Self-organizing maps (SOM) are considered here for their ability to classify input 85 AI-based Error Correction for Distributed Rainfall-Runoff Models vectors into groups based on their similarity. The advantage of this is that a separate regression model can be tailor-made for each group as an alternative to a one-size-fitsall global model. SOM were developed by Kohonen (2001) and are also referred to as self-organizing feature maps or Kohonen maps, and are a variation on the competitive neural network. An SOM can be taught through a process of unsupervised learning to map input vector data onto regions of a grid such that similar vectors are grouped together. Furthermore, as the resulting map transforms multi-dimensional input vectors into one or twodimensional space, an SOM can also be used as a data visualization technique. The SOM used in this research is formed from a 2-dimensional rectangular grid of neurons. Each neuron is initialized with a random n-dimensional weight, with the value of each of the weight’s components taking on a value between zero and one. A training set of input vectors are prepared using past observations and runs of the Hydro-BEAM model. The components of these input vectors are scaled to range between zero and one. Each input vector is passed through the map one at a time. The neurons in the map compete to see which has the weight with the smallest Euclidean distance to the input vector. The winning neuron i* and the neurons within close proximity are then updated to become more similar to the input vector according to the Kohonen rule. w i q w i q 1 p q w i q 1 1 w i q 1 p q i N i* d 5-1 Here the neighborhood Ni*(d) refers to the all the neurons that lie within a radius d of the neuron i*, p(q) is the current input vector, wi(q-1) and wi(q) are the previous and updated values respectively of the weight associated with neuron i, and α is a learning rate which controls the degree to which weights are modified. Figure 5-6 illustrates the basic structure of a SOM. This process is repeated with all training input vectors passing through the network numerous times, and the weights associated with each neuron come to resemble those of other neurons in close proximity. As the learning process proceeds the value of d gradually decreases to reduce the range of influence of a winning neuron. Once the learning stage is completed the map can then be used to sort input data into clusters based on similarity. Input data vectors with similar properties become associated with 86 AI-based Error Correction for Distributed Rainfall-Runoff Models neurons that are in close proximity to each other on the map. All training data are passed through the map after the learning process is completed and are sorted into clusters based on the region of the map that they become associated with. New data vectors can also be classified as belonging to a cluster in the same way. Figure 5-6: Basic structure of a self-organizing map The decision as to the number of clusters to use and the decision as to which neurons should be grouped into which clusters can be rather subjective in the case of the SOM used here for separating training data into groups for GP model identification. Following the learning stage, a tally of the number of training data associated with each neuron on the map can be made and the results plotted. A visual inspection of the clustering of the data will often reveal groups of neurons that have a large number of data vectors associated with them, separated by neurons that have very few. Clusters can be chosen such that they are centered on these groups of strong neurons. 5.5. Application A case study is undertaken here which demonstrates the use of GP and SOM with the distributed rainfall-runoff model Hydro-BEAM to predict future runoff conditions for the Nagara River watershed. The use of the distributed rainfall-runoff model allows knowledge of physical processes in the watershed to be incorporated into the solution, leaving GP and SOM to focus on uncovering the less-understood prediction error structure. 87 AI-based Error Correction for Distributed Rainfall-Runoff Models Radar-observed rainfall patterns are used as input for Hydro-BEAM for both the observation period prior to when the prediction is made and for the prediction period spanning between time zero and the designated lead-time. This in effect removes interference from rainfall forecast errors. The effects of these errors on the prediction ability of the overall flood forecast system are considered separately in a Monte Carlo simulation of distributed rainfall conditions which is used to provide the rainfall forecast. 5.5.1. Problem formulation The distributed rainfall-runoff model Hydro-BEAM is used together with observed and predicted rainfall patterns to predict runoff conditions for up to 6-hours ahead for the Nagara River watershed located in Gifu, Japan. At locations where real-time runoff observations are available, the accuracy of past model predictions are compared with runoff observations and these comparisons are used to estimate prediction error. The estimated prediction errors can then be added to the current model predictions to give an improved prediction. In this application the inputs to GP are selected from among the following: The past observed runoff rates at the location of interest for the current time step and up to 5 previous hourly time steps, Qo-5 ~ Qo0. The model-calculated runoff rates for the current time step, the 5 previous hourly time steps, and for each time step up to the lead time for which the prediction is required, Qc-5 ~ Qcn. Where available, the observed runoff rates at (upstream) locations i = 1, m in the watershed for time steps t = -5 ~ 0, labeled Qo,it. The observed model errors, calculated from the time series of Qo and Qc above, for the current and 5 previous hourly steps, E-5 ~ E0. Three GP functions are required for each location of interest to estimate the future time series of prediction errors E1 ~ E6. 5.5.2. Training, cross-check and verification data A number of runs of Hydro-BEAM are performed to produce time series of modelcalculated estimates, and the observed runoff rates for the corresponding periods are 88 AI-based Error Correction for Distributed Rainfall-Runoff Models also obtained. The resulting data set is divided into training, cross-check and verification data sets, comprising 8, 4 and 3 runoff events, respectively. The GP search proceeds by evolving populations and measuring their fitness based on their ability to convert the inputs in the training set to their corresponding correct outputs. In order to avoid overfitting, the fittest member from each generation is tested against a cross-check data set which is prepared independently of the training data. The crosscheck data set can be used to estimate the point during evolution when overfitting begins to occur. Overfitting occurs when the GP evolved functions begin to lose their ability to generalize as they continue to fit closer to the specific examples presented in the training set. This will occur at the point when the fittest GP functions from each evolution will begin to decrease in performance when tested against the cross-check data, despite an increasing performance in terms of the training data set. Each GP run was performed over 30 minutes with the population in each run set to be between 250 and 450 and the number of children produced at each generation set to be between 400 and 1000. 5.5.3. Results Expressions were evolved for the location of Chusetsu for 1 and 3-hour ahead lead times. Each expression was evolved so as to be dimensionally correct, although this is not necessarily a requirement as the functions are being evolved through an empirical process and are not guaranteed to possess an underlying physical meaning. After collecting like terms, the expression identified for a 1-hour ahead lead time could be written as a linear function of the error time series: 0.614Qct Et 1 0.909 Et 0.614 Et 1 Qct 1 5-2 A suitable expression for a 3-hour lead time was also identified; however it was found that the iterative use of Equation 5-2 over three time steps, predicting the 1-hour ahead error at each step, provided superior 3-hour ahead prediction performance. Tests were also performed for 4 through 6-hour ahead predictions, with results suggesting that GP is incapable of predicting model error for these lead times. 89 AI-based Error Correction for Distributed Rainfall-Runoff Models The 1 and 3-hour ahead results for two rainfall events are given for Chusetsu in Figure 5-7 and Figure 5-8. A comparison of performance in terms of root mean square error is made in Table 5-1 for the two events between the following prediction strategies: Hydro-BEAM calculated discharge without a GP model Hydro-BEAM calculated discharge with GP model correction Naïve prediction, which assumes that the discharge at the target lead time will be the same as the current observed discharge. Table 5-1: Prediction error comparison Event 16-17/7/2001 16-17/7/2001 14-16/9/2001 14-16/9/2001 Lead time 1-hour 3-hour 1-hour 3-hour RMSE Calc. 106.6 106.6 87.0 87.0 RMSE GP 32.1 78.3 21.3 79.8 RMSE Naïve 42.5 111.8 29.6 82.4 Results show that the accuracy of the GP enabled prediction is superior to that of both the raw Hydro-BEAM calculated discharge and the naïve prediction for each case. A SOM was also developed to group training data into clusters. An example 3-hour ahead prediction made using two separate GP models trained based on the clustered data is shown in Figure 5-9, together with a prediction based on a GP model trained using the entire non-clustered data set, included for comparison. Two clusters were used as only a limited training data set was available. It can be seen from the example that data could be divided into meaningful clusters, making distinctions between times when the hydrograph is rising and falling, and between high and low discharge rates. Generally it was found that the use of SOM for predicting model error in the case of this research provided no significant advantage, and in some cases led to poorer prediction error. It is reasoned that this is due to requiring an already limited data set to be divided into two smaller clusters, such that each cluster does not have sufficient data for training a GP model without suffering the effects of overfitting. 90 AI-based Error Correction for Distributed Rainfall-Runoff Models 700 Observed discharge Calculated discharge 1-hr ahead predicted discharge 3-hr ahead predicted discharge Discharge (m3/s) 600 500 400 300 200 100 0 9:00 15:00 16/7/2001 21:00 3:00 17/7/2001 Time 9:00 15:00 21:00 Figure 5-7: Runoff predictions for Chusetsu (16-17/7/2001) 700 Discharge (m3/s) 600 500 Observed discharge Calculated discharge 1-hr ahead predicted discharge 3-hr ahead predicted discharge 400 300 200 100 0 22:00 4:00 14/9/2001 10:00 16:00 22:00 4:00 Time 16/9/2001 10:00 16:00 Figure 5-8: Runoff predictions for Chusetsu (14-16/9/2001) 91 AI-based Error Correction for Distributed Rainfall-Runoff Models 800 Discharge (m3/s) 700 600 Observed GP Predicted GP & SOM Cluster 1 GP & SOM Cluster 2 500 400 300 200 100 0 22:00 4:00 14/9/2001 10:00 16:00 22:00 4:00 16/9/2001 Time 10:00 16:00 22:00 Figure 5-9: 3-hour ahead runoff predictions for Chusetsu (14-16/9/2001) 5.6. Conclusion The utility of using a GP-fitted expression for predicting Hydro-BEAM modeling errors has been investigated in this chapter. It is found that Hydro-BEAM simulations of future discharge conditions can be improved for a lead-time of up to 3-hours through the use of expressions identified for individual locations of interest, with no prediction performance gains over the use of the raw Hydro-BEAM calculated results for lead times of 4-hours and longer. The use of self-organizing maps has been considered for clustering training data to allow for developing GP expressions tailored to various specific discharge conditions. While SOM has shown ability to group training data into meaningful clusters based on observed and modeled discharge conditions, data clustering did not promote improvements in prediction accuracy of the use of a single GP model in the case of this research. The use of SOM may be appropriate in the case where large data sets are available. The use of GP in this manner is restricted to locations where real-time and historical discharge observation data are available. Chapter 6 introduces a methodology for 92 AI-based Error Correction for Distributed Rainfall-Runoff Models extending the Hydro-BEAM and GP-based discharge predictions to other locations within the target watershed. 93 Interpolation of Runoff Predictions for Distributed Flood Forecasting 6. INTERPOLATION OF RUNOFF PREDICTIONS FOR DISTRIBUTED FLOOD FORECASTING The high spatial and temporal correlation between runoff rates at different locations in a watershed can be taken advantage of to allow interpolation and extrapolation of runoff rates to locations where real-time observations or predictions are currently unavailable. Provided with runoff rates at only a handful of observation stations, interpolation and extrapolation of runoff rates along a watershed’s rivers are achieved based on knowledge acquired from multiple off-line precipitation-driven distributed hydrological simulations of historical runoff events. Local linear modeling and global regression are investigated for the analysis of spatial patterns across the watershed. This process uses the filtered predictions made using the AI-based error correction model discussed in the previous chapter to estimate future runoff conditions at all locations in a watershed, not just those where runoff observations are available. 6.1. Proposed interpolation strategy Local Linear Modeling (LLM) and Global Linear Modeling (GLM) are investigated for their application to interpolation and extrapolation of runoff rates along river channels. Because the interpolation system developed in this research must be used to identify hydrological patterns for hundreds of unique combinations of watershed locations under a variety of different hydrological conditions, it is essential to use a flexible strategy capable of adjusting itself to each different task in real-time. Additionally, in consideration of global climate change, it is desirable that the system as a whole can grow and adapt to changing hydrological conditions. For these reasons, both strategies use a database containing numerous precipitation-driven rainfall-runoff simulation results from a distributed rainfall-runoff model calibrated to the target watershed of interest. The simulated hourly discharge rates at each watershed location (1km spatial resolution) stored in this database can then be accessed in real-time to recognize spatial and temporal patterns between hydrographs at different locations in the watershed, thus removing the need for the development of numerous pre-defined models. In this way the most probable discharge rates at various unguaged locations in a watershed can be estimated based on observations or predictions of discharge rates at each available discharge observation station. The system can be automatically updated following each new observed precipitation event simply by performing a hydrological simulation and adding the results to the database, thus increasing the diversity of the knowledge in the database through the inclusion of new hydrological phenomena. 94 Interpolation of Runoff Predictions for Distributed Flood Forecasting 6.2. Local linear modeling 6.2.1. Introduction Local Linear Modeling is used here to approximate the relationship of future runoff states at watershed locations without discharge observation stations using the filtered predictions (and recent observations) of future runoff states at observation station locations. This method provides an effective tool for finding an estimate or prediction for a query vector x by fitting a parametric function in the neighborhood of x. Unlike global models such as Artificial Neural Networks which seek to fit a single global model to all of the training data, local models use only those training samples that are most similar to the query vector x to obtain a locally parametric model suitable for estimating f(x) in the vicinity of x. As linear regression is only used in the vicinity of the query, the LLM strategy is capable of modeling solution spaces that are globally nonlinear. A local regression model is used to approximate a relationship between the query vector and output vector by drawing upon database simulation data and embedding it into a suitably-determined state space. This state space is searched for the k nearest neighbors closest to the query vector. A regression is then performed on the neighborhood, from which an estimate of the state of the non-observation location can then be made. Regressions of polynomial degree zero and one are respectively referred to here as Local Averaging Models (LAM) and Local Linear Models (LLM). Regressions of higher polynomial degree are possible, however only those of degree one are considered here. Atkeson et al. (1997) give the following linear model, which assumes that the constant 1 has been appended to all the input vectors x to include a constant term. y 0 1 x1 d xd 6-1 Here i are the set of model parameters requiring identification, xi are the model inputs, d is the dimensionality of the training data and is an error term to be minimized. The training examples are collected in matrix X and the model parameters are collected in matrix β . y βX 6-2 The model is determined through estimation of the parameters i using a regression 95 Interpolation of Runoff Predictions for Distributed Flood Forecasting which minimizes x i T i i yi 2 6-3 through solution of the normal equations X X β X T T y 6-4 with the matrix XTX inverted for : XTX XT y 1 6-5 6.2.2. Nearest neighbors search An exhaustive search strategy is used to find the k nearest neighbors to the query vector, which requires that the Euclidean distance dE between the query vector q and each data point x in the database be calculated for every query made. d E x, q x j j qj 6-6 x q x q T Efficient nearest point search algorithms are available to speed the nearest neighbor lookup process, such as the k-d trees scheme (Bentley 1980, Moore 1991) which creates a data structure for storing the set of training points taken from a d-dimensional space, to allow for rapid subsequent lookup. In the case of this research, the system is designed to be flexible to allow for changes in database size, data quality and hydrological and climate change. The query vector has a variable form to allow for the unique requirements of each location within a watershed and for changes in the temporal correlation between discharge rates at spatially separated locations. For this reason and as database search time is negligible, an efficient search algorithm is deemed unnecessary. An option to prevent a given regression from being dominated by data points all taken from the same simulation is included in the system, whereby the maximum fraction of nearest neighbors that may be chosen from a given simulation event i is restricted to be 6-7 ki 1/ a n _ sim b i 1, , n _ sim where a and b are chosen by the user such that their sum is unity (a=0.05, b=0.95 is used in this research) and n_sim is the number of simulations stored in the database. Furthermore, in recognizing that some observation stations will be more important than others in the regression stage, the elements of the query vector can be weighted during 96 Interpolation of Runoff Predictions for Distributed Flood Forecasting the nearest neighbor search to give priority to data elements from observation stations that have hydrographs that are highly correlated with the query location’s hydrograph. These observation stations will often be those that are geographically closest to the query location. One approach towards choosing appropriate weights involves using the magnitude of the correlation vector φ 1, 2 , , m , which is a measure of how highly correlated each query vector element is to the runoff at the target location. This correlation can be estimated from simulated data in the database, and assumes a linear relationship. These measures of correlation can be used to weight the elements of the query vector when searching for nearest neighbors: the higher the value of j , the more influence the corresponding query vector element will have in determining suitable nearest neighbors for the regression. This modified measure of distance between query point and data point is referred to here as the Dimensionally Weighted Euclidean Distance (DWED). d DWED x, q x j x q T 2 j j qj 6-8 φ φ x q T 6.3. Global regression As the number of nearest neighbors approaches n_sim the modeling approach moves from a local modeling strategy to a global regression strategy. This global regression approach can be considered as an extension of the local linear regression described above, using all available simulation data in searching for a relationship between the particular combination of locations under investigation. 6.4. Choice of query vector form The proposed interpolation system is designed to exploit the correlation that exists between the discharge rates at different locations within the same watershed. It is therefore desirable to tailor the form of the query vector to suit each individual watershed location such that it maximizes the use of available correlated data. Since observations of discharge rates and filtered predictions of future discharge rates are available at observation stations, data from these locations form the basis of the query vector. 97 Interpolation of Runoff Predictions for Distributed Flood Forecasting 6.4.1. Temporal correlation between elements An estimate of the correlation between the hydrograph of the target non-observation point and the hydrographs from each observation station is determined. Q Q T T s Q Q Q T n i 1 n i 1 ts i ts i ts t t i ts 2 n i 1 t i 6-9 t 2 Here s is the number of time lag steps, n is the number of data points in the time series, and Qt and Tt are the discharge rates at the observation station and target location respectively, at time step t. In most cases there will exist a given time lag at which the two hydrographs being compared have the highest correlation. For example a target location’s present discharge rate will have a higher correlation with an upstream location’s discharge rate from a number of time steps prior, compared with its present discharge rate. In other words, the influence of an upstream location’s discharge takes some time to be felt by downstream locations. In the case of using the interpolation system in a prediction scenario, the optimal time lag for each combination of locations is chosen to be the non-positive time lag that shows the largest correlation. In the case where the observation station is downstream of the target location, the optimal time lag for that observation station will nearly always be zero, since positive time lags have no relevance in a prediction scenario. The query vector for a given target location thus takes the following form: q t Qtt s1 , Qtt s 2 , , Qtt sm 6-10 where s1, s2,…, sm refer to the optimal lag of each of the m observation stations. 6.4.2. Additional elements The inclusion of a number of additional query vector elements, which refer to other factors related to the hydrological dynamics in the watershed, may result in an increase in the accuracy of the interpolation method. Division of data points in the database into groups related to the stage of the hydrograph at the time of observation is considered. Here the hydrograph stage is simply described by one of the following four descriptors: (low flow / rising / peak / falling). A low flow level is set for each observation station based on hydrological records, with any discharge rate below or equal to that defined as 98 Interpolation of Runoff Predictions for Distributed Flood Forecasting ‘low flow’. Any discharge rate above this level is then grouped based on the following rules: If the second derivative of the discharge rate time series is negative: ‘peak’ Else, if the first derivative of the discharge rate time series is positive: ‘rising limb’ Else, if the first derivative of the discharge rate time series is negative: ‘falling limb’ In this way, the inflection points of the hydrograph are chosen as the transition points between rising limb, peak, and falling limb regions. This strategy of grouping data based on hydrograph stage has a similar effect to using a self-organizing map for data clustering. 6.5. Application This section presents the results of an application to test the ability of the local modeling scheme to faithfully model the temporal-spatial relationship between watershed locations based on the distributed rainfall-runoff simulation results. The application is conducted for two typhoon events that occurred in the vicinity of the Nagara River watershed. The discharge observation stations used for this application exist within the watershed at the downstream locations of Chusetsu and Akutami, and the mid-stream locations of Mino and Shimohorado (Figure 6-1). The vast majority of residences and facilities that require protection from flooding are also located in this southern area of the watershed. Hydro-BEAM calibration and database preparation are performed using simulation results from 10 major precipitation events that occurred in 2000-2004. 99 Interpolation of Runoff Predictions for Distributed Flood Forecasting Figure 6-1: Nagara River flow routing map and discharge observation stations Validation of the system is performed using two additional independent runoff events that occurred in 2003. Two scenarios are investigated here. The first scenario involves interpolating discharge rates for a location (Mino) that has observation stations located in both upstream (Shimohorado) and downstream (Akutami, Chusetsu) locations. The second scenario involves extrapolation of discharge rates to a location (Shimohorado) that has no observation stations located upstream, and three observation stations located downstream (Mino, Akutami, Chusetsu). In each case the observed runoff at the target location is only used for verification, and as such these locations are assumed to be without observation stations. The observed discharge rates at the four locations for the two events used in this application are given in Figure 6-2 and Figure 6-3. 100 Interpolation of Runoff Predictions for Distributed Flood Forecasting 1800 Observed discharge 23-28/4/2003 Chusetsu Akutami Mino Shimohorado 1600 1400 Discharge (m3/s) 1200 1000 800 600 400 200 0 23/4 0:00 24/4 0:00 25/4 0:00 26/4 0:00 27/4 0:00 28/4 0:00 29/4 0:00 Date / Time Figure 6-2: Observed discharge, Event 1: 23-28/4/2003 1600 Observed discharge 11-13/7/2003 Chusetsu Akutami Mino Shimohorado 1400 Discharge (m3/s) 1200 1000 800 600 400 200 0 11/7 0:00 12/7 0:00 13/7 0:00 14/7 0:00 Date / Time Figure 6-3: Observed discharge, Event 2: 11-13/7/2003 6.6. Results and discussion The correlation between hydrographs in the simulation database is analyzed to determine the optimal form of the query vectors. It is found that the discharges at the 101 Interpolation of Runoff Predictions for Distributed Flood Forecasting two target locations are best described by functions of the following form, where superscripts refer to hourly time lag steps and subscripts refer to location names: QMt f QCt 2 , QAt 1 , QSt 1 Mino Shimohorado Q f Q , Q , Q t S t 3 C t 3 A t 1 M 6-11 Results using local linear modeling with a small number of nearest neighbors gave unstable results for both Mino and Shimohorado. It was found that stability and accuracy of the interpolation and extrapolation results improved as the number of nearest neighbors approached the number of data points in the database, equivalent to the global regression approach. The high linear correlation between hydrographs at each location studied also suggests that global linear regression is a valid approach. Results using global regression for Mino are given in Figure 6-4 and Figure 6-5, and for Shimohorado in Figure 6-6 and Figure 6-7. Table 3-1 gives the root mean square (RMS) error and mean absolute relative (MAR) error for the integration at Mino and the extrapolation at Shimohorado for the two events. 1600 Observed Discharge (Mino) Global Regression Global Regression & Optimal Lag Global Regression & Data Division Mino 23-28/4/2003 1400 Discharge (m3/s) 1200 1000 800 600 400 200 0 23/4 0:00 24/4 0:00 25/4 0:00 26/4 0:00 27/4 0:00 28/4 0:00 Date / Time Figure 6-4: Interpolation for Mino, (23-28/4/2003) 102 29/4 0:00 Interpolation of Runoff Predictions for Distributed Flood Forecasting 1000 Mino 11-13/7/2003 Observed Discharge (Mino) Global Regression Global Regression & Optimal Lag Global Regression & Data Division 900 800 Discharge (m3/s) 700 600 500 400 300 200 100 0 11/7 0:00 12/7 0:00 13/7 0:00 14/7 0:00 Date / Time Figure 6-5: Interpolation for Mino, (11-13/7/2003) 700 Observed Discharge (Shimohorado) Global Regression Global Regression & Optimal Lag Global Regression & Data Division Shimohorado 23-28/4/2003 600 Discharge (m3/s) 500 400 300 200 100 0 23/4 0:00 24/4 0:00 25/4 0:00 26/4 0:00 27/4 0:00 28/4 0:00 Date / Time Figure 6-6: Extrapolation for Shimohorado, (23-28/4/2003) 103 29/4 0:00 Interpolation of Runoff Predictions for Distributed Flood Forecasting 350 Shimohorado 11-13/7/2003 Observed Discharge (Shimohorado) Global Regression Global Regression & Optimal Lag Global Regression & Data Division 300 Discharge (m3/s) 250 200 150 100 50 0 11/7 0:00 12/7 0:00 13/7 0:00 14/7 0:00 Date / Time Figure 6-7: Extrapolation for Shimohorado, (11-13/7/2003) Table 6-1: Global regression results for Mino and Shimohorado Method GRa RMSb c GR / optimal lag RMS MAR GR / division RMS MAR MAR Mino Event 1: 23-28/4/2003 101 0.140 89.9 0.128 99.8 0.165 Event 2: 11-13/7/2003 23.8 0.0543 21.9 0.0532 38.0 0.0895 Shimohorado Event 1: 23-28/4/2003 36.9 0.201 24.5 0.243 48.5 0.176 Event 2: 11-13/7/2003 25.0 0.251 21.8 0.270 26.1 0.210 a b c Global regression Root mean square error (m3/s) Mean absolute relative error Application results indicate that the global regression strategy proposed here is capable of estimating hydrographs at distributed positions within a watershed based on knowledge of the hydrographs at positions located at a distance. As would be expected, hydrograph shape is estimated accurately, with rising and falling limbs, and hydrograph peaks timed well. For the unseen events, a mean absolute relative error in magnitude of the estimated runoff of the order of 0.05~0.15 was achieved for the two cases of interpolation for runoff at Mino, with less accurate results for extrapolation to the distant location of Shimohorado of the order of 0.20~0.25. The results showed that a slight improvement in accuracy was gained for the 104 Interpolation of Runoff Predictions for Distributed Flood Forecasting interpolation at Mino through optimization of the query vector to consider the time lags at which the target location is optimally correlated. Division of the data points in the database to reflect their position in a hydrograph (baseflow, rising limb, peak, falling limb) to train separate regression models for each hydrograph stage showed mixed results with an increase in accuracy only for the extrapolation case at Shimohorado. These results are inconclusive regarding the benefit of employing lag optimization and data division. 6.7. Conclusion A strategy for interpolation and extrapolation of runoff rates across a watershed has been introduced. Results indicate that global regression can be used to estimate the shape, timing and magnitude of hydrographs separated from reference locations where runoff observations or predictions are available. Further investigation is required to determine the ability of the system to accurately extrapolate results to locations greatly separated from observation locations. 105 Probabilistic Flood Forecasting 7. PROBABILISTIC FLOOD FORECASTING Combined rainfall prediction and rainfall-runoff simulation procedures for estimation of future flood stage conditions generally attempt only to offer a best-guess estimate of future river watershed discharge conditions without giving any information in regards to the confidence of the forecast being made. This ignores the fact that there are a number of sources of forecast uncertainty that exist, including rainfall measurement and forecasting errors, and errors associated with Hydro-BEAM and its parameters. Information about the uncertainty in forecasts, otherwise referred to as predictive uncertainty, can be beneficial in a number of ways, especially when this uncertainty is described in the form of a probabilistic forecast, which gives the probability distribution of the variable being forecast. Risk-based decision-making becomes possible when probabilistic rather than deterministic forecasts are provided, with the potential for social and economic benefits resulting from the operation of floodgates and pumps, and other mitigation measures, with a view to risk minimization. Risk-based flood warning is also made possible through probabilistic flood stage forecasting, where the probability of exceedance of design flood levels can be provided. This has the benefit of reminding the user that a given forecast is not certain, and alerts the user to the range of flood stage heights that could potentially be experienced. This would help to remove the confusion during flood events that would otherwise likely occur if a flood stage prediction were exceeded in a major flood event, leading to damage or loss of life as a result of misguided faith in what was a ‘best’ but by no means perfect estimate of future conditions. For the above reasons, a framework is proposed for taking into account the uncertainty inherent in a flood stage forecasting system. An estimate of future conditions that takes uncertainties into account through considering probabilities is referred to in this research as a ‘forecast’, while an estimate that ignores probabilities in inferring future conditions is referred to as a ‘prediction’, in line with the recommendations given in Krzysztofowicz (2001a). 7.1. Modeling uncertainty in flood forecasts Uncertainty in watershed runoff predictions results as a consequence of an inability to perfectly predict future rainfall conditions, and the inadequacy of the mathematical 106 Probabilistic Flood Forecasting model used to approximate a highly complex physical system. The uncertainty related to estimates of future rainfall conditions are referred to here as precipitation uncertainty, and the uncertainty related to the model structure, estimated model parameters, and data observations, is referred to as hydrologic uncertainty. Precipitation uncertainty is generally regarded as the most influential cause of uncertainty in a flood forecast (Moore, 2002). Ensemble or Monte Carlo simulationbased forecasts of future hydrological conditions may be used to estimate the uncertainty in a flood stage forecast due to uncertainty in the rainfall forecast input. Ensemble forecasts, however, cannot alone produce a complete probabilistic forecast, as they are only capable of estimating an output distribution of model flood stage, incorporating uncertainty in the precipitation input, while ignoring the hydrologic uncertainty arising from all other sources of uncertainty (Krzysztofowicz, 2001b). Additionally, an ensemble forecast often does not take into account the precipitation measurement error, assuming that the precipitation forecast is made based on perfectly observed climatic conditions. One attempt at incorporating all known uncertainties in a short-term flood stage forecast involved a Bayesian forecasting system, which determines the probability distribution of a model flood stage, under the hypothesis that there is no hydrologic uncertainty (Kelly and Krzysztofowicz, 2000), quantifies hydrologic uncertainty under the hypothesis that there is no uncertainty in the precipitation input (Krzysztofowicz and Herr, 2001), and integrates these uncertainties to produce a probabilistic flood stage forecast (Krzysztofowicz, 2001b). Attempts to date to produce probabilistic forecasts of flood stage have considered rainfall as an averaged or point process using a coarse temporal resolution of the order of one hour, and have used lumped physical models or black box models to model the rainfall-runoff process. Examples include the precipitation uncertainty processor developed by Kelly and Krzysztofowicz (2000) for the aforementioned Bayesian forecasting system, which used a time series of 6-hours watershed average precipitation amounts as input for a lumped hydrologic model, and the real-time flood forecasting system of Lardet and Obled (1994), which uses stochastically generated hourly time series of rainfall as a lumped input to a rainfall-runoff model. A framework for probabilistic forecasting of discharge conditions throughout a watershed, considering 107 Probabilistic Flood Forecasting rainfall at a fine spatial and temporal resolution, and using a distributed physicallybased rainfall-runoff model, is presented here. The probabilistic short-term forecast of watershed flood stage conditions presented in this research is based on a rainfall translation model and a deterministic rainfall-runoff model. Consideration is given to the effects of uncertainty in the rainfall forecast, as well as observational and modeling uncertainties. These hydrologic and precipitation uncertainties are handled as follows: A Monte Carlo simulation of rainfall conditions is used to produce an ensemble forecast considering precipitation uncertainty. Two independent error correction approaches are proposed to reduce the influence of observation and model errors, and to provide an estimate of the uncertainty in the forecast due to hydrologic uncertainty: i. A recursive adaptive updating technique which updates the state of the target watershed in real-time based on runoff observations. ii. An AI technology-based error prediction strategy that works to reduce the rainfall-runoff model error at locations where runoff observations are available in real-time, and uses these corrected model rates to predict the runoff at surrounding locations in the watershed. 7.2. Probabilistic flood forecast formulation An effective means by which to unambiguously convey the degree of certitude in a forecast is a predictive probability distribution function involving a numerical measure of the degree of certitude regarding the occurrence of an event. Charts of the probability density function (pdf), or the equivalent cumulative distribution function (cdf) describing the probability P(Q q) of flood discharge Q being less than or equal to a designated discharge level q, are proposed as an appropriate means of describing a flood forecast for a given location within a watershed for each required forecast lead time. Additionally, a convenient method of displaying results of a distributed flood forecast, so as to provide information at a glance regarding future distributed watershed conditions, is to provide a color-coded plot of probability of exceedance in terms of percentage of design flood level for each location across a watershed. If an appropriate distribution can be fitted to the ensemble forecast results, a single aggregated forecast in pdf or cdf form can be provided for each watershed point. This is 108 Probabilistic Flood Forecasting achieved through combining the distributions resulting from consideration of precipitation uncertainty and hydrologic uncertainty 7.2.1. Precipitation uncertainty A Monte Carlo simulation approach that involves the generation of numerous future rainfall pattern series, and the input of these patterns into a deterministic rainfall-runoff model, is proposed in Chapter 3. A translation vector model for analysis of rainfall pattern movement is extended to include a time series analysis of observed pattern translation to allow for stochastic generation of future rainfall patterns based on the statistical properties of rainfall pattern translation and growth-decay characteristics. These generated future rainfall patterns are subsequently input into a distributed rainfall-runoff model, resulting in a distributed ensemble forecast of watershed flood stage based on the range of possible precipitation conditions that could be experienced. The goal of the Monte Carlo simulation is to use a stochastic rainfall generator and hydrologic model to generate numerous realistic future rainfall-runoff events such that an ensemble forecast of flood stage carrying a probabilistic meaning can be given. 7.2.2. Hydrologic uncertainty In addition to improving the accuracy of the real-time flood stage forecast, the methods proposed in this research for assimilation of observed runoff data can be used to provide an estimate of the variance of the prediction error due to errors in measurement of hydrological inputs and shortcomings associated with the model and its parameterization. An estimate of the hydrologic uncertainty can be made as outlined in Chapter 4 through using the adaptive updating algorithm to recursively estimate the forecast error variance. A drawback of this approach is that it is limited to locations where real-time discharge observation data is available. An estimate of the hydrologic uncertainty is also required for other non-observation point locations. As no observation data is available for these locations, the assumption is made that the predictive ability of Hydro-BEAM at these locations is at least as good as a naïve prediction whereby future discharge rates are estimated as being the same as the currently observed discharge rate. Error distributions can thus be determined based on Hydro-BEAM simulated hydrographs using observed rainfall, comparing n-hour ahead discharge rates with current rates for various locations to determine error distributions for the naïve prediction. Under the assumption that the 109 Probabilistic Flood Forecasting error distributions are similar for runoff events of similar magnitude, these distributions can then be used in real time to estimate the degree of uncertainty of a runoff rate prediction for a given location and prediction lead-time. In this way a prediction of a runoff rate can be converted to a cumulative distribution function of the range of possible runoff rates that may eventuate under the given rainfall time series when considering hydrological uncertainty. Error distributions resulting from hydrologic uncertainty are assumed to be lognormally distributed. This assumption is necessary to allow the error to be combined with the distribution resulting from the Monte Carlo simulation for precipitation uncertainty. In order to satisfy this assumption, adaptive updating (Chapter 4) is performed on the logarithm of the discharge, rather than the discharge itself. This is achieved using a simple preprocessor for converting the discharge to the lognormal scale prior to updating Q log Q 7-1 h log h together with a postprocessor for converting the discharge back to a real number scale once updating is completed: Q eQ 7-2 h e h Here h is the forecast error due to hydrologic uncertainty. Updating is performed on Q , and h is treated by the adaptive updating algorithm as being normally distributed with zero mean. 7.2.3. Combining precipitation uncertainty and hydrologic uncertainty In order to produce a complete probabilistic forecast of future runoff conditions it is necessary to combine the effects of both precipitation uncertainty and hydrologic uncertainty together in the one pdf or cdf distribution. The forecast of future discharge Q can be represented in the logarithmic scale as Q Qp h 7-3 where Qp is a lognormally distributed variable with mean μp and variance σp2 representing discharge modeled under precipitation uncertainty, and Q p log Q p is a normally distributed variable with mean mp and variance sp2. The logarithm of the forecast error due to hydrologic uncertainty, h , is normally distributed with mean mh 110 Probabilistic Flood Forecasting (assumed equal to zero) and variance sh2. Equation 7-3 can be expressed as Q m p s p rp mh sh rh 7-4 m p s p rp sh rh where subscripts p and h relate to precipitation and hydrologic uncertainties respectively, and rp and rh are independent random normal variables defined by: E rp 0, E rp2 1 E rh 0, E rh2 1 7-5 E rp rh 0 The mean m and variance sp2 of Q can be described in terms of mp, sp2 and sh2 as follows. m E m p s p rp sh rh m p s p E rp sh E rh 7-6 mp s 2 E Q 2 E Q 2 E m p 2m p s p rp 2m p sh rh s p rp 2 s p rp sh rh sh rh 2 2 2 2 s E r s p E rp 2 s p sh 2 2 2 2 m 2 p 7-7 2 h h 2 Defining Q in terms of a single lognormal distribution then becomes a simple matter of converting Q from the logarithmic scale to the real scale. The mean, variance, skewness and kurtosis of Q are: e 2 m p s p 2 sh 2 2 2 e 2m s 1 e p 2 s p 2 p sh 2 2 e 2 2 sh 2 e 2 m 2 e s 2 p sh 2 4 s p sh 2 2 2e3 s 2 p p sp sh 2 2 sh 2 7-8 1 3e 2 s 2 p sh 2 3 111 Probabilistic Flood Forecasting 7.3. Application An application of the probabilistic flood forecasting system is presented here. The probabilistic rainfall forecast results for 11 September 2000 from Chapter 3, comprising results from 100 Monte Carlo simulations of rainfall dynamics between 11 September 21:00 and 12 September 3:00 are used for the precipitation input, and the distributed adaptive updating algorithm presented in Chapter 4 is used for assimilating real-time discharge observations and updating the middle reach of Nagara River and surrounding areas. The result of the ensemble forecast considering precipitation uncertainty based on 100 6-hour simulations is given for the location of Chusetsu in Figure 7-1. It can be seen from the ensemble that the generated rainfall input does not have a major influence on the hydrograph at downstream locations within the Nagara River watershed for the first 2 hours of the rainfall-runoff simulation. The influence on the hydrographs of midstream locations such as Mino and Akutami appears approximately an hour earlier. Generated hydrographs can be converted into cumulative distribution functions at each time step, thus describing the forecast of future discharge conditions at each point within a watershed in probabilistic terms. The ensemble data is found to fit a lognormal distribution function, and example cdfs are given for Chusetsu for 1 through 6-hour ahead forecasts (Figure 7-2). As is expected, these figures suggest increasing uncertainty in the forecasts with time, with very little uncertainty due to the precipitation forecast present for 1 and 2-hours-ahead forecasts. Hydrologic uncertainty, considering observation errors and modeling errors, is not considered in these figures. 112 Probabilistic Flood Forecasting 20000 18000 Discharge (m3/s) 16000 14000 12000 10000 8000 6000 4000 Observed discharge Calculated discharge 2000 t =0 0 2000/9/11 18:00 2000/9/11 21:00 2000/9/12 0:00 2000/9/12 3:00 Date / Time Figure 7-1: Ensemble forecast for Chusetsu made at 21:00 11 September 2000 Predictive probability P(Q<=q) 1 0.8 0.6 Lead time 1 hour 2 hours 3 hours 4 hours 5 hours 6 hours 0.4 0.2 0 4000 5000 6000 7000 8000 9000 10000 Discharge Q (m3/s) Figure 7-2: Probabilistic forecast of discharge considering precipitation uncertainty, 21:00 11 September 2000, Chusetsu 113 Probabilistic Flood Forecasting 7.4. Conclusion A framework has been proposed for the production of a probabilistic forecast of future distributed discharge conditions in a watershed. Methods for quantifying the two sources of forecast uncertainty that affect a flood forecast, being precipitation uncertainty and hydrologic uncertainty, have been proposed so as to provide a complete probabilistic forecast. The system provides a forecast for a lead-time of up to 6 hours of discharge conditions at 1km intervals along each major tributary within the midstream region of the Nagara River watershed. A forecast of discharge presented in both a distributed and probabilistic manner has a considerable benefit over the traditional approach of providing best-guess predictions for a small number of locations, as it allows the range of potential flood conditions to be identified for all populated areas in a watershed, which is necessary for effective planning of flood prevention and evacuation strategies. An approach for using such a forecast for providing optimal evacuation decisions is explored in Chapter 8. Reduction of modeling error associated with hydrologic uncertainty was made possible during the ensemble forecast using the adaptive updating algorithm presented in Chapter 4. Alternatively, the AI-based error correction scheme presented in Chapter 5 and Chapter 6 can be used. An advantage of using the adaptive updating algorithm is that it can also be used to provide an estimate of hydrologic uncertainty, however this ability is limited to locations where real-time discharge observations are available. 114 Evacuation Decision 8. EVACUATION DECISION One of the most important features of a short-term flood forecast is its utility in helping to make decisions during times of flood risk. Such decisions include those related to the operation of hydraulic structures and the inundation of flood plains to reduce flood risk, and the evacuation of citizens from locations threatened by flood inundation. As an example application for the probabilistic flood forecast developed in this research the development of a decision support system for evacuation decision is investigated. The problem of evacuation decision is essentially that of choosing an action from a variety of alternatives each with different consequences which depend on the combination of the choice of action made and an uncertain future state of nature. Since by definition a probabilistic flood forecast can provide either an estimate of the probability with which a flood will occur or the probability at which different water levels may be experienced, and since the losses involved with each action-state combination can be estimated, the evacuation decision can be modeled as an engineering decision-making problem. In this way it is possible to use a distributed probabilistic flood forecast to provide an optimal decision regarding evacuation of residents that is based on the probability of flood occurrence at their location. This is considered superior to a decision based purely on a deterministic prediction of water level with no information as to the uncertainty involved in the prediction or the range of possible water levels that could be experienced. A number of approaches for estimating damage due to inundation are discussed and recommendations are given for using the probabilistic flood forecast system in making evacuation decisions. The following discussion considers flooding which results from overtopping of embankments only, however depending on the watershed and the hydrological conditions being considered, flooding due to embankment failure may also be an issue requiring attention. 8.1. Decision model The decision regarding whether or not to evacuate an area involves making a choice as to a course of action based on a limited available knowledge. The courses of action open to the decision maker in a time of flood risk are considered here to be the action of issuing an evacuation order or not issuing an evacuation order for each location within a river basin. The knowledge available on which this decision can be made includes the 115 Evacuation Decision probabilistic flood forecast issued for each location, the costs associated with flooding, evacuation costs, and relevant topographical and demographical information for the river basin. Ultimately, a course of action is desirable for each location within an area at risk that leads to zero casualties. Although in the interest of saving lives it may be necessary to issue evacuation orders even at times of low inundation risk, it is important to minimize such disruption to communities when possible. The approach suggested for this decision model is one that aims to minimize loss of life and disruptions to communities through identification of the evacuation decision and strategy that has the maximum expected value under current conditions. 8.1.1. Estimating potential costs The costs considered in the decision model for evacuation can be categorized as losses resulting from preventable flood damage and losses resulting from evacuation. Preventable flood damage is considered to be losses which could have been avoided through appropriate evacuation of citizens from an affected area, such as death and injury. Potential damage to buildings and property should not be considered when making an evacuation decision as this damage is the same regardless of whether an evacuation is ordered or not. Losses resulting from evacuation include costs associated with coordinating an evacuation and providing emergency services, lost profits due to business interruption, and costs associated with the inconvenience and lost time associated with vacating a residential dwelling. A tradeoff therefore occurs between the number of hours or amount of money saved as a result of no evacuation against the potential for loss of life that could result from flooding. Assigning equivalent cost values in terms of yen, dollars or other units to each of the above items is difficult and can be rather subjective. There are many arguments both for and against assigning a monetary value to human life, and in the case where a value is assigned the figure can vary greatly depending on the approach and background assumptions used. 8.1.2. Estimating inundation probability and severity The probabilistic flood forecast developed in this thesis is capable of providing a forecast of when and where river banks are likely to be overtopped. In order to utilize 116 Evacuation Decision this information for evacuation decision making, it is necessary to be able to determine the risk that overtopping presents to residents in regions adjacent to rivers. The ability to determine this depends on the detail to which urban flooding dynamics are understood and modeled in each region. In any given watershed, depending on the resources available and geographic and demographic characteristics, a combination of strategies may be employed throughout the watershed to estimate flood depths resulting from embankment overtopping, such as linking the river network model with a detailed urban flood model, making estimates based on pre-existing flood hazard maps, or using a simple tank model strategy. The use of the probabilistic flood forecasting strategy with each of these scenarios is discussed below. The most detailed approach to modeling flood depths resulting from embankment overtopping is that of employing an urban flood model. Ideally, this would allow for dynamic real-time mapping of inundation risk across a watershed and give a visual guide as to safe locations to evacuate to and the lowest risk routes to take. The kinematic wave equation is acceptable for modeling one-dimensional flow in a relatively steep channel network, however once floodwaters overtop embankments and enter urban regions, a fully-distributed two-dimensional urban flood model is more suitable for accurately modeling flood dynamics. There exist a wide range of urban flood models and strategies that could be suitably adapted for use together with HydroBEAM for providing a probabilistic forecast of spatially-distributed inundation levels. Once a forecast of inundation levels is made available, it then becomes necessary to estimate the potential for loss of life should inundation occur. The procedure proposed here assigns a severity index to each potential inundation level which varies from zero inundation through to a specified inundation level which would result in the death of the entire unevacuated population of the area being considered (Figure 8-1(b)). The combined use of this severity curve with a probabilistic forecast of inundation levels (Figure 8-1(a)) can be considered equivalent to a measure of the risk to life posed by future flood conditions. 117 Evacuation Decision p(q) Severity 1.0 0.0 Inundation level (m) Inundation level (m) Figure 8-1: (a) PDF of inundation levels, and (b) severity curve While the use of an urban flood model is attractive as it is capable of detailed flood modeling and consideration of facilities such as underground malls and subway stations which are at highest risk during flood events, the large amount of time and considerable difficulty involved with the development and calibration of such models often makes their use prohibitive. For many regions within a watershed, especially highly-populated areas close to major rivers, flood hazard maps may be available as a viable alternative to the development of a detailed urban model. Flood hazard maps depict the inundation depths that may result from embankment overflow or embankment failure during a severe flooding event, based on past flooding experience and regional topography. Such maps are quite subjective in that they rely heavily on the assumptions made regarding the flood event and overflow/failure scenario, however in the absence of an urban flood model they can be used as a rough reference from which to assess the potential risk to urban locations posed by flood levels in adjacent river channels. When using flood hazard maps, the shape of the severity curve must be determined individually for each location within the target region based on the potential for inundation as suggested by the hazard map, and the distance of the location from the river being considered. In this case the curve is given in terms of the river flood rate in the adjacent river, and varies from zero for the maximum flood discharge rate in the adjacent river that would lead to no flood damage (assumed for demonstration purposes here to be approximately equal to 100% of the design discharge rate for locations adjacent to a river) through to a specified discharge rate which would result in the death of the entire unevacuated population (Figure 8-2(b)). 118 Evacuation Decision p(q) Severity 1.0 qdesign Discharge (m3/s) 0.0 q qdesign Discharge (m3/s) q Figure 8-2: (a) PDF of discharge rates for a location under analysis for a given future point in time, (b) Severity curve for location under analysis In many cases neither an online urban flood model nor a flood hazard map may be available for assessing the risk associated with potential flood conditions. A third and much less resource-intensive option that is available to the decision maker is to estimate urban flood levels that would result from predicted flood conditions through the use of a simple tank model representation of the regions adjacent to rivers. Elevation data is available at 50m intervals within Japan, and a tank model based on this data can be used to estimate which regions will experience urban flooding and to what degree, based on predicted flood levels within a river basin’s channel network and the associated embankment overflow rate. In this case a curve such as depicted in Figure 8-1(b) would be used to describe the severity associated with each inundation level. 8.1.3. Evacuation decision formulation and timing of the evacuation The evacuation decision problem can be formulated as a multi-stage model. At regular time steps throughout the duration of a rainfall event a distributed probabilistic forecast of discharge is generated for each location of interest within the watershed for several time steps into the future. For a given location, a decision based on the forecasted flood conditions at each future time step is required. A choice is offered between two actions, AE: order evacuation or AE : do not order evacuation and delay decision one time step. In making a decision when faced with a potential flood risk there is a trade-off between ordering an evacuation too early based on a highly-uncertain forecast which risks unnecessarily disturbing the public, and leaving the evacuation order until a point in time when it is too late to evacuate the majority of the public. In choosing between actions AE and AE the decision method must be able to determine the optimal timing of the evacuation based on the amount of time it takes to evacuate a 119 Evacuation Decision R: Fraction remaining population. An evacuation progress index R(τ) is proposed to indicate the fraction of a population that would remain unevacuated for evacuation orders given at various warning lead times. This index can be plotted against lead time for each target location as a function decreasing from one to zero as given in Figure 8-3. The shape of the function will depend on the characteristics and demographics of the location being modeled. 1.0 100% of population remaining Complete evacuation 0.0 τ: Lead time (hrs) Figure 8-3: Evacuation progress index As both evacuation success and evacuation costs are modeled as being dependent on the period of time allocated for the evacuation (lead time), the decision model is able to optimize the timing of an evacuation should one be necessary. This can be achieved through considering the decision in terms of a multi-stage decision model (Figure 8-4). Although flood-related costs are modeled as a continuous function in this research, for the sake of this explanation a decision tree for the multi-stage model for the discrete (no flood / flood) evacuation problem is assumed. In this example the probability of flooding at the given lead time being considered is denoted Pf and evacuation cost and flood damage are labeled C and D respectively. In using this multistage model, the expected value of action AE , is calculated as being the expected value of the optimal choice at the next time step. Once this is calculated it can be compared with the expected value of AE , and a decision can be made. In order to calculate the expected values of the actions AE , 1 and AE , 1 , the probability of flooding from the point of view of the next step Pf* is required. Although this probability can not be known at the present time step, the optimal estimate for this value can be considered equal to the value of Pf from the point of view of the current time step. In the case where action AE is chosen, this probability will be updated based on the newly-available probabilistic flood forecast made at the next time step, which is likely to include less uncertainty. 120 Evacuation Decision AE , Evacuation AE , AE , 1 Delayed No evacuation evacuation AE , 1 No evacuation 0 Present time OF Flooding OF No flooding OF Flooding OF No flooding OF Flooding OF No flooding Flood Damage C** D < D* C** 0 C* < C** D* < D** C* < C** 0 0 D** 0 0 PF 1-PF PF* 1-PF* PF* 1-PF* 1 Next time step Lead time under consideration (Higher-accuracy Prediction) Evacuation Costs Lead time Figure 8-4: Multi-stage decision model 8.1.4. Objective function formulation A function is developed here to calculate the expected value of a given action at a given lead time. The function estimates the combined flood damage (D) and evacuation costs (C) for the location and lead time being considered. Flood damage is defined for a location as the product of the number of people killed by the flood and the value attributed to an average human life, λ: 8-1 D S q R A, n pop where S is the severity index representing the fatality rate associated with a flood of magnitude q, npop is the number of people in the target location prior to the evacuation and R(A, τ) is the fraction of a population expected to remain unevacuated in the target location at a time τ after action A is taken, such that: R AE , R R AE , 1.0 8-2 Evacuation cost is defined as: C 1 R A, n pop 8-3 where A is action (AE: evacuate; AE : don’t evacuate), α is the average estimated cost of 121 Evacuation Decision evacuating an individual and β is the average value associated with one human hour that would be lost due to the disruption caused by an evacuation (assumed to end after τ time steps). The expected value (EV) of a given action per unit of population can therefore be calculated by integrating over the range of forecasted discharge rates as EV ( A, ) p q, S q R( A, ) dq 8-4 1 R A, where p is the probability distribution function for discharge q at lead-time τ. The optimal decision at any given point in time during a rainfall event can thus be made by choosing the action that maximizes the expected value of the outcome with respect to A and τ. The expected value for both evacuate and don’t evacuate options is calculated and compared for every lead time up to a limit set by the flood forecast horizon. If the expected value is optimal for the evacuate option for any of these future time steps, an evacuation is ordered. IF EV AE , EV AE , for any 1, 2, , horizon THEN choose AE where EV AE , max EV AE , 1 , EV AE , 1 8-5 8.1.5. Risk aversion The decision model is developed above under the assumption that monetary costs are a suitable measure of value. Furthermore it should be noted that outcomes associated with death due to inundation, while likely to occur far less often than outcomes associated with evacuation false alarms, are extremely costly in comparison, especially considering that the costs while measured in monetary terms are in reality associated with loss of lives. The public are far more likely to forgive a series of evacuation false alarms than they are to forgive a one-off failure to issue an alarm which results in death. For these reasons a risk aversion strategy may be preferred by the authority responsible for issuing floods. In such cases the authority may lean towards making decisions to order evacuations even when they are the less-than-optimal choice in terms of the expected value criterion. 122 Evacuation Decision For the case where the risk aversion can be assumed to arise from undesirable consequences associated with suffering a large one-off cost, a utility function (Figure 8-5) can be utilized to convert the cost of all possible outcomes ranging from the worst O* through to the most desirable O* into their equivalent utility values as judged by the subjective views of the decision maker. The decision making process can then be carried out such that the action with the maximum expected value of utility is chosen as being the optimal solution from the viewpoint of the decision maker. The shape of the utility function is subjective and will vary between decision makers depending on their individual requirements. The method for constructing a utility function is presented in von Neumann and Morgenstern (1947). 1.0 0.0 O* O* = 0 Value ($) Figure 8-5: Utility function 8.2. Demonstration of the evacuation decision framework In order to demonstrate the value of the evacuation decision framework, it is used here for a hypothetical flood event occurring in the vicinity of the city of Mino. Mino is home to 24,100 residents in 7533 households (as at 2005). The valley region located in the vicinity of the Mino discharge observation station at 35°32’58’’ N and 136°54’32’’ E is considered. The Nagara River traverses this valley region flowing north to south, with residences located along each bank. The areas within the region that are in risk of flood are identified on a flood hazard map provided by Mino City Council (Figure 8-6). Potential flood levels that could be experienced due to bank failure or overtopping are given, and these are used as the basis for determining a set of severity curves for the region as described in Table 8-1, where the values of s0 and s1 are used to denote the points between which the curves vary from a severity rating of zero through one. A severity level of zero indicates that conditions produced by the corresponding discharge at the adjacent river location carry no risk of 123 Evacuation Decision taking life, and a severity level of one indicates conditions with the potential of taking the lives of all unevacuated residents remaining in the region. For example, areas given the extreme rating are judged to be at maximum risk for any discharge level exceeding 100% of the design discharge, and for this reason s0 = s1 = 100%. Conversely, it is recognized that in areas given the moderate ranking, that overtopping of river banks, although promoting dangerous conditions, will not cause conditions as severe as for locations with the extreme rating, where flood levels have the potential of exceeding a depth of 2.0m. For this reason s1 is set at 200% for moderate areas which has the effect of creating a mild sloping severity curve. Each area is also rated in terms of estimates of the time required to evacuate residents from the area at risk of flooding as given in Table 8-2. The curve described by r0 and r1 recognizes that evacuation time will vary between residents depending on factors such as physical ability, access to transportation and preparedness. Furthermore, it is recognized that there is likely to be a significant time lag between when the evacuation decision is made and when the warning reaches each resident in the area. For the example given here the initial cost associated with disrupting and evacuating an individual is assumed to be 10,000 yen, the average value associated with each human hour lost due to the evacuation is assumed to be 1000 yen, and the value associated with a human life is set at 50,000,000 yen. Table 8-1: Severity curve parameters Water depth 2.0 – 5.0m 1.0 – 2.0m 0.5 – 1.0m 0.0 - 0.5m s0 100% 100% 100% 100% s1 100% 110% 120% 200% 1.0 Severity Rating Extreme Very high High Moderate 0.0 0 s0 s1 Discharge (% of qDesign) Rating A B Distance to shelter 0.0 – 1.0km 1.0 km – r1 1 hr 1 hr 124 r0 2 hr 2.5 hr R: Fraction remaining Table 8-2: Evacuation curve parameters 1.0 0.0 r1 r0 τ: Lead time (hrs) Evacuation Decision Figure 8-6: Mino flood hazard map 125 Evacuation Decision Probabilistic flood forecast data for 1, 2 and 3-hour ahead forecasts made for Mino at hourly steps between midday and 15:00 are given in Table 8-3 for a hypothetical event. Although the example given considers only three forecast periods, the use of a 6-hour ahead forecast would be used in the same manner. Probabilistic flood forecast data are provided in pdf and cdf formats as demonstrated in Chapter 7, and for the purpose of this example the forecasted cumulative probabilities of discharge not exceeding 100%, 105% and 110% of the design discharge at Mino are tabulated. The design water level at Mino is given at 6.60m, corresponding to a discharge of approximately 6750 m3/s. This event demonstrates a scenario where forecasts made at 12:00, 13:00 and 14:00 indicate a low yet significant probability that river banks will be overtopped. Table 8-3: Probabilistic flood forecast data 13:00 14:00 15:00 Pq=P(Q≤q) P100 P105 P110 t=2 14:00 1.000 1.000 1.000 t=3 15:00 0.995 1.000 1.000 Pq=P(Q≤q) P100 P105 P110 14:00 1.000 1.000 1.000 15:00 0.999 1.000 1.000 16:00 0.95 0.98 0.999 Pq=P(Q≤q) P100 P105 P110 15:00 1.000 1.000 1.000 16:00 0.999 1.000 1.000 17:00 0.995 1.000 1.000 Pq=P(Q≤q) P100 P105 P110 16:00 0.999 1.000 1.000 17:00 0.999 1.000 1.000 18:00 1.000 1.000 1.000 P(q) 1.0 P100 0.0 p(q) 100% design t=0 12:00 t=1 13:00 1.000 1.000 1.000 q100 q105 q110 q Discharge (m3/s) Design discharge at Mino: q100 = 6750 m3/s Using the severity curves and evacuation curves given in Table 8-1 and Table 8-2, together with Equation 8-4 and Equation 8-5, an evacuation decision can be made for each area within the proximity of the river cross-section adjacent to the Mino discharge observation station. Based on this information, the optimal decisions made for each location in the region are as follows. 126 Evacuation Decision 12:00: Evacuation ordered for locations with severity ratings of high or greater located at a distance greater than 1km from a shelter. 13:00: Evacuation ordered for locations with a severity rating of moderate at a distance greater than 1km from a shelter, and for all remaining locations with severity ratings of very high or greater. 14:00: No further evacuation required, residents in locations with a severity rating of high and lower at a distance less than 1km from a shelter remain unevacuated. In the decision made at 12:00 for locations at a distance less than 1km from a shelter the action of delaying evacuation one hour is taken. As the decision model assumes a cost for each hour of disturbance due to evacuation, this option to delay the evacuation decision one hour is optimal based on the multi-stage decision model given in Figure 8-4 where the estimated value of flood damage D for evacuation is unchanged if the evacuation is delayed until 13:00 as complete evacuation can be achieved in under 2 hours, but evacuation costs C = (α + βτ) are reduced slightly for the one hour delay as the time period is reduced from τ = 3 hours to τ = 2 hours. This is the correct decision considering that all residents from this area can still be evacuated in time based on an evacuation order given at 14:00 if the new forecast available at that time deems it necessary, and delaying the evacuation decision has the added advantage that the decision can be made based on new information, which may allow a false-alarm to be avoided all together. In the example given here the 3-hour ahead forecast made at 13:00 indicated a 5% probability that overtopping of river banks would occur at 16:00, thus in this case it does eventually become optimal to evacuate all locations at distances of greater than 1km from a shelter. The decision for areas with severity rating high at a distance of greater than 1km from a shelter is demonstrated below: 127 Evacuation Decision EV AE ,3 p q,3 S q dq.R ( AE ,3) 1 R AE ,3 3 3 10,000 1000 3 13,000 yen EV AE ,3 max EV AE ,2 , EV AE ,2 Here p q,3 is the best estimate for p q, 2 , such that: EV AE ,2 p q,3 S q dq.R ( AE , 2) 1 R AE , 2 2 0.000625 0.33 50,000,000 1 0.33 10,000 1000 2 10,313 8,710 19,000 yen EV AE ,2 max EV AE ,1 , EV AE ,1 EV AE ,1 EV AE ,1 p q,3 S q dq. 31,000 yen EV AE ,3 max EV AE ,2 , EV AE ,2 max 19,000, 31,000 19,000 yen EV AE ,3 >EV AE ,3 and evacuation at 12:00 is optimal for considered location At 13:00 the decision model suggests evacuation of residents from all remaining locations with severity ratings of very high or greater based on a 1 in 1000 chance of experiencing flooding at 15:00. This represents a very high probability that the evacuation will be a false alarm, but considering that flooding carries very high risk of death for these locations, this is not an unreasonable choice of action. In this way, the decision model demonstrates the ability to be more conservative in its approach toward areas that would suffer greatly due to flooding, and less conservative in dealing with areas were flooding would not be catastrophic. 8.3. Evacuation path planning using probabilistic information Once a decision is made to evacuate a given location, it is necessary to give clear instructions as to where to evacuate to, and how to safely reach the evacuation shelter. Traditionally, residents living in areas at high risk of flooding have been educated as to the dangers of flooding and have been given advice as to where the nearest evacuation 128 Evacuation Decision shelters are located should evacuation become necessary. While preparedness of this sort is invaluable for reducing confusion during flood evacuations, probabilistic modeling of urban flooding as discussed in the previous section can further improve the effectiveness of the evacuation effort through real-time flood hazard mapping and preparation of optimal evacuation routes based on flood risk. Once an evacuation order is issued, safe and rapid evacuation becomes the focus of the decision making. As discussed in the previous section, depending on the urban flood modeling strategy used for the given region at risk, probabilistic forecasts of flood inundation levels across the region can be provided together with an estimate of the severity that the range of possible inundation levels would have for unevacuated residents (Figure 8-1). The risk to a resident remaining in or moving through a given location at a given lead-time is defined here as the integral of the product of the PDF of inundation levels (Figure 8-1(a)) and the severity curve (Figure 8-1(b)): 8-6 risk( ) p q, S q dq In this way, the risk at lead time τ for a given location can be described as ranging between zero for a location that will be completely safe at this lead time through to a value of one for a region that will experience extreme inundation. A plot of the risk across the region being considered can be made and an optimal evacuation path can be chosen such that evacuees travel between their current locations and a designated evacuation shelter by traversing locations with the lowest risk rating. When choosing between multiple paths, the location at highest risk along each path is identified and the path for which this value is lowest is chosen. This is demonstrated in the conceptual flood risk map given in Figure 8-7 where the optimal route in terms of lowest risk to the evacuee is calculated in real time and may not necessarily be the shortest route to the evacuation shelter. Depending on the time required for evacuation, it may be necessary to consult flood risk maps generated for multiple lead times when choosing an evacuation path. 129 Evacuation Decision Figure 8-7: Conceptual flood risk maps for real-time evacuation path planning It is currently technologically feasible in Japan and many other countries to have mobile phone handsets pinpoint an individual’s location using GPS satellites and communicate this location to a central emergency service. Ideally the probabilistic flood forecasting system developed in this research could be used as discussed in this chapter as the backbone of a flood warning and evacuation support service capable of supplying mobile phone handsets with a map of current and forecasted inundation levels and evacuation directions automatically generated and updated in real-time based on an individual’s location and specific requirements. 8.4. Conclusion A decision support system for making evacuation decisions using probabilistic distributed flood forecasts is proposed and demonstrated. The system provides timely evacuation orders tailor-made to each area within the watershed based on potential 130 Evacuation Decision inundation levels for the area and the distance and corresponding evacuation time to the nearest available shelter. The risk to each area is considered through estimating the probability with which inundation levels will occur for each forecast lead time, and the severity associated with each of these inundation levels. Three strategies are discussed for estimating inundation levels in urban areas using probabilistic forecasts of discharge within adjacent rivers. A benefit of the approach presented here is that it provides a framework for choosing an acceptable level of risk for each area that may be tolerated prior to issuing an evacuation order. Informed decisions based on the additional information offered by probabilistic forecasts become possible by considering not only the probability of flooding, but also the potential for loss of life based on the geography of the region concerned. An application of the system was demonstrated for a hypothetical discharge event considering a region of the city of Mino. Although overtopping of embankments did not occur for this event, evacuation was deemed necessary for several areas in the region based on forecasts suggesting up to a 1 in 20 chance of flood occurrence. False alarms were avoided for areas that could be quickly evacuated or where the risk posed by the flood was considered low. 131 Conclusion 9. CONCLUSION A framework for probabilistic forecasting of short-term distributed runoff conditions within a watershed has been proposed. The probabilistic forecast has been developed by dividing the various uncertainties inherent in a flood forecast into precipitation uncertainty for modeling of errors associated with distributed rainfall forecasts, and hydrologic uncertainty for modeling model structure, model parameterization and model input errors. These uncertainties have been modeled through the use of a distributed rainfall-runoff model and a Monte Carlo simulation, with the resulting forecast presented in the form of a cumulative distribution function for each required forecast lead-time and location. The distributed rainfall-runoff model Hydro-BEAM is described and its value in realtime flood modeling and forecasting is demonstrated. The model is calibrated to the Nagara River watershed for the purpose of demonstrating applications of the various components of the flood forecasting system. Although Hydro-BEAM has been used in the past predominantly for the purpose of long term water quality and quantity simulations, its structure has been modified to optimize it to suit real-time flood forecasting. As observed and forecasted rainfall is the major input to Hydro-BEAM, and as forecasted rainfall is generally considered to be the major source of uncertainty (referred to as precipitation uncertainty) in the runoff forecast, considerable attention is given to the task of modeling the short-term dynamics of rainfall fields. A time series analysis model is coupled with a translation vector model to stochastically simulate the movement of rainfall fields, and time series and statistical models for growth-decay are also developed. A Monte Carlo simulation is developed based on these models which simulates the range of possible future rainfall patterns that may develop based on recently observed rainfall field dynamics. These rainfall pattern time series are input into Hydro-BEAM to allow an ensemble of future runoff conditions to be simulated considering the effects of precipitation uncertainty. Although uncertainty in the rainfall forecast is largely responsible for error in forecasting runoff, especially at long lead-times, other factors such as limitations associated with the rainfall-runoff model, calibration errors, and errors in radar rainfall observation are also responsible for considerable errors in runoff modeling. Two 132 Conclusion methods for using real-time discharge observations to reduce this type of error resulting from hydrologic uncertainty are developed to be compatible with a distributed rainfallrunoff model. The first of these methods is an adaptive updating technique which compares discharge observation data available at observation stations within a watershed with Hydro-BEAM simulated hydrographs at those same locations and adjusts discharge levels throughout the watershed to bring simulated results closer to the observed data while maintaining continuity of discharge rate along the river network. This technique is found to effectively reduce modeling error for several hours into the future in the case of the Nagara River watershed. A second alternative technique for reducing forecasting error associated with hydrologic uncertainty is developed using AI models to perform data mining on historically observed and simulated data sets. Genetic programming is used to search for functions capable of describing the relationship between recently observed data and future modeling errors. The use of a self-organizing map is introduced in an attempt to cluster large historical data sets into meaningful groupings of data such that GP can be used to develop a range of functions suited to different hydrograph characteristics. Although the use of SOM remains promising for this purpose, the use of an SOM is not recommended in the case of Nagara River due to the limited amount of training data available for this watershed. The AI-based error correction technique relies on real-time discharge observation data to forecast and correct future modeling error, and as such it is only applicable for locations in a watershed where discharge observation stations are located. For this reason a complimentary technique is developed to interpolate and extrapolate forecast results at observation locations to other locations within the watershed. Global and local linear modeling approaches are investigated for identifying relationships between hydrographs at observation stations and non-observation locations. It is found that global linear modeling allows accurate simulations of discharge rates to be made for all locations within a watershed within approximately 15 kilometers of a discharge observation station. An example application is carried out of the Monte Carlo simulation of rainfall-runoff considering precipitation uncertainty coupled with the adaptive updating technique for reducing modeling error considering hydrologic uncertainty. A method for considering both forms of uncertainty in providing a comprehensive probabilistic forecast of 133 Conclusion discharge for all locations in a watershed is developed, with results presented in the form of cumulative distribution functions of discharge rates for a 6-hour lead time. As would be expected, the uncertainty in the discharge forecast, as indicated by the variance of the ensemble distribution, increases with increasing lead time. As one example of a use for a probabilistic distributed flood forecasting system, the problem of issuing evacuations during flood risk periods is considered. An engineering decision making approach is discussed which aims to minimize losses due to false evacuation alarms and deaths due to floods through making evacuation decisions and proposing evacuation routes that maximizes the expected value of the outcome. There is a great need for a flood forecasting system such as the one presented here that can provide a clear picture of potential future flood risks at all locations within a watershed. Such information is valuable not only in planning evacuations, but also in operating hydraulic equipment for flood mitigation during times of emergency with the goal of minimizing losses across an entire watershed. 134 References REFERENCES Atkeson, C.G., Moore, A.W., and Schaal, S. (1997). Locally weighted learning. Artificial Intelligence Review 11, 11-73. Babovic, V., and Bojkov, V. H. (2001). Runoff Modelling with Genetic Programming and Artificial Neural Networks, D2K Technical Report 0401-1. Danish Hydraulics Institute, Denmark. Bathurst, J. C. (1986). Physically-based distributed modelling of an upland catchment using the Systeme Hydrologique Europeen. J. Hydrol. 87, 79-102. Bentley, J. L. (1980). Multidimensional divide and conquer. Communications of the ACM 23(4), 214-229. Beven, K. (1979). On the generalized kinematic routing method. Water Resour. Res. 15(5), 1238-1242. Beven, K. J. (2000). Rainfall-runoff modelling. John Wiley & Sons Ltd., West Sussex, England. Beven, K. J., Lamb, R., Quinn, P., Romanowicz, R., and Freer, J. (1995). TOPMODEL. In Singh, V. P. (Ed.), Computer Models of Watershed Hydrology. Water Resource Publications, Colorado, 627-668. Box, G. E. P. and Jenkins, G. M. (1976). Time series analysis: forecasting and control. Holden-Day, San Francisco. Cai, X., McKinney, D. C., and Lasdon, L. S. (2001). Solving nonlinear water management models using a combined genetic algorithm and linear programming approach. Advances in Water Resources, 24, 667-676. Campolo, M., Andreussi, P., and Soldati, A. (1999). River flood forecasting with a neural network model, Water Resour. Res. 35(4), 1191-1197. Cheng, C. T., Ou, C. P., and Chau, K. W. (2002). Combining a fuzzy optimal model with 135 References a genetic algorithm to solve multi-objective rainfall-runoff model calibration. J. Hydrol. 268, 72-86. Elshorbagy, A., Simonovic, S. P., and Panu, U. S. (2000). Performance evaluation of artificial neural networks for runoff prediction. Journal of Hydrologic Engineering, 5(4), 424-427. Hagan, M. T., Demuth, H. B. and Beale, M. (1996). Neural network design. PWS Publishing Company, Boston. Hsu, K., Gupta, H. V., and Sorooshian, S. (1995). Artificial neural network modeling of the rainfall-runoff process. Water Resour. Res. 31(10), 2517-2530. Ishihara, T. and Takasao, T. (1963). A study on runoff pattern and its characteristics. Bull. Disaster Prevention Research Institute, Kyoto University, 65, 1-23. Kalman, R.E. (1960). A new approach to linear filtering and prediction problems. Trans. ASME, J. Basic Eng. 82, 35-45. Karunanithi, N., Grenney, W.J., Whitley, D., and Bovee, K. (1994). Neural networks for river flow prediction. Journal of Computing in Civil Engineering, 8(2), 201-220. Kelly, K.S. and Krzysztofowicz, R. (2000). Precipitation uncertainty processor for probabilistic river stage forecasting. Water Resour. Res. 36(9), 2643-2653. Kitanidis, P. K. and Bras, R. L. (1980). Real-time forecasting with a conceptual hydrological model: 1. analysis of uncertainty. Water Resour. Res. 16(6), 1025-1033. Kohonen, T. (2001). Self organizing maps. Springer, Berlin. Kojiri, T., Tokai, A., and Kinai, Y. (1998). Assessment of river basin environment through simulation with water quality and quantity. Annuals of Disaster Prevention Research Institute, Kyoto University, No. 41 B-2, 119-134 (in Japanese). Krzysztofowicz, R. (2001a). The case for probabilistic forecasting in hydrology. J. Hydrol. 249, 2-9. 136 References Krzysztofowicz, R. (2001b). Integrator of uncertainties for probabilistic river stage forecasting: precipitation-dependent model. J. Hydrol. 249, 69-85. Krzysztofowicz, R. and Herr, H. D. (2001). Hydrologic uncertainty processor for probabilistic river stage forecasting: precipitation-dependent model. J. Hydrol. 249, 4668. Lardet, P. and Obled, C. (1994). Real-time flood forecasting using a stochastic rainfall generator. J. Hydrol. 162, 391-408. Lees, M., Young, P., Ferguson, S., Beven, K., and Burns, J. (1994). An adaptive flood warning scheme for the River Nith at Dumfries. In White, W. R. and Watts. J. (Ed.), 2nd international conference on river flood hydraulics. John Wiley & Sons Ltd. Liong, S. Y. and Sivapragasam, C. (2002). Flood stage forecasting with support vector machines. J. American Water Resources Association, 38(1), 173-186. Liong, S. Y., Gautam, T. R., Khu, S. T., Babovic, V., Keijzer, M., and Muttil, N. (2002). Genetic Programming: A new paradigm in rainfall runoff modeling. J. American Water Resources Association, 38(3), 705-718. Lorrai, M. and Sechi, G. M. (1995). Neural nets for modeling rainfall-runoff transformations. Water Resources Management, 9, 299-313. Madsen, H. and Skotner, C. (2005). Adaptive state updating in real-time river flow forecasting – a combined filtering and error forecasting procedure. J. Hydrol. 308, 302312. Ministry of Construction. (1980). Flow management of the Kiso River channel system – working together with integrated dam management. Kiso River Upstream River Construction Management Division, Chubu Regional Construction Bureau, Ministry of Construction, Japan (in Japanese). Ministry of Construction. (1995). River improvements of the Kiso three rivers – 100 years journey. Chubu Regional Construction Bureau, Ministry of Construction, Japan 137 References (in Japanese). Ministry of Construction. (2000). Outline of the lower reaches of the Kiso three rivers. Chubu Regional Construction Bureau, Ministry of Construction, Japan (in Japanese). Moore, A. (1991). An Introductory Tutorial on kd-trees, Technical Report No. 209. Computer Laboratory, University of Cambridge, Robotics Institute, Carnegie Mellon University. Moore, G. E. (1965). Cramming more components onto integrated circuits. Electronics, 38(8). Moore, R. J. (2002). Aspects of uncertainty, reliability, and risk in flood forecasting systems incorporating weather radar. In Bogardi, J. and Kundzewicz, Z. W. (Ed.) Risk, reliability, uncertainty and robustness of water resources systems. Cambridge University Press. Park, J., Kojiri, T., Ikebuchi, S., and Oishi, S. (2000). GIS based hydrological comparison and run-off simulation of a river basin. Fresh Perspectives on Hydrology and Water Resources in Southeast Asia and the Pacific, Mosley, M. P. (eds.), Christchurch, 143-156. Puente, C. E. and Bras, R. L. (1987). Application of nonlinear filtering in the real time forecasting of river flows. Water Resour. Res. 23(4), 675-682. Ramírez, J. A., (2000). Prediction and modeling of flood hydrology and hydraulics. In: Wohl, E. (Eds.), Inland Flood Hazards: Human, Riparian and Aquatic Communities. Cambridge University Press. Sekii, K., Smith, P. J., and Kojiri, T. (2005). A distributed filtering algorithm for forecasting of basin-wide flood runoff. Proc. International Conference on Reservoir Operation & River Management, Guangzhou, China. Shepard, D. (1968). A two-dimensional interpolation function for irregularly-spaced data, Proc. 23rd National Conference ACM, ACM, 517-524. 138 References Sherman, L. K. (1932). Streamflow from rainfall by the unit graph method. Eng. News Rec., 108, 501-505. Shiiba, M. (1993). Integration of rainfall-runoff model. Lectures on recent developments in modeling technologies. Japan Society of Hydrology and Water Resources. Shiiba, M., Laurenson, X., and Tachikawa, Y. (2000). Real-time stage and discharge estimation by a stochastic-dynamic flood routing model. Hydrological Processes, 14, 481-495. Shiiba, M., Takasao, T., and Nakakita, E. (1984). Investigation of short-term rainfall prediction method by a translation model. Proc. 28th Japanese Conf. On Hydraulics, JSCE, 423-428 (in Japanese). Smith, P. J. (2003). Probabilistic short-term flood stage prediction using a distributed rainfall-runoff model. Masters thesis, Graduate School of Engineering, Kyoto University, Kyoto. Smith, P. J. and Kojiri, T. (2004). Monte Carlo simulation of distributed rainfall-runoff conditions for probabilistic short-term flood forecasting. GIS and Remote Sensing in Hydrology, Water Resources and Environment. IAHS Publ. 289. Smith, P. J., Sekii, K. and Kojiri, T. (2004). Error correction of a rainfall-runoff model using genetic programming and a self-organizing map. Proceedings of 2004 Annual Conference, Japan Society of Hydrology and Water Resources. Muroran (in Japanese). Tachikawa, Y., Komatsu, Y., and Shiiba, M. (2003). Stochastic modelling of the error structure of real-time predicted rainfall and rainfall field generation. Weather Radar Information and Distributed Hydrological Modelling, IAHS Publ. no. 282. Takasao, T., Shiiba, M., and Nakakita, E. (1994). A real-time estimation of the accuracy of short-term rainfall prediction using radar. Stochastic and Statistical Methods in Hydrology and Environmental Engineering, 2, 339-351. Tamura, N. and Kojiri, T. (2002). Water quantity and turbidity simulation with 139 References distributed runoff model in the Yellow River basin. Flood Defence ‘2002, Wu et al. (eds.), Science Press, New York Ltd., Vol. 2, 1699-1705. Thiessen, A.H. (1911). Precipitation averages for large areas. Monthly Weather Report, 39, 1082-1084. Tokai, A., Kojiri, T., and Yoshikawa, H. (2002). Case study of basin wide environmental quality assessment based on the distributed runoff model. 6th Water Resources Symposium, Japan, 229-234 (in Japanese). Vapnik, V. N. (1995). The nature of statistical learning theory. Springer-Verlag, New York. von Neumann, J. and Morgenstern, O. (1947). Theory of games and economic behavior. Princeton, Princeton University Press. Walker, G. (1931). On periodicity in series of related terms. Proceedings of the Royal Society of London, A(131), 518-532. Whigham, P. A. and Crapper, P. F. (2001). Modelling rainfall-runoff using genetic programming. Mathematical and Computer Modelling, 33, 707-721. Young, P. (1984). Recursive estimation and time-series analysis. Springer-Verlag, Berlin. Yule, G. U. (1927). On a method of investigating periodicities in disturbed series, with special reference to Wolfer's sunspot numbers. Philosophical Transactions of the Royal Society of London, A(226), 267-298. Zhang, L., Dawes, W.R., Hatton, T. J., Reece, P. H., Beale, G. T. H., and Packer, I. (1999). Estimation of soil moisture and groundwater recharge using the TOPOG_IRM model. Water Resour. Res. 35(1), 149-161. 140