This file was created by scanning the printed publication. Errors identified by the software have been corrected; however, some errors may remain. Nonhomogeneous Hidden Markov Models Allowing Stochastic Downscaling of Synoptic Atmospheric Patterns to Local Hydrologic Phenomena Peter Guttorp' and James P. Hughes2 - - - - - -- Abstract.- A model for multistation precipitation, conditional on synoptic atmospheric patterns, is presented. It postulates the existence of an unobserved weather state which serves as a link between the large-scale atmospheric measures and the small-scale spatially discontinuous precipitation field. The weather state process is assumed to be conditionally Markov, given the atmospheric data. The rainfall process is then conditionally specified given the weather state. Various parameterizations for the weather state process and the rainfall process are discussed and a likelihood based estimation procedure is described. As an example the model is fit to a twenty station network of rain gauge stations in western Washington. We discuss the NHMM as a method of relating synoptic atmospheric data, such as the output of a general circulation model, to rainfall and other local (sub-grid scale) hydrologic processes, and outline possible extensions to continental scales. INTRODUCTION Stochastic downscaling of GCM data involves building a stochastic model to relate historical measurements of synoptic circulation to historical regional or local climate data (e.g. precipitation). Then, using GCM circulation data as input, the model can be used to produce simulations at the regional or local level. The strength of this procedure lies in its relative simplicity and computational ease compared to the difficult task of building a nested GCM. However, the procedure assumes that the relationships (between circulation and regional climate) identified in the historical data will be preserved under the alternative climate scenarios produced by the GCM. Initial attempts at stochastic downscaling utilized "weather state" models ( e g . McCabe et al. (1989), Hay et al. (1991), Wilson et a1.(1992), Bardossy and Plate (1992)). In these models, each day is explicitly classified into one of a few 1 Depurrrnent of Statistics, Box 354322, University of' Wushington, Seuttle, WA 981 95-4322. 3 Dc~parrmentof Biostututm, University of Wushington, Seuttle, WA. discrete categories (weather states) based on the synoptic circulation pattern for that day. Then separate rainfall models are fit conditional on the weather state. Hughes et al. (1993) developed such a model using historical circulation and rainfall and then applied the results to data from a GCM 2 x CO, run to describe rainfall pattems, streamflow and flood frequency under the altered climate. A difficulty with the weather state models as defined above is that the weather states must be explicitly defined a priori. A poor choice of the weather states will lead to a poor model. As an alternative, Zucchini and Guttorp (1991) proposed a hidden Markov model (HMM) as a model for precipitation. The HMM presupposes the existence of a few discrete weather states but it assumes that these states cannot be observed directly. Instead, only rainfall is observed and the rainfall is assumed to be conditionally temporally independent given the weather state. In this model the weather states are effectively defined by the rainfall process. The states chosen are those that separate the rainfall process into different spatial or temporal pattems. Formally, let R, be the measurement (typically multivariate) of the regional or local process ( e g rainfall occurrence) at time t and S , be the weather state at time t. Then the hidden Markov model is defined by the assumptions where the notation sf means all values of S , from time 1 to T. The term hidden refers to the fact that the weather state process, S,, is an unobserved quantity and is not defined a priori (although the weather state for each day may be identified after the model is fit). Nonetheless, Zucchini and Guttorp (1991) found that the circulation patterns associated with particular weather states were interpretable. In effect, the model acts as an automatic classifier of circulation patterns into weather states that are associated with particular precipitation patterns. The major drawback of the HMM approach is that it fails to incorporate synoptic circulation information in the definition of the weather states. The weather states are defined solely on the basis of rainfall. Thus, the model cannot be used for downscaling. However, an extension to the HMM which does incorporate atmospheric information and can be used for downscaling was developed by Hughes and Guttorp (1994a). They termed their model a nonhomogeneous hidden Markov model (NHMM). Letting X, be the measurement (or summary) of the atmospheric data at time t, the NHMM is defined by the assumptions The first assumption says that the local or regional process, R,, is conditionally temporally independent given the current weather state. In other words, the regional process is driven by the weather state process. The second assumption states that the (unobserved) weather state process is Markov with transition probabilities depending on the current values of the atmospheric information. Various parameterizations are possible for P(S,IS,-, ,X,) and P(R,IS,) and maximum likelihood techniques are available to estimate the parameters. Hughes and Guttorp (1994a) show that both the explicit weather state models mentioned previously and the hidden Markov model are special cases of the more general NHMM. Hughes and Guttorp (1994a) applied the NHMM to 24 years of winter data from a four station rain gauge network in Washington state using the following model for P(R, IS,): This model assumes that the probability of rain at each station depends only on the weather state and is independent of the other stations. For the network analyzed by Hughes and Guttorp (1994a) this assumption was reasonable since the rain stations were widely separated. Note that such an assumption does not imply that the rainfall occurrence processes at the stations are uncorrelated. Hughes and Guttorp (1994a) found strong (unconditional) correlations, on the order of 0.4 to 0.5, between the stations. This correlation is induced by the common weather state. In Hughes and Guttorp (1994b), the NHMM was applied to 21 years of winter data in a densely packed network of 24 stations in the Puget Sound region of western Washington state. The assumption of spatial independence (even conditional on the weather state) was no longer tenable. Therefore, the following model was used for P(R, IS,): where di, is the distance between stations i and j . The parameters P , measure the spatial dependence between stations. When P , is positive, stations i and j are positively correlated (within weather states) and the correlation decreases inversely with distance. A negative value for P, implies negative correlation between stations i and j (within weather states). When each P , is 0, equation (6) reduces to the independence model (equation ( 5 ) ) with ai,y = log(pi,J(l - pi,)).As before, maximum likelihood may be used for parameter estimation. However, for large numbers of stations the constant of proportionality in equation (6) is difficult to evaluate, and Hughes and Guttorp (1994b) describe an alternative estimation strategy for that situation. When applied to the 24 station network mentioned above, and based on mean sea-level pressure and the north-south gradient of 500 mb geopotential height, the model identified five weather states. State 1 had a southwesterly flow over the Puget Sound region, both at the surface and aloft, resulting in high precipitation probabilities everywhere. State 2 had westerly surface flow, but southwest flow aloft. There was substantial rainfall in the north and south, but relatively low probability of rain in the central part of the region. State 3 was characterized by southwesterly flow at sea level and westerly flow aloft, with fairly weak surface winds. The stations bordering Puget Sound were in a rain shadow, while more easterly stations had high precipitation probabilities. State 4 was characterized by a high pressure system over the region, with consequent low precipitation rates. Finally, state 5 was similar to state 2, but with a more westerly 500 mb flow. The rainfall pattern was similar to that in state 2. Statistics of the historical rainfall process (i.e. probability of rain, spatial and temporal correlations, storm duration distribution) were generally well reproduced although stations at the periphery of the network or isolated stations tended to exhibit a predicted (simulated) duration distribution which was lighter-tailed than that which was actually observed. MODELING PRECIPITATION ON LARGER SCALES The NHMM has not yet been applied on subcontinental or continental scales. Some modification of the model will be necessary in this case since, in the present configuration, St is a finite scalar quantity which is meant to describe the weather state at time r over a (meteorologically) homogeneous region. One possibility would be to identify K distinct precipitation/meteorological regions and then allow St to be a K-variate vector where the k-th element corresponds to the weather state in the k-th region. The model for P(S,IS,-l, X,) therefore becomes a model for a multivariate Markov process. The simplest such model is which postulates that each regional process evolves independently, conditional on the observed atmospheric process, X,. Presumably, one would also want to assume that the rain stations in each region depend only on that region's hidden state: where R('} denotes the set of rain stations in region i. Note that this assumption of spatial independence between regions does not preclude the use of a model for spatial dependence within each region. Unfortunately, it seems unlikely that the situation will be as simple as the (conditional) regional independence model envisioned in (1). It is possible that some enhancement in skill can be gained by coupling weather states in each region with weather states in other regions. For instance, the movement of large scale storm fronts across regions would suggest that yesterday's weather state in one region could provide information about today's weather state in another region. On the other hand, if N = dim(sk)is the total number of states in the K-variate process St, it is not computationally feasible to attempt to fit the entire N ( N - 1) transition matrix, P(S,IS,-, ,X,). A reasonable compromise between these extremes might be to assume that today's state in region i depends only on n;=, yesterday's state in the regions surrounding region i. That is, where Mi) is read "the neighborhood of i". Basic meteorologic principles and knowledge of the regions under consideration should be used to determine which regions constitute the neighborhood of region i. Another approach to modeling continental scale precipitation would be to divide the atmospheric variables into region-specific measures (call these x:' for region i) and (sub)continental measures (call these x:) and to postulate the existence of an unobserved continental scale weather state (call this C,). Under these assumptions, one might write In other words, the weather state in region i today depends on the weather state in region i yesterday, the atmospheric data in region i today and the overall continental weather state today. The probability distribution of the continental weather state is completely defined by the continental atmospheric variables. SUMMARY Simulations of precipitation conditional on a set of atmospheric data can be used to downscale the output of a GCM into local precipitation, and to assess the regional and local effects of climate change scenarios (by conditioning the simulations on the output of GCM runs assuming altered climate scenarios). This does, of course, require the untestable assumption that precipitation reacts to the altered climate in the same fashion as it does to the historical climate. In other words, we must assume that the climate change affects only the distribution of weather states, rather than which weather states are appropriate. REFERENCES Bardossy, .A. and E. J. Plate (1992) Space-time models for daily rainfall using atmospheric circulation patterns. Water Resour. Res., 28, 1247- 1259. Hay, L., G. J. McCabe, D. M. Wolock, and M. A. Ayers (1991) Simulation of precipitation by weather type analysis. Water Resour. Res., 27,493-50 1. Hughes, J. P., D. P. Lettenmaier, P. Guttorp (1993) A stochastic approach for assessing the effects of changes in regional circulation patterns on local precipitation. Water Resour. Res., 29, 3303-33 15. Hughes, J. P. and P. Guttorp (1994a) A Class of Stochastic Models for Relating Synoptic Atmospheric Patterns to Regional Hydrologic Phenomena. Water Resour. Res., 30, 1535-1546. Hughes, J. P. and P. Guttorp (1994b) Incorporating spatial dependence and atmospheric data in a model of precipitation. J. Applied Meteor., 30, 1535-1546. McCabe, G. J., L. E. Hay, M. A. Ayers, and D. M. Wolock (1989) Assessment of climate change using weather type analysis. Presented at the National Conference on Hydraulic Engineering, Am. Soc. of Civ. Eng., New Orleans, La., Aug. 14-18. Wilson, L. L., D. P. Lettenmaier and E. Skyllingstad (1992) A hierarchical stochastic model of large-scale atmospheric circulation patterns and multiple station daily precipitation. J. Geophys. Res., 97. 279 1-2809. Zucchini, W. and P. Guttorp (1991) A hidden Markov model for space-time precipitation. Water Resour. Res., 27, 1917-192%