Statistical hydro-ecological models Mike Dunbar National Hydroecology Technical Advisor mike.dunbar@environment-agency.gov.uk August 2013 (Statistics for Environmental Evaluation 2004) Structure (About Me) Statistical modelling using monitoring data Hydroecology: river flows and ecological response River ecology and land management stressors Some history Mid-1980s Brookes: quantify massive extent of river channelisation in E & W More focus on flows downstream of dams Roll out of national bioassessment methods 1990s Addressing site-specific low flow problems Development of River Habitat Survey Growing interest in river restoration/rehabilitation Development of LIFE metric (see later) 2000s European Water Framework Directive Importance of hydromorphology (made up word) increasingly recognised DRIED-UP project (basis of this talk) EU Water Framework Directive England (2012) Physical Modification Phosphate Abstraction and Flow Dissolved Oxygen Sediment Ammonia Specific Pollutants BOD Pressure on Ground Water Priority Substances Alien Species pH Nitrate Temperature Fish Stocking Other Pollutants Still under investigation 0% 10% 20% 30% 40% 50% 60% Percentage of waterbodies assigned a reason for fauilure attributed to each pressure Where it all started for me Extence, Balbi and Chadd (1999) Dunbar and Clarke, 2002 (2005?) Centre for Ecology and Hydrology – Mike Dunbar More context Desperate need to ‘upscale’ our detailed knowledge spatially and temporally for it to be useful for river management It’s generally well known that Physical environment affects river and stream biota Biota have definable niches for physical microhabitat as well as water quality Distribution of biota related to catchment characteristics Multiple pressures are the norm How to upscale: use national datasets Macroinvertebrate biological monitoring River Habitat Survey Indicator organisms: Macroinvertebrates Rhyacophlia Simulium Perla Caenis Sericostoma Lymnaea Fast velocity water, clean gravel / cobble substrates Sigara Gerris Slow / still water and / or silty substrates Centre for Ecology and Hydrology – Mike Dunbar f (flowgroup, abundance) n LIFE n Where n = number of different taxa in sample Groups based on a huge literature survey (which I didn’t do) Group Abundance in sample A 1-9 B 10-99 C 100-999 D 1000+ I Rapid 9 10 11 12 II Moderate/fast 8 9 10 11 III Slow/sluggish 7 7 7 7 IV Flowing/standing 6 5 4 3 V Standing 5 4 3 2 VI Drought resistant 4 3 2 1 Standard sampling method Assess habitat 3 minute kick/sweep sample 1 minute hand search Sample processing DRIED-UP Distinguishing the Relative Importance of Environmental Data Underpinning flow Pressure assessment Four R&D phases so far (DU1-4) Mainly funded by Environment Agency, some contribution from NERC/CEH and EU Two papers (DU1&2), ~Three reports Currently undergoing testing in the EA Centre for Ecology and Hydrology – Mike Dunbar Flow Life Score 8.0 1.00 7.5 0.10 7.0 0.01 6.5 0.00 6.0 86 88 90 92 94 Year 96 98 00 LIFE Score Discharge (m³/s) 10.00 Sp Au ++++ ++++ ++++ + + ++ ++++ + + + +++ + + ++++ ++ +++++ +++ ++ 78117 + ++ ++ + 75896 + 64268 53858 + 69367 67808 + + +++ + ++++++ +++ + ++++ +++ + + ++ ++ + +++ ++ 69055 +++ +++ +++ + 68570 89481 + ++++ ++ + + 9 8 7 6 ++ +++++ + + ++ 67456 + ++ ++ + ++ 53857 + ++++ +++ + ++ 79079 + ++ + + + +++++ + ++++ ++ ++ ++++ +++++ + + + + + ++++++ 66546 66050 78718 + + +++ ++ ++ 53811 78698 65984 65889 + + +++ + +++ +++ + + + + ++ ++ ++ +++ + +++ ++ +++ + -2 0 2 53810 + ++++++ ++++ 78517 + +++ ++ ++ + + 53782 78377 + + +++ +++++ ++ +++ + +++++ +++++ ++ ++ 65509 ++++ + + + + ++++++++ +++ ++ ++ + +++ ++++ + -2 0 2 53729 78217 65477 53618 +++ ++++++ ++ + ++++++++++ + + 52058 +++ ++ +++++ + ++++ +++ +++++++ + ++ + ++ + ++ +++ 64287 63497 64100 +++ + +++ ++++ ++ + + + ++ +++ +++++ + + 53617 + +++ +++ ++ ++++ + + + ++ 53552 + ++ ++ + +++ + ++++ ++++++++ + + + + + +++++ ++ 53277 +++ + + +++++ ++ + + + + + ++++ + 51788 + + ++ +++ + + +++ + ++ +++ 53217 53160 ++ +++ +++ ++++ +++++ + + ++ + + 52957 + + + + ++++ +++++ + 52646 +++++ + + +++++ + 52408 52238 + +++ ++ + ++ +++ ++ + ++ +++++++ + ++ ++ ++++ +++ + + ++ ++ + + + 54221 + +++++ +++++ + ++++ 53212 + ++++++ + +++ ++ + 52221 + ++ + ++++++++++ ++ + ++++++++ + +++++++ + 53819 + 52133 9 8 7 6 + ++ +++ ++ +++ + ++++ ++ ++++++ + +++++ ++++ + + + +++ ++ +++ + + 51902 + +++ + + +++ ++ ++ + + 51847 51649 ++ + + +++ + + 49754 + + +++ + + +++ + + + +++ + 51136 ++ +++++++ ++ ++++ + ++++ + + + +++ +++ + ++++ + ++++++++ + + +++ + +++++ ++ +++ 49498 ++++ + ++ ++ +++ + ++ ++ + 51097 ++++++ + ++ +++ + ++ ++ ++ +++++ ++++++++ + ++++++++++ ++++ + 48389 48105 + ++ + ++++ ++ ++ + 45201 50995 + ++ + + + ++ + +++ + + 45200 + + ++++++ + ++ + ++ + 10709 + + ++ ++ ++++++ + + + ++ 50953 50844 50584 ++ ++ + +++ +++++++++ ++++++ + ++ + + +++++ ++ + + + + + ++ +++ +++ + + ++ 50251 47157 50059 47041 49824 + ++ + + ++ + ++++ + + + +++++ + ++++ ++++ ++++++++ +++ 44546 + ++++ ++++ ++ ++ +++ + ++ + + ++++ + ++++++++++ + + ++++ ++ + ++ +++ + ++ + 44436 ++ + +++++ +++++ ++ + ++ +++ ++ ++ ++ + + + + +++++ ++++ ++ 46953 ++++ + +++ + + 10866 45247 ++ + +++ ++++ + + 10790 ++++ ++++++ ++ ++++++++ 45202 47712 47862 ++ + + +++ ++++ ++ ++ + + +++ + + + +++ + + +++ + ++ ++ ++ 10775 ++ +++++ ++++ ++ + ++++ ++ ++ ++ +++ + ++++ + +++ ++ 10753 + ++++++ + ++ + ++++ + ++ + 45199 + ++ +++ + + +++ +++ + + + ++ + ++ + +++ +++ 10708 + + ++ + + ++++++++ + + + ++ + ++ + ++ 45137 + + + +++ + 45136 ++ + +++++ + + ++++ ++ +++ ++ +++ ++++ 10080 + +++++ +++++++ 509 + ++ +++++++++ + +++ 291 + + 1587 + + + ++ + + + 47497 47379 + + + +++ + + +++ 45135 ++ +++++ + + + ++ ++ 45133 47257 ++ + + + + + ++++ + ++ + ++++ ++ + ++ + + + ++ ++ + 9 8 7 6 ++ ++++++++ ++ ++++++ + +++++ +++ + 47672 50454 + + ++ ++ + + ++ +++++ + +++++++++ + + + + ++++ + + ++ +++ ++ + 9 8 7 6 9 8 7 6 + 44803 9 8 7 6 50378 + 215 LIFE_F 9 8 7 6 + ++ ++++ ++ + + 50349 -2 0 2 + + ++++ + + ++ + + + +++++ ++ +++++ +++ + +++ + ++ + ++++ +++ + + ++ +++ +++ + ++ ++ + + -2 0 2 -2 0 2 -2 0 2 -2 0 2 Q10z -2 0 2 -2 0 2 -2 0 2 9 8 7 6 -1 0 1 2 8.0 6.0 7.0 LIFE 7.5 7.0 6.5 LIFE 8.0 6.5 7.0 7.5 8.0 8.5 LIFE -2 -2 0 1 2 -2 Normalised Flow -1 0 1 2 Normalised Flow 0 1 2 -2 0 1 -2 Normalised Flow 6.5 LIFE 7.5 LIFE -1 0 1 Normalised Flow 2 -2 -1 0 1 Normalised Flow -1 0 1 2 Normalised Flow 7.6 LIFE 7.2 6.8 -2 6.5 7.0 7.5 8.0 2 8.5 Normalised Flow -1 2 6.5 7.0 7.5 8.0 -1 LIFE 7.5 -2 6.5 7.0 LIFE 7.5 6.5 LIFE 8.0 Normalised Flow -1 -2 -1 0 1 2 Normalised Flow Original data Analysis Data Using subset of Environment Agency historical macroinvertebrate monitoring data • Extensively screened for water quality impacts Model historical daily flows where gauges not available Physical habitat quantified by a River Habitat Survey Biotic index LIFE, in the manner of other biotic indices Relate preceding flows to the LIFE score for each sample Explanatory variables Flow magnitudes, statistics of flows preceding sample http://www.ceh.ac.uk/data/nrfa/ River Habitat Survey Habitat Modification Habitat Quality 3 8 7.9 2.5 7.8 Flow (m³/s) 7.6 1.5 7.5 7.4 1 7.3 7.2 0.5 7.1 0 01/01/95 7 01/01/96 01/01/97 01/01/98 01/01/99 01/01/00 01/01/01 01/01/02 01/01/03 01/01/04 LIFE score 7.7 2 Examples of sites Multilevel statistical models Also called mixed-effects, or hierarchical Extension of linear regression to hierarchically structured data Very common in social sciences, educational, medical statistics Not very common in environmental sciences Multilevel / hierarchical approach Terminology: i sample (level 1), nested within j site (level 2) Problems with alternative approaches Site-by-site You need a surprisingly large amount of biological data to model the LIFE-flow relationship for a site Particularly if you are interested in response to different flow variables So site-specific flow-biology relationships can be highly uncertain (and misleading) If multiple flow variables are “tested”, this uncertainty is even greater than you think Ignore group structure Weak, unrealistic models Unsuitable for prediction Can’t handle multi-level predictors Common patterns BOTH high (Q10) and low (Q95) flow magnitudes influence LIFE score Autumn samples more sensitive to high flow magnitude Extent of Resectioning decreases LIFE score Extent of Resectioning increases response of LIFE to low flow magnitude Year trend: upwards, varies by site DRIED-UP 1: 2005 7.5 5.5 6.5 LIFE score (species) 7.5 6.5 5.5 LIFE score (species) 8.5 b 8.5 a -2 -1 0 1 2 Flow (Q95) z-scores 3 -2 -1 0 1 2 Flow (Q95) z-scores Data from 11 sites in E.Midlands 3 DRIED-UP 3: 2010 Influence of HMSRS on response (upland sites) 7.0 8.0 100% 7.5 33% 66% 7.0 LIFE score (family) 7.5 0% 0% 33% 66% 6.0 6.0 6.5 6.5 LIFE score (family) 8.0 8.5 8.5 Influence of HMSRS on response (lowland sites) -2 -1 0 1 2 Normalised antecdent low flows (Q95) 100% -2 -1 0 1 2 Normalised antecdent low flows (Q95) 8.5 Modelled response of each individual site 7.5 7.0 6.5 6.0 LIFE score (family) 8.0 Modelled mean response of LIFE score to Q95z for upland and lowland sites as mediated by HMSRS, and response of each individual site. Percentages are of the maximum HMSRS score observed in the dataset. NB model fitted excluding normalised Q10 term. -2 -1 0 1 Normalised antecdent low flows (Q95) 2 Borrowing strength In DRIED-UP, each site in the model “borrows strength” from the dataset as a whole Or.. The DU dataset makes site-specific relationships more robust This is very handy for prediction Prediction In ecology at least, too much focus on model selection as the end point Actually we should take more time making predictions... Plug in flow (norm seasonal Q95 and Q10) + habitat 1. No new biol data 2. New biol data (borrowing strength) Example later.. Conclusions Modelling approach accounts for the spatial-temporal structure in the data Common effect of both high and low flows for both upland and lowland sites Physical habitat can influence both overall LIFE and its response to flow Consistent signature from resectioning across upland and lowland Effect of high flows greater on autumn samples (ie summer flows) There are implications for water resource management, river rehabilitation, climate change mitigation More information? Taking the modelling forward DRUWID – DRIED-UP with Incremental Drought Chilterns NW of London Major aquifer: large number of water supply boreholes Abstraction impacts on river flows New housing development “Chalk Streams”: high conservation value and public interest Strong climatic control, overlaid with anthropogenic influence E.g. River Misbourne Photo: Misbourne River Action DRUWID concept 6 years in the making... How to capture more of the complexity of the flow regime? How to describe impact of drought Solutions Mixed effects approach More flow variables AND Flow-flow interactions Multi-model inference Rank alternate competing hypotheses Often no single model “best” Stepwise etc approaches all flawed Totally avoids issues of “significance”, in-out Information Theoretic approach Further details: Burnham and Anderson (2002): model selection and multi-model inference... Anderson (2010): model-based inference in the life sciences... Hydrological complexity: existing approaches: DRUWID application to Chilterns 42 sites in 9 catchments Still using gauged flows, but also indicator as to whether site was dry the summer before sampling Chose lags up to 2 years as reasonable compromise Separate models for spring and autumn Reasonable compromise- this formula: yijk = β0 + v1jk x1ik + β2x2ik + u3jk x3ik + u4jk x4ik + u2jk β8 x8i + w1k + β5 + β6 + β7 + β9x9j + β10x10j + β10x10j + β11x11j + β12x12j + β 13 x1ik x2ik + β 14 x1ik x3ik + β 15 x2ik x3ik + β 16 x3ik x4ik +eijk v1jk = β1 + β17 x9j + β18 x10j + β19 x11j + β20 x12j +u1jk w1k ~ N(0, 2) u ~ MVN(0, Ω) 12 12 2 132 14 2 2 2 2 2 2 23 24 12 2 2 2 13 23 3 34 2 2 2 2 2 24 34 4 14 eijk ~ N(0, σ2 ) 20 fixed parameters (intercept and 19 slopes) plus 6 variances and 6 covariances And catchment ID only varies overall LIFE, not any of the flow response slopes Chilterns DRUWID: Summary of variables Interactions are v.v. important Further illustration of one flow:flow interaction Prediction DRUWID is work in progress Was funded by CEH, but development used EA Chilterns data Methodology can be applied elsewhere Relatively quick to set up, just need the data.. DRUWID shows that Can expand number of antecedent flow descriptors without model selection / overfitting problems Can use interaction effects neatly Chilterns DRUWID shows that See lag effects in ecological response to past flow conditions, over at least two years Sequencing is important Drying pattern matters Resectioning important again... Also livestock poaching Habitat still matters... Photo: Misbourne River Action DRIED-UP and DRUWID summary Both totally reliant on multilevel / mixed effects approach DRIED-UP = a “national” model derive robust site-specific relationships where relatively short series of monitoring data are available WFD, RSA, drought, ?licensing? influential in building our understanding that ecological response is a consequence of interacting multiple stressors DRUWID extends the DRIED-UP concept to consider impacts of drought DRUWID is a framework rather than a specific model It’s more complex so works best using relatively compact regional datasets Stats learnings Power of mixed-effects / multilevel approach Need good understanding of multiple linear regression No single go-to book Look outside environmental sciences: social, medical End of Part 1!