WATER RESOURCES RESEARCH, VOL. 48, W09544, doi:10.1029/2011WR011543, 2012 Can time domain and source area tracers reduce uncertainty in rainfall-runoff models in larger heterogeneous catchments? R. Capell,1 D. Tetzlaff,1 and C. Soulsby1 Received 31 October 2011; revised 16 July 2012; accepted 2 August 2012; published 25 September 2012. [1] We used both time domain (deuterium) and source area (alkalinity) tracers to reduce uncertainty in simple conceptual rainfall-runoff models applied to a larger (749 km2) heterogeneous catchment with both upland and lowland headwaters. Stepwise, tracer-aided model development resulted in different model structures for the uplands and lowlands as representative elementary watersheds (REWs). These were differentiated by the parameterization of a nonlinear overland flow mechanism in the former, and the incorporation of high soil moisture storage capacity in the latter. Use of tracers and recession characteristics also helped to reduce parameter uncertainty and provided models that could simulate flows and tracer responses reasonably well over a full hydrological year. However, it was apparent that other processes (e.g., more complex mixing, fractionation, etc.) would need to be parameterized to explain the full variation in isotope dynamics. It is also evident that the information content of tracer data declines as the intensity of sampling decreases, particularly in the lowlands. The models of the REWs were coupled to provide plausible simulations of the up-scaled flow and tracer response at the outfall of the 749 km2 catchment, though the usefulness of source area tracers decreased markedly at this larger scale. Whereas the approach provides a step toward simple models that are likely to give the ‘‘right answer for the right reasons,’’ further improvements appear to require increased parameterization and/or higher-resolution tracer data. Citation: Capell, R., D. Tetzlaff, and C. Soulsby (2012), Can time domain and source area tracers reduce uncertainty in rainfall-runoff models in larger heterogeneous catchments?, Water Resour. Res., 48, W09544, doi:10.1029/2011WR011543. 1. Introduction [2] Recent progress in rainfall-runoff modeling in small experimental catchments has seen the use of time domain (e.g., isotopes) or source area (e.g., alkalinity) tracers to constrain the selection of appropriate model structures and parameter ranges [Beven, 2012]. It is uncommon for both time domain and source area tracers to be used concurrently in such tracer-aided models, and rarely have they been applied to larger mesoscale catchments (>100 km2). Yet this up-scaling is important, as it is commensurate with the scales where models are needed to inform management decisions concerning climate change impacts, flood alleviation, sustainable allocation of water resources, and protection of riverine ecology [Reed et al., 2006]. Unfortunately, spatial variability of catchment characteristics and associated hydroclimatic drivers make it challenging to optimize appropriate model structures at such larger scales and the potential for tracers in assisting this process is largely unexplored [Soulsby et al., 2008]. Contrasting geographic regions usu- 1 Northern Rivers Institute, School of Geosciences, University of Aberdeen, Aberdeen, UK. Corresponding author: R. Capell, School of Geosciences, University of Aberdeen, Elphinstone Rd., Aberdeen AB24 3UF, Scotland, UK. (rene.capell@gmail.com) ©2012. American Geophysical Union. All Rights Reserved. 0043-1397/12/2011WR011543 ally require different model structures. Although these may be achieved by using more spatially distributed models, this necessitates either high-resolution data or a level of parameterization, which makes model identification based on rainfall-runoff relationships virtually impossible [Beven, 2001]. On the other hand, increased heterogeneity of larger catchments can inhibit simple lumped modeling approaches, particularly where marked landscape changes result in different dominant runoff generation processes [Beven, 2000]. Increased parameterization can offset this shortcoming [Fenicia et al., 2008a]. However, increased complexity in modeling approaches makes it more difficult to use additional process-based ‘‘soft’’ data, such as tracers, to aid model evaluation [Seibert and McDonnell, 2002]. Thus, there remain strong arguments for retaining low-parameter approaches, which allow additional opportunities to assess whether models provide the ‘‘right answers for the right reasons’’ [Kirchner, 2006]. This is particularly important if calibrated models are to be used as tools to evaluate the effects of environmental change on hydrological function [Beven, 2007; Dunn et al., 2008b]. [3] The uncertainties associated with extrapolating pointscale empirical observations increase in larger-scale catchments, where routine hydrometric and meteorological data are usually available, but process-based information is usually restricted. Here, environmental tracers can present versatile and comparatively cheap tools for adding information to hydrometric data and detecting the dominant influences on the integrated runoff response by interpolating the likely processes W09544 1 of 19 W09544 CAPELL ET AL.: CAN TRACERS REDUCE UNCERTAINTY IN RAINFALL-RUNOFF MODELS? governing the dynamics in a top-down manner [e.g., Klemes, 1983; Kendall and Coplen, 2001; Soulsby et al., 2006b]. Tracers can thus provide empirical insights into the dynamics of catchment systems that have the potential to be useful in modeling [e.g., Seibert et al., 2003; Birkel et al., 2010]. Information provided by both geochemical and isotopic tracer responses in catchment runoff models can help to constrain viable parameter sets and more fundamentally increase confidence in the choice of model structure [e.g., Uhlenbrook et al., 2004; McGuire et al., 2007; Birkel et al., 2010; McMillan et al., 2011, 2012]. [4] There is a long history of using tracers in modeling studies. For example, Robson et al. [1992] used acid neutralizing capacity as the source area tracer of the geographic provenance of water in the Plynlimon catchments and compared this with the equivalent partitioning of water with an application of TOPMODEL. In the same catchment, Neal et al. [1988] used chloride as a conservative time domain tracer to demonstrate that, whereas the Birkenes model could simulate rainfall-runoff relationships, it failed to simulate stream chloride concentrations, indicating that storage changes and mixing processes were poorly represented in the model. Despite this, it is only recently that both ‘‘source area’’ and ‘‘time domain’’ tracers have been used in conjunction in dynamic rainfall-runoff models to constrain model parameters and structures in a consistent way [e.g., Birkel et. al., 2011a]. Moreover, most other modeling applications of geographic source area or time domain tracers either to constrain uncertainty or as a basis for model rejection have focused on smaller (typically <10 km2) experimental catchments [e.g., Fenicia et al., 2008b; Uhlenbrook et al., 2002; Page et al., 2007; Fenicia et al., 2010]. The utility of tracers in aiding modeling studies at larger scales is thus an area of significant research potential. [5] The usefulness of identifying hydrologically similar areas, or representative elementary watersheds (REW), is well established as a basis for hydrological modeling at larger scales [e.g., Wood et al., 1988], and recent work has suggested that marked differences in catchment characteristics can provide important classification tools that can aid this disaggregation [e.g., Buttle, 2006; Wagener et al., 2007; Carillo et al., 2011; Sawicz et al., 2011]. Very recent approaches have used such connections between key landscape controls and dominant runoff generation processes to delineate hydrologically similar areas [Savenije, 2010; Gharari et al., 2011; Ali et al., 2012]. However, different landscape controls for delineating dominant runoff generation processes may be needed at different spatial scales. For example, soil cover has been identified as the first-order control on runoff generation processes at the hillslope [e.g., Scherrer and Naef, 2003] and small catchment scale [Schmocker-Fackel et al., 2007]. Tetzlaff et al. [2007] also showed that generalized hydrological properties of soil types can serve as predictive landscape controls on runoff dynamics even in larger catchments (>10 km2), using the U.K. Hydrology of Soil Types (HOST) classification [Boorman et al., 1995]. However, in larger river systems that encompass different geological provinces, bedrock characteristics have been shown to be a dominant landscape control on runoff generation processes [e.g., Didszun and Uhlenbrook, 2008; Fröhlich et al., 2008; Gleeson and Manning, 2008; Tetzlaff and Soulsby, 2008; Tetzlaff et al., 2011]. Studies W09544 that seek to integrate tracers into modeling processes at larger catchment scales thus need to identify the first-order landscape controls to delineate the different REWs. [6] The aim of this study is to explore the utility of both time domain and source area tracers in the larger, nested catchments of the North Esk in Scotland, which has two distinct hydrogeomorphologic/hydroclimatic provinces. This builds on previous work where the spatial and temporal dynamics of runoff in the catchment were examined using a range of environmental tracers [Capell et al., 2011, 2012]. Catchment geology together with landscape evolution history (glaciation, soils, topography), were shown to be the most important factors controlling large-scale runoff dynamics and hydrochemistry. This results in shallow responsive soils overlying low permeability bedrock in the wetter, cooler uplands giving a rapid runoff response. In contrast, deeper, more freely draining soils facilitate recharge of sandstone aquifers in the drier, warmer lowlands, increase baseflows, and attenuate storm hydrographs [Capell et al., 2011, 2012]. Based on these previous findings, the specific research objectives are: (1) to develop parsimonious tracer-aided rainfall-runoff models based on landscape heterogeneity with a simple division into two major hydrological units representing the uplands and lowlands; (2) to identify appropriate parameter ranges and model structures representing dominant runoff processes in these different landscapes using tracer constraints (with alkalinity as a source area tracer and deuterium as a time domain tracer); and (3) to combine the upland and lowland models to simulate flow and tracer dynamics at a larger catchment scale, using this as a learning framework to explore the potential and problems associated with such tracer-aided upscaling in more complex, heterogeneous landscapes. 2. Study Sites and Data [7] The 749 km2 North Esk catchment in northeast Scotland combines contrasting lowland and upland landscape characteristics. An abrupt change between landscape units is caused by a major geological discontinuity that separates the northern mountains of Scotland from the lowlands, and results in a distinct geomorphologic separation in the study catchment (Figure 1 and Table 1). Along with the North Esk outfall, two of its subcatchments were investigated : the 56 km2 Water of Mark (upland catchment) and the 131 km2 Luther Water (lowland catchment). [8] A detailed description of the study area is given by Capell et al. [2011]. In brief, the mountainous western parts of the North Esk—including the Water of Mark—have a maximum elevation of 932 m a.s.l. The eastern part of the North Esk with the Luther Water catchment drains the lowlands. The upland geology is dominated by granitic and metamorphic bedrock, with low permeability apart from fractures [cf. Haria and Shand, 2004]. In contrast, the sedimentary bedrock of the lowlands, predominantly sandstones and conglomerates, date from the lower Devonian and form an important regional aquifer system. This aquifer is actively recharged and chlorofluorocarbon (CFC) dating resulted in estimated average groundwater ages of up to Dochartaigh et al., 2006]. 40 years [O [9] The distribution of soils is strongly influenced by geology and topography. The uplands are dominated by peats 2 of 19 W09544 CAPELL ET AL.: CAN TRACERS REDUCE UNCERTAINTY IN RAINFALL-RUNOFF MODELS? W09544 Figure 1. (a) Catchment topography with simplified stream network and sampling sites. Subcatchments outlined (b) bedrock geology and quaternary sediments. Scotland overview map with location of the study catchment inset. and peaty soils on higher elevation plateaus and upper hillslopes and mineral podzols on lower slopes. Lowland soils are predominantly brown forest soils and mineral podzols. The HOST mapping scheme [Boorman et al., 1995] was used in previous studies to classify dominant flow paths based on soil cover, which proved to be an effective predictor for regionalization at the catchment scale [Soulsby et al., 2006a; Tetzlaff et al., 2009]. This scheme distinguishes hydrologically responsive soils with dominantly lateral flow paths and little recharge to depth, from more freely draining soils with dominantly vertical flow paths that recharge groundwater. Under the prevailing cool and wet climate conditions, responsive soils are usually at, or near to, saturation, and saturation excess overland and 3 of 19 W09544 CAPELL ET AL.: CAN TRACERS REDUCE UNCERTAINTY IN RAINFALL-RUNOFF MODELS? Table 1. Summary of Physical and Hydrological Catchment Properties for the Three Studied Catchments Uplands River name Area (km2) Lowlands Outlet Water of Mark Luther Water North Esk 56 131 749 Topography 600 257 932 11.2 66 146 24 518 5.8 51 310 12 932 8.5 78 Bedrock Geology – 78.0 22.0 79.8 17.5 2.7 37.7 55.3 7.0 10.9 53.9 – 3.6 21.8 14.5 85.4 25.3 23.8 7.5 9.4 19.9 46.5 53.4 Hydrometry (October 2004–October 2009) Qmean (mm d1) 1.4 0.9 1 Q95 (mm d ) 0.54 0.27 1 Q5 (mm d ) 8.7 4.0 Mean annual sum (mm) 1066 502 1.3 0.37 6.4 752 Precipitation (mm) October 2008–October 2009 1256 1165 Mean elevation (m) Min. elevation (m) Max. elevation (m) Mean slope ( ) Max. slope ( ) Sedimentary (%) Metamorphic (%) Igneous (%) Dominant Soils (HOST) Peaty podzol (%) 25.6 Brown forest soil (gleying) (%) – Eroded peat (%) 31.0 Blanket peat (%) 34.0 Humus iron podzol (%) 9.0 Sum responsive soils (%) 97.3 Sum freely draining soils (%) 2.8 1161 near surface lateral flows dominate storm runoff generation and produce flashy storm hydrographs (Figure 2a). Freely draining soils dominate in areas that facilitate significant groundwater recharge and large volumes of aquifer storage, which produces longer mean transit times and attenuates the runoff response (Figure 2b). Responsive soils cover 97% of the upland catchment (Water of Mark), whereas freely draining soils (85%) dominate the lowland catchment (Luther Water). The North Esk catchment integrates both landscape types resulting in 53% freely draining and 47% responsive soils, respectively, and the streamflow response reflects this (Figure 2c). [10] Five years of daily hydrometric and climatic data (October 2004 to September 2009), which included one year with weekly tracer data (October 2008 to September 2009), were used. Discharge data were provided by the Scottish Environmental Protection Agency (SEPA). Precipitation data were available from the British Atmospheric Data Centre (BADC) at a daily resolution from nine gauges in and around the catchment (maximum distance : 10 km). For the lowland areas, a gradient-inverse-distance-squared (GIDS) interpolation [Hrachowitz et al., 2009] was used to estimate daily catchment-wide precipitation. In the upland areas, interpolation results were unreliable because of a lack of measurements at higher elevations. Therefore, station data was used with a simple altitudinal correction to ensure reasonable water balance. Given the size of the study catchments and the subdued terrain in the lowland areas, regional groundwater flows may be an additional W09544 source of error in the catchment water balance, though the geological structure of the aquifer (a syncline with a NE– SW axis) suggests such errors are probably small. Daily air temperatures from two sites in the North Esk covering the upland–lowland elevation gradient were available through the BADC. The average of both stations was used as catchment temperature at the outlet. Daily actual evapotranspiration ET (mm), estimated with a Penman-Monteith equation adjusted to aerodynamic and canopy roughness characteristics, was available from the U.K. Environmental Change Network (ECN) at Glensaugh, in the northeast of the catchment. To account for higher temperatures and land use differences in the lowland areas, annual ETa estimates from the Met Office Rainfall and Evaporation Calculation System (MORECS) were used to correct the evapotranspiration for lowland areas. [11] Weekly stream tracer samples were analyzed for stable water isotopes (2H), alkalinity, and major ion concentrations. 2H was analyzed using a laser spectrometer (DLT 100, Los Gatos Research) at a precision of 62 % applying five standards for calibration. The 2H signatures are presented as (%) concentration derived from Vienna standard mean ocean water (VSMOW). 2H concentrations in precipitation were available from two sites located north and south of the North Esk basin [Birkel et al., 2011b; Speed et al., 2010] and weekly averages were used as precipitation 2H input signature for modeling purposes. The 2H precipitation signatures of the subcatchments were further corrected to average stream signatures, which showed a close correlation to mean catchment elevation [Capell et al., 2012]. Stream water alkalinities were determined by acidimetric titration [Neal, 2001]. The collected stream water samples were also analyzed for major ion content using inductively coupled plasma mass spectrometry and ion chromatography. Stream water alkalinities and 2H signatures were used in the development and evaluation of simple conceptual models, as the alkalinity summarized much of the variation of other ions [Capell et al., 2011, 2012]. 3. Model Development 3.1. Translating Perceptual Catchment Understanding Into Conceptual Rainfall-Runoff Models [12] Simple, lumped water balance models were developed to simulate daily rainfall-runoff dynamics in the contrasting landscapes (as REWs) of the North Esk. The model structures were extended in a stepwise fashion to determine the lowest level of complexity necessary to capture the main processes driving runoff generation following a ‘‘topdown’’ approach. The selected model components and configurations are summarized in Table 2; this shows the water balance and tracer model equations ordered by the flow of water through the model, as well as model parameters used in each component and the performance measures for each model in the upland and lowland catchments. In addition to rainfall-runoff data, tracer data helped conceptualize the dominant runoff generation processes in relation to different landscape types. [13] In all tested model configurations, a degree-day snowmelt routine was implemented to account for snow accumulation and melt events during the winter season (Table 2, equation (1)). The snow module is similar to the 4 of 19 W09544 CAPELL ET AL.: CAN TRACERS REDUCE UNCERTAINTY IN RAINFALL-RUNOFF MODELS? W09544 Figure 2. Rainfall-runoff response in (a) upland Water of Mark, (b) lowland Luther Water, and (c) outlet of the North Esk catchment. HBV model snow routine [Bergström, 1975]: precipitation is accumulated as snow if temperature T ( C) falls below a threshold value Tthresh ( C), usually close to 0 C, and melt water is released from the snow pack according to a factor Dmelt (mm C1 d1), forming effective precipitation input Peff to the runoff generation modules. A proportion of the meltwater is retained in the snow pack and allowed to refreeze if temperatures fall below Tthresh. To minimize model parameterization, the maximum volume of retained water is fixed at 10% of the snow pack and the maximum refreezing volume at 5% of resident meltwater. These values are based on previous studies using the HBV model [Seibert, 2000]. [14] For the runoff generation routines in both the uplands and lowlands, model development started from a simple linear reservoir model (Model 1 in Table 2), with discharge Q being proportional to the active catchment storage S and the runoff coefficient k (Table 2, equation (8)). This initial structure was iteratively tested against observed hydrometric data and then extended with the aim of better capturing the observed discharge dynamics, which were evaluated with a range of performance measures. Besides single linear storage concepts, different model configurations with two runoff-generation storage components were also evaluated. Both included a linear base flow storage (Table 2, equation (8)); however, for the generation of fast runoff components (Q1) from shallow storages (S1), single linear (Model 2) and nonlinear storages (Model 3) were alternatively implemented (Table 2, equations (4)–(6)). Evapotranspiration ET is subtracted sequentially starting with evaporation from the snow pack (if present), followed by the resident shallow storage volumes. In the configuration using a soil storage Ss, which explicitly conceptualizes shallow water held by the soil matrix, this serves as a reservoir for ET. [15] The models were further extended to assess whether conceptualization of assumed dominant runoff processes indicated from previous tracers studies and the catchment soil (HOST classes) and geology characteristics could improve their performance. Thus, in Model 4, an additional soil storage Ss, accounting for the predominance of deeper soils that mainly contribute to groundwater recharge in the lowlands, was implemented (Table 2, equation (2)). Precipitation only enters the runoff-generating storage components if the soil storage capacity Ss,max (mm) is exceeded. Previous tracer studies inferred such a mechanism with high concentrations of geochemical tracers, even during peak flows, suggesting a strong influence of pre-event water in these areas [Capell et al., 2011]. The assumption of linear shallow storage, however, turned out to be insufficient to capture observed peak flow dynamics. These peak flows were most likely dominated by fast translatory flow through near surface storages aided by agricultural drains [Birkel et al., 2011b]. These lowland shallow runoff sources were, therefore, conceptualized as the nonlinear shallow storage implemented in Model 3 (Table 2, equation (5)) and retained in Model 4. A final model (Model 5) included a conceptualization of saturation overland flow (SOF) (Table 2, equation (3)) to explore the potential 5 of 19 (5) (7) (8) Q1 ¼ kS S1 (mm d1)(solved analytically) S2 ðtÞ ¼ S2 ðt 1Þ Q2 ðt 1Þ þ R t (mm d1) Q2 ¼ kG S2 (mm d1) (solved numerically) Alkalinity Concentration Equilibrium Routine yðtÞ ¼ ðyðt 1Þ y0 Þe þ ð1 e Þ þ y0 (eq l1) Linear shallow storage runoff Baseflow storage Linear baseflow storage runoff Bounded exponential growth model 6 of 19 (14) pS2 (mm) 4 5 0.28 0.27 0.29 0.19 0.28 0.32 0.49 0.36 0.24 0.00 0.33 0.35 0.36 0.35 0.40 0.43 0.45 0.44 0.38 0.04 0.44 0.38 0.41 0.45 0.43 0.59 0.50 0.60 0.51 0.04 0.46 0.59 0.53 0.53 0.44 0.71 0.68 0.72 0.63 0.44 SOF only present in the uplands model Implemented as passive mixing volumes in each storage component 0.53 0.63 0.55 0.64 0.60 0.62 0.62 0.66 0.44 0.19 Peff will be updated sequentially in soil buffer if snow melt is generated Implemented in shallow and baseflow storages, with y0 representing the input concentration of Peff and R in S1 and S2, respectively 3 a Peff and ET were updated sequentially while passing through the model routines. Models 4 (lowlands) and 5 (uplands) are the configurations chosen for the study catchments. The equations are presented in the order of flow through the model structures, corresponding to the conceptual model structures (Figure 3). Model Maximum Performance With Uplands Data for the Calibration Period (October 2008–September 2009) Model Maximum Performance With Lowlands Data for the Calibration Period (October 2008–September 2009) (15) (13) pS1 (mm) (12) (11) (10) (9) (6) 2 1 Model Number CAPELL ET AL.: CAN TRACERS REDUCE UNCERTAINTY IN RAINFALL-RUNOFF MODELS? NSE log(NSE) R2 Alkaline R2 2H R2 NSE log(NSE) R2 Alkaline R2 2H R2 Runoff signatures from SOF QSOF ðtÞ ¼ Peff ðtÞ (%) Peff ðtÞPeff ðtÞ þ S1 ðt 1Þ S1 ðt 1Þ þ pS1 Runoff signatures from shallow storage Q1 ðtÞ ¼ (%) Peff ðtÞ þ S 1 ðt 1Þ þ pS1 S1 ðtÞR þ S2 ðt 1Þ S2 ðt 1Þ þ pS2 Runoff signatures from baseflow storageQ2 ðtÞ ¼ (%) R þ S2 ðt 1Þ þ pS2 QSOF ðtÞQSOF ðtÞ þ Q1 ðtÞQ1 ðtÞ þ Q2 ðtÞQ2 ðtÞ Combined runoff signatures QðtÞ ¼ (%) QSOF þ Q1 ðtÞ þ Q2 ðtÞ Peff ðtÞ ¼ Isotope Mixing Model Routine PðtÞPðtÞ þ SNOWðt 1ÞSNOWðt 1Þ (%) Peff ðtÞ þ SNOWðt 1Þ PðtÞPðtÞ þ Peff ðt 1ÞSs ðt 1Þ Peff signatures from soil buffer storage Peff ðtÞ ¼ (%) Peff ðtÞ þ Ss ðt 1Þ Peff signatures in snow pack y0 (meq l1) (meq l1) (d1) kG (d1) kS (d1) kS (d1) aS (-) (3) (4) Shallow storage SOF runoff Soil buffer storage (2) (1) Q1 ¼ kS S11þaS (mm d1) Tthresh ( C) Dmelt (mm C1 d1) Parameter Nonlinear shallow storage runoff Snow pack retained if T < Tthresh , melt computed with: Peff ðtÞ ¼ Dmelt ðT Tthresh Þ Model Equation Runoff Generation, Storage Routine Ss,max (mm) Ss ðtÞ ¼ Ss ðt 1Þ þ Peff ðtÞ ETðtÞ t (mm d1) Peff ¼ P ðSs;max Ss Þ=t (mm d1) 1þaSOF 1 S1 cSOF ¼ 1 1 Smax (-), QSOF ¼ cSOF Peff (mm d ) aSOF (-) Smax (mm) 1 S1 ðtÞ ¼ S1 ðt 1Þ Q1 ðt 1Þ þ ðPeff ðtÞ QSOF Þ ETðtÞ R t (mm d ) R (mm d1) Day-degree snow routine Module Equation Number Table 2. Model Equations and Configurations Used in the Stepwise Development Process, and Performance Measures for Model Evaluationa W09544 W09544 W09544 CAPELL ET AL.: CAN TRACERS REDUCE UNCERTAINTY IN RAINFALL-RUNOFF MODELS? importance of shallow lateral flows relating to the high proportion of responsive, thin soils in these areas. SOF runoff is generated from Peff (mm) depending on the volume in the shallow storage S1 (mm). The fraction of Peff contributing to SOF, cSOF (-), is determined as a power function with the maximum storage parameter Smax (mm) and exponent aSOF (-). This gives the SOF runoff QSOF (mm) at each time step. The remaining fraction of Peff is routed to a linear shallow runoff store S1. [16] In the upland areas, previous tracer-based investigations revealed a dynamic contribution of surface or nearsurface lateral flow paths to stormflow but also a base flow component with fairly long turnover times [Capell et al., 2011, 2012]. The model development utilized this information as substantially under-estimated peak flows, or base flow depletion to zero, were simulated using for standard linear and nonlinear storage model structures (Models 1–3). This is reflected in poor overall performances of the rainfallrunoff modules for these model configurations (Table 2) and necessitated the integration of a separate, saturation-dependent runoff generation parameter (Table 2, equation (3)) [Seibert et al., 2003; Savenije, 2010]. This also improved the simulations of alkalinity and deuterium. The overall efficiency statistics for flow in the best models were still quite low, but this mainly reflected a crude approximation of snowmelt in topographically complex catchments like the Water of Mark [Kling and Nachtnebel, 2009]. [17] Similarly, whereas the standard model configurations (Models 1–3) performed better for the lowland catchment W09544 (Table 2), the incorporation of the soil storage parameter (Table 2, equation (2))—which was also informed by previous observations of tracer dynamics [Capell et al., 2011, 2012]—resulted in much better performance statistics for flow simulations. This store was needed to filter precipitation inputs and required to provide a threshold before runoff could be generated by upper and lower storages for storm runoff and base flow generation. The parameter also improved alkalinity predictions significantly and was essential in getting the deuterium variations close to measured values. The main failure in the flow simulations was the prediction of runoff peaks during the rewetting period at the end of the summer; however, it is likely that rainfall measurement errors in summer convective events are a reason for this, as well as model errors. [18] Figure 3 shows a conceptual diagram of the final model structures accepted for lowland (Model 4) and upland areas (Model 5). The models differentiate components representing near-surface storages and groundwater from deeper sources with longer flow paths. As discussed above, these were selected based on model performance evaluation, which is consistent with the landscape characteristics and the perceptual models developed from the a priori tracer analyses. For modeling at the catchment outlet, the two model structures were coupled and precipitation distributed according to the proportion of uplands and lowlands. The parameter sets for the coupled model were restricted to ranges identified as behavioral in the subcatchment models. The models were implemented using explicit forward approximation. An exception was the use of an analytical solution for the Figure 3. Conceptual model structures for uplands (a) and lowlands (b). Both structures incorporate a simple snow accumulation and melt module, active below a threshold air temperature (TT), and a linear base flow storage (S2). The upland model conceptualizes saturation excess overland flow (QSOF) as a function of the linear shallow storage (S1), whereas the lowland model features an additional soil storage (Ss) and a nonlinear shallow storage (S1). Both model structures use additional passive storage volumes for isotope tracer mixing (pS1 and pS2), which are not connected to the runoff generation. 7 of 19 W09544 CAPELL ET AL.: CAN TRACERS REDUCE UNCERTAINTY IN RAINFALL-RUNOFF MODELS? shallow linear storage Q1 (Table 2, equation (6)), as it was assumed this might improve simulations of processes where rapid subdaily runoff dynamics were observed. Although subsequent testing indicated that this did not result in significant improvements in the modeled runoff response, the analytical solution was retained for the final model. [19] A recession analysis of both upland and lowland catchments revealed little temporal variability in behavior for the low-flow end of the discharge recessions, which underpins the justification for two-storage model structures [Capell et al., 2012]. Master recession curves [Lamb and Beven, 1997; Fenicia et al., 2006] were constructed using 5 years of discharge data selecting all recession events with a minimum length of 6 days (Figure 4). In both catchments, exponential decay functions were fitted with coefficients of determination R2 of 0.99 (uplands) and 0.96 (lowlands), respectively. Based on these, the low-flow behavior in both model structures was conceptualized with linear groundwater storages S2 (Table 2, equations (7) and (8)) generating base flow from deeper groundwater sources. The regression parameters derived from the master recession curves were used in the model structures to constrain acceptable parameter ranges for the base flow storage coefficients kG (d1) in both landscapes (Table 3). In both models the recharge from the shallow storage S1 into S2 is also conceptualized as recharge parameter R (mm), assuming a constant recharge rate. Base flow runoff from these storages is computed identically to the shallow storage in the upland model (Table 2, equation (4)). 3.2. Tracer Incorporation [20] Previous analysis of tracer dynamics [Capell et al., 2011, 2012] helped identify important runoff generation mechanisms at larger scales and guide their conceptualization into simplified lumped models. To further evaluate appropriateness of the structures and identify feasible parameter ranges, tracer transport modules were also integrated into the modeling. A recession function between the W09544 geographic source area tracer alkalinity and the timing of the discharge recession was implemented to simulate stream water alkalinities and for model evaluation [Table 2, equation (9)]. The alkalinity recession characteristics were used to constrain parameter ranges, in the same way as the discharge recession parameters. Alkalinity is a proxy for acid neutralization capacity (ANC) and has been widely used as a source area tracer in Scottish upland catchments [e.g., Soulsby et al., 2003; Tetzlaff et al., 2008; Birkel et al., 2011b]. Alkalinity correlates with the concentration of weathering-derived solutes, thus is highest during low flows representing water from sources with longer contact times with minerogenic materials in the subsoil and aquifers. Lowest alkalinities occur during high flows when sources like lateral flow from peats and thin acidic soils with short contact times dominate the runoff. Thus, alkalinities can vary substantially between different landscape types and it is a useful tracer for detecting the temporal and spatial variation of runoff sources. In the North Esk, stream alkalinities ranged from 29 meq l1 to 552 meq l1 in the upland catchment and 670 meq l1 to 1547 meq l1 in the lowland catchment (Table 4). The higher stormflow alkalinities at the lowland site are indicative of the dominance of subsurface flow paths. Previous analysis of long-term data showed strong correlation of alkalinity and runoff in upland rivers [Tetzlaff et al., 2008]. Reasonable power law regressions representing flow-concentration relationships were found in the 1 year weekly data for up- and lowlands (R2 of 0.72 and 0.62, respectively) (Figure 5). These were used similarly to the master recession curves for discharge, to establish stream water alkalinity recession curves for the study period (October 2008 to September 2009) (Figure 5). A bounded exponential growth model was found to be a good approximation for the temporal dynamics of stream alkalinities y(t) (meq l1). A regression model of the form: yðtÞ ¼ y0 þ ð1 et Þ Figure 4. Master recession curves using 5 years daily discharge time series. (a) Upland catchment using 27 recession events, and (b) lowland catchment using 26 recession events. The exponential regression functions were fitted to the well-confined lower end of the curves, below the dotted horizontal lines. 8 of 19 W09544 CAPELL ET AL.: CAN TRACERS REDUCE UNCERTAINTY IN RAINFALL-RUNOFF MODELS? W09544 Table 3. Regression Parameters of Empirically Derived Alkalinity Relationships Including Standard Errors, Passive Mixing Volumes Necessary to Fit the 2H Stream Signatures, and Model Performance Measures Parameters/ Performances Upland Lowland Coupled Model at Outlet kG (d1) y0 (meq l1) (meq l1) (d1) pS1 (mm) pS2 (mm) NSE VE log(NSE) R2 Alkaline R2 2H R2 NSE 0.037 6 0.01 119 6 6 386 6 20 0.07 6 0.01 0 >100 0.53 (0.45) 0.57 (0.5) 0.63 (0.5) 0.55 (0.45) 0.64 (0.3) 0.60 (0.4) 0.46 0.027 6 0.01 950 6 12 346 6 36 0.08 6 0.02 >100 >200 0.71 (0.6) 0.73 (0.6) 0.68 (0.6) 0.72 (0.6) 0.63 (0.3) 0.44 (0.15) 0.62 – – – – – – 0.74 0.70 0.74 0.74 0.59 0.57 0.68/0.68 Empirically derived baseflow storage coefficient Alkalinity equilibrium model Deuterium transport model Model maximum performance (performance thresholds) Cross-evaluation (upland/lowland, outlet with both) Table 4. Summary Statistics for Deuterium and Alkalinity for the Sampling Year 2008 to 2009 Alkalinity (meq l1) Uplands Lowlands Outlet 2H (%) Precipitation Uplands Lowlands Outlet n Mean Minimum Maximum Standard Deviation 95th Percentile 5th Percentile 51 49 51 235 1123 573 29 670 253 552 1547 745 109 184 132 376 1329 733 44 784 298 51a 51 49 51 50.3 57.0 52.8 55.2 94.3 73.2 58.3 67.1 12.6 49.6 48.5 45.7 18.8 4.3 2.0 3.4 84.2 63.0 55.6 58.7 25.0 53.6 49.9 51.3 a Volume-weighted composite samples from two stations around the catchment. Figure 5. Empirical recession functions between stream water alkalinity and time during flow recession periods (period October 2008 to September 2009). (a) Upland catchment using seven recession events, and (b) lowland catchment using eight recession events. Inset: concentration–discharge relationship of weekly samples. A bounded exponentially growth model was used for fitting the alkalinity recessions, a power law model for the concentration–discharge relationships. 9 of 19 W09544 CAPELL ET AL.: CAN TRACERS REDUCE UNCERTAINTY IN RAINFALL-RUNOFF MODELS? was fitted, with y0 (meq l1) being the initial alkalinity as precipitation enters the subsurface, (meq l1) being the long-term equilibrium alkalinity, and (d1) giving the coefficient of increment toward equilibrium. In the hydrological model, a daily analytical solution of the bounded exponential growth was implemented using the same parameters (equation (9) in Table 2). [21] At each time step, alkalinity of Peff entering the subsurface storages is initialized as y0, as the storage volumes are updated according to equation (7) in Table 2 and mixed with incoming Peff and recharge R. The exponential growth model (Figure 5) approximates alkalinity recession dynamics very well (R2 of 0.99 (uplands) and 0.96 (lowlands)). However, some of the alkalinity variation is masked by the estimation of daily alkalinities using the concentration discharge relationship and the subsequent loss of variability found in the measured weekly samples. Nevertheless, the samples covered a wide range of high and low flows— therefore, viewed as representative of the overall dynamics—and the obtained parameters were then used in the model structure to constrain parameter ranges (Table 3). [22] The model was extended to facilitate the mixing and routing of the stable isotope 2H as a time domain tracer through the storages between precipitation entering and discharge leaving the system. This was used to estimate the total storage volumes of the catchment models. 2H dynamics are characterized by strong intra-annual variability in precipitation (standard deviation of 18.8 % for the year). This is attenuated in the stream water signatures caused by internal mixing processes (Table 4). Variability of 2H in stream water varies spatially, with stronger attenuation where larger mixing volumes and longer transit times are involved in runoff generation. In the lowland catchment, the attenuation of 2H in stream water (2.0 % std. dev.) is much stronger than in the uplands (4.3 % std. dev.) indicating the influence of larger subsurface storage volumes. These dynamics were used as an independent tool to evaluate the storage mixing volumes necessary to be able to simulate the isotope response in the stream. Passive (i.e., not contributing to runoff generation) mixing volumes pS1 and pS2, freely parameterized, were assigned to the linear and nonlinear storages (Table 2, equations (13) and (14); Figure 3) to incorporate the mixing of 2H signatures with modeled stored water volumes. [23] The mixing model routine is defined in equations (10) to (15) of Table 2. The 2H signatures of modeled storage and runoff components in are denoted in deviations from VSMOW (%): P (in precipitation, model input variable), Ss (in soil buffer storage), Peff (effective precipitation from snow pack and soil buffer), S1 (in shallow storage), S2 (in base flow storage), QSOF (in SOF runoff), Q1 (in shallow storage runoff), Q2 (in base flow storage runoff), and Q (modeled runoff). All storage component and internal flow signatures were recalculated iteratively in each time step using the volumetric model components, adding the passive mixing volumes for shallow and base flow storages. The isotope mixing equations were solved numerically in the model code. For this study, the 2H was considered to be conservative; no fractionation processes were included and instantaneous and complete mixing of isotope tracers in each storage compartment was assumed [Dunn et al., 2010; Birkel et al., 2011b]. The model W09544 storages were initialized as being empty and a 3 year warm-up period was run to fill and equilibrate the model storage volumes. The 2H signatures of the model storages (Ss, S1, S2) were initialized with the average 2H signature measured in stream water during the 1 year tracer survey, a method commonly used in transit time modeling [e.g., Hrachowitz et al., 2011]. 4. Model Calibration and Parameter Identifiability [24] Model performance was assessed using multiobjective, multicriteria evaluation based on Monte Carlo (MC) random sampling (106 realizations) during the calibration year 2008–2009. The performance of the runoff dynamics was evaluated using four objective evaluation criteria: the Nash-Sutcliffe efficiency NSE [Nash and Sutcliffe, 1970] together with the NSE of log-scaled runoff, log(NSE), the coefficient of determination R2, and the volumetric efficiency (VE) [Criss and Winston, 2008], which results in an unbiased evaluation of volumetric errors as runoff volume deviations are not weighted depending on the flow conditions (Table 2). The performance of the alkalinity and isotope tracer dynamics were assessed using the coefficient of determination R2, thus focusing on the model performance of the simulated temporal dynamic and accounting for systematic concentration biases (Table 2). [25] Model calibration was based on MC sampling of parameter sets. Initially, MC sample sets were computed applying random sampling from wide parameter ranges using a uniform distribution (Table 3, Table 5). The model performances derived from these MC samples were used as an objective measure during the iterative testing of model configurations. Following the selection of the upland and lowland model structures based on the evaluation criteria and prior knowledge, a stepwise integration of multicriteria evaluation measures [Khu et al., 2008] was then used to evaluate the performance of the parameter combinations of the two models using: (1) the runoff dynamics, (2) the additional tracer performances, and (3) the parameter ranges from the independently derived recession functions for base flow storage coefficients and the alkalinity equilibrium model (Table 5). Efficiency performance thresholds were subjectively established for all four rainfall-runoff criteria to identify behavioral parameter sets with acceptable scores for each. The critical threshold value was set as 0.6 for all criteria in the lowlands, and as 0.5 (VE, log(NSE)) and 0.45 (NSE, R2) corresponding to the weaker maximum performances of the upland model (Table 4). These probably reflect the complex nature of the Water of Mark catchment with marked environmental gradients and topographic complexity (Figure 1), which resulted in large uncertainties over precipitation inputs and other hydroclimatic drivers including snowmelt. The resulting parameter sets were further constrained by introducing additional information to the performance evaluation procedure, seeking to reduce the subjectivity in the selection of behavioral parameter sets [Seibert and McDonnell, 2002]. For this, tracer performance criteria and the parameter ranges established by the alkalinity-recession functions were used. Tracer model performance was evaluated using threshold criteria (Table 3). Behavioral parameter ranges for the storage coefficients kG 10 of 19 W09544 CAPELL ET AL.: CAN TRACERS REDUCE UNCERTAINTY IN RAINFALL-RUNOFF MODELS? W09544 Table 5. Model Parameters for Upland and Lowland Model Structures With Parameter Ranges for Initial Monte Carlo Simulations and Accepted Performance Envelopes Under Rainfall-Runoff Performance and Additional Tracer Performance Constraintsa Upland Lowland Initial Parameter Sampling Range Parameter Range, Rainfall-Runoff Constrained Parameter Range, Constrained by Recession Functions Initial Parameter Sampling Range Parameter Range, Rainfall-Runoff Constrained Parameter Range, Constrained by Recession Functions Tthresh ( C) Dmelt (mm 1 1 C d ) 2–3 0.1–7 1.2–3 1.1–7 Snow Module 1.3–3 1.5–6.6 2–2 0.1–7 2–2 0.1–7 1.9–1.65 0.12–5.4 Smax (mm) aSOF (-) aS (-) Ss,max (mm) kS (d1) R (mm d1) kG (d21) 10–400 0–5 0.001–0.5 1–10 0.0001–0.15 47–400 0.0003–5 n.a. n.a. 0.09–0.35 1.9–10 0.001–0.05 0.09–0.32 2.3–10 0.001–0.04 0–1 1–1000 0.0001–0.4 0.1–10 0.00002–0.3 n.a. n.a. 0.001–0.99 1–1000 0.001–0.25 1.1–9.9 0.004–0.049 0.14–0.83 8–970 0.01–0.15 3.5–9.0 0.004–0.017 y0 (meq l21) a (meq l-1) b (d21) 25–200 100–700 0.02–0.11 25–200 100–700 0.02–0.11 Alkalinity Module 107–131 346–424 0.05–0.09 500–1500 150–750 0.04–0.12 500–1500 150–750 0.04–0.12 926–974 274–418 0.04–0.11 0–500 10–1000 0–500 10–1000 Deuterium Storage Module 0–498 84–997 1–1000 10–5000 3–997 10–2250 134–997 168–2100 Model Parameter pS1 (mm) pS2 (mm) Runoff Generation Module 116–400 0.0003–5 a Parameters that are constrained by empirical recession functions are highlighted in bold. and the alkalinity model parameters y0, , and were constrained using the empirically derived regression parameters with boundary estimates based on the regression standard errors (Table 3). The resulting parameter space is given in Table 5. [26] The rainfall-runoff performance of the final model structures was considered reasonable with maximum NSEs of 0.71 in the lowlands and 0.53 in the uplands (Table 2). Evaluation MC runs in the hydrological year preceding the sampling period (October 2007–September 2008), yielded similar maximum performances with the identified parameter ranges. A cross-evaluation, forcing the uplands and lowlands model structures with data from the other respective subcatchment decreased the model performances (maximum NSE: 0.62 in the lowlands and 0.44 in the uplands) and confirmed the importance of the specific model structures for different catchment landscapes (Table 3). The ability of both models to simulate the observed tracer dynamics was also regarded as reasonable [maximum R2 of 0.64 and 0.63 (alkalinity), 0.44 and 0.60 (2H) in lowlands and uplands, respectively]. The weaker performances of the modeled tracer dynamics were accepted given the simplifying conceptualizations necessary to integrate the tracers into the lumped model schemes and the lower information content of the measured data because of the lower sampling density of the weekly tracer survey. [27] Parameter identifiability under increasing constraints was assessed using cumulated efficiencies of the sampled parameter space [Hornberger and Spear, 1981]. For the rainfall-runoff performance, an average efficiency score E was calculated from the averaged scores of NSE, log(NSE), and R2. For alkalinity and 2H model performances, R2 scores were used directly: E¼ NSE þ log ðNSEÞ þ R2 3 [28] The cumulative efficiencies were calculated as the accumulated, normalized difference between the efficiency score and the chosen threshold values. They represent empirical cumulative distribution functions of the efficiency score. Figures 6 and 7 indicate increasing parameter identifiability with added constraints for the majority of the model components for the uplands and lowlands, respectively. Steep segments of the cumulative curves represent high-efficiency scores for the parameter value given on the x axis, whereas plateaus indicate poor efficiency scores. [29] In the uplands (Figure 6), the multicriteria calibration increases the identifiability of the model parameters. Alkalinity parameters are only constrained after the introduction of the recession function, but this, in turn, constrains other parameters; notably, recharge R and the nonlinearity factor for the generation of SOF runoff, aSOF. The parameterization of the 2H mixing model is generally poorly constrained in the upland catchment, the occurrence of two-steep segments in the shallow passive storage Sp, however, indicates a bimodal distribution with highest performances at small passive storages, reflecting strong short-term dynamics in the 2H response, and at large passive storages consistent with a longer-termed damped response. [30] In the lowland catchment (Figure 7), the snow module parameter Tthresh and Dmelt show no sensitivity to 11 of 19 W09544 CAPELL ET AL.: CAN TRACERS REDUCE UNCERTAINTY IN RAINFALL-RUNOFF MODELS? W09544 Figure 6. Cumulated efficiency C (-) of parameter ranges under increasing constraints (sensitivity analysis) for the upland subcatchment. Bold lines show the full sampled parameter space, thin lines include a runoff performance threshold, and dashed lines include constraints derived from empirical recession functions. performance constraints as snow is much less significant here compared to the uplands. After constraining the rainfall-runoff criteria the identifiability of base flow coefficient kG and the recharge R increases noticeably. The constraints added by the recession functions enhances the identifiability of kG and R. Additionally, the identifiability of maximum soil storage Ss,max is greatly increased below 200 mm, probably because of short term variability observed in the stream water 2H signature. The 2H model shows a bimodal distribution for pS1, comparable to the uplands. However, slow performance increase at low values for pS1 indicates that below 100 mm passive storage the 12 of 19 W09544 CAPELL ET AL.: CAN TRACERS REDUCE UNCERTAINTY IN RAINFALL-RUNOFF MODELS? W09544 Figure 7. Cumulated efficiency C of parameter ranges under increasing constraints (sensitivity analysis) for the lowlands subcatchment. Bold lines show the full sampled parameter space, thin lines include a runoff performance threshold, and dashed lines include constrains derived from empirical recession functions. 2H is not sufficiently damped. Thus, minimum transit times even at high flows are longer than in the uplands where the 2H model performs equally well with smaller passive storages. [31] For the catchment outlet, where outputs from the contrasting REWs are integrated in the observed runoff dynamics, the two model structures were coupled and parameterized using the constrained parameter ranges identified in the two subcatchments (Table 5). Percentage geology was chosen to distinguish between the two main landscape units. Precipitation input was then distributed between both modeled regions based on the proportion of sedimentary bedrock in the lowlands (38%) and metamorphic and igneous bedrocks in the uplands (62%). The distribution parameter was allowed to vary 65% but no identifiable performance difference was detected. MC simulations (106 random samples) were used to evaluate the model performance. The performance of the coupled model with the a priori constrained 13 of 19 W09544 CAPELL ET AL.: CAN TRACERS REDUCE UNCERTAINTY IN RAINFALL-RUNOFF MODELS? parameter ranges (Table 5) is comparable (maximum NSE 0.74) to those of the models in up- and lowlands. The tracer model performances are also reasonable (maximum R2 of 0.59 (alkalinity) and 0.57 (2H)). A cross-evaluation using both model structures separately to model the outlet runoff gave maximum NSE values of 0.68 for each, which is an acceptable if slightly poorer performance but does not account for the known differences in runoff processes in the contrasting catchment parts. 5. Modeled Hydrographs and Tracer Responses [32] Figure 8 shows the precipitation, observed and final modeled hydrographs and tracer responses in the up- and lowland subcatchments and the integrating outlet for the study period 2008–2009. The modeled envelopes are defined by the 5th and 95th percentiles of simulation values for all accepted parameter sets using the constraints described above. Despite similar climate forcing, the upland hydrograph shows a more rapid and intense runoff response to storm events, and a steeper decline to base flow (Figure 8a). In contrast, the lowland catchment hydrograph shows a much more subdued response with smaller peak flows and limited direct runoff reaction to smaller precipitation events, particularly in summer (Figure 8b). The outlet catchment hydrograph visibly integrates both aspects; the upland influence is evident in the runoff peaks, but these are moderated in smaller events and more pronounced base flow recession periods are evident (Figure 8c). Precipitation is fairly evenly distributed throughout the year though the biggest storm events occur in autumn and winter. The intra-annual runoff dynamics follow the precipitation distribution, and snow melt events are responsible for large winter and spring floods in late December 2008 and mid-February 2009. [33] The observed stream water alkalinity dynamics are strongly influenced by dominant soils and geology. The range varied between 400 meq l1 in the upland and 1000 meq l1 in the lowland catchment, but both with pronounced temporal dynamics reflecting the active differentiation of geographic sources (Figure 8). Again, the catchment outlet shows clear mixing as it integrates the subcatchment responses. The 2H signatures observed in the precipitation show strong temporal variation, with 2H concentrations being most depleted in winter and most enriched in summer. In comparison, the signature in runoff is strongly attenuated but shows a pronounced response to storm events in the uplands, which propagates to the larger-scale catchment outlet, whereas the lowlands dynamics are much more attenuated. [34] The modeled runoff time series envelopes bracket the observed dynamics quite well at most times in the uplands. Peak flows in the uplands are sometimes overestimated in moderate-sized events and the timing and rate of snowmelt are only crudely captured as the melt in December is delayed whereas the February melt proceeds too quickly. Nevertheless, the timing of the onset of stormflow and the recession into base flow are generally well matched (Figure 8). This applies also for the lowland catchment, where the modeled envelope neatly follows the recession period in March to June 2009 particularly well. However, a major precipitation event in July 2009 showed little response in the observed runoff, which the model overestimates. This may reflect the model poorly capturing the W09544 rewetting process, or it may be that the precipitation event is overestimated if a more localized convective summer precipitation event was erroneously extrapolated to a larger area (Figure 8b). The hydrograph dynamics of the larger catchment outlet are captured well by the coupled model, even though some more moderate flow fluctuations observed in July and August 2009 are not entirely bracketed by the runoff envelopes. [35] The modeled alkalinity envelopes capture the observed dynamics in the up- and lowlands, though some subtle variations are missed. (Figures 8a and 8b). For example, in the uplands the model’s failure to capture the length of the February melt results in over estimation of alkalinities. Conversely, in the lowlands, low-flow alkalinity at the start of the year is systematically under predicted, despite good flow simulations, implying that the end-member was not as well-defined, as assumed. However, model failures are much more apparent at the larger scale of the catchment outlet where the model systematically underestimates alkalinity concentrations, especially at low flows (Figure 8c). This is a limitation that is mainly caused by the transfer of constrained parameter ranges from the separately fitted upland and lowland models to the coupled model. More extensive tracer surveys indicate that some lowland tributaries in close proximity had higher alkalinities than those in the Luther Water as a result of local geological differences, which violates the mixing assumption implicit in the parameter transfer and points to limitations in using alkalinity, or indeed other weathering-based tracers, in larger catchments. [36] The results of the 2H models are ambiguous. On the one hand, the precipitation signal is sufficiently damped in all three catchments and the models capture the most dynamic characteristics of the relationship between 2H in precipitation and discharge at the more responsive upland (Figure 8a) and whole catchment (Figure 8c) scales. On the other hand, there is a notable systematic overestimation of the short-term variation in the summer season in all models, but most notably in the lowlands, probably relating to the selective depletion of the modeled shallow storage components by evapotranspiration and subsequent runoff generation of more enriched and less-well-mixed precipitation, when much more complex mixing processes are likely to occur in reality. In the lowlands, the lack of dynamic stream response provides limited variability for the model to track, however, in the winter the model underestimates the deuterium values, again implying that the precipitation signal is insufficiently well mixed in the model. However, this is not evident at the large catchment scale, as the influence of the upland headwaters provides a larger volume of water where isotopic values are better predicted and also exhibits variability that the integrated model is able to capture. 6. Discussion and Wider Implications [37] This study provided an opportunity to combine both geographic source area tracers and time domain tracers to aid the development of rainfall-runoff models in large catchments with contrasting headwaters. Specific traceraided models were developed for REWs for upland and lowland tributaries and then integrated to simulate flow and tracer responses in the larger catchment. The integration of such alternative sources of ‘‘soft’’ or ‘‘orthogonal’’ data is 14 of 19 Figure 8. Observed and modeled time series of runoff, alkalinity, and 2H. Modeled time series as 5th to 95th percentile accepted model result envelopes : (a) Water of Mark subcatchment and upland model, (b) Luther Water subcatchment with lowland model, and (c) integrating North Esk catchment using the coupled outlet model. W09544 CAPELL ET AL.: CAN TRACERS REDUCE UNCERTAINTY IN RAINFALL-RUNOFF MODELS? 15 of 19 W09544 W09544 CAPELL ET AL.: CAN TRACERS REDUCE UNCERTAINTY IN RAINFALL-RUNOFF MODELS? increasingly viewed as an efficient, viable means of constraining structural and parameter uncertainty to produce more realistic process representation in hydrological models [Seibert and McDonnell, 2002]. This is an area of considerable potential as the use of multiple model schemes like FUSE [e.g., Clark et al., 2008, 2010; McMillan et al., 2012] and FLEX is becoming possible and high-resolution tracer data sets are becoming more readily available. These approaches are consistent with the philosophy of using models as learning tools for understanding catchment functioning [Beven, 2007; Dunn et al., 2008b]. Moreover, integrating additional data into models, while maintaining low levels of parameterization, allows the capabilities and limitations of simple conceptualizations of hydrological systems to be explored [Kirchner, 2006]. In larger catchments with marked differences in landscape evolution history, tracers can help to identify changes in dominant runoff generation mechanisms in the main REWs and inform the choice of model structures. Given the simplicity of the lumped approach used, together with the size of the study catchments, the associated uncertainties in measurements and catchment heterogeneity, the accepted model configurations generally performed quite well. Some important insights emerge from the work, which highlight some of the challenges and possible limitations to tracer-aided modeling at these larger scales. [38] As well as aiding the development of plausible model structures, tracer data provided an effective means for constraining model parameters for the groundwater subroutine (Table 5). The additional information content of the alkalinity data was utilized by the simple alkalinity-recession function and the associated equilibrium conceptualization (Figure 5) used to estimate concentrations in the model storage components, thus, providing additional constraints on the dynamics of runoff sources. [39] The use of 2H signatures, which are strongly dampened after passage through the catchment, highlighted the incapacity of simple storage structures to sufficiently provide mixing volumes especially during stormflow generation (see 2H model performances in Table 2). This led to the integration of buffering volumes and storm runoff mixing mechanisms into the model concepts. This, however, also highlights the limitation of lumped storage conceptualizations as a computational basis for the modeling of larger catchments as their structure necessitates simplifying assumptions on transport and mixing processes in the subsurface. This was particularly apparent in simulating the lowland deuterium response, where the low signal-to-noise ratio provided limited tracer variation in streams and a weak test of the models ‘‘goodness of fit.’’ Moreover, the simple representation of ET losses, mixing and drainage from the upper soil store particularly in the lowlands seemed to result in the systematic underestimation of the damping of precipitation inputs [e.g., Dunn et al., 2008a]. This led to the model over-estimating 2H values in the summer and under-estimating them in the winter. Improvements could result from inclusion of additional parameters to overcome the assumption of complete mixing, which is unrealistic [e.g., Kirchner et al., 2000]. However, this would move the models to greater complexity and possible increase problems of identifiability. [40] Elucidating the nature of mixing in the main stores represents a potentially fertile area in integrating empirical W09544 data with catchment scale models [McNamara et al., 2011]. Recent regional studies in the Scottish Highlands have shown that tracer-based assessment of the minimum catchment storage needed to cause tracer based damping often implies storages (103 mm) two orders of magnitudes greater than the storage changes caused by the annual water balance (101 mm) [Soulsby et al., 2011]. In the case of the present study, the different storage estimates identified with the 2H integration into the landscape-specific models are consistent with known characteristics of the lowlands Dochartaigh bedrock aquifer in the catchment lowlands [O et al., 2006], whereas the upland bedrock is considered largely impermeable and groundwater is restricted to fractures and shallow drifts [e.g., Soulsby et al., 2000, 2006b]. The mixing volumes of each storage component for the 2H mixing model (pS1 and pS2, Table 3) showed lower performance limits, which allowed for an approximation of the minimum subsurface mixing volume in each landscape. The minimum storage estimates in the model components generating runoff were identified as at least 100 mm passive mixing volume in the uplands model and at least 300 mm in the lowlands model. The volumes are an order of magnitude greater than the dynamic model storages, and are higher by at least a factor of two in the lowlands. [41] In this study, we implemented the models using fixed step forward approximation in almost all model components, the only exception was the linear shallow storage (Table 2, equation (6)). It has recently been stressed how numerical implementation can have a significant influence on the stability of parameter performances, as well as the size of internal fluxes and storage volumes, which can be important, especially when comparing performances of different model structures [Kavetski and Clark, 2010]. The chosen implementation can lead to instability of performance ranges and thus misleading acceptability of parameter sets because of artifacts of numerical approximation, which adds to uncertainty in the internal storage estimates and, more generally, in the validity of the performance levels of parameter ranges [Clark and Kavetski, 2010]. Although a full exploration of these issues was beyond the scope of this study, some testing in the most dynamic upland catchment indicated that similar simulations were given by both explicit-forward and analytical solutions for the model structure used here. Moreover, the acceptable tracer simulations provide additional reassurance that the uncertainties in the internal fluxes are reasonably constrained. [42] The combination of models for the upland and lowland REWs were able to simulate the flow and isotope response quite well at the larger catchment scale. However, also some of the limitations of using tracers at larger scales became apparent. As with the lowlands, the low signal-tonoise ratio in the damped stream water isotope response, together with the weekly data, give only a weak basis to test the models capabilities. More serious are the limitations of alkalinity as a source area tracer at larger scales. The observed stream alkalinity at the catchment integrates the spatial end members of the total catchment area. Although the investigated subcatchments covered around 25% of the total catchment area, local geological and land use differences elsewhere in the lowlands resulted in inputs of more alkaline water [cf. Capell et al., 2011], thus the end-member space was not adequately sampled [cf. Hooper et al., 1990] 16 of 19 CAPELL ET AL.: CAN TRACERS REDUCE UNCERTAINTY IN RAINFALL-RUNOFF MODELS? W09544 in the approach used. Although the large-scale alkalinity dynamics were crudely captured, the end-member values were poorly constrained despite the broad envelope of the simulations (Figure 7). Furthermore, even though stream water alkalinity can be considered a conservative tracer at event timescales [e.g., Soulsby et al., 2003], the alkalinity of geographical sources will change toward equilibration endpoints over longer time periods and on larger scales with many spatially disconnected sources, the sensitivity of alkalinity as end-member mixing tracer becomes increasingly limited. An alternative would be to use an increased number of end members, and/or end members defined by multivariate characterization of whole ion chemistries [Capell et al., 2011]. However, at larger scales, similar problems of multiple sources and nonconservative behavior mean that use in simple model conceptualizations would be extremely challenging. 7. Conclusion [43] Insights gained from this study have been instructive in learning about the conceptualizations needed for modeling runoff generation and time domain and source area tracer transport in contrasting REWs and their integration at larger catchment scales [Dunn et al., 2008b]. This, in turn, informs future challenges in such efforts to improve process-based models that retain low levels of parameterization and are based on relatively simple, integrated field measurements. Unfortunately, it seems that improvements are contingent upon either increased model complexity or tracer data that is collected at a higher temporal and spatial resolution. The former includes new parameterization to account for processes affecting isotopic tracers (e.g., partial mixing, fractionation, etc.), or more fundamentally better representation of snowmelt in headwater areas. The latter includes more intense stream water sampling on daily or subdaily time steps; for isotope tracers, this is becoming increasingly feasible and cost effective with new technological developments [Berman et al., 2009]. This would allow a better test of modeling capabilities, especially where the tracer damping results in reduced variability in weekly samples. In addition, measurements of tracer signals in hydrological stores that reflect the internal model variables (i.e., soil water, shallow and deeper groundwater, etc.) might help to identify the limitations of the current model structure [Seibert et al., 2003; Birkel et al., 2011b]. This would have the advantage of allowing tracer signals to be tracked through different stores providing new model diagnostics [cf. McMillan et al., 2012]. Such characterization is obviously more easily envisaged in smaller REWs, and the development of tracer-aided models at larger scales is likely to remain challenging and frustrated by greater uncertainty. Notation P Peff ET QSOF Ss catchment-wide precipitation, mm effective precipitation, i.e., precipitation amount after passage through snow and soil layers, mm actual evapotranspiration, mm modeled saturation excess overland flow runoff, mm modeled storage in soil layer, mm S1 Q1 R S2 Q2 W09544 modeled shallow storage, mm modeled runoff from shallow storage, mm modeled recharge from shallow to base flow storage (freely parameterizable), mm modeled base flow storage, mm modeled runoff from base flow storage, mm [44] Acknowledgments. This work is supported by a University of Aberdeen, College of Physical Sciences Ph.D. studentship. The authors would like to acknowledge discharge data provision by the Scottish Environmental Protection Agency (SEPA) and precipitation data provision by the Metoffice via the British Atmospheric Data Centre (BADC). The authors are grateful to Christian Birkel for helpful comments on an earlier draft of the manuscript. We are also grateful to Martyn Clark and three anonymous reviewers for their comments in the peer revision process. References Ali, G., D. Tetzlaff, C. Soulsby, J. J. McDonnell, and R. Capell (2012), A comparison of similarity indices for catchment classification using a cross-regional dataset, Adv. Water Resour., 40(0), 11–22, doi:10.1016/ j.advwatres.2012.01.008. Bergström, S. (1975), The development of a snow routine for the HBV-2 model, Nord. Hydrol., 6(2), 73–92, doi:10.2166/nh.1975.006. Berman, E. S. F., M. Gupta, C. Gabrielli, T. Garland, and J. J. McDonnell (2009), High-frequency field-deployable isotope analyzer for hydrological applications, Water Resour. Res., 45(10), W10201, doi:10.1029/ 2009WR008265. Beven, K. (2000), Uniqueness of place and process representations in hydrological modelling, Hydrol. Earth Syst. Sci., 4(2), 203–213, doi:10.5194/ hess-4-203-2000. Beven, K. (2001), How far can we go in distributed hydrological modelling?, Hydrol. Earth Syst. Sci., 5(1), 1–12, doi:10.5194/hess-5-1-2001. Beven, K. (2007), Towards integrated environmental models of everywhere: Uncertainty, data and modelling as a learning process, Hydrol. Earth Syst. Sci., 11, 460–467, doi:10.5194/hess-11-460-2007. Beven, K. (2012), Rainfall-Runoff Modeling—The Primer, 2nd ed., WileyBlackwell, Hoboken, NJ. Birkel, C., D. Tetzlaff, S. M. Dunn, and C. Soulsby (2010), Towards a simple dynamic process conceptualization in rainfall-runoff models using multi-criteria calibration and tracers in temperate, upland catchments, Hydrol. Process., 24(3), 260–275. doi:10.1002/hyp.7478. Birkel, C., D. Tetzlaff, S. M. Dunn, and C. Soulsby (2011a), Using time domain and geographic source tracers to conceptualize streamflow generation processes in lumped rainfall-runoff models, Water Resour. Res., 47, W02515, doi:10.1029/2010WR009547. Birkel, C., D. Tetzlaff, S. M. Dunn, and C. Soulsby (2011b), Using lumped conceptual rainfall-runoff models to simulate daily isotope variability with fractionation in a nested mesoscale catchment, Adv. Water Resour., 34(3), 383–394, doi:10.1016/j.advwatres.2010.12.006. Boorman, D. B., J. M. Hollis, and A. Lilly (1995), Hydrology of soil types: A hydrological classification of the soils of the United Kingdom, Institute of Hydrology Report 126, Institute of Hydrology, Wallingford, U. K. Buttle, J. (2006), Mapping first-order controls on streamflow from drainage basins: The T3 template, Hydrol. Process., 20(15), 3415–3422, doi:10.1002/ hyp.6519. Capell, R., D. Tetzlaff, I. A. Malcolm, A. J. Hartley, and C. Soulsby (2011), Using hydrochemical tracers to conceptualise hydrological function in a larger scale catchment draining contrasting geologic provinces, J. Hydrol., 408, 164–177, doi:16/j.jhydrol.2011.07.034. Capell, R., D. Tetzlaff, A. J. Hartley, and C. Soulsby (2012), Linking metrics of hydrological function and transit times to landscape controls in a heterogeneous mesoscale catchment, Hydrol. Process., 26(3), 405–420, doi:10.1002/hyp.8139. Carrillo, G., P. A. Troch, M. Sivapalan, T. Wagener, C. Harman, and K. Sawicz (2011), Catchment classification: Hydrological analysis of catchment behavior through process-based modeling along a climate gradient, Hydrol. Earth Syst. Sci., 15(11), 3411–3430, doi:10.5194/hess-153411-2011. Clark, M. P., and D. Kavetski (2010), Ancient numerical daemons of conceptual hydrological modeling: 1. Fidelity and efficiency of time stepping schemes. Water Resour. Res., 46(10), W10510, doi:10.1029/ 2009WR008894. 17 of 19 W09544 CAPELL ET AL.: CAN TRACERS REDUCE UNCERTAINTY IN RAINFALL-RUNOFF MODELS? Clark, M. P., A. G. Slater, D. E. Rupp, R. A. Woods, J. A. Vrugt, H. V. Gupta, T. Wagener, and L. E. Hay (2008), Framework for understanding structural errors (FUSE): A modular framework to diagnose differences between hydrological models, Water Resour. Res., 44(8), W00B02, doi:10.1029/2007WR006735. Clark, M. P., H. K. McMillan, D. B. G. Collins, D. Kavetski, and R. A. Woods (2010), Hydrological field data from a modeller’s perspective: Part 2. Process-based evaluation of model hypotheses, Hydrol. Process., 25(4), 523–543, doi:10.1002/hyp.7902. Criss, R., and W. Winston (2008), Do Nash values have value? Discussion and alternate proposals, Hydrol. Process., 22(14), 2723–2725, doi:10.1002/ hyp.7072. Didszun, J., and S. Uhlenbrook (2008), Scaling of dominant runoff generation processes: Nested catchments approach using multiple tracers, Water Resour. Res., 44, W02410, doi:10.1029/2006WR005242. Dunn, S. M., J. R. Bacon, C. Soulsby, D. Tetzlaff, M. I. Stutter, S. Waldron, and I. A. Malcolm (2008a), Interpretation of homogeneity in delta O-18 signatures of stream water in a nested sub-catchment system in north-east Scotland, Hydrol. Process., 22(24), 4767–4782, doi:10.1002/hyp.7088 Dunn, S., J. Freer, M. Weiler, M. Kirkby, J. Seibert, P. Quinn, G. Lischeid, D. Tetzlaff, and C. Soulsby (2008b), Conceptualization in catchment modelling: Simply learning?, Hydrol. Process., 22(13), 2389–2393, doi:10.1002/hyp.7070. Dunn, S. M., C. Birkel, D. Tetzlaff, and C. Soulsby (2010), Transit time distributions of a conceptual model: Their characteristics and sensitivities, Hydrol. Process., 24(12), 1719–1729, doi:10.1002/hyp.7560. Fenicia, F., H. H. G. Savenije, P. Matgen, and L. Pfister (2006), Is the groundwater reservoir linear? Learning from data in hydrological modelling, Hydrol. Earth Syst. Sci., 10(1), 139–150. Fenicia, F., H. H. G. Savenije, P. Matgen, and L. Pfister (2008a), Understanding catchment behavior through stepwise model concept improvement, Water Resour. Res., 44 W01402, doi:10.1029/2006WR005563. Fenicia, F., J. McDonnell, and H. Savenije (2008b), Learning from model improvement: On the contribution of complementary data to process understanding, Water Resour. Res., 44(6), W06419, doi:10.1029/2007WR 006386. Fenicia, F., S. Wrede, D. Kavetski, L. Pfister, L. Hoffmann, H. H. G. Savenije, and J. J. McDonnell (2010), Assessing the impact of mixing assumptions on the estimation of streamwater mean residence time, Hydrol. Process., 24(12), 1730–1741, doi:10.1002/hyp.7595. Fröhlich, H., L. Breuer, K. Vache, and H. Frede (2008), Inferring the effect of catchment complexity on mesoscale hydrologic response, Water Resour. Res., 44(9), W09414, doi:10.1029/2007WR006207. Gharari, S., M. Hrachowitz, F. Fenicia, and H. H. G. Savenije (2011), Hydrological landscape classification: Investigating the performance of HAND based landscape classifications in a Central European meso-scale catchment, Hydrol. Earth Syst. Sci., 15(11), 3275–3291, doi:10.5194/ hess-15-3275-2011. Gleeson, T., and A. H. Manning (2008), Regional groundwater flow in mountainous terrain: Three-dimensional simulations of topographic and hydrogeologic controls, Water Resour. Res., 44, W10403, doi:10.1029/ 2008WR006848. Haria, A. H., and P. Shand (2004), Evidence for deep sub-surface flow routing in forested upland Wales: Implications for contaminant transport and stream flow generation, Hydrol. Earth Syst. Sci., 8(3), 334–344. Hooper, R. P., N. Christophersen, and N. E. Peters (1990), Modelling streamwater chemistry as a mixture of soilwater end-members—An application to the Panola Mountain catchment, Georgia, U.S.A., J. Hydrol., 116, 321–343, doi:16/0022-1694(90)90131-G. Hornberger, G. M., and R. C. Spear (1981), An approach to the preliminary— analysis of environmental systems. J. Environ. Manage., 12(1), 7–18. Hrachowitz, M., C. Soulsby, D. Tetzlaff, J. J. C. Dawson, S. M. Dunn, and I. A. Malcolm (2009), Using long-term data sets to understand transit times in contrasting headwater catchments, J. Hydrol., 367, 237–248, doi:10.1016/j.jhydrol.2009.01.001. Hrachowitz, M., C. Soulsby, D. Tetzlaff, and I. A. Malcolm (2011), Sensitivity of mean transit time estimates to model conditioning and data availability, Hydrol. Process., 25(6), 980–990, doi:10.1002/hyp.7922. Kavetski, D., and M. P. Clark (2010), Ancient numerical daemons of conceptual hydrological modeling: 2. Impact of time stepping schemes on model analysis and prediction. Water Resour. Res., 46(10), W10511, doi:10.1029/2009WR008896. Kendall, C., and T. Coplen (2001), Distribution of oxygen-18 and deuterium in river waters across the United States, Hydrol. Process., 15(7), 1363– 1393. W09544 Khu, S.-T., H. Madsen, and F. di Pierro (2008), Incorporating multiple observations for distributed hydrologic model calibration: An approach using a multi-objective evolutionary algorithm and clustering, Adv. Water Resour., 31(10), 1387–1398, doi:16/j.advwatres.2008.07.011. Kirchner, J. (2006), Getting the right answers for the right reasons: Linking measurements, analyses, and models to advance the science of hydrology, Water Resour. Res., 42(3), W03S04, doi:10.1029/2005WR004362. Kirchner, J., X. Feng, and C. Neal (2000), Fractal stream chemistry and its implications for contaminant transport in catchments, Nature, 403(6769), 524–527. Klemes, V. (1983), Conceptualization and scale in hydrology, J. Hydrol., 65, 1–23, doi:16/0022-1694(83)90208-1. Kling, H., and H. P. Nachtnebel (2009), A spatio-temporal comparison of water balance modelling in an Alpine catchment. Hydrol. Process., 23(7), 997–1009, doi:10.1002/hyp.7207. Lamb, R., and K. J. Beven (1997), Using interactive recession curve analysis to specify a general catchment storage model, Hydrol. Earth Syst. Sci., 1, 101–113. McGuire, K. J., M. Weiler, and J. J. McDonnell (2007), Integrating tracer experiments with modeling to assess runoff processes and water transit times, Adv. Water Resour., 30(4), 824–837, doi:16/j.advwatres.2006. 07.004. McMillan, H. K., M. P. Clark, W. B. Bowden, M. Duncan, and R. A. Woods (2011), Hydrological field data from a modeller’s perspective: Part 1. Diagnostic tests for model structure, Hydrol. Process., 25(4), 511–522, doi:10.1002/hyp.7841. McMillan, H., D. Tetzlaff, M. Clark, and C. Soulsby (2012), Do time-variable tracers aid the evaluation of hydrological model structure? A multimodel approach, Water Resour. Res., 48, W05501, doi:10.1029/2011WR 011688. McNamara, J. P., D. Tetzlaff, K. Bishop, C. Soulsby, M. Seyfried, N. E. Peters, B. T. Aulenbach, and R. Hooper (2011), Storage as a metric of catchment comparison, Hydrol. Process., 25(21), 3364–3371, doi:10.1002/ hyp.8113. Nash, J. E., and J. V. Sutcliffe (1970), River flow forecasting through conceptual models part I—A discussion of principles, J. Hydrol., 10(3), 282–290, doi:16/0022-1694(70)90255-6. Neal, C. (2001), Alkalinity measurements within natural waters: Towards a standardised approach, Sci. Tot. Env., 265, 99–113. Neal, C., N. Christophersen, R. Neale, C. J. Smith, P. G. Whitehead, and B. Reynolds (1988), Chloride in precipitation and streamwater for the upland catchment of river severn, mid-wales; some consequences for hydrochemical models, Hydrol. Process., 2(2), 155–165, doi:10.1002/ hyp.3360020206. Dochartaigh, B. E., P. L. Smedley, A. M. MacDonald, and W. G. Darling O (2006), Baseline Scotland: The Lower Devonian aquifer of Strathmore, British Geological Survey Commissioned Report, British Geological Survey, Nottingham, U. K. Page, T., K. J. Beven, J. Freer, and C. Neal (2007), Modelling the chloride signal at Plynlimon, Wales, using a modified dynamic TOPMODEL incorporating conservative chemical mixing (with uncertainty), Hydrol. Process., 21(3), 292–307, doi:10.1002/hyp.6186. Reed, P. M., et al., (2006), Bridging river basin scales and processes to assess human–climate impacts and the terrestrial hydrologic system, Water Resour. Res., 42(7), W07418, doi:10.1029/2005WR004153. Robson, A., K. Beven, and C. Neal (1992), Towards identifying sources of subsurface flow: A comparison of components identified by a physically based runoff model and those determined by chemical mixing techniques, Hydrol. Process., 6(2), 199–214, doi:10.1002/hyp.3360060208. Savenije, H. H. G. (2010), Topography driven conceptual modelling (FLEX-Topo), Hydrol. Earth Syst. Sci., 14(12), 2681–2692, doi:10.5194/ hess-14-2681-2010. Sawicz, K., T. Wagener, M. Sivapalan, P. A. Troch, and G. Carrillo (2011), Catchment classification: Empirical analysis of hydrologic similarity based on catchment function in the Eastern USA, Hydrol. Earth Syst. Sci., 15(9), 2895–2911, doi:10.5194/hess-15-2895-2011. Scherrer, S., and F. Naef (2003), A decision scheme to indicate dominant hydrological flow processes on temperate grassland, Hydrol. Process., 17(2), 391–401, doi:10.1002/hyp.1131. Schmocker-Fackel, P., F. Naef, and S. Scherrer (2007), Identifying runoff processes on the plot and catchment scale, Hydrol. Earth Syst. Sci., 11(2), 891–906. Seibert, J. (2000), Multi-criteria calibration of a conceptual runoff model using a genetic algorithm, Hydrol. Earth Syst. Sci., 4, 215–224, doi:10.5194/hess-4-215-2000. 18 of 19 W09544 CAPELL ET AL.: CAN TRACERS REDUCE UNCERTAINTY IN RAINFALL-RUNOFF MODELS? Seibert, J., and J. McDonnell (2002), On the dialog between experimentalist and modeler in catchment hydrology: Use of soft data for multicriteria model calibration, Water Resour. Res., 38(11), 1241, doi:10.1029/ 2001WR000978. Seibert, J., A. Rodhe, and K. Bishop (2003), Simulating interactions between saturated and unsaturated storage in a conceptual runoff model, Hydrol. Process., 17(2), 379–390, doi:10.1002/hyp.1130. Soulsby, C., R. Malcolm, R. Helliwell, R. C. Ferrier, and A. Jenkins (2000), Isotope hydrology of the Allt a’ Mharcaidh catchment, Cairngorms, Scotland: Implications for hydrological pathways and residence times, Hydrol. Process., 14(4), 747–762, doi:10.1002/(SICI)10991085(200003)14:4<747: :AID-HYP970>3.0.CO;2-0. Soulsby, C., P. Rodgers, R. Smart, J. Dawson, and S. Dunn (2003), A tracer-based assessment of hydrological pathways at different spatial scales in a mesoscale Scottish catchment, Hydrol. Process., 17(4), 759– 777, doi:10.1002/hyp.1163. Soulsby, C., D. Tetzlaff, S. Dunn, and S. Waldron (2006a), Scaling up and out in runoff process understanding: Insights from nested experimental catchment studies, Hydrol. Process., 20(11), 2461–2465, doi:10.1002/hyp.6338. Soulsby, C., D. Tetzlaff, P. Rodgers, S. Dunn, and S. Waldron (2006b), Runoff processes, stream water residence times and controlling landscape characteristics in a mesoscale catchment: An initial evaluation, J. Hydrol., 325, 197–221. Soulsby, C., C. Neal, H. Laudon, D. Burns, P. Merot, M. Bonell, S. Dunn, and D. Tetzlaff (2008), Catchment data for process conceptualization: Simply not enough?, Hydrol. Process., 22(12), 2057–2061, doi:10.1002/ hyp.7068. Soulsby, C., K. Piegat, J. Seibert, and D. Tetzlaff (2011), Catchment-scale estimates of flow path partitioning and water storage based on transit time and runoff modelling. Hydrol. Process., 25(25), 3960–3976, doi:10.1002/hyp.8324. Speed, M., D. Tetzlaff, C. Soulsby, M. Hrachowitz, and S. Waldron (2010), Isotopic and geochemical tracers reveal similarities in transit times in contrasting mesoscale catchments, Hydrol. Process., 24(9), 1211–1224, doi:10.1002/hyp.7593. W09544 Tetzlaff, D., and C. Soulsby (2008), Sources of baseflow in larger catchments—using tracers to develop a holistic understanding of runoff generation, J. Hydrol., 359, 287–302, doi:10.1016/j.jhydrol.2008.07.008. Tetzlaff, D., C. Soulsby, S. Waldron, I. Malcolm, P. Bacon, S. Dunn, A. Lilly, and A. Youngson (2007), Conceptualization of runoff processes using a geographical information system and tracers in a nested mesoscale catchment, Hydrol. Process., 21(10), 1289–1307, doi:10.1002/ hyp.6309. Tetzlaff, D., S. Uhlenbrook, S. Eppert, and C. Soulsby (2008), Does the incorporation of process conceptualization and tracer data improve the structure and performance of a simple rainfall-runoff model in a Scottish mesoscale catchment?, Hydrol. Process., 22(14), 2461–2474, doi:10.1002/ hyp.6841. Tetzlaff, D., J. Seibert, K. McGuire, H. Laudon, D. Burn, S. Dunn, and C. Soulsby (2009), How does landscape structure influence catchment transit time across different geomorphic provinces?, Hydrol. Process., 23(6), 945–953, doi:10.1002/hyp.7240. Tetzlaff, D., C. Soulsby, M. Hrachowitz, and M. Speed (2011), Relative influence of upland and lowland headwaters on the isotope hydrology and transit times of larger catchments, J. Hydrol., 400, 438–447, doi:10.1016/j.jhydrol.2011.01.053. Uhlenbrook, S., M. Frey, C. Leibundgut, and P. Maloszewski (2002), Hydrograph separations in a mesoscale mountainous basin at event and seasonal timescales, Water Resour. Res., 38(6), 1096, doi:10.1029/2001 WR000938. Uhlenbrook, S., S. Roser, and N. Tilch (2004), Hydrological process representation at the meso-scale: The potential of a distributed, conceptual catchment model, J. Hydrol., 291, 278–296, doi:10.1016/j.jhydrol.2003. 12.038. Wagener, T., M. Sivapalan, P. Troch, and R. Woods (2007), Catchment classification and hydrologic similarity, Geogr. Compass, 1(4), 901–931, doi:10.1111/j.1749-8198.2007.000. Wood, E. F., M. Sivapalan, K. Beven, and L. Band (1988), Effects of spatial variability and scale with implications to hydrologic modeling, J. Hydrol., 102, 29–47, doi:10.1016/0022-1694(88)90090-X. 19 of 19