Available online at www.sciencedirect.com Environmental Pollution 156 (2008) 544e552 www.elsevier.com/locate/envpol A classification and regression tree model of controls on dissolved inorganic nitrogen leaching from European forests James J. Rothwell a,*, Martyn N. Futter b, Nancy B. Dise a a Department of Environmental and Geographical Sciences, Manchester Metropolitan University, John Dalton Building, Chester Street, Manchester M1 5GD, UK b Macaulay Land Use Research Institute, Craigiebuckler, Aberdeen AB15 8QH, UK Received 9 October 2007; received in revised form 27 December 2007; accepted 8 January 2008 Classification and regression trees provide new insights into the non-linear behaviour of nitrogen leaching from forests. Abstract Often, there is a non-linear relationship between atmospheric dissolved inorganic nitrogen (DIN) input and DIN leaching that is poorly captured by existing models. We present the first application of the non-parametric classification and regression tree approach to evaluate the key environmental drivers controlling DIN leaching from European forests. DIN leaching was classified as low (<3), medium (3e15) or high (>15 kg N ha1 year1) at 215 sites across Europe. The analysis identified throughfall NO 3 deposition, acid deposition, hydrology, soil type, the carbon content of the soil, and the legacy of historic N deposition as the dominant drivers of DIN leaching for these forests. Ninety four percent of sites were successfully classified into the appropriate leaching category. This approach shows promise for understanding complex ecosystem responses to a wide range of anthropogenic stressors as well as an improved method for identifying risk and targeting pollution mitigation strategies in forest ecosystems. Ó 2008 Elsevier Ltd. All rights reserved. Keywords: Nitrogen leaching; Forest; Threshold; Non-linear; Model; Prediction 1. Introduction Leaching of dissolved inorganic nitrogen (DIN) from forest soils is a problem across Europe as it causes acidification of surface waters and eutrophication of coastal marine environments (Vitousek et al., 1997). There are three main mechanisms that may cause DIN leaching from non-agricultural ecosystems: (i) deposition of atmospheric N surplus to the requirements of plant and microbial communities; (ii) disturbance to the vegetation community; and (iii) enhanced mineralization of soil N (Gundersen et al., 2006). Human activities related to fossil fuel burning and agriculture are the main sources of excess atmospheric N pollution (Vitousek et al., 1997). * Corresponding author. E-mail address: j.j.rothwell@mmu.ac.uk (J.J. Rothwell). 0269-7491/$ - see front matter Ó 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.envpol.2008.01.007 Several empirical models based on linear relationships between DIN leaching and hypothesised environmental drivers have been derived for forest ecosystems in Europe (e.g. Dise et al., 1995, 1998a,b; Gundersen et al., 1998; MacDonald et al., 2002; Kristensen et al., 2004; van der Salm et al., 2007) and North America (e.g. Fenn et al., 1998; Aber et al., 2003; Lovett et al., 2002). Prediction of DIN leaching has been limited by the high degree of variability in the response of forest ecosystems to N deposition, and the linear techniques used. As a consequence the main use of empirical modelling to this point has been identification of potential drivers and hypothesis testing. Classification and regression trees (Breiman et al., 1984) (also described as ‘partitioning trees’ in this paper) are a data mining technique for empirical model building and hypothesis formulation. A classification and regression tree builds a set of decision rules for identifying response variable group membership or value based on a dichotomous partitioning of predictor variables. A major advantage of partitioning trees is that assumptions which are required for the appropriate use of parametric statistics, such as Gaussian distribution of predictor variables, do not need to be satisfied. Traditional linear techniques such as multiple linear regression are also only able to identify a limited number of predictor variables, often due to multi-collinearity constraints, and predictor and response variables must show a linear relationship over their entire range. In contrast, tree-based models allow the complex interactions between the predictor variables to be represented, with no assumptions of linearity. Multiple linear regression identifies global relationships in the data set, whereas partitioning trees are able to identify local relationships. Although classification and regression trees can be used for empirical model building, large data sets are required for the development of statistically valid models. Recently, partitioning trees have been used to identify potential causal relationships in a variety of environmental data sets (e.g. Dobbertin and Biging, 1998; De’ath and Fabricius, 2000; Bennett et al., 2006; Lawler et al., 2006; Sullivan et al., 2006). The approach has also been used to investigate controls on soil NO 3 -N in a large watershed with heterogeneous land use (Lamsal et al., 2006), but has not previously been used to analyse the dynamics of N pollution in a large number of forested ecosystems. The aim of this study is to use classification and regression tree analysis to determine the broad-scale predictors of DIN leaching from forest soils in Europe and to enhance our understanding of forest N dynamics. A further aim is to elucidate if outcomes from the partitioning tree analysis can be used to identify possible management strategies for the reduction of N pollution in surface waters caused by DIN leaching from European forest soils. 2. Materials and methods 2.1. Data set Data from the Indicators of Forest Ecosystem Functioning (IFEF) data set were used in this analysis. IFEF is a compilation of published studies of N inputs and N outputs from European forests together with ecosystem data. The IFEF data set has previously been used to investigate European-scale controls on DIN, dissolved aluminium and base cation leaching (e.g. Dise et al., 2001; Armbruster et al., 2002; MacDonald et al., 2002). We used data from 215 plot and catchment scale forest sites from the data set. These sites are situated primarily across northern and central Europe. Sites were limited to those for which DIN input DIN leaching to avoid heavily damaged or disturbed ecosystems. Sites were also excluded from the analysis if there was evidence of liming or if the forests were dominated by calcareous soils. All data are average measurements collected over several years for the period 1985e2000. Data from the IFEF data set show a non-linear relationship between atmospheric DIN input and DIN leaching (Fig. 1). In the IFEF data set, similar European data sets (e.g. van der Salm et al., 2007) and North American data sets (e.g. Aber et al., 2003) no DIN leaching is observed at low levels of DIN deposition. However, the leaching of DIN is highly variable at intermediate levels of DIN deposition. Some sites do not leach DIN while others show high leaching for the same level of DIN deposition. High levels of DIN deposition always lead to DIN leaching, but the amount leached is highly variable. Such threshold relationships between DIN deposition and DIN leaching have been demonstrated in essentially all analyses of European and North American DIN leaching (kg N ha-1 y-1) J.J. Rothwell et al. / Environmental Pollution 156 (2008) 544e552 545 40 35 30 25 20 15 10 5 0 0 10 20 30 DIN TF (kg N 40 ha-1 50 60 70 y-1) Fig. 1. Relationship between DIN leaching and throughfall deposition of DIN for IFEF forest sites. forest data sets (cf. Dise and Wright, 1995; MacDonald et al., 2002; Aber et al., 2003; van der Salm et al., 2007). Clearly, these non-linear thresholds are not appropriate to models using standard linear statistical methods. An examination of the cumulative distribution of DIN leaching for the 215 forest sites revealed slope breakpoints at 3 and 15 kg N ha1 year1. The 33 sites with observed leaching of <3 kg N ha1 year1 were categorized as low leaching. There were 129 sites leaching between 3 and 15 kg N ha1 year1 which were categorized as medium leaching while the 53 sites leaching in excess of 15 kg N ha1 year1 were categorised as high leaching. Fig. 2 shows the location of sites across Europe and their DIN leaching category. Sites in central and northern Scandinavia and Finland, France and Spain all have low levels of leaching. Medium levels of DIN leaching are seen in parts of the UK and across central Europe. High DIN leaching is observed in Ireland, the Netherlands, Germany, Southern Scandinavia and the Czech Republic. The predictor variables used in this analysis are listed in Table 1. Candidate variables included N deposition (in bulk and throughfall), site properties (elevation, bedrock geology, soil type, soil chemistry and tree type), climate (mean annual temperature and precipitation) and modelled cumulative historical N deposition from 1880 to 2000 (cf. Schopp et al., 2003). These variables were chosen as they are known to affect DIN leaching and data were commonly available. 2.2. Statistical techniques Classification and regression trees are a non-parametric technique for the sequential partitioning of a data set composed of a response variable and any number of potential predictor variables, using dichotomous criteria (Breimen et al., 1984). After each split, the technique searches for the predictor variable that provides the most effective binary separation of the range in the response variable. As a result, predictor variables can be used more than once. Unlike traditional regression methods, both predictor and response variables may be continuous or categorical. This was useful as DIN leaching can be influenced by factors that are categorical (e.g. tree type or soil type) and the IFEF data set contains a mixture of continuous and categorical variables. The classification and regression tree analysis was performed using JMP 5.1 (SAS Institute). The criterion used for selecting the splits on the nodes was set to ‘Max Split Statistic’. This split selection method examines all possible splits for each predictor variable at each node. Missing values were assigned to ‘Closest’ and the minimum split size for nodes was set to three. With no independent test sample, a k-fold cross validation procedure was used. This procedure randomly partitions the data set into k equal sized groups. Each group is then sequentially used as a test set for the model derived from the combined set of remaining groups. This ensures that roughly unbiased estimates for predictions are obtained. In this application five was selected for the k-fold cross validationda value commonly used for this type of validation (Breiman et al., 1984; Witten and Frank, 2005). Model goodness-of-fit was assessed using the G2 statistic. The G2 statistic is a likelihood-ratio chi-square, analogous to a sum of squares for continuous data. The significance of each additional split in the tree was assessed using the J.J. Rothwell et al. / Environmental Pollution 156 (2008) 544e552 546 DIN leaching category < 3 Kg N ha-1 y-1 3 - 15 Kg N ha-1 y-1 > 15 Kg N ha-1 y-1 Fig. 2. Spatial distribution of forest sites used in the analysis and their observed DIN leaching category. Akaike Information Criterion (AIC) (Akaike, 1974). Statistical significance was assessed at p 0.05. Splitting was stopped immediately prior to the first split that would have resulted in a leaf node with an AIC probability p > 0.05. 3. Results The G2 and AIC statistics obtained during model building are shown in Fig. 3. The first split that would have resulted in a leaf node with an AIC test statistic probability > 0.05 was obtained with a model with 15 terminal leaf nodes. This split was removed and the final model comprised 14 terminal leaf nodes. Overall, the model classified 94.1% of sites into the correct leaching category. 98.2% of the low-leaching sites were correctly classified. The remaining 1.8% were predicted to have medium levels of leaching. 87.8% of the medium-leaching Table 1 Predictors used in model building Code Description Units Range Alt MAT MAP Dist Coast Bedrock Tree Type Soil Type Altitude above sea level Mean annual temperature Mean annual precipitation Distance to nearest coast Bedrock type Tree type Soil type (no calcareous soils in database) 4e2976 1.9e14 395e3032 1e770 C:N Org pH B Oa %C Oa %N Soil %C Soil %N NHþ 4 -N BP NO 3 -N BP NHþ 4 -N TF NO 3 -N TF Cum N-in SO2 4 -S TF Acid TF Organic layer C:N ratio pH of the B horizon % organic carbon in Oa horizon % total N in Oa horizon % organic carbon in mineral horizon % total N in mineral horizon Ammonium, as N in bulk precipitation Nitrate as N in bulk precipitation Ammonium as N in throughfall Nitrate as N in throughfall Cumulative historical N deposition (1880e2000) Sulphate as S in throughfall Acid throughfall þ (NO 3 -N þ NH4 -N þ SO4 -S) Runoff or seepage flux m C mm km Igneous; sedimentary; metamorphic Coniferous; deciduous; nitrogen fixer; shrub Alisol; anthrosol; arenosol; cambisol; gley; gleysol; histosol; luvisol; other non-calcareous; peat; podzol; umbrisol kg kg1 None % % % % kg ha1 year1 kg ha1 year1 kg ha1 year1 kg ha1 year1 kg ha1 kg ha1 year1 kg ha1 year1 Runoff/Seep mm year1 10.2e50.9 3.10e6.98 10.7e50.2 0.09e2.35 0.45e14.3 0.02e2.21 0.18e15.1 0.35e12.0 0.14e51.5 0.14e19.8 7656e251594 1.76e118 2.04e160 38e2642 J.J. Rothwell et al. / Environmental Pollution 156 (2008) 544e552 450 0.12 G2 Probability 400 350 0.1 G2 0.08 250 0.06 200 0.04 150 100 Probability 300 0.02 50 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 No. of terminal leaf nodes 2 Fig. 3. Model G statistic and probability that variation explained by an additional predictor is more than would be expected due to chance alone. Terminal leaf nodes 1e14 correspond to AeN respectively in Fig. 4 and Table 2. sites were correctly classified, 4.4% were classified as low and 7.8% as high. 87.5% of the high-leaching sites were correctly classified. The remaining 12.5% were classified as medium. All cases of mis-classification placed the sites into the next category, i.e. no high-leaching sites were classified as low or low-leaching sites as high. Fig. 4 shows the classification and regression tree criteria used in predicting DIN leaching. Each of the 215 sites was assigned to one of 14 leaf nodes (‘A’ to ‘N’) which characterised broad-scale controls on DIN leaching. A dichotomous key showing decision criteria for each terminal leaf node is presented in Table 3. The first split within the database that partitioned the tree into two main branches occurs at the level of NO 3 -N in -N in throughthroughfall. No sites receiving low current NO 3 fall (<7.7 kg N ha1 year1) fall in the high DIN leaching category, regardless of the history of N deposition, climate, or any other relevant environmental or site characteristic (leaf nodes ‘A’ through ‘F’). In contrast, almost no sites receiving high 547 1 year1) fall in current NO 3 -N in throughfall (7.7 kg N ha the low DIN leaching category (leaf nodes ‘I’ to ‘N’). Within the low NO 3 -N-in branch, sites that have received low historical N deposition (<9.9 104 kg N ha1 from 1880 to 2000) and are situated on soils that are either highly organic (peats, histosols, umbrisols), deep and well-weathered (podzols, cambisols) or developed on fluvio-glacial sands (arenosols) all fall in the low DIN leaching category (leaf node ‘B’). Forests receiving low current and historical N deposition but underlain by other soils show on average higher DIN leaching, with some leaching at intermediate levels (leaf node ‘A’). Forests with low current NO 3 -N deposition but high historical N deposition may still show low DIN leaching if they have low water fluxes in runoff or seepage (<w200 mm year1; leaf node ‘C’). With higher water fluxes, forests must have an organic carbon-rich Oa horizon (>w35%) to fall in the low DIN leaching category (leaf nodes ‘E’ and ‘F’, with ‘F’ being at the upper range of the low NO 3 -N deposition sites and showing some intermediate leaching). On the other main branch of the tree, at high NO 3 -N-in (7.7 kg N ha1 year1), the next split is at the level of potenþ tial acid deposition in throughfall (kg (NO 3 -N-in) þ (NH4 -N2 in) þ (SO4 -S-in)). All sites that receive high current NO 3 -N in throughfall, as well as high current levels of acid deposition (>w80 kg ha1 year1), leach in the high DIN category (leaf node ‘N’). However, if acid deposition is lower, most sites leach at the intermediate level. The only exceptions to this are some sites with a low organic horizon C:N (<24) (leaf node ‘J’) and all sites located on podzols or luvisols (wellweathered, acid soils) that receive high amounts of precipitation (MAP ca. 1040e1280 mm) (leaf node ‘M’); these forests fall in the high DIN leaching category. Interestingly, once precipitation exceeds 1280 mm, forests leach less N (node ‘L’). Four of these sites are remote high altitude sites and three are located on moorland spruce plantations. Table 2 Classification summary and leaf formula (predictor descriptions, units and numerical ranges are shown in Table 1) Leaf node Leaching category (%) Low Medium No. of sites High Predictors NO 3 -N TF Cum N-in Soil type Other non-calcareous Arenosol, cambisol, histosol, peat, podzol, umbrisol A B 50 100 50 0 0 0 4 80 <7.7 <7.7 <99102 <99102 C D E F G H 100 0 100 33 33 100 0 100 0 67 67 0 0 0 0 0 0 0 20 14 16 3 3 9 <7.7 <7.7 <7.57 7.57-7.7 7.7 7.7 99102 99102 99102 99102 I J K 0 0 0 100 71 100 0 29 0 3 7 16 7.7 7.7 7.7 L M N 0 0 0 100 0 10 0 100 90 6 5 29 7.7 7.7 7.7 Runoff/ Seep <199 >199 >199 >199 Podzol Arenosol, cambisol, other non-calcareous Cambisol, gleysol, other non-calcareous Podzol, luvisol Podzol, luvisol Oa %C Acid TF MAP C:N Org NHþ 4 -N TF <35.6 35.6 35.6 <77.7 <77.7 <1040 <1040 24 24 <22.2 <22.2 <77.7 <77.7 <77.7 <1040 <1040 1040 24 <24 22.2 <77.7 <77.7 77.7 1281 1040e1281 J.J. Rothwell et al. / Environmental Pollution 156 (2008) 544e552 548 NO3- -N TF < 7.7 kg ha-1 y-1 ≥ 7.7 kg ha-1 y-1 Cum N-in < 99100 kg ha-1 Acid TF < 77.7 kg ha-1 y-1 ≥ 99100 kg ha-1 Soil Type Runoff/Seep ≥ 199 mm Other non calcareous A Arenosol, Cambisol, Histosol, Peat Podzol Umbisol B <199mm ≥ 77.7 kg ha-1 y-1 MAP <1040 mm C:N Org OA%C ≥ 35.6% C Podzol/Luvisol NH4+ -N TF < 24 Gleysol, J Other non calcareous < 7.57 kg ha-1 y-1 ≥ 7.57 kg ha-1 y-1 E F MAP Cambisol, < 22.2 kg ha-1 y-1 D N Soil Type ≥ 24 NO3- -N TF < 35.6 % ≥ 1040 mm ≥ 22 .2 kg ha-1 y-1 Soil Type I K ≥ 1281 mm < 1281 mm L M DIN leaching category < 3 Kg N ha-1 y-1 3 - 15 Kg N ha-1 y-1 > 15 Kg N ha-1 y-1 Podzol Arenosol, G Cambisol, Other non calcareous H Fig. 4. Classification and regression tree showing the decision criteria for predicting DIN leaching from European forest soils. Identifiers ‘A’ to ‘N’ for the terminal leaf nodes are shown, together with DIN leaching category. The only forests that receive high NO 3 -N in throughfall (7.7 kg N ha1year1) but leach in the low DIN leaching category are those that receive <1040 mm precipitation year1, have a high organic horizon C:N ratio (24), and receive 1 year1 in throughfall. Of these, forests <22 kg NHþ 4 -N ha on deeper soils with a higher buffering capacity (arenosols, cambisols) leach the lowest levels of N, with all in the low DIN leaching category (node ‘H’). Forests developed on podzols leach on average more N (node ‘G’). Fig. 5a shows low DIN leaching sites in leaf nodes ‘B’ and ‘C’. The sites in leaf node ‘B’ are mostly in southern Scandinavia. They are currently receiving relatively low levels of NO 3 -N in throughfall and have historically low levels of N deposition. The sites in leaf node ‘C’ are mostly found in a band running through the mountainous region of Germany and into the Czech Republic. They have received high levels of historical N deposition but are now receiving low levels of NO 3 -N in throughfall. High-leaching sites from leaf node ‘N’ are distributed across central Europe (Fig. 5b). These have relatively low NO 3 -N in throughfall but high levels of acid deposition 2 (enhanced NHþ 4 -N and/or SO4 -N). Note that the partitioning tree was able to successfully distinguish these high-leaching forests in southern Scandinavia from the majority of lowleaching sites, by assigning them to node ‘N’ rather than ‘A’. Table 4 shows the relative contributions of predictors to the final classification. All predictors in Table 4 make statistically significant contributions ( p0.05) to explaining the observed pattern in DIN leaching across European forests. Of the suite of variables used in model building, NO 3 -N in throughfall was the most important in predicting the observed pattern of DIN leaching. Acid throughfall was the next most important variable. The mean annual precipitation (MAP), percentage of organic carbon in the Oa horizon (Oa %C), and soil type were all approximately equivalent in their contribution to explaining DIN leaching. The remainder of the predictors (cumulative historical N deposition, runoff/seepage water flux, C:N ratio of the organic horizon, and NHþ 4 -N in throughfall) explained a minor, but statistically significant, part of DIN leaching (Table 4). Eight of the 215 sites were misclassified by the partitioning tree. Two sites with low DIN leaching were classified as medium, four sites with medium leaching were categorised as high and two high-leaching sites were classified as medium. The two low-leaching sites classified as medium were located J.J. Rothwell et al. / Environmental Pollution 156 (2008) 544e552 Table 3 Dichotomous key for leaching status (leaf nodes are displayed in Fig. 4) Code Decision 1 1 year1? Is the NO 3 -N in throughfall less than 7.7 kg ha Yes: go to 2 No: go to 7 Is the cumulative historical N deposition less than 99102 kg ha1? Yes: go to 3 No: go to 4 If the site is on one of the following soils: arenosol; cambisol; histosol; peat; podzol or umbrisol Low, 100% (leaf node B) For sites where the soil type is other non-calcareous Low, 50%; medium, 50% (leaf node A) Is the runoff/seepage water flux less than 199 mm year1? Yes: Low, 100% (leaf node C) No: go to 5 Is the percent organic carbon in the Oa horizon less than or equal to 35.6%? Yes: Medium, 100% (leaf node D) No: go to 6 1 year1? Is the NO 3 -N in throughfall less than 7.57 kg ha Yes: Low, 100% (leaf node E) No: Low, 33%; medium, 67% (leaf node F) Is acid throughfall less than 77.7 kg ha1 year1? Yes: go to 8 No: Medium, 10%; high, 90% (leaf node N) Is the mean annual precipitation less than 1040 mm year1? Yes: go to 9 No: go to 12 Is the organic layer C:N ratio less than 24? Yes: Medium, 71%; high, 29% (leaf node J) No: go to 10 1 year1? Is NHþ 4 -N in throughfall greater than 22.2 kg ha Yes: Medium, 100% (leaf node I) No: go to 11 For sites on podzols Low, 33%; medium, 67% (leaf node G) For sites on arenosols, cambisols or other non-calcareous soils Low: 100% (leaf node H) Is the site is on a cambisol, gleysol or other non-calcareous soil Yes: Medium, 100% (leaf node K) No: If the site is on a podzol or luvisol, go to 13 Is the mean annual precipitation less than 1281 mm year1? Yes: High, 100% (leaf node M) No: Medium, 100% (leaf node L) 2 3 4 5 6 7 8 9 10 11 12 13 in Sweden. These forests were near each other and located close to the sea. One was quite wet (w900 mm precipitation year1) and the other had a very high C:N ratio (38.9). All four of the medium-leaching sites that were classified as high were on cambisols. One high-leaching site classified as medium had a borderline DIN export (16.8 kg ha1 year1; 15 is the cutoff) and the other had high historic N deposition, relatively low levels of acid throughfall, a low C:N ratio in the organic layer and a high percent organic carbon in the Oa horizon. There are a number of generalisations that can be drawn from these results. Low levels of NO 3 -N in throughfall combined with low cumulative historical N deposition almost always results in low levels of DIN leaching (leaf node ‘A’ and ‘B’). Sites with high cumulative historical N deposition 549 can still show low levels of DIN leaching if current atmospheric NO 3 -N deposition is low and the forest has at least one major characteristic predisposing it for N retention, such as low runoff water flux (leaf node ‘C’) or high % soil organic carbon (leaf nodes ‘E’ and ‘F’). Organic layer C:N ratios are most useful for distinguishing between low- and mediumleaching sites (leaf nodes ‘H’ vs. ‘J’). After accounting for the primary drivers, sites with a C:N ratio 24 (leaf node ‘H’) leach low levels of DIN, whereas sites with a C:N ratio <24 (leaf node ‘J’) leach medium, and occasionally high levels of DIN. High levels of acid throughfall combined with high levels of NO 3 -N in throughfall will almost certainly result in high levels of DIN leaching (leaf node ‘N’). 4. Discussion Predicting DIN leaching from forest soils using linear statistical techniques has been limited due to both the non-linear relationships between DIN leaching and predictor variables (such as N deposition or the C:N ratio of the soil), the dependence of DIN leaching on some aggregate site characteristics that are best described categorically (e.g. soil type) and complex interactions among predictors. The partitioning tree approach is a statistical tool ideally suited for interrogating complex non-linear environmental data sets. Using this treebased method for evaluating the controls on DIN leaching in European forest ecosystems, NO 3 -N in throughfall deposition can be hypothesised as the primary driver of DIN leaching. The analysis revealed that even with high historical N deposition, sites with low levels of contemporary NO 3 -N deposition can leach low levels of DIN. The second most important predictor of DIN leaching was anthropogenic acid deposition. This analysis suggests that higher levels of acid throughfall, if that acid throughfall has high NO 3 -N, will always result in high DIN leaching. Numerous studies have demonstrated that DIN leaching in forests is negatively correlated with the forest floor C:N ratio and have reported a threshold of between 23 and 25, below which significant DIN leaching may occur (e.g. Dise et al., 1998a; Gundersen et al., 1998; Aber et al., 2003; van der Salm et al., 2007). The partitioning tree analysis reveals that the soil organic layer C:N ratio is important under some circumstances, but is not a ubiquitous predictor of DIN leaching. This supports previous arguments that a low C:N ratio of the organic horizon is only one of several factors (including sustained, elevated N deposition, low to intermediate net primary productivity and low soil pH) that are necessary to initiate DIN leaching from most temperate forests (Dise et al., 1998a; MacDonald et al., 2002). DIN leaching from sites with high NO 3N in throughfall, moderate levels of acid throughfall and lower precipitation is sensitive to organic soil C:N ratio. At these sites, soil C:N ratios less than 24 are associated with higher DIN leaching. It is worth noting that the threshold of 24 was determined independently by the partitioning tree technique. The results of this study revealed that hydrology (mean annual precipitation and runoff/seepage) is an important control on DIN leaching. High levels of runoff and precipitation are 550 J.J. Rothwell et al. / Environmental Pollution 156 (2008) 544e552 Fig. 5. Sites with low (left panel) and high (right panel) DIN leaching in Central Europe. Fig. (A) Leaf nodes B and C. (B) Leaf node N. Leaf node description is displayed in Table 2. generally correlated with higher levels of DIN leaching. However, for those sites receiving very high levels of precipitation (w1280 mm) DIN leaching is moderate, even when NO 3 -N in throughfall is high (leaf node ‘L’). A possible explanation for this occurrence may be a hydrological effect. When soil horizons are saturated with water, high precipitation leads to an increase in the proportion of overland flow in the total runoff and a reduction in the contribution from seepage water (rich in DIN). This could lead to a partial bypassing of DIN stores in the upper soil horizons (Creed and Band, 1998). The partitioning tree model revealed that soil type is important to DIN leaching. Highly organic, gleyed, and well-buffered soils are all associated with low DIN leaching. In particular, cambisols, or brown earths, are less likely to leach high levels of DIN at high levels of N deposition. These soils tend to be less acid, less strongly weathered and with a higher buffering capacity than podzols (Bridges, 1997). However, the level of N deposition is still the driving force: Scandinavian podzols, which have always received relatively low levels of N deposition, only leach low levels of DIN. Podzols in Germany and the Netherlands receiving high levels of N deposition leach high levels of DIN (Fig. 5b). The partitioning tree analysis also provides insights into the relative roles of oxidised and reduced N deposition in DIN leaching in European forests, identifying NO 3 -N as the dom-N on its own of importance only after inant driver, and NHþ 4 several other factors have been accounted for (Table 4). The role of NHþ 4 -N is often confounded by the fact that, for nearly all European forests receiving high DIN, throughfall is dominated by NHþ 4 -N. Therefore, lowering the input of reduced N is essential for recovery of these ecosystems. However, regional analyses, including this one, consistently show that oxidised N is a significantly stronger driver of DIN leaching than reduced N. Previous work using earlier versions of the IFEF database and simple regressions showed that DIN leaching was three times higher for a given input of NO 3 -N than for -N (Dise et al., 1998b). Ecosystem reathe same input of NHþ 4 -N include preferential sons for the enhanced leaching of NO 3 þ vegetation uptake of NHþ 4 -N, nitrification of NH4 -N, and enþ hanced soil retention of NH4 -N on cation exchange sites (references in Dise et al., 1998b). Some of the forest sites were misclassified by the partitioning tree model. The two low-leaching sites that were misclassified as medium-leaching were located very close to the sea. At these coastal locations it may be the case that less reactive N and SO2 4 -N occurs as acid deposition and more is accompanied by marine-derived basic cations. One of these coastal sites was characterised by quite high rainfall, and as such it Table 4 Predictor variable contributions to explaining DIN leaching Predictor G2 NO 3 -N TF Acid TF MAP Oa %C Soil Type Cum N-in Runoff/Seep C:N Org NHþ 4 -N TF 152 48.4 34.5 32.9 31.1 22.8 19.2 13.7 8.28 J.J. Rothwell et al. / Environmental Pollution 156 (2008) 544e552 is possible that this site may have significant levels of denitrification, which is unaccounted in this analysis. The other coastal site had a very high C:N ratio in the organic horizon. C-rich forest sites can soak up more N than expected. It is also worth noting that one of the misclassified sites was placed in the incorrect category by <1 kg N ha1 year1, an amount easily within year-to-year variation in DIN export. This points out the need in future analyses for determining error ranges between categories. Lovett et al. (2004) demonstrate that tree species can exert a control on N cycling within a particular region. Interrogation of the IFEF data set using the partitioning tree approach provided no evidence that tree type (conifer vs. deciduous) or tree species (results not shown) controls DIN leaching when other drivers have been accounted for. One reason for the difference in findings may be the scale: the Lovett et al. (2004) study was over a relatively narrow geographic range (Catskill Mountains) whereas the IFEF data cover much of Europe. It is clear, for example, that one tree genus can cover a very wide range in N deposition (e.g. Picea) or be limited in the database to particular ranges of N deposition (e.g. Pseudotsuga, Fagus) which is likely confounded with climate (Fig. 6a). However, similar to Lovett et al. (2004) we did find that most DIN leaching (kg ha–1 y–1) a 40 Betula Fagus Picea Pinus Pseudotsuga Quercus 35 30 25 20 15 10 5 0 0 10 20 30 40 DIN TF (kg DIN leaching (kg ha–1 y–1) b ha–1 50 60 70 yr–1) 40 Betula Fagus Picea Pinus Pseudotsuga Quercus 35 30 25 20 15 10 5 0 0 10 20 30 40 50 551 Quercus-dominated sites receiving low to moderate N deposition showed a low C:N ratio with low DIN leaching (Fig. 6b). This analysis highlights the major controls on DIN leaching in European forests and may therefore be useful for formulating appropriate management strategies for controlling surface water acidification and coastal eutrophication caused by excess leaching of DIN from forest soils. Such a strategy could take the form of a dichotomous key (an example, based on this analysis, is shown in Table 3, although we would expect a key in practical use to be much simpler). The analysis suggests that reducing NO 3 -N deposition, even at those sites with a long legacy of historical N deposition, is the most effective strategy for lowering DIN leaching. Similarly, reducing atmospheric NHþ 4 -N may help reduce DIN leaching in some ciris a major cumstances, particularly where NHþ 4 -N component of atmospheric deposition. Further reductions in atmospheric emissions of acidifying pollutants are also likely to lead to improvements in DIN leaching from European forests. Results from large-scale N deposition manipulation experiments, such as NITREX (Wright and van Breemen, 1995), also revealed that reductions in N deposition at sites displaying symptoms of N saturation significantly reduced DIN leaching after NO 3 -N deposition was reduced (Bredemeier et al., 1998). Long-term environmental monitoring in central Europe has also shown that reductions in NO 3 -N deposition have resulted in a decrease in DIN leaching in Nsaturated catchments even with a legacy of high historical N deposition (Kopacek et al., 1998). The advantage of classification and regression trees is that splits are performed locally within a tree branch as opposed to globally (across the entire data set) as is the case with traditional statistical modelling approaches such as stepwise multiple regression and ordination. For example, there is a successful distinction between sites that have had very different N deposition histories, but have similar contemporary N input (Fig. 5a). The partitioning tree approach enables the subtleties and inter-connections between forcing variables that may not previously have been apparent to be identified and modelled. This is especially appropriate for data sets where there are threshold responses. Although this has primarily been an exploratory study, we have shown that partitioning tree analyses show excellent potential for identifying the controls on DIN leaching from forest soils, and for use as a predictive tool. Fully exploring the potential of this technique would require a relatively large, complete data set with few or no missing values, as well as a separate database of validation sites. This study has identified the most important potential drivers that should be assembled into such databases. Further work could also evaluate the applicability of partitioning tree analyses to other forested and non-forested ecosystems. 60 C:N Org Fig. 6. Relationship between DIN leaching and throughfall deposition of DIN for IFEF forest sites displayed according to tree species (A); relationship between DIN leaching and organic layer C:N ratio for IFEF forest sites displayed according to tree species (B). 5. Conclusions The partitioning tree approach employed in this analysis successfully classified European forest sites into three categories based on DIN leaching. The deposition of NO 3 -N in 552 J.J. Rothwell et al. / Environmental Pollution 156 (2008) 544e552 throughfall is the primary determinant of DIN leaching. In this analysis, it is more important than NHþ 4 -N deposition or cumulative historical N deposition, and suggests that the most effective strategy for reducing DIN leaching is reducing atmospheric input of NO 3 -N. Hydrology, soil type and organic carbon content of the soil are the most important ecosystem characteristics that modify the response of a forest to high NO 3 -N deposition and high acid deposition. Classification and regression trees are able to provide insights into the biogeochemistry of European forests that cannot be obtained using traditional empirical modelling approaches. Acknowledgements Thanks go to the contributors to the IFEF data set. Funding for this project was provided by the European Union (5th Framework Programme) as part of the projects C-NTER (contract no. QLK5-2001-00596) and DYNAMIC (contact no. 2000.60.NL,3B). We would also like to thank two anonymous reviewers for helpful comments on an earlier version of the manuscript. References Aber, J.D., Goodale, C.L., Ollinger, S.V., Smith, M.L., Magill, A.H., Martin, M.E., Hallett, R.A., Stoddard, J.L., 2003. Is nitrogen deposition altering the nitrogen status of northeastern forests? BioScience 53 (4), 375e389. Akaike, H., 1974. A new look at statistical model identification. IEEE Transactions on Automatic Control 19 (6), 716e723. Armbruster, M., MacDonald, J., Dise, N.B., Matzner, E., 2002. Throughfall and output fluxes of Mg in European forest ecosystems: a regional assessment. Forest Ecology and Management 164, 137e147. Dobbertin, M., Biging, G.S., 1998. Using the non-parametric classifier CART to model forest tree mortality. Forest Science 44 (4), 507e516. Bennett, J.P., Jepsen, E.A., Roth, J.A., 2006. Field responses of Prunus serotina and Asclepias syriaca to ozone around southern Lake Michigan. Environmental Pollution 142 (2), 354e366. Bredemeier, M., Blanck, K., Tietema, A., Boxman, A., Emmett, B.A., Kjnaas, O.J., Moldan, F., Gundersen, P., Schleppi, P., Wright, R.F., 1998. Inputeoutput budgets at the NITREX sites. Forest Ecology Management 101 (1e3), 37e56. Breiman, L., Friedman, J., Olshen, R., Stone, C., 1984. Classification and Regression Trees. Wadsworth, Blemont, CA. Bridges, E.M., 1997. World Soils. Cambridge University Press, Cambridge. Creed, I.F., Band, L.E., 1998. Export of nitrogen from catchments within a temperate forest: the need for an understanding of topographic regulation of variable source area dynamics. Water Resources Research 34, 3105e3120. De’ath, G., Fabricius, K.E., 2000. Classification and regression trees: a powerful yet simple technique for ecological data analysis. Ecology 81 (11), 3178e3192. Dise, N.B., Wright, R.F., 1995. Nitrogen leaching from European forests in relation to nitrogen deposition. Forest Ecology and Management 71 (1e2), 153e161. Dise, N.B., Matzner, E., Forsius, M., 1998a. Evaluation of organic horizon C:N ratio as an indicator of nitrate leaching in conifer forests across Europe. Environmental Pollution 120 (S1), 453e456. Dise, N.B., Matzner, E., Gundersen, P., 1998b. Synthesis of nitrogen pools and fluxes from European forest ecosystems. Water, Air and Soil Pollution 105, 143e154. Dise, N.B., Matzner, E., Armbruster, M., MacDonald, J., 2001. Aluminium output fluxes from forest ecosystems in Europe: a regional assessment. Journal of Environmental Quality 30, 1747e1756. Fenn, M.E., Poth, M.A., Aber, J.D., Baron, J.S., Bormann, B.T., Johnson, D.W., Lemly, A.D., McNulty, S.G., Ryan, D.E., Stottlemyer, R., 1998. Nitrogen excess in North American ecosystems: predisposing factors, ecosystem responses, and management strategies. Ecological Applications 8 (3), 706e733. Gundersen, P., Callesen, I., de Vries, W., 1998. Nitrate leaching in forest ecosystems is related to forest floor C/N ratios. Environmental Pollution 102 (S1), 403e407. Gundersen, P., Schmidt, I.K., Raulund Rasmussen, K., 2006. Leaching of nitrate from temperate forests e effects of air pollution and forest management. Environmental Reviews 14, 1e57. Kopacek, J., Hejzlar, J., Stuchlik, E., Fott, J., Vesely, J., 1998. Reversibility of acidification of mountain lakes after reduction in nitrogen and sulphur emissions in central Europe. Limnology and Oceanography 43 (2), 357e361. Kristensen, H.L., Gundersen, P., Callesen, I., Reinds, G.J., 2004. Throughfall nitrogen deposition has different impacts on soil solution nitrate concentration in European coniferous and deciduous forests. Ecosystems 7 (2), 180e192. Lamsal, S., Grunwald, S., Bruland, G.L., Bliss, C.M., Comerford, N.B., 2006. Regional hybrid geospatial modeling of soil nitrate-nitrogen in the Santa Fe River Watershed. Geoderma 135, 233e247. Lawler, J.J., White, D., Neilson, R.P., Blaustein, A.R., 2006. Predicting climate-induced range shifts: model differences and model reliability. Global Change Biology 12, 1568e1584. Lovett, G.M., Weathers, K.C., Arthur, M.A., 2002. Control of nitrogen loss from forested watersheds by soil carbon: nitrogen ratio and tree species composition. Ecosystems 5, 712e718. Lovett, G.M., Weathers, K.C., Arthur, M.A., Scultz, J.C., 2004. Nitrogen cycling in anorthern hardwood forest: do species matter? Biogeochemistry 67, 289e308. MacDonald, J.A., Dise, N.B., Matzner, E., Armbruster, M., Gundersen, P., Forsuis, M., 2002. Nitrogen input together with ecosystem nitrogen enrichment predict nitrate leaching from European forests. Global Change Biology 8, 1028e1033. Schopp, W., Posch, M., Mylona, S., Johansson, M., 2003. Long-term development of acid deposition (1880e2030) in sensitive freshwater regions in Europe. Hydrology and Earth System Sciences 7, 436e446. Sullivan, M.S., Jones, M.J., Lee, D.C., Marsden, S.J., Fielding, A.H., Young, E., 2006. A comparison of predictive methods in extinction risk studies: contrasts and decision trees. Biodiversity and Conservation 15, 1977e1991. van der Salm, C., de Vries, W., Reinds, G.J., Dise, N.B., 2007. N leaching across European forests: derivation and validation of empirical relationships using data from intensive monitoring plots. Forest Ecology and Management 238, 81e91. Vitousek, P.M., Aber, J.D., Howarth, R.W., Likens, G.E., Matson, P.A., Schindler, D.W., Schlesinger, W.H., Tilman, D.G., 1997. Human alteration of the global nitrogen cycle: sources and consequences. Ecological Applications 7 (3), 737e750. Witten, I.H., Frank, E., 2005. Data Mining: Practical Machine Learning Tools and Techniques, second ed. Elsevier, San Francisco Wright, R.F., van Breemen, N., 1995. The NITREX project: an introduction. Forest Ecology and Management 71 (1e2), 1e5.