A classification and regression tree model of controls on dissolved

Available online at www.sciencedirect.com
Environmental Pollution 156 (2008) 544e552
www.elsevier.com/locate/envpol
A classification and regression tree model of controls on dissolved
inorganic nitrogen leaching from European forests
James J. Rothwell a,*, Martyn N. Futter b, Nancy B. Dise a
a
Department of Environmental and Geographical Sciences, Manchester Metropolitan University, John Dalton Building,
Chester Street, Manchester M1 5GD, UK
b
Macaulay Land Use Research Institute, Craigiebuckler, Aberdeen AB15 8QH, UK
Received 9 October 2007; received in revised form 27 December 2007; accepted 8 January 2008
Classification and regression trees provide new insights into the non-linear behaviour of nitrogen leaching from forests.
Abstract
Often, there is a non-linear relationship between atmospheric dissolved inorganic nitrogen (DIN) input and DIN leaching that is poorly captured by existing models. We present the first application of the non-parametric classification and regression tree approach to evaluate the key
environmental drivers controlling DIN leaching from European forests. DIN leaching was classified as low (<3), medium (3e15) or high
(>15 kg N ha1 year1) at 215 sites across Europe. The analysis identified throughfall NO
3 deposition, acid deposition, hydrology, soil
type, the carbon content of the soil, and the legacy of historic N deposition as the dominant drivers of DIN leaching for these forests. Ninety
four percent of sites were successfully classified into the appropriate leaching category. This approach shows promise for understanding complex
ecosystem responses to a wide range of anthropogenic stressors as well as an improved method for identifying risk and targeting pollution mitigation strategies in forest ecosystems.
Ó 2008 Elsevier Ltd. All rights reserved.
Keywords: Nitrogen leaching; Forest; Threshold; Non-linear; Model; Prediction
1. Introduction
Leaching of dissolved inorganic nitrogen (DIN) from forest
soils is a problem across Europe as it causes acidification of
surface waters and eutrophication of coastal marine environments (Vitousek et al., 1997). There are three main mechanisms that may cause DIN leaching from non-agricultural
ecosystems: (i) deposition of atmospheric N surplus to the requirements of plant and microbial communities; (ii) disturbance to the vegetation community; and (iii) enhanced
mineralization of soil N (Gundersen et al., 2006). Human activities related to fossil fuel burning and agriculture are the
main sources of excess atmospheric N pollution (Vitousek
et al., 1997).
* Corresponding author.
E-mail address: j.j.rothwell@mmu.ac.uk (J.J. Rothwell).
0269-7491/$ - see front matter Ó 2008 Elsevier Ltd. All rights reserved.
doi:10.1016/j.envpol.2008.01.007
Several empirical models based on linear relationships between DIN leaching and hypothesised environmental drivers
have been derived for forest ecosystems in Europe (e.g. Dise
et al., 1995, 1998a,b; Gundersen et al., 1998; MacDonald
et al., 2002; Kristensen et al., 2004; van der Salm et al.,
2007) and North America (e.g. Fenn et al., 1998; Aber
et al., 2003; Lovett et al., 2002). Prediction of DIN leaching
has been limited by the high degree of variability in the response of forest ecosystems to N deposition, and the linear
techniques used. As a consequence the main use of empirical
modelling to this point has been identification of potential
drivers and hypothesis testing.
Classification and regression trees (Breiman et al., 1984)
(also described as ‘partitioning trees’ in this paper) are
a data mining technique for empirical model building and hypothesis formulation. A classification and regression tree
builds a set of decision rules for identifying response variable
group membership or value based on a dichotomous partitioning of predictor variables. A major advantage of partitioning
trees is that assumptions which are required for the appropriate
use of parametric statistics, such as Gaussian distribution of
predictor variables, do not need to be satisfied. Traditional linear techniques such as multiple linear regression are also only
able to identify a limited number of predictor variables, often
due to multi-collinearity constraints, and predictor and response variables must show a linear relationship over their entire range. In contrast, tree-based models allow the complex
interactions between the predictor variables to be represented,
with no assumptions of linearity. Multiple linear regression
identifies global relationships in the data set, whereas partitioning trees are able to identify local relationships. Although
classification and regression trees can be used for empirical
model building, large data sets are required for the development of statistically valid models.
Recently, partitioning trees have been used to identify potential causal relationships in a variety of environmental data
sets (e.g. Dobbertin and Biging, 1998; De’ath and Fabricius,
2000; Bennett et al., 2006; Lawler et al., 2006; Sullivan
et al., 2006). The approach has also been used to investigate
controls on soil NO
3 -N in a large watershed with heterogeneous land use (Lamsal et al., 2006), but has not previously
been used to analyse the dynamics of N pollution in a large
number of forested ecosystems.
The aim of this study is to use classification and regression
tree analysis to determine the broad-scale predictors of DIN
leaching from forest soils in Europe and to enhance our understanding of forest N dynamics. A further aim is to elucidate if
outcomes from the partitioning tree analysis can be used to
identify possible management strategies for the reduction of
N pollution in surface waters caused by DIN leaching from
European forest soils.
2. Materials and methods
2.1. Data set
Data from the Indicators of Forest Ecosystem Functioning (IFEF) data set
were used in this analysis. IFEF is a compilation of published studies of N inputs and N outputs from European forests together with ecosystem data. The
IFEF data set has previously been used to investigate European-scale controls
on DIN, dissolved aluminium and base cation leaching (e.g. Dise et al., 2001;
Armbruster et al., 2002; MacDonald et al., 2002). We used data from 215 plot
and catchment scale forest sites from the data set. These sites are situated primarily across northern and central Europe. Sites were limited to those for
which DIN input DIN leaching to avoid heavily damaged or disturbed ecosystems. Sites were also excluded from the analysis if there was evidence of
liming or if the forests were dominated by calcareous soils. All data are average measurements collected over several years for the period 1985e2000.
Data from the IFEF data set show a non-linear relationship between atmospheric DIN input and DIN leaching (Fig. 1). In the IFEF data set, similar European data sets (e.g. van der Salm et al., 2007) and North American data sets
(e.g. Aber et al., 2003) no DIN leaching is observed at low levels of DIN deposition. However, the leaching of DIN is highly variable at intermediate
levels of DIN deposition. Some sites do not leach DIN while others show
high leaching for the same level of DIN deposition. High levels of DIN deposition always lead to DIN leaching, but the amount leached is highly variable.
Such threshold relationships between DIN deposition and DIN leaching have
been demonstrated in essentially all analyses of European and North American
DIN leaching (kg N ha-1 y-1)
J.J. Rothwell et al. / Environmental Pollution 156 (2008) 544e552
545
40
35
30
25
20
15
10
5
0
0
10
20
30
DIN TF (kg N
40
ha-1
50
60
70
y-1)
Fig. 1. Relationship between DIN leaching and throughfall deposition of DIN
for IFEF forest sites.
forest data sets (cf. Dise and Wright, 1995; MacDonald et al., 2002; Aber
et al., 2003; van der Salm et al., 2007). Clearly, these non-linear thresholds
are not appropriate to models using standard linear statistical methods.
An examination of the cumulative distribution of DIN leaching for the 215
forest sites revealed slope breakpoints at 3 and 15 kg N ha1 year1. The 33
sites with observed leaching of <3 kg N ha1 year1 were categorized as low
leaching. There were 129 sites leaching between 3 and 15 kg N ha1 year1
which were categorized as medium leaching while the 53 sites leaching in
excess of 15 kg N ha1 year1 were categorised as high leaching.
Fig. 2 shows the location of sites across Europe and their DIN leaching category. Sites in central and northern Scandinavia and Finland, France and Spain
all have low levels of leaching. Medium levels of DIN leaching are seen in
parts of the UK and across central Europe. High DIN leaching is observed
in Ireland, the Netherlands, Germany, Southern Scandinavia and the Czech
Republic.
The predictor variables used in this analysis are listed in Table 1. Candidate variables included N deposition (in bulk and throughfall), site properties
(elevation, bedrock geology, soil type, soil chemistry and tree type), climate
(mean annual temperature and precipitation) and modelled cumulative historical N deposition from 1880 to 2000 (cf. Schopp et al., 2003). These variables
were chosen as they are known to affect DIN leaching and data were commonly available.
2.2. Statistical techniques
Classification and regression trees are a non-parametric technique for the
sequential partitioning of a data set composed of a response variable and
any number of potential predictor variables, using dichotomous criteria (Breimen et al., 1984). After each split, the technique searches for the predictor variable that provides the most effective binary separation of the range in the
response variable. As a result, predictor variables can be used more than
once. Unlike traditional regression methods, both predictor and response variables may be continuous or categorical. This was useful as DIN leaching can
be influenced by factors that are categorical (e.g. tree type or soil type) and the
IFEF data set contains a mixture of continuous and categorical variables.
The classification and regression tree analysis was performed using JMP
5.1 (SAS Institute). The criterion used for selecting the splits on the nodes
was set to ‘Max Split Statistic’. This split selection method examines all possible splits for each predictor variable at each node. Missing values were assigned to ‘Closest’ and the minimum split size for nodes was set to three.
With no independent test sample, a k-fold cross validation procedure was
used. This procedure randomly partitions the data set into k equal sized groups.
Each group is then sequentially used as a test set for the model derived from
the combined set of remaining groups. This ensures that roughly unbiased estimates for predictions are obtained. In this application five was selected for
the k-fold cross validationda value commonly used for this type of validation
(Breiman et al., 1984; Witten and Frank, 2005).
Model goodness-of-fit was assessed using the G2 statistic. The G2 statistic
is a likelihood-ratio chi-square, analogous to a sum of squares for continuous
data. The significance of each additional split in the tree was assessed using the
J.J. Rothwell et al. / Environmental Pollution 156 (2008) 544e552
546
DIN leaching category
< 3 Kg N ha-1 y-1
3 - 15 Kg N ha-1 y-1
> 15 Kg N ha-1 y-1
Fig. 2. Spatial distribution of forest sites used in the analysis and their observed DIN leaching category.
Akaike Information Criterion (AIC) (Akaike, 1974). Statistical significance
was assessed at p 0.05. Splitting was stopped immediately prior to the first
split that would have resulted in a leaf node with an AIC probability p > 0.05.
3. Results
The G2 and AIC statistics obtained during model building
are shown in Fig. 3. The first split that would have resulted
in a leaf node with an AIC test statistic probability > 0.05
was obtained with a model with 15 terminal leaf nodes. This
split was removed and the final model comprised 14 terminal
leaf nodes.
Overall, the model classified 94.1% of sites into the correct
leaching category. 98.2% of the low-leaching sites were correctly classified. The remaining 1.8% were predicted to have
medium levels of leaching. 87.8% of the medium-leaching
Table 1
Predictors used in model building
Code
Description
Units
Range
Alt
MAT
MAP
Dist Coast
Bedrock
Tree Type
Soil Type
Altitude above sea level
Mean annual temperature
Mean annual precipitation
Distance to nearest coast
Bedrock type
Tree type
Soil type (no calcareous soils in database)
4e2976
1.9e14
395e3032
1e770
C:N Org
pH B
Oa %C
Oa %N
Soil %C
Soil %N
NHþ
4 -N BP
NO
3 -N BP
NHþ
4 -N TF
NO
3 -N TF
Cum N-in
SO2
4 -S TF
Acid TF
Organic layer C:N ratio
pH of the B horizon
% organic carbon in Oa horizon
% total N in Oa horizon
% organic carbon in mineral horizon
% total N in mineral horizon
Ammonium, as N in bulk precipitation
Nitrate as N in bulk precipitation
Ammonium as N in throughfall
Nitrate as N in throughfall
Cumulative historical N deposition (1880e2000)
Sulphate as S in throughfall
Acid throughfall
þ
(NO
3 -N þ NH4 -N þ SO4 -S)
Runoff or seepage flux
m
C
mm
km
Igneous; sedimentary; metamorphic
Coniferous; deciduous; nitrogen fixer; shrub
Alisol; anthrosol; arenosol; cambisol; gley;
gleysol; histosol; luvisol; other non-calcareous;
peat; podzol; umbrisol
kg kg1
None
%
%
%
%
kg ha1 year1
kg ha1 year1
kg ha1 year1
kg ha1 year1
kg ha1
kg ha1 year1
kg ha1 year1
Runoff/Seep
mm year1
10.2e50.9
3.10e6.98
10.7e50.2
0.09e2.35
0.45e14.3
0.02e2.21
0.18e15.1
0.35e12.0
0.14e51.5
0.14e19.8
7656e251594
1.76e118
2.04e160
38e2642
J.J. Rothwell et al. / Environmental Pollution 156 (2008) 544e552
450
0.12
G2
Probability
400
350
0.1
G2
0.08
250
0.06
200
0.04
150
100
Probability
300
0.02
50
0
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
No. of terminal leaf nodes
2
Fig. 3. Model G statistic and probability that variation explained by an additional predictor is more than would be expected due to chance alone. Terminal
leaf nodes 1e14 correspond to AeN respectively in Fig. 4 and Table 2.
sites were correctly classified, 4.4% were classified as low and
7.8% as high. 87.5% of the high-leaching sites were correctly
classified. The remaining 12.5% were classified as medium.
All cases of mis-classification placed the sites into the next
category, i.e. no high-leaching sites were classified as low or
low-leaching sites as high.
Fig. 4 shows the classification and regression tree criteria
used in predicting DIN leaching. Each of the 215 sites was assigned to one of 14 leaf nodes (‘A’ to ‘N’) which characterised
broad-scale controls on DIN leaching. A dichotomous key
showing decision criteria for each terminal leaf node is presented in Table 3.
The first split within the database that partitioned the tree
into two main branches occurs at the level of NO
3 -N in
-N
in
throughthroughfall. No sites receiving low current NO
3
fall (<7.7 kg N ha1 year1) fall in the high DIN leaching category, regardless of the history of N deposition, climate, or any
other relevant environmental or site characteristic (leaf nodes
‘A’ through ‘F’). In contrast, almost no sites receiving high
547
1
year1) fall in
current NO
3 -N in throughfall (7.7 kg N ha
the low DIN leaching category (leaf nodes ‘I’ to ‘N’).
Within the low NO
3 -N-in branch, sites that have received
low historical N deposition (<9.9 104 kg N ha1 from
1880 to 2000) and are situated on soils that are either highly organic (peats, histosols, umbrisols), deep and well-weathered
(podzols, cambisols) or developed on fluvio-glacial sands (arenosols) all fall in the low DIN leaching category (leaf node ‘B’).
Forests receiving low current and historical N deposition but
underlain by other soils show on average higher DIN leaching,
with some leaching at intermediate levels (leaf node ‘A’).
Forests with low current NO
3 -N deposition but high historical N deposition may still show low DIN leaching if they have
low water fluxes in runoff or seepage (<w200 mm year1;
leaf node ‘C’). With higher water fluxes, forests must have
an organic carbon-rich Oa horizon (>w35%) to fall in the
low DIN leaching category (leaf nodes ‘E’ and ‘F’, with ‘F’
being at the upper range of the low NO
3 -N deposition sites
and showing some intermediate leaching).
On the other main branch of the tree, at high NO
3 -N-in
(7.7 kg N ha1 year1), the next split is at the level of potenþ
tial acid deposition in throughfall (kg (NO
3 -N-in) þ (NH4 -N2
in) þ (SO4 -S-in)). All sites that receive high current NO
3 -N
in throughfall, as well as high current levels of acid deposition
(>w80 kg ha1 year1), leach in the high DIN category (leaf
node ‘N’). However, if acid deposition is lower, most sites
leach at the intermediate level. The only exceptions to this
are some sites with a low organic horizon C:N (<24) (leaf
node ‘J’) and all sites located on podzols or luvisols (wellweathered, acid soils) that receive high amounts of precipitation (MAP ca. 1040e1280 mm) (leaf node ‘M’); these forests
fall in the high DIN leaching category. Interestingly, once precipitation exceeds 1280 mm, forests leach less N (node ‘L’).
Four of these sites are remote high altitude sites and three
are located on moorland spruce plantations.
Table 2
Classification summary and leaf formula (predictor descriptions, units and numerical ranges are shown in Table 1)
Leaf node
Leaching
category (%)
Low
Medium
No. of
sites
High
Predictors
NO
3 -N TF
Cum N-in
Soil type
Other non-calcareous
Arenosol, cambisol,
histosol, peat, podzol,
umbrisol
A
B
50
100
50
0
0
0
4
80
<7.7
<7.7
<99102
<99102
C
D
E
F
G
H
100
0
100
33
33
100
0
100
0
67
67
0
0
0
0
0
0
0
20
14
16
3
3
9
<7.7
<7.7
<7.57
7.57-7.7
7.7
7.7
99102
99102
99102
99102
I
J
K
0
0
0
100
71
100
0
29
0
3
7
16
7.7
7.7
7.7
L
M
N
0
0
0
100
0
10
0
100
90
6
5
29
7.7
7.7
7.7
Runoff/
Seep
<199
>199
>199
>199
Podzol
Arenosol, cambisol,
other non-calcareous
Cambisol, gleysol,
other non-calcareous
Podzol, luvisol
Podzol, luvisol
Oa %C
Acid
TF
MAP
C:N Org
NHþ
4 -N
TF
<35.6
35.6
35.6
<77.7
<77.7
<1040
<1040
24
24
<22.2
<22.2
<77.7
<77.7
<77.7
<1040
<1040
1040
24
<24
22.2
<77.7
<77.7
77.7
1281
1040e1281
J.J. Rothwell et al. / Environmental Pollution 156 (2008) 544e552
548
NO3- -N TF
< 7.7
kg ha-1 y-1
≥ 7.7
kg ha-1 y-1
Cum N-in
< 99100
kg ha-1
Acid TF
< 77.7
kg ha-1 y-1
≥ 99100
kg ha-1
Soil Type
Runoff/Seep
≥ 199 mm
Other non
calcareous
A
Arenosol,
Cambisol,
Histosol,
Peat
Podzol
Umbisol
B
<199mm
≥ 77.7
kg ha-1 y-1
MAP
<1040 mm
C:N Org
OA%C
≥ 35.6%
C
Podzol/Luvisol
NH4+ -N TF
< 24
Gleysol,
J
Other non
calcareous
< 7.57
kg ha-1 y-1
≥ 7.57
kg ha-1 y-1
E
F
MAP
Cambisol,
< 22.2
kg ha-1 y-1
D
N
Soil Type
≥ 24
NO3- -N TF
< 35.6 %
≥ 1040 mm
≥ 22 .2
kg ha-1 y-1
Soil Type
I
K
≥ 1281
mm
< 1281
mm
L
M
DIN leaching category
< 3 Kg N ha-1 y-1
3 - 15 Kg N ha-1 y-1
> 15 Kg N ha-1 y-1
Podzol
Arenosol,
G
Cambisol,
Other non
calcareous
H
Fig. 4. Classification and regression tree showing the decision criteria for predicting DIN leaching from European forest soils. Identifiers ‘A’ to ‘N’ for the terminal
leaf nodes are shown, together with DIN leaching category.
The only forests that receive high NO
3 -N in throughfall
(7.7 kg N ha1year1) but leach in the low DIN leaching category are those that receive <1040 mm precipitation year1,
have a high organic horizon C:N ratio (24), and receive
1
year1 in throughfall. Of these, forests
<22 kg NHþ
4 -N ha
on deeper soils with a higher buffering capacity (arenosols,
cambisols) leach the lowest levels of N, with all in the low
DIN leaching category (node ‘H’). Forests developed on podzols leach on average more N (node ‘G’).
Fig. 5a shows low DIN leaching sites in leaf nodes ‘B’ and
‘C’. The sites in leaf node ‘B’ are mostly in southern Scandinavia. They are currently receiving relatively low levels of
NO
3 -N in throughfall and have historically low levels of N deposition. The sites in leaf node ‘C’ are mostly found in a band
running through the mountainous region of Germany and into
the Czech Republic. They have received high levels of historical N deposition but are now receiving low levels of NO
3 -N
in throughfall. High-leaching sites from leaf node ‘N’ are distributed across central Europe (Fig. 5b). These have relatively
low NO
3 -N in throughfall but high levels of acid deposition
2
(enhanced NHþ
4 -N and/or SO4 -N). Note that the partitioning
tree was able to successfully distinguish these high-leaching
forests in southern Scandinavia from the majority of lowleaching sites, by assigning them to node ‘N’ rather than ‘A’.
Table 4 shows the relative contributions of predictors to the
final classification. All predictors in Table 4 make statistically
significant contributions ( p0.05) to explaining the observed
pattern in DIN leaching across European forests. Of the suite
of variables used in model building, NO
3 -N in throughfall was
the most important in predicting the observed pattern of DIN
leaching. Acid throughfall was the next most important variable. The mean annual precipitation (MAP), percentage of organic carbon in the Oa horizon (Oa %C), and soil type were all
approximately equivalent in their contribution to explaining
DIN leaching. The remainder of the predictors (cumulative
historical N deposition, runoff/seepage water flux, C:N ratio
of the organic horizon, and NHþ
4 -N in throughfall) explained
a minor, but statistically significant, part of DIN leaching
(Table 4).
Eight of the 215 sites were misclassified by the partitioning
tree. Two sites with low DIN leaching were classified as medium, four sites with medium leaching were categorised as
high and two high-leaching sites were classified as medium.
The two low-leaching sites classified as medium were located
J.J. Rothwell et al. / Environmental Pollution 156 (2008) 544e552
Table 3
Dichotomous key for leaching status (leaf nodes are displayed in Fig. 4)
Code
Decision
1
1
year1?
Is the NO
3 -N in throughfall less than 7.7 kg ha
Yes: go to 2
No: go to 7
Is the cumulative historical N deposition less than 99102 kg
ha1?
Yes: go to 3
No: go to 4
If the site is on one of the following soils: arenosol; cambisol;
histosol; peat; podzol or umbrisol
Low, 100% (leaf node B)
For sites where the soil type is other non-calcareous
Low, 50%; medium, 50% (leaf node A)
Is the runoff/seepage water flux less than 199 mm year1?
Yes: Low, 100% (leaf node C)
No: go to 5
Is the percent organic carbon in the Oa horizon less than or
equal to 35.6%?
Yes: Medium, 100% (leaf node D)
No: go to 6
1
year1?
Is the NO
3 -N in throughfall less than 7.57 kg ha
Yes: Low, 100% (leaf node E)
No: Low, 33%; medium, 67% (leaf node F)
Is acid throughfall less than 77.7 kg ha1 year1?
Yes: go to 8
No: Medium, 10%; high, 90% (leaf node N)
Is the mean annual precipitation less than 1040 mm year1?
Yes: go to 9
No: go to 12
Is the organic layer C:N ratio less than 24?
Yes: Medium, 71%; high, 29% (leaf node J)
No: go to 10
1
year1?
Is NHþ
4 -N in throughfall greater than 22.2 kg ha
Yes: Medium, 100% (leaf node I)
No: go to 11
For sites on podzols
Low, 33%; medium, 67% (leaf node G)
For sites on arenosols, cambisols or other non-calcareous soils
Low: 100% (leaf node H)
Is the site is on a cambisol, gleysol or other non-calcareous soil
Yes: Medium, 100% (leaf node K)
No: If the site is on a podzol or luvisol, go to 13
Is the mean annual precipitation less than 1281 mm year1?
Yes: High, 100% (leaf node M)
No: Medium, 100% (leaf node L)
2
3
4
5
6
7
8
9
10
11
12
13
in Sweden. These forests were near each other and located
close to the sea. One was quite wet (w900 mm precipitation
year1) and the other had a very high C:N ratio (38.9). All
four of the medium-leaching sites that were classified as
high were on cambisols. One high-leaching site classified as
medium had a borderline DIN export (16.8 kg ha1 year1;
15 is the cutoff) and the other had high historic N deposition,
relatively low levels of acid throughfall, a low C:N ratio in the
organic layer and a high percent organic carbon in the Oa
horizon.
There are a number of generalisations that can be drawn
from these results. Low levels of NO
3 -N in throughfall combined with low cumulative historical N deposition almost always results in low levels of DIN leaching (leaf node ‘A’
and ‘B’). Sites with high cumulative historical N deposition
549
can still show low levels of DIN leaching if current atmospheric NO
3 -N deposition is low and the forest has at least
one major characteristic predisposing it for N retention, such
as low runoff water flux (leaf node ‘C’) or high % soil organic
carbon (leaf nodes ‘E’ and ‘F’). Organic layer C:N ratios are
most useful for distinguishing between low- and mediumleaching sites (leaf nodes ‘H’ vs. ‘J’). After accounting for
the primary drivers, sites with a C:N ratio 24 (leaf node
‘H’) leach low levels of DIN, whereas sites with a C:N ratio
<24 (leaf node ‘J’) leach medium, and occasionally high
levels of DIN. High levels of acid throughfall combined
with high levels of NO
3 -N in throughfall will almost certainly
result in high levels of DIN leaching (leaf node ‘N’).
4. Discussion
Predicting DIN leaching from forest soils using linear statistical techniques has been limited due to both the non-linear
relationships between DIN leaching and predictor variables
(such as N deposition or the C:N ratio of the soil), the dependence of DIN leaching on some aggregate site characteristics
that are best described categorically (e.g. soil type) and complex interactions among predictors. The partitioning tree approach is a statistical tool ideally suited for interrogating
complex non-linear environmental data sets. Using this treebased method for evaluating the controls on DIN leaching in
European forest ecosystems, NO
3 -N in throughfall deposition
can be hypothesised as the primary driver of DIN leaching.
The analysis revealed that even with high historical N deposition, sites with low levels of contemporary NO
3 -N deposition
can leach low levels of DIN. The second most important predictor of DIN leaching was anthropogenic acid deposition.
This analysis suggests that higher levels of acid throughfall,
if that acid throughfall has high NO
3 -N, will always result
in high DIN leaching.
Numerous studies have demonstrated that DIN leaching in
forests is negatively correlated with the forest floor C:N ratio
and have reported a threshold of between 23 and 25, below
which significant DIN leaching may occur (e.g. Dise et al.,
1998a; Gundersen et al., 1998; Aber et al., 2003; van der
Salm et al., 2007). The partitioning tree analysis reveals that
the soil organic layer C:N ratio is important under some circumstances, but is not a ubiquitous predictor of DIN leaching.
This supports previous arguments that a low C:N ratio of the
organic horizon is only one of several factors (including sustained, elevated N deposition, low to intermediate net primary
productivity and low soil pH) that are necessary to initiate DIN
leaching from most temperate forests (Dise et al., 1998a; MacDonald et al., 2002). DIN leaching from sites with high NO
3N in throughfall, moderate levels of acid throughfall and lower
precipitation is sensitive to organic soil C:N ratio. At these
sites, soil C:N ratios less than 24 are associated with higher
DIN leaching. It is worth noting that the threshold of 24 was
determined independently by the partitioning tree technique.
The results of this study revealed that hydrology (mean annual precipitation and runoff/seepage) is an important control
on DIN leaching. High levels of runoff and precipitation are
550
J.J. Rothwell et al. / Environmental Pollution 156 (2008) 544e552
Fig. 5. Sites with low (left panel) and high (right panel) DIN leaching in Central Europe. Fig. (A) Leaf nodes B and C. (B) Leaf node N. Leaf node description is
displayed in Table 2.
generally correlated with higher levels of DIN leaching. However, for those sites receiving very high levels of precipitation
(w1280 mm) DIN leaching is moderate, even when NO
3 -N
in throughfall is high (leaf node ‘L’). A possible explanation
for this occurrence may be a hydrological effect. When soil
horizons are saturated with water, high precipitation leads to
an increase in the proportion of overland flow in the total runoff and a reduction in the contribution from seepage water
(rich in DIN). This could lead to a partial bypassing of DIN
stores in the upper soil horizons (Creed and Band, 1998).
The partitioning tree model revealed that soil type is important to DIN leaching. Highly organic, gleyed, and well-buffered soils are all associated with low DIN leaching. In
particular, cambisols, or brown earths, are less likely to leach
high levels of DIN at high levels of N deposition. These soils
tend to be less acid, less strongly weathered and with a higher
buffering capacity than podzols (Bridges, 1997). However, the
level of N deposition is still the driving force: Scandinavian
podzols, which have always received relatively low levels of
N deposition, only leach low levels of DIN. Podzols in Germany and the Netherlands receiving high levels of N deposition leach high levels of DIN (Fig. 5b).
The partitioning tree analysis also provides insights into the
relative roles of oxidised and reduced N deposition in DIN
leaching in European forests, identifying NO
3 -N as the dom-N
on
its
own
of
importance
only after
inant driver, and NHþ
4
several other factors have been accounted for (Table 4). The
role of NHþ
4 -N is often confounded by the fact that, for nearly
all European forests receiving high DIN, throughfall is dominated by NHþ
4 -N. Therefore, lowering the input of reduced N
is essential for recovery of these ecosystems. However, regional analyses, including this one, consistently show that oxidised N is a significantly stronger driver of DIN leaching than
reduced N. Previous work using earlier versions of the IFEF
database and simple regressions showed that DIN leaching
was three times higher for a given input of NO
3 -N than for
-N
(Dise
et
al.,
1998b).
Ecosystem
reathe same input of NHþ
4
-N
include
preferential
sons for the enhanced leaching of NO
3
þ
vegetation uptake of NHþ
4 -N, nitrification of NH4 -N, and enþ
hanced soil retention of NH4 -N on cation exchange sites (references in Dise et al., 1998b).
Some of the forest sites were misclassified by the partitioning tree model. The two low-leaching sites that were misclassified as medium-leaching were located very close to the sea.
At these coastal locations it may be the case that less reactive
N and SO2
4 -N occurs as acid deposition and more is accompanied by marine-derived basic cations. One of these coastal
sites was characterised by quite high rainfall, and as such it
Table 4
Predictor variable contributions to explaining DIN leaching
Predictor
G2
NO
3 -N TF
Acid TF
MAP
Oa %C
Soil Type
Cum N-in
Runoff/Seep
C:N Org
NHþ
4 -N TF
152
48.4
34.5
32.9
31.1
22.8
19.2
13.7
8.28
J.J. Rothwell et al. / Environmental Pollution 156 (2008) 544e552
is possible that this site may have significant levels of denitrification, which is unaccounted in this analysis. The other
coastal site had a very high C:N ratio in the organic horizon.
C-rich forest sites can soak up more N than expected. It is
also worth noting that one of the misclassified sites was placed
in the incorrect category by <1 kg N ha1 year1, an amount
easily within year-to-year variation in DIN export. This points
out the need in future analyses for determining error ranges
between categories.
Lovett et al. (2004) demonstrate that tree species can exert
a control on N cycling within a particular region. Interrogation
of the IFEF data set using the partitioning tree approach provided no evidence that tree type (conifer vs. deciduous) or tree
species (results not shown) controls DIN leaching when other
drivers have been accounted for. One reason for the difference
in findings may be the scale: the Lovett et al. (2004) study was
over a relatively narrow geographic range (Catskill Mountains) whereas the IFEF data cover much of Europe. It is clear,
for example, that one tree genus can cover a very wide range
in N deposition (e.g. Picea) or be limited in the database to
particular ranges of N deposition (e.g. Pseudotsuga, Fagus)
which is likely confounded with climate (Fig. 6a). However,
similar to Lovett et al. (2004) we did find that most
DIN leaching (kg ha–1 y–1)
a
40
Betula
Fagus
Picea
Pinus
Pseudotsuga
Quercus
35
30
25
20
15
10
5
0
0
10
20
30
40
DIN TF (kg
DIN leaching (kg ha–1 y–1)
b
ha–1
50
60
70
yr–1)
40
Betula
Fagus
Picea
Pinus
Pseudotsuga
Quercus
35
30
25
20
15
10
5
0
0
10
20
30
40
50
551
Quercus-dominated sites receiving low to moderate N deposition showed a low C:N ratio with low DIN leaching (Fig. 6b).
This analysis highlights the major controls on DIN leaching
in European forests and may therefore be useful for formulating appropriate management strategies for controlling surface
water acidification and coastal eutrophication caused by excess leaching of DIN from forest soils. Such a strategy could
take the form of a dichotomous key (an example, based on this
analysis, is shown in Table 3, although we would expect a key
in practical use to be much simpler). The analysis suggests
that reducing NO
3 -N deposition, even at those sites with
a long legacy of historical N deposition, is the most effective
strategy for lowering DIN leaching. Similarly, reducing atmospheric NHþ
4 -N may help reduce DIN leaching in some ciris a major
cumstances, particularly where NHþ
4 -N
component of atmospheric deposition. Further reductions in
atmospheric emissions of acidifying pollutants are also likely
to lead to improvements in DIN leaching from European forests. Results from large-scale N deposition manipulation experiments, such as NITREX (Wright and van Breemen,
1995), also revealed that reductions in N deposition at sites
displaying symptoms of N saturation significantly reduced
DIN leaching after NO
3 -N deposition was reduced (Bredemeier et al., 1998). Long-term environmental monitoring in central Europe has also shown that reductions in NO
3 -N
deposition have resulted in a decrease in DIN leaching in Nsaturated catchments even with a legacy of high historical N
deposition (Kopacek et al., 1998).
The advantage of classification and regression trees is that
splits are performed locally within a tree branch as opposed to
globally (across the entire data set) as is the case with traditional statistical modelling approaches such as stepwise multiple regression and ordination. For example, there is
a successful distinction between sites that have had very different N deposition histories, but have similar contemporary
N input (Fig. 5a). The partitioning tree approach enables the
subtleties and inter-connections between forcing variables
that may not previously have been apparent to be identified
and modelled. This is especially appropriate for data sets
where there are threshold responses.
Although this has primarily been an exploratory study, we
have shown that partitioning tree analyses show excellent potential for identifying the controls on DIN leaching from forest
soils, and for use as a predictive tool. Fully exploring the potential of this technique would require a relatively large, complete data set with few or no missing values, as well as
a separate database of validation sites. This study has identified the most important potential drivers that should be assembled into such databases. Further work could also evaluate the
applicability of partitioning tree analyses to other forested and
non-forested ecosystems.
60
C:N Org
Fig. 6. Relationship between DIN leaching and throughfall deposition of DIN
for IFEF forest sites displayed according to tree species (A); relationship between DIN leaching and organic layer C:N ratio for IFEF forest sites displayed
according to tree species (B).
5. Conclusions
The partitioning tree approach employed in this analysis
successfully classified European forest sites into three categories based on DIN leaching. The deposition of NO
3 -N in
552
J.J. Rothwell et al. / Environmental Pollution 156 (2008) 544e552
throughfall is the primary determinant of DIN leaching. In this
analysis, it is more important than NHþ
4 -N deposition or cumulative historical N deposition, and suggests that the most effective strategy for reducing DIN leaching is reducing
atmospheric input of NO
3 -N. Hydrology, soil type and organic
carbon content of the soil are the most important ecosystem
characteristics that modify the response of a forest to high
NO
3 -N deposition and high acid deposition. Classification
and regression trees are able to provide insights into the biogeochemistry of European forests that cannot be obtained using traditional empirical modelling approaches.
Acknowledgements
Thanks go to the contributors to the IFEF data set. Funding
for this project was provided by the European Union (5th
Framework Programme) as part of the projects C-NTER (contract no. QLK5-2001-00596) and DYNAMIC (contact no.
2000.60.NL,3B). We would also like to thank two anonymous
reviewers for helpful comments on an earlier version of the
manuscript.
References
Aber, J.D., Goodale, C.L., Ollinger, S.V., Smith, M.L., Magill, A.H.,
Martin, M.E., Hallett, R.A., Stoddard, J.L., 2003. Is nitrogen deposition altering the nitrogen status of northeastern forests? BioScience 53 (4), 375e389.
Akaike, H., 1974. A new look at statistical model identification. IEEE Transactions on Automatic Control 19 (6), 716e723.
Armbruster, M., MacDonald, J., Dise, N.B., Matzner, E., 2002. Throughfall
and output fluxes of Mg in European forest ecosystems: a regional assessment. Forest Ecology and Management 164, 137e147.
Dobbertin, M., Biging, G.S., 1998. Using the non-parametric classifier CART
to model forest tree mortality. Forest Science 44 (4), 507e516.
Bennett, J.P., Jepsen, E.A., Roth, J.A., 2006. Field responses of Prunus
serotina and Asclepias syriaca to ozone around southern Lake Michigan.
Environmental Pollution 142 (2), 354e366.
Bredemeier, M., Blanck, K., Tietema, A., Boxman, A., Emmett, B.A.,
Kjnaas, O.J., Moldan, F., Gundersen, P., Schleppi, P., Wright, R.F., 1998.
Inputeoutput budgets at the NITREX sites. Forest Ecology Management
101 (1e3), 37e56.
Breiman, L., Friedman, J., Olshen, R., Stone, C., 1984. Classification and
Regression Trees. Wadsworth, Blemont, CA.
Bridges, E.M., 1997. World Soils. Cambridge University Press, Cambridge.
Creed, I.F., Band, L.E., 1998. Export of nitrogen from catchments within a temperate forest: the need for an understanding of topographic regulation of
variable source area dynamics. Water Resources Research 34, 3105e3120.
De’ath, G., Fabricius, K.E., 2000. Classification and regression trees: a powerful yet simple technique for ecological data analysis. Ecology 81 (11),
3178e3192.
Dise, N.B., Wright, R.F., 1995. Nitrogen leaching from European forests in relation
to nitrogen deposition. Forest Ecology and Management 71 (1e2), 153e161.
Dise, N.B., Matzner, E., Forsius, M., 1998a. Evaluation of organic horizon
C:N ratio as an indicator of nitrate leaching in conifer forests across
Europe. Environmental Pollution 120 (S1), 453e456.
Dise, N.B., Matzner, E., Gundersen, P., 1998b. Synthesis of nitrogen pools and
fluxes from European forest ecosystems. Water, Air and Soil Pollution 105,
143e154.
Dise, N.B., Matzner, E., Armbruster, M., MacDonald, J., 2001. Aluminium
output fluxes from forest ecosystems in Europe: a regional assessment.
Journal of Environmental Quality 30, 1747e1756.
Fenn, M.E., Poth, M.A., Aber, J.D., Baron, J.S., Bormann, B.T., Johnson, D.W.,
Lemly, A.D., McNulty, S.G., Ryan, D.E., Stottlemyer, R., 1998. Nitrogen
excess in North American ecosystems: predisposing factors, ecosystem
responses, and management strategies. Ecological Applications 8 (3),
706e733.
Gundersen, P., Callesen, I., de Vries, W., 1998. Nitrate leaching in forest ecosystems is related to forest floor C/N ratios. Environmental Pollution 102
(S1), 403e407.
Gundersen, P., Schmidt, I.K., Raulund Rasmussen, K., 2006. Leaching of nitrate from temperate forests e effects of air pollution and forest management. Environmental Reviews 14, 1e57.
Kopacek, J., Hejzlar, J., Stuchlik, E., Fott, J., Vesely, J., 1998. Reversibility of
acidification of mountain lakes after reduction in nitrogen and sulphur
emissions in central Europe. Limnology and Oceanography 43 (2),
357e361.
Kristensen, H.L., Gundersen, P., Callesen, I., Reinds, G.J., 2004. Throughfall
nitrogen deposition has different impacts on soil solution nitrate concentration in European coniferous and deciduous forests. Ecosystems 7 (2),
180e192.
Lamsal, S., Grunwald, S., Bruland, G.L., Bliss, C.M., Comerford, N.B., 2006.
Regional hybrid geospatial modeling of soil nitrate-nitrogen in the Santa
Fe River Watershed. Geoderma 135, 233e247.
Lawler, J.J., White, D., Neilson, R.P., Blaustein, A.R., 2006. Predicting climate-induced range shifts: model differences and model reliability. Global
Change Biology 12, 1568e1584.
Lovett, G.M., Weathers, K.C., Arthur, M.A., 2002. Control of nitrogen loss
from forested watersheds by soil carbon: nitrogen ratio and tree species
composition. Ecosystems 5, 712e718.
Lovett, G.M., Weathers, K.C., Arthur, M.A., Scultz, J.C., 2004. Nitrogen
cycling in anorthern hardwood forest: do species matter? Biogeochemistry
67, 289e308.
MacDonald, J.A., Dise, N.B., Matzner, E., Armbruster, M., Gundersen, P.,
Forsuis, M., 2002. Nitrogen input together with ecosystem nitrogen enrichment predict nitrate leaching from European forests. Global Change Biology 8, 1028e1033.
Schopp, W., Posch, M., Mylona, S., Johansson, M., 2003. Long-term development of acid deposition (1880e2030) in sensitive freshwater regions in
Europe. Hydrology and Earth System Sciences 7, 436e446.
Sullivan, M.S., Jones, M.J., Lee, D.C., Marsden, S.J., Fielding, A.H.,
Young, E., 2006. A comparison of predictive methods in extinction risk
studies: contrasts and decision trees. Biodiversity and Conservation 15,
1977e1991.
van der Salm, C., de Vries, W., Reinds, G.J., Dise, N.B., 2007. N leaching
across European forests: derivation and validation of empirical relationships using data from intensive monitoring plots. Forest Ecology and Management 238, 81e91.
Vitousek, P.M., Aber, J.D., Howarth, R.W., Likens, G.E., Matson, P.A.,
Schindler, D.W., Schlesinger, W.H., Tilman, D.G., 1997. Human alteration
of the global nitrogen cycle: sources and consequences. Ecological Applications 7 (3), 737e750.
Witten, I.H., Frank, E., 2005. Data Mining: Practical Machine Learning Tools
and Techniques, second ed. Elsevier, San Francisco
Wright, R.F., van Breemen, N., 1995. The NITREX project: an introduction.
Forest Ecology and Management 71 (1e2), 1e5.