After 50 years of quantitative palaeoecology

advertisement
After 50 Years of Quantitative Palaeoecology
– Senescence, Maturity, or Progress?
H John B Birks
University of Bergen and University College London
Lanzhou, August 2015
Typical ‘Life’ of a Scientific Approach –
Main Phases
Activity (e.g.
publications)
Pioneer
Building
progress
Mature?
stability
senescence
1965
1975
1985
1995
2005
2015
Time
Where will we go from 2015?
At end of mature phase, stable state implies now accepted as ‘normal science’
Senescence implies some of the earlier ideas not very useful, best forgotten
about!
Progress indicates more to be done, not yet fully mature
The Pioneering Phase: 1965–1974
Cognisance, ignorance, knowledge, and uncertainty
(began 50 years ago)
The Pioneering Phase: 1965–1974
All descriptive – characterise patterns in complex multivariate data
(stratigraphical or surface-sample data). Exception is Webb & Bryson (1972)
which provides narratives (untestable climate reconstructions) from pollen data,
and Mosimann (1965) which presents robust statistical methods for estimating
errors in pollen counting – sadly ignored today!
The Building Phase: 1975–1985
(began 40 years ago)
The Building Phase: 1975–1985
1985
Primarily descriptive or narrative, hint of
analytical hypothesis testing in Birks &
Peglar (1979) in relation to different
interglacials.
1992
The Building Phase: 1975–1985
At the same time, important developments going on in quantitative
plant ecology
J Ecol
1973
Vegetatio
1980
1987
Ecology
1986
Correspondence analysis (CA),
detrended correspondence
analysis (DCA), canonical
correspondence analysis (CCA)
The Mature Phase: 1985–2015
(began 30 years ago)
The Mature Phase: 1985–2015
2012
Primarily narratives (plausible but untestable environmental
reconstructions) and analytical hypothesis-testing
The Mature Phase: 1985–2015
Culminated in the ‘big blue book’ of 2012 edited by Andy
Lotter, Steve Juggins, John Smol, and myself
Was that the ‘pinnacle’ of the subject’s life?
At same time, applied statisticians were developing new
‘state-of-the-art’ techniques for handling and analysing huge
data-sets, so-called ‘data-mining’ and ‘statistical learning’
procedures.
2011
2013
2015
Data-sets so large that analysis can ‘learn’ from the data
when split into a ‘training’ set, a ‘validation’ set, and a ‘test’
set. With ever-increasing computer power, can repeat the
analyses with such random splitting of the data many times
(e.g. 1000) to assess uncertainties, significance levels, etc.
‘Statistics in the computer age’
Cross-validation – bootstrapping, leave-one-out, leave-n-out
(where n may be 100, 500, etc. objects)
Discuss six of these new techniques that in the last few years
have shown themselves to be important additions to the
quantitative palaeoecologist’s tool-kit.
To put them into context, outline the basic uses of numerical
methods in quantitative palaeoecology (Birks 2013
Encyclopedia of Quaternary Science 3, 821-830).
Data collection and data
assessment
• Identification
• Error estimation
Data summarisation
• Single stratigraphical or
geographical data-sets (e.g.
zonation, ordination)
• Two or more stratigraphical
or geographical data-sets
Data analyses
• Palynological richness
• Population analysis
• Rate-of-change analysis
• Time-series analysis
• Pollen-based climate
reconstructions
Data interpretation
• Vegetation reconstruction
• Causative factors
Quaternary data-sets can be modern (‘surface samples’) (M)
or fossil (stratigraphical) (F) data
Six new techniques
1. Co-correspondence analysis (Co-CA)
2. Classification and regression trees (CART)
+ indicator species analysis (INDVAL)
3. Procrustes analysis and comparison of
ordinations and classification
M or F, data
summarisation,
data analysis
4. Principal curves (PC)
5. Intrinsic and extrinsic drivers of change
F, data
interpretation
6. Statistical significance of environmental
reconstructions
F, data
analysis, data
interpretation
Palaeoecological pioneers of these new techniques in the last
2–3 years – next generation
Jacob Carstensen
Tom Davidson
Ulrike Herzschuh
Vivian Felde
Gavin Simpson
Alistair Seddon
And the applied statisticians who developed these methods
Cajo ter Braak
Robert Tibshirani
Trevor Hastie
Mark Hill
Glenn De’ath
Pierre Legendre
Co-Correspondence Analysis (Co-CA)
ter Braak & Schaffers (2004) Ecology 85: 834-846
Problems: are carabid beetles in grassland more closely related to vegetation structure
(height, cover, biomass, etc.) than to vegetation composition? Are fen bryophytes more closely related
to vegetation composition than to water chemistry?
Data - beetles, plants, vegetation type, vegetation structure, and environmental variables
all from same set of sites.
Approaches
1. RDA/CCA
- predict beetles from environmental data.
- cannot predict beetles from vegetation data because may be more plant species
(predictors) than sites. No constraints.
2. CA/DCA
of beetles and plants separately, correlate the axes (compare with Procrustes rotation).
Correlative rather than predictive approach.
Can reduce plant data to (D)CA axes first, use these as predictors. Will work if major
patterns in one biological data-set are important for the other response data-set. Need
not be so.
Need a one-step method where the most important relationships are expressed in the first
few axes so that nothing important is missed. Co-correspondence analysis.
Co-correspondence analysis (Co-CA)
Problem with combined CA is that each CA has its own site weights (the site's total
abundance of the species in the analysis). Pointless to have weights that are a sum of
both beetles and plants.
As in CA (reciprocal averaging algorithm) but has an explicit maximisation criterion for
Co-CA, the covariance between WA species and site scores of beetles should be
maximised with WA species and site scores of plant data. Replaces linear combinations
(PCA, PLS, RDA) with weighted averages, so it is suitable for unimodal biological data.
Symmetric, descriptive Co-CA (can swap data sets)
Asymmetric, predictive Co-CA (data A are thought to influence data B)
CA - selects species scores (by WA) to maximise variance of site scores under the
constraint that the species scores have unit variance. Symmetrical in that species and
sites can be interchanged in the optimisation criterion.
Co-CA - calculates two sets of WA species and site scores to maximise weighted
covariance between the two sets of WA species and site scores with allowance for
differences in weights among data. What is maximised is covariance between two sets of
site scores with common site weights; covariance is maximised by finding appropriate
species weights. Species scores of one set are weighted averages of other set's site
scores and site scores are weighted averages of the species scores of own set.
Beetles 91 species, 173 plant species, 30 sites.
Eigenvalues of the first three axes of separate CAs and
DCAs and of symmetric Co-CA of beetles and plants.
Axis
Beetles
Plants
Beetles-plants
Method
1
2
3

CA
0.50
0.36
0.32
4.99
DCA
0.50
0.32
0.21
Length of gradient
3.22
2.74
2.57
CA
0.57
0.53
0.42
DCA
0.57
0.41
0.27
Length of gradient
3.44
2.99
2.88
Co-CA
0.25
0.13
0.08
Highly structured data-sets - high eigenvalues, long gradients.
5.65
0.94
Correlation coefficients between beetle-derived and plant-derived
site scores of the first three axes of separate CAs and DCAs and of
symmetric Co-CA (% fit = the percentage fit of the beetle data by
the first two plant-derived axes).
Method
Axis
% fit (2 axes)
1
2
3
CA
0.88
0.27
0.46
15
DCA
0.89
0.53
0.07
16
Co-CA
0.96
0.94
0.88
19
Highest correlations for all three axes in Co-CA
Percentage variance for 2 axes highest (19%) in Co-CA
Direct ordination procedure for relating one community data-set to another
community data-set.
Combines WA and PLS to maximise covariance between WA species scores of one
community data-set with those of another. Finds ecological gradients common to
both.
Species assemblages are a multivariate 'bio-assay' of the environment.
Assemblages analysed by Co-CA often give better predictions of another set of
assemblages than using environmental variables alone. Often environmental
basis for ecological gradients is not precisely known.
Fen bryophytes - vascular plants
Fen bryophytes - environmental variables
28% explained Co-CA
17%
Co-CA can be used to find good indicators for biodiversity. Not all species groups
are equally easy to sample or identify. Can try to predict a 'difficult' group from
an 'easy' group. Need representative full data for both species groups from
common set of sites for Co-CA. After this, only the 'easy' group need be sampled.
Another idea is to look at biological data at different taxonomic levels - species,
genera, families, or as functional types. See how well each predicts each other.
A palaeoecological application of Co-CA
Setesdal, south-central Norway
* = lakes
Felde et al. (2014)
Felde et al. (2014)
Setesdal pollen percentages
Felde et al. (2014)
Setesdal plant abundances
Felde et al. (2014)
Setesdal plants as pollen equivalents
How similar are the patterns in the modern pollen and the
modern vegetation (plants or pollen equivalents)
Co-correspondence analysis
See how co-correspondence analysis
decreases with increasing elevation,
far-distance pollen blown up from low
elevations, distorting the pollen–
vegetation relationship.
Co-CA also been used to quantify cocorrespondence between down-core
variables (must be in identical samples)
(e.g. diatoms, cladocera, chironomids)
Felde et al. (2014)
Classification and Regression Trees (CART) and
Indicator-Species Analysis (IndVal)
Common questions in analysis of large multivariate data-sets
(fossil or modern) are
(1) Are there any ‘real’ groups or clusters in the data (i.e. groups
that are not simply an artefact that a clustering algorithm
will, by design, force things into ‘groups’ even random
numbers)?
(2) How many ‘real’ groups are there in a data-set?
Related question is which variables are ‘indicative’ of particular
groups (‘indicator species’).
Recent developments in applied statistics now make it possible
to answer these questions.
Explain variation of single response variable by one or more explanatory
or predictor variables.
Response variable can be quantitative (regression trees) or categorical
(classification trees).
Predictor variables can be categorical and/or quantitative.
Trees constructed by repeated splitting of data, defined by a simple rule
based on single predictor variable.
At each split, data partitioned into two mutually exclusive groups, each of
which is as homogeneous as possible. Splitting procedure is then applied
to each group separately.
Aim is to partition the response into homogeneous groups but to keep the
tree as small and as simple as possible.
Usually create an over-large tree first, pruned back to desired size by
cross-validation.
Each group typically characterised by either the distribution (categorical
response) or mean value (quantitative response) of the response variable,
group size, and the predictor variables that define it.
Splitting Procedures
Way that predictor variables are used to form splits depends on their type.
1. Categorical variable with two levels (e.g. small, large), only one
split is possible, with each level defining a group.
2. Categorical variables with more than two levels, any combination of
levels can be used to form a split. With k levels, there are 2k-1 –1
possible splits.
3. Quantitative predictor variables, a split is defined by values less
than and greater than some chosen value. Only the rank order of
quantitative variables determines a split, and for u unique values
there are u-1 possible splits.
From all possible splits of predictor variables, select the one that maximises the
homogeneity of the two resulting groups. Homogeneity can be defined in many
ways, depending on the type of response variable.
Trees drawn graphically, with root node representing the undivided data at the
top, and the branches and leaves (each leaf representing a final group) beneath.
Can also show summary statistics of nodes and distributional plots.
Ecological example
Regression tree
(5 point abundance)
Regression tree analysis of the abundance of
the soft coral species Asterospicularia laurare
rated on a 0-5 scale; only values 0-3 were
observed. The explanatory variables were
shelf position (inner, mid, outer), site location
(back, flank, front, channel), and depth (m).
Each of the three splits (nonterminal nodes) is
labelled with the variable and its values that
determine the split. For each of the four
leaves (terminal nodes), the distribution of
the observed values of A. laurae is shown in a
histogram. Each node is labelled with the
mean rating and number of observations in
the group (italic, in parentheses). A. laurae is
least abundant on inner- and mid-reefs (mean
rating = 0-038) and most abundant on front
outer-reefs at depths  3m (1.49). The tree
explained 49.2% of the total ss, and the
vertical depth of each split is proportional to
the variation explained.
Classification tree
( +/ - )
Classification tree on the
presence-absence of A.
laurae. Each leaf is
labelled (classified)
according to whether A.
laurae is pre-dominantly
present or absent, the
proportions of
observations in that
class, and the number of
observations in the
group (italic, in
parentheses). The
misclassification rate of
the model was 9%,
compared to 15% for
the null model (guessing
with the majority, in this
case the 85% of
absences).
Splits minimise sum-of-squares within groups in regression tree; splits are based on
proportions of presence and absence in the classification tree.
CART can be used for (i) description and summarisation of data and (ii) prediction purposes for
new data.
Can identify the environmental conditions under which a taxon is particularly abundant
(regression tree) or particularly frequent (classification tree).
Regression trees explaining the abundances
of the soft coral taxa Efflatounaria, Sinularia
spp., and Sinularia flexibilis in terms of the
four spatial variables (shelf position, location,
reef type, and depth) and four physical
variables (sediment, visibility, waves, and
slope). At the bottom of the cross-validation
plots (a, d, g), the bar charts show the relative
proportions of trees of each size selected
under the 1-SE rule (grey) and minimum rules
(white) from a series of 50 cross-validations.
For Efflatournaria (a), a five-leaf tree is most
likely by either the 1-SE or the minimum rule.
For Sinularia spp. (d), five- to eight-leaf trees
have support, and for S. flexibilis (g), five- to
nine-leaf trees have support. Cross-validation
plots (a, d, g), representative of the modal
choice for each taxa according to the 1-SE
rule, are also shown. For all three taxa, a fiveleaf tree was selected (c, f, i). The shaded
ellipses enclose nodes pruned from the full
trees (b, e, h), each of which accounted for >
99% of the total ss.
Multivariate Regression Trees
De'Ath, G. (2002) Ecology 83, 1105-1117
Natural extension of univariate regression trees. Considers multivariate response, not
single response.
Replace univariate response by multivariate assemblage response and redefine the
impurity of a node by summing the univariate impurity measure over the multivariate
response.
Extend univariate sum-of-squares impurity criterion to multivariate sum-of-squares about
the multivariate mean. Sum of squared Euclidean distances (SSD) of samples about the
node centroid.
Each split minimises the SSD of samples from the centroids of the nodes to which they
belong. Maximises the SSD between node centroids (cf. k-means clustering). This
minimises SSD between all pairs of samples within nodes and maximises SSD between all
pairs of samples in different nodes.
Each tree leaf can be characterised by multivariate mean of its samples, number of
samples at the leaf, and the predictor values that define it.
Forms clusters of sites by repeated splitting of data, each split defined by simple rule
based on environmental values. Splits chosen to minimise the dissimilarity of sites within
node.
MRT is a form of constrained clustering, with constraints set by
the predictor variables and their values
MRT can be extended to dissimilarity measures other than
squared Euclidean distance (distance-based MRT)
Can identify indicator species using Dufrêne & Legendre (1967)
INDVAL approach
Use multivariate regression trees for numerical zonation of
fossil data (depth as age as predictor)
Lowest cross-validated
relative error is 8 groups but
6 groups lie within one
standard error of 8 groups so
this is the simplest partition
in ‘real’ statistically different
groups
Simpson & Birks (2012)
Can also be applied to modern surface samples (multivariate
classification trees with vegetation groups as predictors)
Felde et al. (2014)
Given these groups, are there any statistically significant
‘indicator species’?
Basic concept and tradition in ecology and biogeography – characteristic or
indicator species e.g. species characteristic of particular habitat, geographical
region, vegetation type. Valuable in monitoring, conservation, management,
description, and stratigraphy.
Add ecological meaning to groups of sites discovered by clustering
INDICATOR SPECIES – indicative of particular groups of sites. ‘Good’ indicator
species should be found mostly in a single group of a classification and be
present at most of the sites belonging to that group. Important DUALITY (faithful
AND high constancy)
INDVAL – Dufrene & Legendre (1997) Ecological Monographs 67, 345-366
Derives indicator species from any hierarchical or non-hierarchical classification
of objects
Indicator value index based only on within-species abundance and occurrence
comparisons. Its value is not affected by the abundances of other species.
Significance of indicator value of each species is assessed by a randomisation
procedure.
Indicator Species Value
Specificity measure
FAITHFULNESS
Aij = N individuals ij / N individuals i.
Mean abundance of species i
across the sites in group j
sum of the mean abundance of
species i over all groups
(means are used to remove any effects of variation in the number of sites belonging to the
various groups)
Fidelity measure
CONSTANCY
Bij = N sites ij / N sites. j
number of sites in group j
where species i is present
total number of sites in
cluster j
Aij is maximum when species i is present in group j only
Bij is maximum when species i is present in all sites in group j
Indicator value (Aij . Bij . 100) %
INDVALij
Indicator value of species i for a grouping of sites is the largest value of INDVALij observed over all
groups j of that classification.
INDVALi = max (INDVALij)
Will be 100% when individuals of species i are observed at all sites belonging to a single group
A random re-allocation procedure of sites among the
groups is used to test the significance of INDVALi
Can be computed for any given partition of grouping of
sites and/or for all levels of a hierarchical classification of
sites.
PcoA Ca
MDS DCA
Species
Sites
Sites
Site ranking
Hierarchical
cluster(s)
Non-hierarchical
cluster(s)
UPGMA-WARD
k means
Site groups
Any site
typology
Measuring Species
Indicator Power
Observed value
Random permutation
of sites in the typology
A randomised INDVAL to be
included in the distribution
Randomised INDVAL distribution
Diagram of the analysis steps for the indicator value method
Carabid beetles 97 species. 123 year-catches from 69 different localities
representing 9 habitats.
P. madidus (1)
H. rubripes (1)
P. cupreus (1)
P. cupreus (1)
P. melanarius (1)
7
P. cupreus (1)
Chalky xeric grasslands
3
A.. equestris (1)
1
Chalky mesic grasslands
6
P. versicolor (3)
C. problematicus (3)
P. lepidus (1)
C. melanocephalus (1)
A. ater (1)
B. ruficolis (1)
D. globosus (1)
C. violaceus (1)
P. versicolor (3)
T cognatus (1)
Zn grasslands and xeric sandy heathlands
Atypical and xeric gravelly heathlands
Temporary flooded heathlands
5
Peaty heathlands
2
P. diligens (1)
P. rhaeticus (1)
A. fuliginosus (1)
P. minor (1)
P. minor (3)
A. fuliginosus (1)
L. pilicornis (1)
T. secalis (1)
P. nigrita (1)
D. globosus (1)
4
Fringes of ponds and alluvial grasslands
A. communis (1)
Swamps and raised mires
Dendrogram representing the TWINSPAN classification of the year-catch
cycles. The indicator species relative abundance levels are expressed on an
ordinal scale (1, 0-2%; 2, 2-5%; 3, 5-10%; 4, 10-20%; and 5, 20-100%.
Felde et al. (2014)
Modern pollen and vegetation types
Procrustes Analysis and Comparisons of
Ordinations and Classification
Many ordination or ‘scaling’ methods can be used to summarise
complex multivariate data in few (usually 1 or 2) dimensions
Principal components analysis
Correspondence analysis
Detrended correspondence analysis
Non-metric multidimensional scaling
Metric scaling (= principal
coordinates analysis)
Constrained ordinations (e.g.
canonical correspondence analysis)
All these methods make different assumptions of the data (linear
or unimodal responses, abundant taxa have greatest influence, rare
taxa important or unimportant, etc.)
Are the results we obtain with different methods consistent or are
they method-dependent?
Need a numerical method to compare two ordinations of the same
n samples. Procrustes rotation or analysis
Procrustes Rotation
rotates ordinations to maximum similarity between two
ordinations and estimates the minimised difference
Two configurations of points in ordinations representing the same n
samples.
Take one configuration as fixed, move the other to match as closely as
possible and to minimise the sum of squared distances of the
transformed points from the respective points of the fixed configuration
1. Translation of origin – shift the origins of the co-ordinate axes
2. Rotation and/or reflection of axes
3. Uniform scaling (deflate or inflate the axis scale)
Single points can move a lot although the sum of squared distances can
stay fairly constant, especially in large data sets.
Procrustes rotation of
NMDSCAL (circles) and
PCA (arrows)
ordinations
Procrustes rotation residuals
(differences between NMDSCAL and
PCA ordination site scores)
Generalised Procrustes Rotation
Any number of configurations. Basic idea is to find a consensus
or centroid configuration so that the fit of ordinary Procrustes
rotation to this centroid over all configurations, is optimal.
Minimise m2 where m2 = mi2 where mi2 is Procrustes statistic
for each pair-wise comparison.
1. Translation
2. Rotation and/or reflection
3. Scaling
Can ordinate all the m2 values in a principal coordinates
analysis
Example – to compare results of 12 different ordination methods to the same data
m2 can be considered as squared distances in a PCOORD
analysis
N = non-metric scaling
C = correspondence analysis
P = principal co-ordinates analysis
Location of ordination methods on the
two principal co-ordinates axes: these
two axes represent 75% of the
variation m2 statistics.
1,2 = presence/absence data only
2 = all joint absences ignored
3,4,5 = abundance data
3 = log abundance data
4 = all joint absences ignored
5 = abundance data
Comparison of ordinations of modern pollen data
See how CA, DCA, and CCA form
second axis
Felde et al. (2014)
Can also use PROTEST to assess if m2 value for a given comparison
of two ordinations is statistically significantly different from random
expectation derived from a computer-intensive randomisation test
Can have many classification (clusterings or partitionings) of
same data
e.g.
k-means clustering
Spherical k-means
TWINSPAN
Ward’s clustering method
Multivariate classification
trees
How can we compare these clusterings?
How to Compare Classifications
(1) Cross-classification table
(2) Rand coefficient (1971)
J. Amer. Stat. Assoc. 66, 846-850
2
2
  






2
1
 2    nij      nij     nij 

  i  j

j  i
  i j




c 1
1 n (n - 1)
2
Classification
B
c
I
II
Classification A
I II III
2 2 1
1 0 4 (n = 10)
= 1 – [½{(2 + 1)2 + (2 + 0)2 + (1 + 4)2 + (2 + 2 + 1)2 + (1 + 0 + 4)2} – 22 + 22 + 12
+ 12 + 02 + 42] / 45
= 1 – [½ {38 + 50} – 26] / 45
= 1 – 18/45
= 0.6
Range
0 (dissimilar) to 1 (identical classifications)
Rand's coefficient should be corrected for chance so as to
ensure
1. its expected value is 0 when the partitions are selected at
random (subject to the constraint that the row and
column totals are fixed)
2. its maximum value is 1
The similarity between two independent classifications of the
same set of objects can be assessed by comparing their Rand
statistic with its distribution under the randomisation model.
For small values of n objects, the complete set of n! values of
Rand can be evaluated. For large values of n, comparison is
made with the values resulting from a random subset of
permutations.
Matrix of Rand’s (1971) Coefficients between Partitions of the LichtiFederovich and Ritchie (1968) Data Based on Vegetation-Landfrom Units and
Partitions Suggested by Several Numerical Classifications of the SurfacePollen Data
Vegetation – landform classification
Numerical pollen classification
3 groups 7 groups 11 groups
Agglomerative
(3 groups)
0.76
0.65
0.64
Agglomerative
(5 groups)
0.69
0.76
0.77
Agglomerative
(9 groups)
0.61
0.86
0.87
Hybrid
(9 groups)
0.61
0.86
0.87
Hybrid
(11 groups)
0.59
0.85
0.88
Given the Rand values between all pairs of classifications, can
ordinate them using principal coordinates analysis to see how
similar different classifications are
Felde et al. (2014)
Results from related methods (e.g. k-means, Ward’s method)
that are based on same underlying numerical approach (e.g.
sum-of-squares) more similar than results based on methods
with different underlying numerical approach (e.g. random
forests, TWINSPAN).
Bottom line is that, in general, palaeoecological data are well
structured and any robust ordination or clustering method
detects this structure.
General recommendations for data summarisation
1. Modern surface samples
Correspondence analysis, detrended correspondence
analysis, principal curves
Multivariate classification trees, k-means clustering
2. Fossil data
Principal components analysis, CA, DCA, principal curves
Multivariate regression trees
Principal Curves (PC)
Principal components analysis (PCA) widely used as datasummarisation technique. Axes are linear combinations of the
data the best explain, in a statistical sense, the data.
Components are inherently linear and if data do not follow linear
patterns, PCA is sub-optimal in capturing this non-linear
variation. Hence CA, non-metric scaling, or principal coordinates
analysis are used as ecological and palaeoecological data are
inherently non-linear. Species responses are non-linear, usually
unimodal.
De'ath, G. (1999) Ecology 80, 2237-2253
Principal curves are smooth one-dimensional curves in a highdimension space.
Form of non-linear PCA, analogous to LOESS smoothers as a nonlinear regression tool.
Principal curves minimise sum of squares distances from data (as
does PCA) but to a curve, not to a line or plane as in PCA.
Two species along single gradient
Principal curve showing gradient location
(a) least-square regression
(b) PCA
(c) cubic smoothing spline
(d) PC – combines (b) and (c) to
create PC. Tries to minimise
the orthogonal distances
Simpson &
Birks (2012)
Degree of smoothness constrained by a penalty term. Optimal
degree of smoothing identified by generalised cross-validation.
Point on the PC to which an object projects is the point on the
curve that is closest to the object in m dimensions.
Fitting is complex two-step iterative procedure. Start with a PCA result
25 sites, 4 species, Gaussian responses, one gradient. Plotted on first two
PCA axes. Iterative fitting of principal curves.
(a) Data using PCA axis 2
(39.4%) as initial curve
(d) Improved and final fit
with 7 d.f. 98.3% variance
(b) First iteration,
snooth spline 3 d.f.
(e) Result of using PCA axis 1
(50.4%) as start and 3 d.f.
(c) Convergence with 3
d.f.
(f) As (e) but 7 d.f.
Also models, using smoothers, the response variables along
the PC
Principal curves and real data
12 hunting spiders at 28 sites and 6 environmental variables
Principal curve superimposed on PCA biplot. Numbers are locations along the gradient. Principal
curve captures 90% of species variance. Modelled environmental variable values for 6 locations
show PC is mainly moisture, sand, moss and twig gradient.
Species responses along principal curves
All are unimodal. Optima well
approximated by intersection of
species vectors with the
principal curve. Curves have
approximately equal tolerances.
Ideal for finding 1-dimensional
gradients that explain species
composition as well, or better
than, higher dimensional
ordination methods. Have been
extended to 2-dimensional
gradients as principal surfaces.
Abundances and response curves from the principal curve
gradient analysis of the hunting spider data. Each panel
represents a single species (8-letter code). The plots suggest
that the principal curve fit is adequate and show unimodal
response curves of approximately equal tolerances, with
maxima located at widely varying locations along the gradient.
Less restrictive in assumptions
than PCA, CA, or DCA. Only
assumes smooth responses.
Very neutral method (cf. LOESS
in regression).
Computationally difficult, hardly
used yet...
Palaeoecological use –
Abernethy Forest late-glacial–early-Holocene pollen data
Simpson & Birks (2012)
PC axis 1
46.5%
PC axis 2
23.7%
Total
80.2%
PC
95.8%
PCA1 + PCA2 80.2%
CA1 + CA2
Simpson & Birks (2012)
Distance along PC
expressed as rate of
change per kyr
Distance along gradient
expressed a proportion of total
gradient for PC, PCA1, and CA1
52.3%
Simpson & Birks (2012)
Response curves for 9 most abundant pollen taxa in
Abernethy Forest
Modern pollen
assemblages
PC using different
start configurations
Felde et al. (2014)
PCA
41.6%
76.1%
PCoA
69.7%
77.9%
CA
37.8%
79.3%
NMDS
69.5%
73.2%
RDA
55.5%
74.8%
CCA
58.6%
72.7%
PC always as good as or better than simple ordination 1 or 2 axes
Felde et al. (2014)
PCs very useful and powerful data-summarisation technique
for very long (1–3.5 million year) pollen records of
alternating glacial and interglacial stages from Colombia,
Siberia, and Greece – on-going work by Vivian Felde,
Chronis Tzedakis, Henry Hooghiemstra, Ulrike Herzschuh,
and myself.
Intrinsic and Extrinsic Drivers of Change
Palaeoecological fossil sequences only, data and
interpretation analyses rather than data summarisation.
What drives observed stratigraphical changes in a sequence?
Williams et al. (2011) J Ecol 99: 664-577
Extrinsic and intrinsic forcing of abrupt ecological change:
case studies from the late Quaternary
Extrinsic (external) drivers of abrupt ecological change and
intrinsic (internal) drivers of abrupt ecological change
Extrinsic
Intrinsic
Williams et al. (2011)
How to detect extrinsic and intrinsic drivers from palaeoecological data? Need fossil and past environmental data
Seddon et al. (2014)
Ecology 95:
3046-3055
To detect regime shifts (change points), several methods
for ‘change-point’ analysis
3 major
change points
Seddon et al. (2014)
Non-linear regressions of diatom transitions in relation to drivers
Closely track
environment,
suggesting extrinsic
drivers
Seddon et al. (2014)
Seddon et al. (2014)
Change from mangrove to a microbialmat dominated system 945 yr BP
No fit in non-linear regression between
δ13C and Ti influx
Good fits for two halves of data
Suggestive of intrinsic regime shift
‘Regime shift’ or ‘tipping point’
Extrinsic 
Intrinsic 
Major challenge to apply this methodology to evaluate relative
importance of extrinsic and intrinsic drivers. Suspect extrinsic
drivers are the most frequent and important
Evaluation of Palaeoenvironmental
Reconstructions
Major breakthrough in Quaternary science was the development
of transfer functions (calibration functions) by Imbrie & Kipp
(1971) that transformed fossil data (e.g. pollen, foraminifera,
diatoms) into estimates of past environment (e.g. climate, seasurface temperatures, lake-water pH)
John Imbrie
Nilva Kipp†
Xm
General Theory of
Reconstruction
Ym
Yf
Ûm TRANSFER
FUNCTION
Xf
Based on a
diagram by
Steve Juggins
Many assumptions in this approach irrespective of which
numerical method is used to derive the transfer function.
“Environmental variable (e.g. summer temperature) to
be reconstructed is, or is linearly related to, an
ecologically important variable in the system”
“Other environmental variables than, say summer
temperature, have negligible influence, or their joint
distribution with summer temperature in the fossil set is
the same as in the training set”
Birks et al. (1990)
Numerical methods such as weighted averaging (WA), WA
partial least squares, and modern analogue technique (MAT)
will produce ‘reconstructions’ even with random data!
Key question therefore
Is an environmental reconstruction statistically significant?
Telford and Birks (2011) Quat Sci Rev 30: 1272-1278
doi: 10.1016/j.quatscirev.2011.03.002
A reconstruction is considered statistically
significant if it explains more of the variance in the
fossil data than most (95% by convention)
reconstructions derived from transfer functions
trained on randomised data
Richard
Telford
Stages
•
PCA of fossil core data to determine the maximum amount of
variance explicable by one axis or latent variable, say 30%
•
Do a reconstruction and use the reconstruction as an
‘environmental’ variable in a RDA to see how much variance
the reconstruction explains, say 20%
•
Do 999 reconstructions using the same biological data,
modern and fossil, but with environmental data drawn from a
uniform distribution
•
Derive an empirical distribution of variance explained based
on 999 randomisations and calculate the p-value of the actual
reconstructed value as
p = Number of reconstructions ≥ 20% (including actual one)
Number of reconstructions + 1 (the actual one)
Round Loch of Glenhead, p = 0.006
Telford & Birks (2011)
Can test if more than one reconstruction made from one
biological data-set is statistically significant.
Chukchi Sea dinoflagellates – summer sea-surface temperature;
sea-ice duration; summer salinity
Summer salinity not
significant (p = 0.146)
What about ice duration and
SST?
Telford & Birks (2011)
Partial out SST first as it explains
marginally more of the variance
(p = 0.003). Ice no longer
significant when SST is allowed
first. No significant independent
information.
Telford & Birks (2011)
Applicable to almost all reconstruction methods, not just
WA or WA-PLS
Many reconstructions turn out not to be statistically
significant as basic assumptions of the transfer functions are
violated because of spatial autocorrelation, or because of
strong collinearity of environmental variables (e.g. July
temperature, JJA temperature, growing season length,
growing degree days).
Important approach because it is testing a hypothesis,
namely a reconstruction. Analytical phase in palaeoecology.
Conclusions
Quantitative palaeoecology is not senescent or stable but is
continuing to make important progress – principal curves,
statistical testing of reconstructions, intrinsic and extrinsic
drivers, co-correspondence analysis, etc.
Progress is possible because of very talented young generation
of researchers in quantitative palaeoecology and a brilliant set of
applied statisticians. Essential to have effective and full joint
discussions . Both are needed if progress is to continue.
Quaternary palaeoecology has reached a major stage in its
development, namely identifying key questions and priority
research areas for palaeoecology.
December 2012 Palaeo-50 workshop
905 questions submitted from 127 individuals in 26
countries and 5 continents
Reduced by removing duplicates to 804 questions in 55
topics
The 66 participants then narrowed the 804 questions
down to 50 in 6 topics during an intensive 2-day workshop
1. Human-environment interactions in the
Anthropocene
Alistair Seddon
2. Biodiversity, conservation, and novel
ecosystems
3. Ecosystem processes and
biogeochemical cycling
Anson Mackay
4. Comparing, combining, and synthesising
information from multiple records
5. Developments in palaeoecology
Ambroise Baker
Seddon et al. (2014)
Of the 50 questions, 18 are clearly quantitative and can only
be answered using state-of-the-art numerical procedures,
and 17 require significant numerical input. (In total 35 out of
the 50 questions require quantitative input)
Quantitative palaeoecology is thus now part of mainstream
palaeoecology. Quite a change from the pioneer phase of
1965-1974 – continual criticism and doubts about the value
of what we were trying to do.
Much has happened in quantitative palaeoecology in last 50
years. Very much still to be done by the current generation of
active researchers and the up-coming new generation.
Subject is very much alive, well, and progressing. Hopefully it
will continue to develop in next 50 years.
Acknowledgements
Cajo ter Braak
Alistair Seddon
Vivian Felde
Trevor Hastie
Gavin Simpson
Steve Juggins
Anson Mackay
Richard Telford
Cathy Jenks
Download