An overview of statistical methods applied to CPR data G. Beaugrand

Progress in Oceanography 58 (2003) 235–262
www.elsevier.com/locate/pocean
An overview of statistical methods applied to CPR data
G. Beaugrand a,∗, F. Ibañez b, J.A. Lindley a
b
a
Sir Alister Hardy Foundation for Ocean Science, The Laboratory Citadel Hill, Plymouth PL1 2PB, UK
Observatoire océanologique, Laboratoire d’Océanologie de Villefranche, BP 28, 06230 Villefranche-Sur-Mer, France
Abstract
Since the beginning of the Continuous Plankton Recorder (CPR) survey in 1931, information on the abundance of
a large number of plankton species or taxa has been obtained on a monthly basis in the northern North Atlantic. The
many different ecological issues in which the survey has been involved have led to the application of a range of
statistical methods. In this paper, we review some of the methods applied to the CPR data by presenting new and upto-date analyses. Special attention is devoted to multivariate analysis, which has been used extensively to extract information from the CPR database. Results obtained from recently applied geostatistical methods on CPR data are then
considered. An example of a time series decomposition by the use of Eigenvector filtering is presented to illustrate
time-series analysis.
 2003 Elsevier Ltd. All rights reserved.
Contents
1.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
2.
The descriptive period of the CPR survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
3. Multivariate analyses . . . . . . . . . . . . . . . . . . . . . .
3.1. Ordination in reduced space . . . . . . . . . . . . . . . .
3.1.1. Standardised Principal Component Analysis (PCA)
3.1.2. Centred PCA at diel and seasonal scales . . . . . .
3.1.3. Three-mode Principal Component Analysis . . . . .
3.1.4. Non-metric multidimensional scaling (MDS) . . . .
3.2. Cluster analysis . . . . . . . . . . . . . . . . . . . . . . .
3.2.1. Seriation . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2.2. Cluster Analysis and ordination . . . . . . . . . . . .
3.3. Indicator-value method . . . . . . . . . . . . . . . . . . .
4.
∗
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
237
238
238
243
244
246
247
247
248
248
Geostatistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
Corresponding author. Tel.: +44-1752-633133; fax: +44-1752-600015.
E-mail address: gbea@mail.pml.ac.uk (G. Beaugrand).
0079-6611/$ - see front matter  2003 Elsevier Ltd. All rights reserved.
doi:10.1016/j.pocean.2003.08.006
236
G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262
4.1.
4.2.
Spatial interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
Semi-variograms and cumulative semi-variograms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
5. Time series analysis . . . . . . . . . . . . . . . . . . . . . . . . . .
5.1. Cumulative sums . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2. Eigenvector filtering (EVF) and power spectra . . . . . . . .
5.3. Maximum entropy spectral and cross-spectral analyses . . .
6.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
254
254
254
258
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
1. Introduction
Since the start of the Continuous Plankton Recorder (CPR) monitoring survey in 1931, large amounts
of data have been accumulated. At present, information on the abundance of more than 450 species or
taxa has been gathered. A total of some 178,000 CPR samples were collected by the year 2000, comprising
~2 million entries and ~80 million data-points in the database. This programme has become the largest
plankton monitoring programme in the world, considering both its wide spatial coverage and long time span.
The CPR survey has been involved in the investigation of many ecological issues. Biogeographical
studies have been conducted showing spatial distribution throughout the North Atlantic Ocean and shelf
seas of more than 250 species such as Calanus finmarchicus and Calanus helgolandicus (Colebrook,
Glover & Robinson, 1961a; Edinburgh Oceanographic Laboratory, 1973). Recently, the mapping protocol
has been improved using the Lambert conical projection (Planque, 1996) and mapping techniques such as
kriging and the inverse squared distance interpolation method (Planque, 1996; Planque & Ibañez, 1997;
Beaugrand, Reid, Ibañez, & Planque, 2000a). A number of investigations have allowed a better characterisation of seasonal cycles and of spatial changes for many taxa (Glover, 1957; Colebrook, 1979; Colebrook,
1984). Other works have examined long-term changes in phytoplankton and zooplankton in relation to
hydro-meteorological forcing (Colebrook, 1981; Colebrook, 1982a; Colebrook, 1991; Taylor, Colebrook,
Stephens, & Baker, 1992; Reid, Edwards, Hunt, & Warner, 1998a; Edwards, John, Hunt, & Lindley, 1999).
Recent results using this large dataset indicate that year-to-year changes in standing stock, production and
community structure of plankton may be related to the North Atlantic Oscillation (NAO) and climate
change (Fromentin & Planque, 1996; Reid & Planque, 2000; Beaugrand, Ibañez, & Reid, 2000b; Beaugrand,
Reid, Ibañez, Lindley, & Edwards, 2002a). Other studies on diel vertical migration of some calanoid copepods (Hays, Proctor, John, & Warner, 1994; Hays, 1995; Hays, 1996; Hirst & Batten, 1998), spatial and
temporal changes in the diversity of copepods (Beaugrand & Edwards, 2001; Beaugrand, Ibañez, & Lindley,
2001), monitoring of non-indigenous species (Edwards, John, Johns, & Reid, 2001a), and unusual events
(Lindley et al., 1990; Edwards, John, Hunt & Lindley, 1999; Edwards, Reid, & Planque, 2001b; Edwards,
Beaugrand, Reid, Rowden, & Jones, 2002) have been undertaken and have led to a better understanding of
the ecology of many species, exceptional events and the functioning of North Atlantic pelagic ecosystems.
The many issues in which the CPR data have been used have involved the deployment of numerous
statistical analyses, of which only a limited number can reasonably be presented in this paper. Since most
statistical analyses found in classical statistical manuals can be used on the CPR data, only those methods
that have often been applied to the CPR dataset and for which it was possible to include a clear example
associated with a particular ecological issue are emphasised in this review. Moreover, the importance of
scales of variability, as stressed by many authors (e.g. Levin, 1992; Angel, 1994; Mann & Lazier, 1996;
Haury & McGowan, 1998; Lundberg, Ranta, Ripa, & Kaitala, 2000) is also addressed as this is as important
as the analyses themselves.
G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262
237
2. The descriptive period of the CPR survey
Until 1964, geographical distribution, annual and year-to-year variability of species or taxa sampled by
the CPR survey were mainly investigated by the use of graphs, contour diagrams or maps (Lucas, 1941;
Lucas, 1942; Rees, 1952; Glover, 1952). Most statistical analyses were restricted to one dimension. Despite
that, results were immediately meaningful and good progress was made in describing the biogeography of
species around the United Kingdom (Lucas, 1940; Robinson, 1961; Colebrook, John, & Brown, 1961b).
The spatial distribution of Centropages hamatus and Ceratium fusus is shown in Fig. 1, based on data
collected during the period 1948–1956 (Colebrook, John & Brown, 1961b; Robinson, 1961). This shows
the coastal distribution of Centropages hamatus, whereas the dinoflagellate Ceratium fusus has a wider
distribution, occurring in both oceanic and neritic waters (Fig. 1). This way of presenting the spatial distribution of plankton (on gridded charts of 1° of latitude by 2° of longitude), described by Colebrook, John &
Brown, 1961b) was used to produce the first atlas of plankton in the North Atlantic Ocean (Edinburgh
Oceanographic Laboratory, 1973). Seasonal cycles of plankton around the British Isles were also investigated. For example, Rae and Rees (1947) presented the seasonal cycle of Temora longicornis and the
group Para-Pseudocalanus spp. This way of investigating CPR results is still used today, although the
application of multivariate statistics has radically changed the way in which information is extracted from
the CPR dataset.
3. Multivariate analyses
While graphical presentation of CPR data is useful, it soon became clear that the huge mass of multidimensional information provided by the Survey had to be sorted and reduced according to its relevance.
For most techniques reviewed in this paper, no mathematical expressions are given in the text and readers
are referred to specialised books (e.g. Jolliffe, 1986; Legendre & Legendre, 1983; Legendre & Legendre,
Fig. 1. Spatial distribution of the calanoid copepod Centropages hamatus (a) and the dinoflagellate Ceratium fusus (b) for the period
1948–1956 around the British Isles. From Colebrook, Glover & Robinson (1961a) and Robinson (1961).
238
G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262
1998) and other references cited in the text. Table 1 lists the types of multivariate analyses that have been
applied to CPR data.
3.1. Ordination in reduced space
This type of multivariate analysis has been applied extensively to CPR data. It consists of representing
the relationships between objects and observations in a reduced number of dimensions (Legendre & Legendre, 1998). Principal Component Analysis (PCA) is an example. This ordination method has greatly helped
in the extraction of relevant information in many types of tables derived from the CPR dataset (Table 2).
Non-metric multidimensional scaling (Shepard, 1962; Kruskal, 1964) and three-mode Principal Component
Analysis (Jolliffe, 1986) are other techniques that have been applied more recently.
3.1.1. Standardised Principal Component Analysis (PCA)
Following Williamson (1961) and Cassie (1963), who were among the first to apply Principal Component
Analysis in plankton ecology, Colebrook (1964) started to analyse data on abundance from the CPR using
multivariate techniques. Standardised PCA was first applied by him to examine the main patterns of variability in the distributions of 22 different taxa around the United Kingdom (Fig. 2). He used eigenvectors
(Fig. 2(a)) to investigate the relationships between the species and principal components to examine the
spatial distribution of groups of species (Fig. 2(c)–(d)). Fig. 2(a) illustates the separation between neritic
and southern species along the first axis, while the second axis separates northern oceanic and intermediate
species. The 22 taxa were classified into five species associations (northern and southern oceanic, northern
Table 1
Types of multivariate analysis performed on CPR data
Multivariate techniques
Ecological goal
Authors
Standardised PCA
Centred PCA
Seriation
See Table 2
See Table 2
Examination of the relations between species
based on their annual fluctuation in abundance
Grouping of species or taxa
See Table 2
See Table 2
Colebrook (1964), Colebrook and
Robinson (1964), Colebrook (1969)
Lindley (1987), Lindley and
Williams (1994)
Clustering of pixels or geographical areas to
identify regions with similar year-to-year or
annual patterns in the abundance of species
Partition of the North Atlantic Ocean based on the
diel and seasonal pattern of diversity of calanoid
copepods
Determination of species associations based on the
relative abundance and presence of species in
distinct areas in the North Atlantic
Ordination of species or taxa based on the
similarity of their spatial distribution
Study of relationships between the size of spatial
structures and their temporal variability
Spatial and temporal modelling of the abundance
of species
Analyses of biological tables structured in space
and time. Evaluation and quantification of the
interactions between biology, space and time
Planque and Ibañez (1997),
Beaugrand, Reid, Ibañez & Planque,
2000a)
Beaugrand, Ibañez, Lindley & Reid,
2002b)
Cluster Analysis. Single linkage
agglomerative (nearest-neighbour)
clustering method
Cluster Analysis. Hierarchical
agglomerative flexible clustering
technique (Lance & Williams, 1967)
Cluster Analysis. Complete linkage
agglomerative clustering
Indicator-value method (Dufrêne &
Legendre, 1997)
Non-metric multidimensional scaling
Mantel correlogram
Generalised additive models
Three-mode PCA
Beaugrand, Ibañez, Lindley & Reid,
2002b)
Lindley (1987)Lindley and Williams
(1994)
Planque and Ibañez (1997)
Beare & McKenzie, 1999a, 1999b)
Beaugrand, Reid, Ibañez & Planque,
2000a)
G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262
239
Table 2
Diversity of matrices on which principal component analysis has been performed
Tables
Correlation/
Ecological goal
covariance matrix
Authors
Standardised
PCA
Matrix area × taxa
taxa × taxa
Colebrook (1964,
1984)
Standardised
PCA
Matrix years ×
geographical areas
Areas × areas
Standardised
PCA
Matrix years × taxa taxa × taxa
Standardised
PCA
Buys-Ballot table
Months-total
geographical areas × copepods-colour
months-total
index × Monthscopepods-colour
total copepodsindex
colour index
Standardised
PCA
Standardised
PCA
Standardised
PCA
Standardised
PCA for table
with missing
data
Centred PCA
Identification of species assemblages.
Examination of the relations between species.
Geographical locations of species associations.
Extraction of major patterns of year-to-year
variability in the abundance of species and its
variation in space.
Examination of the relationships between
species on the basis of their year-to-year and
long-term changes in a region.
Determination of the relationships between the
timing of the amplitude and the duration of the
spring bloom for total copepods and
phytoplankton (eigenvectors). Examination of
spatial changes in the characteristics of the
seasonal cycle (principal components).
Table geographical Months × months Investigation of the relationships between
areas × months
months for species such as Temora longicornis
and Acartia clausii and examination of the
spatial coherence of the seasonal cycle.
Table months × taxa taxa × taxa
Investigation of the relationships between
months and ordination of species according to
their main pattern of seasonality.
Table years ×
Months × months Relationships between the seasonal cycle and
months
the year-to-year variability of species.
Matrix months ×
Pixels × pixels
Determination of seasonal cycle of C.
map pixels of the
finmarchicus and investigation of its spatial
abundance of
variation.
C.finmarchicus
Buys-Ballot table
Pixels × pixels
Determination of seasonal and diel patterns of
months-2-hour
the diversity of calanoid copepods.
period × pixels for
Quantification of the two scales of variability at
diversity of calanoid
a mesoscale resolution in the North Atlantic.
copepods
Examination of the spatial variation of the
diversity of calanoids at diel and seasonal
scales.
Colebrook (1978,
1982b, 1986)
Colebrook (1978,
1982b) Reid et al.,
1998b) Reid and
Beaugrand (2002)
Colebrook (1979)
Colebrook (1981,
1982a, 1984)
Colebrook (1984)
Colebrook (1985a)
Planque, Hays, Ibañez
and Gamble (1997)
Beaugrand et al.
(2001)
and southern intermediate, and neritic) and their locations (see Fig. 2(b)–(d)) were in part explained by
the effect of temperature and its seasonal variability, and also salinity.
The ‘simplification’ of multidimensional space by this method proved satisfactory and led to the extensive
use of standardised PCA on CPR data. Examples of studies that used this method of ordination are summarised in Table 2. Standardised PCA was much used to extract the main patterns of year-to-year and longterm changes in the community structure of phytoplankton and zooplankton, typically in CPR Standard
Areas (Colebrook (1978; Colebrook, 1982a). In most of the CPR Standard Areas, Colebrook (1978, 1982a)
reported a declining trend of about 70% for zooplankton taxa and 60% for phytoplankton taxa. As these
changes were detected consistently throughout a large geographical region, and were shown to be correlated
with westerly weather, Colebrook (1986) argued that these changes were being triggered by meteorological forcing.
240
G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262
Fig. 2. Principal Component Analysis on a matrix of geographical rectangles × species or taxa (22) in the eastern North Atlantic.
(a) Scatter diagrams for the first two eigenvectors. Each point on this diagram represents one species. Five species groups were
identified by examination of the first three eigenvectors. The points were clustered on the basis of the ecological knowledge of the
author. (b)–(d) Maps of the distribution of the first three principal components. From Colebrook, Glover & Robinson (1961a).
This type of PCA was re-applied in 1998 (Reid, Planque, & Edwards, 1998b) and 2001 (Reid & Beaugrand, 2002), and an example is presented for a set of zooplankton taxa in the North Sea (Fig. 3). A total
of 28 taxa (Table 3), which were abundant and did not have a high frequency of zeros during the period
1958–1999, were selected. Scatter plots of the first two eigenvectors are shown (Fig. 3(a)) as well as longterm changes in the associated principal components (Fig. 3(b) and Fig. 3(c)). Groups of years have been
distinguished by a Cluster Analysis (Lance & Williams, 1967; hierarchical agglomerative flexible
algorithm) are indicated. The first principal component (Figs. 3(b), 30.2% of the total variance) shows
there was a period of high values from 1962 to 1976 followed by one of low values from 1983 for both
G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262
241
Fig. 3. (a) Ordination by PCA of the 28 species or taxa listed in Table 3 in the plane of the two first principal components (50.5%
of the total variability). (b) Year-to-year changes in the first principal component. (c) Year-to-year changes in the second principal
component. Periods detected by a Cluster Analysis using the flexible algorithm of Lance and Williams (1967) are indicated. Overall,
there is a good temporal connection with the exception of the years 1958 (period 1), 1975 (period 2), 1991 (period 5), 1993 (period 5).
242
G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262
Table 3
List of species used in a PCA to examine long-term change in zooplankton community structure in the North Sea with normalised
eigenvectors 1 and 2. The correlation between each species or taxon and the corresponding principal components is indicated by (r).
The coefficient of determination (r2) indicates the contribution of a species to the first two axes. Numbers (column 1) correspond to
those shown in Fig. 3
Identification number names of taxa
eigenvector 1
r
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
Calanus I-IV
Pseudocalanus elongatus Adult
Para-Pseudocalanus spp.
Temora longicornis
Acartia spp.
Centropages typicus
Centropages hamatus
Oithona spp.
Corycaeus spp.
Calanus Total Traverse
Podon spp.
Evadne spp.
Limacina retroversa
Lamellibranchia larvae
Chaetognatha Traverse
Cyphonautes larvae
Echinoderm larvae
Larvacea
Calanus finmarchicus
Calanus helgolandicus
Decapoda larvae
Euphausiacea Total
Chaetognatha Eyecount
Harpacticoida Total
Metridia Total Traverse
Copepod nauplii
Cirripede larvae
Euphausiacea calyptopis
0.497
⫺0.401
⫺0.604
⫺0.708
⫺0.529
⫺0.623
⫺0.445
⫺0.270
⫺0.743
0.500
⫺0.547
⫺0.517
⫺0.234
⫺0.538
⫺0.632
⫺0.571
⫺0.607
⫺0.571
0.616
⫺0.633
⫺0.789
0.688
⫺0.515
⫺0.580
⫺0.482
⫺0.425
⫺0.249
0.367
eigenvector 2
r2
0.247
0.161
0.365
0.501
0.280
0.388
0.198
0.073
0.552
0.250
0.300
0.267
0.054
0.289
0.399
0.326
0.369
0.326
0.379
0.401
0.622
0.474
0.265
0.336
0.233
0.181
0.062
0.135
r2
r
0.763
0.667
0.688
⫺0.043
⫺0.032
⫺0.081
0.061
0.844
⫺0.208
0.784
⫺0.283
0.289
0.724
0.396
0.541
0.420
0.004
⫺0.334
0.719
⫺0.138
⫺0.094
0.452
0.503
0.002
0.421
⫺0.334
⫺0.278
0.055
0.583
0.445
0.474
0.001
0.001
0.006
0.003
0.713
0.043
0.614
0.080
0.083
0.524
0.156
0.293
0.176
0.000
0.111
0.517
0.019
0.008
0.204
0.253
0.000
0.177
0.112
0.077
0.003
cold-water mixed oceanic and neritic species (e.g. Euphausiacea and C. finmarchicus), which were positively related to the first axis. A strong increase was detected during a cold period between 1978–1982
(see Fig. 3(a) and Table 3). For species negatively related to the first axis, the long-term changes showed
the inverse pattern with an increasing trend followed by a significant decrease during the cold period (see
Fig. 3(a) and Table 3). This pattern of variability was followed by the warmer-water, neritic or pseudooceanic species such as C. helgolandicus, Temora longicornis, Corycaeus spp. and decapod larvae.
The second principal component (Fig. 3(c), 20.32% of the total variance) displays a decreasing trend
for temperate neritic and pseudo-oceanic taxa such as Para-Pseudocalanus spp., Oithona spp., Limacina
retroversa and colder-water taxa such as Calanus finmarchicus. These opposing trends led to a change in
the ecosystem of the North Sea with a decrease in indicators of cold water and an increase in warmerwater pseudo-oceanic and neritic taxa. This confirms the trend discovered by Beaugrand, Reid, Ibañez,
Lindley & Edwards (2002a), who found an increase in the abundances of warm-temperate and temperate
species, which was associated with decreases in colder-water species. These changes have been linked to
the climatic warming observed in the North-East Atlantic in recent decades.
G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262
243
3.1.2. Centred PCA at diel and seasonal scales
Until recently, little attention has been devoted to the analysis of spatial changes in pelagic diversity
(Lindley, 1998; Beaugrand, Reid, Ibañez & Planque, 2000a) at all temporal scales. Studies have been
carried out to examine in more detail spatial patterns of pelagic biodiversity at diel and seasonal scales
(Beaugrand, Ibañez & Lindley, 2001; Beaugrand, Ibañez, Lindley, & Reid, 2002b). PCA was used to
identify the spatial patterns in diversity (in terms of the number of taxa per CPR sample) of calanoid
copepods and to detect major seasonal and diel patterns of change across the northern North Atlantic
Ocean. Fig. 4 shows (left) the first four eigenvectors and (right) monthly and diel changes of the corresponding principal components from January to December based on 40 years of CPR sampling (1958–1997).
They represent a total explained variance of 63.0%. The monthly and diel plot of the first principal component (Fig. 4, PC1, 47.8%) shows that strong diel variations occurred throughout the year. These diel changes
were more pronounced from April to October. Seasonal changes were also detected but were weaker than
diel changes. The value and intensity of diel variations were clearly detected during winter. As the first
eigenvector is only composed of positive values, high values (in red on the first map in Fig. 4) indicate
Fig. 4. Spatial, seasonal and diel changes in calanoid diversity in the northern North Atlantic. Mapping of the first four eigenvectors
and monthly and diel changes in the corresponding principal components (PC 1-4). The symbol above each graph indicates midnight
and the dashed lines between them denote midday. Modified, from Beaugrand et al. (2001).
244
G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262
where monthly and diel changes were strongest. This pattern occurred predominantly in the south–west
sector of the North Atlantic Drift Province (Longhurst, 1998).
PC2 (Fig. 4, 8.4%) shows that there were seasonal changes in species richness although diel changes
were still detectable in almost all months. The diel changes were weaker in summer than in spring, autumn
and winter. The corresponding eigenvector has both negative and positive values. High negative values
should be negatively related to the signal displayed by PC2 and inversely related to high positive eigenvector values. Thus, in the northern part of the North Atlantic Drift Province, the southern part of the Atlantic
Subarctic Province and the North Sea there were large seasonal changes in diversity with high values
occurring mainly in summer and low diel changes. In contrast, regions south of 50°N had high values for
diversity, mainly in spring, and showed higher diel variations.
PC3 (Fig. 4, 4.6%) displays the contrast that exists between seasonal changes in diversity between spring,
autumn and winter periods. High negative values of EV3 in the Bay of Biscay region indicate high diversity
during spring and the high positive values reflect high diversity during autumn and winter in the Gulf
Stream extension region.
PC4 (Fig. 4, 2.2%) shows two seasonal maxima in March–April and in July–October. May, June and
winter months are characterised by a lower value. Examination of EV4 shows that this pattern occurs off
the Iberian coast.
Diel and seasonal changes were modelled by multiplying the first four principal components by their
respective eigenvectors. Fig. 5 shows the seasonal and diel variability of calanoid diversity by re-estimation
of the original matrix. The North Atlantic Drift Province (Longhurst, 1998) can be clearly divided into
two parts; one to the south-west that is highly variable at a diel scale; the other to the north-east that is
highly variable on a seasonal scale. Consideration of these two scales of variability gave better discrimination between regions, as a result of which new divisions of the North Atlantic Ocean and adjacent seas
were proposed and new hypotheses about factors that contribute to the regulation of pelagic diversity
suggested (Beaugrand, Ibañez & Lindley, 2001).
3.1.3. Three-mode Principal Component Analysis
This numerical technique has recently been applied to CPR data to investigate long-term changes in the
community structure of pelagic ecosystems along the SA route. This route, which crosses the English
Channel, the Celtic Sea and the Bay of Biscay, was divided into twenty sections ranging in length from
20 to 70 km, but which contained the same number of CPR samples (188 observations for each section,
making a total of 3760 samples). Selecting the most common phytoplankton and zooplankton species, a
three-way table of the annual mean abundance of each taxon for each section and for each year over the
period 1979–1995 was constructed. In oceanography, methods that allow the analysis of such complex
tables are rare. A three-mode PCA was developed and applied in conjunction with cluster analysis
(Beaugrand, Ibañez & Reid, 2000b). The calculation of a three-mode PCA is made in two stages. First,
three ‘classical’ PCAs are performed on the matrices time-space x species (mode species), time-species x
space (mode space) and space-species x time (mode time). Secondly, a core matrix, which establishes the
interrelationships between each mode, is calculated from the three eigenvector matrices computed in the
first step of the analysis. Fig. 6 presents the results of this analysis, showing the regions identified (Fig.
6(a)) and the long-term changes from the three principal components, species-locations (Fig. 6(b), mode
time), years-locations (Fig. 6(c), mode species) and species-time (Fig. 6(d), mode space). Five different
zones, corresponding to a distinct interannual variability in plankton abundance, were identified (Fig. 6(a)–
(b)). The zones were also characterised by distinct physical processes. It was even possible to detect the
effects of the Ushant Front, which corresponded to zone 3. Significant negative correlations were detected
between the NAO index, air temperature and the first principal component in the English Channel. Thalassionema nitzschioides, Nitzschia delicatissima and various zooplankton taxa mainly present in the English
Channel showed interannual variability in abundance that differed from that in the Bay of Biscay (Fig.
G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262
245
Fig. 5. (a) Intensity of the diel variability (as a percentage) in the diversity of calanoid copepods in the North Atlantic Ocean. (b)
Intensity of the seasonal variability (in percentage) in the diversity of calanoids. Redrawn, from Beaugrand et al. (2001).
6(c) and (d)). The first principal component in each mode was indicative of plankton abundance and showed
a decrease between 1988 and 1991 in the English Channel (Fig. 6), a period that coincided with a high
NAO index as well as the beginning of the 1989/1991 high-salinity anomaly (Becker & Dooley, 1995).
Furthermore, especially in the northeast and central English Channel, higher abundances were observed at
times of negative or low NAO values. At times of high and positive NAO indices, westerly winds are
stronger throughout this area, and this may lead to an increase in mixing, which could delay the onset of
246
G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262
Fig. 6. Interannual variability along the SA route. (a) Location of the SA route sampled by the CPR survey. The five regions detected
by the analysis are superimposed. (b) year mode. Variability of the first principal component (species-locations). The groups determined from the cluster analysis are indicated for species on the ordinate and for locations on the abscissa. (c) species mode. Year-toyear variability in the first principal component (years-locations). The grey level indicates the intensity of the first component. The
groups determined from the cluster analysis are indicated for years on the ordinate and for locations on the abscissa. (d) location
mode. Variability in the first principal component (species-years). The groups determined from the cluster analysis are indicated for
species on the ordinate and for years on the abscissa. Redrawn, from Beaugrand, Ibañez & Reid, 2000b). Z1: northern eastern English
Channel; Z2: southern western English Channel; Z3: Ushant Front; Z4: Celtic Sea; Z5: Bay of Biscay. TN: Thalassionema nitzschoides; AC: Acartia spp.; CH: Calanus helgolandicus; PP: Para-Pseudocalanus spp.; ND: Nitzschia delicatissima; LI: Limacina spp.;
CT: Centropages typicus; OI: Oithona spp.; CF: Ceratium fusus; CM: Ceratium macroceros; CC: Clausocalanus spp.
water column stabilisation essential for the seasonal increase in net primary production (Dickson, Meincke,
Malmberg, & Lee, 1988).
3.1.4. Non-metric multidimensional scaling (MDS)
MDS is a non-parametric ordination method that aims to project multidimensional space into a reduced
number of dimensions, generally two. This analysis, which can be applied with almost any coefficient of
association (see Legendre & Legendre, 1998), in contrast to PCA (Euclidean distance) or correspondence
analysis (c2 distance), has been applied to CPR data by Lindley (1987); Lindley and Williams (1994), and
Edwards (2000). The analysis presented here to illustrate the method is one that was performed by Lindley
and Williams (1994) but the dendrogram and MDS scatter plot for this tow were not presented in that
G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262
247
paper. Plankton was sampled by a Continuous Plankton and Environmental Recorder (CPER) between
Aberdeen and Grimsby in the North Sea along route LR. Four groups of plankton were recognised. One
group occurred mainly in samples from Areas A (Aberdeen end) and C (unstratified water near Grimsby),
the second occurred mainly in samples from Area A with a few from Area B (Grimsby), the third and
largest group occurred throughout the tow, and the fourth was found mainly in Area B. MDS used in
conjunction with cluster analysis (hierarchical agglomerative single-clustering method) made it possible to
identify three regions along the transect on the basis of their plankton composition (20 taxa were
considered). Fig. 7 shows a clear separation between station 21 situated in unstratified water and other
stations located in more stratified areas. The cluster analysis grouped the northern and southern stations.
The stress coefficient for the MDS plot was 0.08 indicating that the projection of the multidimensional
space into two dimensions was satisfactory. This was also confirmed by the cluster analysis.
3.2. Cluster analysis
Cluster analysis is a powerful multivariate tool that is used to group objects or descriptors. With the
exception of probabilistic clustering methods (e.g. Clifford & Goodall, 1967), which necessitate a particular
association coefficient (e.g. Goodall’s probabilistic coefficient), this technique can possibly be applied to
almost any distance or similarity matrix between objects or descriptors. The choice of the coefficient of
association depends on the type (e.g. quantitative, semi-quantitative or qualitative data) and nature
(abundance of species or presence/absence) of data and the hypothesis that is under study (Legendre &
Legendre, 1998). Results from cluster analysis are often represented by means of a dendrogram.
3.2.1. Seriation
Before cluster analysis techniques became available, relationships between objects or descriptors were
investigated by rearrangement of an association matrix. Colebrook (1964); Colebrook and Robinson (1964)
and Colebrook (1969) applied this technique to study relationships between species and to detect species
associations based on their geographical variation in abundance or to examine geographical similarities in
the interannual variability of a species (e.g. Temora longicornis in the North Sea; Colebrook, 1969).
Fig. 7. Two-dimensional ordination of the 21 locations sampled by a Continuous Plankton and Environmental Recorder (CPER)
between Aberdeen and Grimsby in the North Sea. A Bray-Curtis similarity coefficient was used and a cluster analysis was applied
to group locations on the scatter plot.
248
G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262
3.2.2. Cluster Analysis and ordination
Lindley (1987) was one of the first to apply cluster analysis (single hierarchical agglomerative clustering
method) in conjunction with an ordination method. These two techniques were applied to investigate the
distribution of 36 species of decapod larvae around the British Isles. The resulting dendrogram demonstrated
the presence of seven groups of species, although the shape of the dendrogram clearly indicated a gradient
in the distribution of larvae. The distributions of these decapods were explained by the interaction between
life histories of organisms and bathymetric depth, which is quite important in the ecology of benthic
organisms (Glémarec, 1973). The joint application of cluster analysis and an ordination method enables a
visual inspection of the deformation of the projection of the multidimentional space into a two-dimensional
scatter plot to be visualised. This procedure has been recommended by several authors (e.g. Legendre &
Legendre, 1998).
3.3. Indicator-value method
The recently proposed ‘Indicator-value method’ (Dufrêne & Legendre, 1997) has been applied to calanoid
copepods by Beaugrand, Ibañez, Lindley & Reid (2002b). This method enables species associations to be
identified. Several steps are necessary to detect such associations. If the goal is to identify indicator species
in an area, a cluster analysis is first applied in order to identify the regions. Alternatively the regions can
be determined a priori if the area under investigation is already well known. This can be done using any
type of data (e.g. abundance, diversity, or abiotic factors). Then, a measure of specificity and of fidelity
must be calculated. The specificity Aij computes the ratio of the average abundance of species i in the pixels
of group j (Nindividualsij) to the sum of the mean abundance of the species i in all groups (Nindividualsi.).
Aij ⫽
Nindividualsij
Nindividualsi.
The fidelity Bij is the ratio of the number of pixels where the species i in the group j is present to the
total number of pixels in this group.
Bij ⫽
Nsitesij
Nsites.j
The indicator value (INDVALij) is computed by multiplying the specificity and fidelity indices, as these
two quantities represent independent information.
INDVALij ⫽ Aij ⫻ Bij ⫻ 100
Dufrêne and Legendre (1997) retained the maximum indicator value for each species among all groups.
This method has been used with CPR data to derive species assemblages from calanoid diversity (108
taxa) (Beaugrand, Ibañez, Lindley & Reid, 2002b) as shown in Fig. 8. To take one example, the warmtemperate oceanic assemblage comprises 16 taxa. The boundary to this assemblage is quite sharp and its
geographic coverage does not extend into water depths ⬍200 m (Fig. 8(a)). The influence of the Oceanic
Polar Front at about 52–53°N on the latitudinal distribution of this association is strong between the Northwest Corner (see Worthington, 1976) at 51°N, 44°W and the mid-Atlantic ridge. West of this, the latitudinal
front becomes meridional and the association extends to the north to about 58°N south of Iceland and
55°N to the west of Ireland. To take another example, Fig. 8(c) and (d) shows a clear complementarity
between the distribution of warm-water and cold-water species. At a lower level of distance in the dendrogram (not shown), warm-water species were divided into coastal, continental shelf, and pseudo-oceanic
species (Fig. 9) while cold-water species were divided into cold-temperate, subarctic and arctic species
associations (Fig. 10). The subtropical and warm-temperate oceanic and pseudo-oceanic species associations
G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262
249
Fig. 8. Spatial distribution of calanoid copepod assemblages in the northern North Atlantic. The maps show the percentage of
occurrence of taxa per pixel for each assemblage defined at the partition level of 1.152 of a dendrogram(not shown). Abundance
data were transformed into presence/absence data. Then, the percentage of present taxa per pixel for each assemblage was computed.
An elevated percentage denotes a high degree of spatial aggregation of taxa inside an assemblage and vice versa. A blank pixel
inside the survey area indicates the absence of species for a given group. (a) Warm-temperate oceanic species assemblage (16 species
or taxa). (b) Bay of Biscay and southern European shelf-edge assemblage (4 species). (c) temperate neritic and pseudo-oceanic species
assemblage (12 species or taxa) (d) cold-temperate, subarctic and arctic species assemblage (11 species or taxa). (e)–(f). Subtropical
and warm-temperate species assemblage (25 species). Redrawn from Beaugrand, Ibañez, Lindley & Reid (2002b).
were clearly detectable in the path of the Gulf Stream extension (Fig. 8(e) and (f)). These species assemblages have been used to define in greater detail the ecosystems and ecotones of the North Atlantic Ocean
and adjacent seas and to understand better the factors regulating diversity. Four modulating factors have
been identified: 1. temperature, 2. hydrodynamics, 3. stratification, and 4. seasonal variability. These factors
are often linked, but they can act at different scales, and their contributions can vary geographically.
Moreover, this study clearly detected the influence of warm currents on diversity and hence the functional
characteristics of ecotones west of Europe and in the Gulf Stream extension. Relationships between species
associations and water masses or currents are strong. These assemblages may, therefore, represent an
important environmental indicator for monitoring marine ecosystems and evaluating the impact of climate
change. Other techniques combining clustering methods and Bayesian probabilities, used recently by Anneville, Souissi, Ibañez, Ginot, Druart and Angeli (2002), could be used with the CPR data.
250
G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262
Fig. 9. Spatial distribution of calanoid copepod assemblages in the northern North Atlantic. Subdivision of the temperate neritic
and pseudo-oceanic species assemblages (see Fig. 8(c)) at the partition level of 0.5 of the dendrogram (not shown). Data were coded
in two states: 0 when the abundance of the taxa was less than half the mean. 1 when the abundance was more than half the mean.
The percentage of species in one square was then calculated. This transformation allowed the main centres of the spatial distribution
of taxa inside a group to be detected. Redrawn from Beaugrand, Ibañez, Lindley & Reid (2002b).
4. Geostatistics
4.1. Spatial interpolation
Selection of appropriate interpolation methods for spatial representation of plankton data is a key stage
in making spatial and temporal comparisons of biological variables. The method selected should be rapid
in calculation time and applicable to all species both rare and abundant. Many methods exist to interpolate
data (e.g. Lam, 1983). Two of the available methods have been used on CPR data. Planque (1996) applied
the kriging procedure for the first time to CPR data. Kriging has the advantage that it takes into consideration spatial scales of change in ecological variability. This method allows a standard deviation of the
interpolation error to be derived, whereas this is less obvious in the case of the alternative inverse squared
distance method. Kriging was also used by Planque and Fromentin (1996) and has subsequently been
applied successfully to abundant species and to the diversity of calanoid copepods (Planque & Ibañez,
1997; Beaugrand, 1999). The total number of taxa identified per CPR sample has recently been mapped
using this procedure (Beaugrand, 1999). Fig. 11 shows monthly changes in the total number of taxa identified per CPR sample. A clear contrast is seen between the total number of taxa in the eastern and western
northern North Atlantic. High values were found during winter off Canada and in the Bay of Biscay. Then,
the number of taxa identified per CPR sample progressively extended northwards until September. The
G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262
251
Fig. 10. Spatial distribution of calanoid copepod assemblages in the northern North Atlantic. Subdivision of the cold-temperate,
subarctic and arctic species assemblages (see Fig. 8(d)) at the partition level of 0.5 of the dendrogram, not shown. (a) cold-temperate
species assemblage (4). (b) Subarctic species assemblage (4). (c) Arctic species assemblage (3). Redrawn from Beaugrand, Ibañez,
Lindley & Reid (2002b).
use of kriging with CPR data is limited, however, by three main problems. First, there is the interpretation
of the variogram and its approximation using a theoretical model. For each spatial interpolation, this step
must be checked. Secondly, both geometric and zonal anisotropy must be corrected (Wackernagel, 1995).
Thirdly, it is difficult to use kriging for rare species because of the high proportion of zeros in the matrices.
In practice it is hard to verify all these parameters when a large number of maps is produced, so another
method, called inverse squared distance, has been applied (Beaugrand, Reid, Ibañez & Planque, 2000a;
Planque & Batten, 2000). It is simpler than kriging and gives similar results when the radius of interpolation
is relatively small (i.e less than 300 nautical miles). Fig. 12 shows the mean spatial distribution of some
taxa using this technique. There are considerable differences in the distribution patterns of the illustrated
taxa. The distribution of some taxa is complimentary (e.g. Euchaeta norvegica and the group ParaPseudocalanus). In the case of Metridia lucens, a higher abundance is seen in the pelagic ecotones situated
west of the British Isles and also along the path of the Gulf Stream extension and the North Atlantic
252
G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262
Fig. 11. Seasonal changes in the total number of taxa per CPR sample. Kriging was applied to interpolate the data spatially (period
1958–1997, search radius = 200 nautical miles, neighbours between 5 and 15). Redrawn from Beaugrand (1999).
Current. Spatial interpolation techniques should be used with care in any application applied to the CPR
data because of the spatial and temporal heterogeneity of CPR sampling.
4.2. Semi-variograms and cumulative semi-variograms
Planque (1996) used experimental semi-variograms to investigate the spatial scale of variability of C.
finmarchicus and C. helgolandicus. Semi-variograms were modelled using a spherical model. Important
year-to-year variability was found in the shape of the experimental semi-variograms. This variance was
attributed to the sensitivity of the method to a small number of CPR samples. A clear spatial dependency
in the abundance of Calanus species was found within a range of about 400 km. C. helgolandicus exhibited
a more complex pattern from April to August, which was attributed to a multi-scale distribution pattern.
In that case, spherical models were fitted to the experimental variograms to interpolate by kriging the
abundance of the two species of Calanus for all months of the period 1958–1992. This same procedure
was then repeated in subsequent studies for different periods at different scales (Planque & Fromentin,
1996; Fromentin & Planque, 1996). This technique probably helped to discover the well-known negative
G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262
253
Fig. 12. Mean spatial distribution of some key species or taxa in the North Atlantic Ocean. An inverse squared distance method
was used with a research radius of 250 nm (neighbours between 5 and 15). Spatial interpolation was performed for each 2-month
and 2-hour period (144 maps). The mean of all maps was then calculated.
relationship between the state of the North Atlantic Oscillation and of C. finmarchicus (Fromentin &
Planque, 1996).
However, classical experimental semi-variograms are highly sensitive to irregular distribution of observations inside the spatial domain (Wackernagel, 1995; Sen, 1989). Sampling by the CPR survey is irregular,
so an unbiased estimate of the experimental semi-variogram is difficult to obtain. Moreover, the choice of
distance classes is quite arbitrary and may strongly influence the shape of the curve. This led Sen (1989)
to propose a new way to evaluate the spatial dependence of observations for geological purpose. The
principle of Sen’s cumulative semi-variograms is to calculate for each sampling point a semi-variogram
based on geographical distances and dissimilarity between the particular sampling point and the others.
Values for each semi-variogram are pooled, and then, as there are as many semi-variograms as there are
points, it is possible to map the spatial dependence of the data. This gives an indication of the anisotropy
of the regional variable and shows spatial changes in the scale of variability of the variable. Sen’s cumulative semi-variograms were applied to CPR data by Beaugrand and Ibañez (2002). Fig. 13 shows the result
of applying the procedure to one month, using only data collected at night. The regional dependence in
254
G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262
Fig. 13. Application of Sen’s cumulative semi-variogram to show the spatial dependence of calanoid diversity in the northern North
Atlantic. The value of the PCSV (Point Cumulative Semi-Variogram) was fixed at 1000. Blue indicates diversity changes at small
spatial scales, while red indicates that diversity varies at a larger spatial scale. Main surface currents are superimposed. Redrawn
from Beaugrand and Ibañez (2002).
the path of the North Atlantic Current and in the north-eastern part of the North Atlantic Drift Province
(see Longhurst, 1998) was low (400 km for a value of the Point Cumulative Semi-Variograms of 1000).
This indicated that spatial variability in calanoid diversity varied at a small scale. The regional dependence
of diversity inside the two subtropical gyres was stronger (1200 km for a value of the Point Cumulative
Semi-Variograms of 1000) and indicated a greater spatial variability in calanoid diversity.
5. Time series analysis
Objectives of this type of analysis are to describe and decompose time series and then develop models
to enable future forecasting (Diggle, 1990). To date, most time series analyses used on the CPR dataset
have been used to describe seasonal, year-to-year and long-term changes in the abundance of species.
Then, correlation and cross-correlation techniques have been applied to identify environmental parameters
responsible for the observed trends (e.g. Fromentin & Planque, 1996). Time series analyses applied to the
CPR data are presented in Table 4.
5.1. Cumulative sums
This technique is a simple method, which consists of graphically detecting local changes in a time series
and assessing the intensity and duration of these changes (Ibañez, Fromentin, & Castel, 1993). This function
is calculated by subtracting for all values of the time series a reference number (i.e. the mean of the time
series) and progressively pooling the residuals (Ibañez et al., 1993). This function was applied to the
CPR data with the objectives of emphasising the relationships between community change and air surface
temperature in the English Channel, the Celtic Sea and the Bay of Biscay (Beaugrand et al., 2000b, Fig.
6). A clear relationship was detected between changes in community structure (Principal Component yearlocation from a three-mode Principal Component Analysis calculated on the three-dimensional table, years
× locations × species) and air surface temperature in the English Channel and the Celtic Sea (Fig. 14).
5.2. Eigenvector filtering (EVF) and power spectra
The Eigenvector filtering method, also known as Principal Component Analysis of Processes (Ibañez &
Etienne, 1991) or singular-spectrum analysis (Vautard, Yiou, & Ghil, 1992), was used on CPR data by
G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262
255
Table 4
Examples of time series analyses used to interpret CPR data
Types of analyses
Ecological goal or utility
Authors
Box and Jenkins models
(AR.MA. ARMA. ARIMA)
Multinomial logit model
Modelling of a time series and forecasting.
Rothschild (1998)
Cumulative sums
Polynomial regression
Eigenvector filtering
Power spectra
Maximum Entropy Spectral
analysis
Maximum entropy cross
spectral analysis
This kind of Generalized Additive Model was applied to analyse
Beare & McKenzie (1999b)
the CPR data, using recorded values. This technique was utilised to
reveal the seasonal, spatial and long-term variability in the
abundance of species.
Examination of local trends in a time series.
Beaugrand, Reid, Ibañez &
Planque (2000a)
Determination of the trend in the abundance of Calanus
Fromentin and Planque (1996)
finmarchicus, C. helgolandicus and environmental parameters.This
type of regression is used to de-trend the different time series and
to take into account temporal autocorrelation in the calculation of
correlations (e.g. correlation between C. finmarchicus and the NAO
index).
Decomposition of a signal. Smoothing of a time series and
Colebrook (1978. 1982b)
extraction of the trends. Emphasis of the major signal in the time
series and quantification of temporal variability by the use of
eigenvalues (Ibañez & Dauvin, 1988; Ibañez & Etienne, 1991).
Assessment of the scales of variability of a variable.
Colebrook (1979)
Assessment of the scales of temporal variability of a variable. This Colebrook (1981, 1982a,
was used on time series of sea-surface temperature, principal
1985b, 1991)
components and species abundance. This analysis is more adapted
to short time series and measures for which a higher degree of
error is expected than for classical spectral analysis (Legendre &
Legendre, 1998).
Examination of the common patterns of temporal variability for
Colebrook (1985b, 1986,
two pairs of variables (e.g. total copepods and sea-surface
1991)
temperature). This method uses both coherence and phase diagrams
and determines relationships between variables for all possible
scales of variability.
Fig. 14. Cumulative sums of surface air temperature and the 2-dimensional principal component (years-locations) in the eastern
English Channel. The figure shows the negative relationship between both variables. Redrawn from Beaugrand, Reid, Ibañez &
Planque (2000a).
256
G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262
Colebrook, 1978, 1982a). He used this method to smooth and emphasise the trend of plankton time series
and applied it directly to abundance data and to principal components (Colebrook, 1978). In more recent
years, EVF has seldom been used on CPR data. An application to the decomposition and quantification
of the scales of temporal variability in the diversity of calanoid copepods is presented in Fig. 15. A time
series was built up for the North Sea (0°E–10°W, 50°N–60°N) from 1958 to 1999 with (× 42) the diversity
(as the number of taxa per CPR sample) for each year, for daylight and dark periods (× 2) and for each
two-month period (× 6). Hence, the length of the time series was 504 (2 × 6 × 42). The lag chosen for
the Toeplitz matrix (autocovariance matrix) was 100. This lag was selected in order to eliminate long-term
cycles and to emphasise the trend in the time series. The autocorrelation function (Fig. 15(a)) of the time
Fig. 15. Variability in the diversity of calanoid copepods from 1958 to 1999 for every two-month period and dark/daylight periods
(504 points). (a) Autocorrelation function (99% and 95% confidence intervals are indicated). (b). Gain spectra assessed from the
results of the EVF.
G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262
257
series was high with a lag of one year between each maximum, which clearly showed the strong effect
of seasonality on the diversity of calanoid copepods. The gain function was also calculated according to
the procedure described by Ibañez and Etienne (1991) (Fig. 15(b)). This function indicates the periods
emphasised by the corresponding eigenvectors of the EVF. A high value in the gain function for long
periods (infinity) indicates a trend while a high value for small periods can often be attributed to noise.
The gain function (Fig. 15(b)) shows that the first two axes of the decomposition (S1 and S2, 24.9% and
24.1% of the total variance of the time series, respectively) generated cycles with a period of about one
year (the seasonal cycle). The third series (S3, 3.7%) emphasised the long-term variability of the time
series, the fourth (S4, 2.8%) pointed out a cyclical trend with a period of approximately 16 years and the
fifth series (S5, 2.8%) identified the day/night variation in diversity. Fig. 16 confirms the results of the
gain function. Series 1 (the original time series reassessed by multiplying the first eigenvector by the first
principal component) and 2 (the original time series reassessed by multiplying the second eigenvector by
the second principal component) represented the seasonal cycle. Series 3 indicated the trend of the time
series. This trend showed three peaks of high diversity in 1959, 1972 and 1990, which corresponded to
warmer sea surface temperature. The low diversity in 1980 corresponds with the inflow of cold water into
the North Sea. Series 4 clearly shows a pseudo-cycle of about 16 years, evident from the gain function.
Fig. 16. Series recalculated from the data presented in Fig. 15 using the first five eigenvectors. As the use of the second eigenvector
gave a similar result to that of the first, it is not represented here. (a) first series: seasonal variability. (b) third series: long-term trend.
(c) fourth series: cyclical variability (pseudo-period of about 16 years) with a slight influence of diel variability in diversity. (d) fifth
series: diel variability in the diversity of calanoid copepods.
258
G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262
Series 5 emphasised the day/night variation in diversity. Coefficients of variation calculated for each series
indicated that the seasonal variability was more important than the year-to-year variability (Fig. 17). This
result confirms the observations on diel and seasonal variability made by Beaugrand, Ibañez and Lindley
(2001). The result also shows that it is important to take seasonal variability into account in the examination
of calanoid copepod diversity. Diel variability is also important in relation to year-to-year variability and
should also be considered (Fig. 17). This may also apply to abundance data.
5.3. Maximum entropy spectral and cross-spectral analyses
Colebrook was the first to apply Maximum Entropy Spectral and Cross-Spectral analyses to CPR data
(Colebrook, 1981, 1982b, 1985a, 1991). Colebrook and Taylor (1984) used these techniques to analyse
temporal variability in the abundance of plankton sampled on a monthly basis and physical data such as
sea-surface temperature from 1948 to 1980. Maximum Entropy spectral and cross-spectral analyses were
used to determine the characteristic frequency of long-term variability in the abundance of plankton (first
principal component from a standardised PCA on the matrix years x species) and to examine similarities
between plankton and physical variables around the British Isles. Using coherence and phase spectra, these
authors identified a number of characteristic periods (e.g. 10–12 years, 5–6 years, 3–4 years). Wavelengths
of 3–4 years were associated with surface-heat exchange phenomena.
6. Conclusions
Considering the current number of years (43) and months (516) recorded from 1958 to 2000, for all
species or taxa (about 450) and standard areas (33), about 7.7 million graphs would be needed to examine
year-to-year and long-term changes in the seasonal cycle of each species or taxon in all standard areas.
More than ever, multivariate analyses need to be used to extract relevant information contained in the
Fig. 17. Quantification of temporal scales of variability in calanoid copepod diversity. The coefficient of variation was calculated
for the first five series reconstructed from the EVF (see Fig. 16).
G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262
259
database. This review has emphasised how important statistical analyses have been, and are likely to continue to be, in the interpretation of CPR data.
There is a clear need to develop techniques to improve the sorting of information in the CPR database
and to evaluate relationships between biological and environmental data. Environmental parameters are
available from the web (e.g. temperature, CZCS data, salinity, wind speed, wind direction). Environmental
tables could be gathered and compared with biological tables assembled from the CPR database. Techniques
so far not applied to CPR data (e.g. Redundancy Analysis or Canonical Correspondence Analysis) may be
appropriate. Other techniques based on probability distribution (e.g. randomisation procedure for customised
distributions, Poisson and Poisson-like distributions for rare species, Levy and log-Levy distributions for
geometrically fractal distributions, Bayesian techniques for regional variables) could also help to assess
relationships between biological and physical variables.
Acknowledgements
The authors are grateful to all past and present members and supporters of the Sir Alister Hardy Foundation for Ocean Science whose continuous efforts have allowed the long-term establishment and maintenance of the CPR dataset. We are particularly grateful to Philip C. Reid, Martin Edwards, Benjamin
Planque, Arnold Taylor and the two referees for advice and comments on the manuscript. The research
presented was supported by the European Community Research Project No. MAS3-CT98-5058, the Netherlands (contract RKZ595) and the French ‘Programme National en environnement côtier, thème: influence
des facteurs hydroclimatiques ou anthropiques sur la variabilité spatio-temporelle des populations et écosystèmes marins’ (PNEC art 4).
References
Angel, M. V. (1994). Spatial distribution of marine organisms: patterns and processes. In P. J. Edwards, R. M. May, & N. R. Webb
(Eds.), Large-scale ecology and conservation biology (pp. 59–109). Cambridge: Blackwell Scientific Publications.
Anneville, O., Souissi, S., Ibañez, F., Ginot, V., Druart, J. -C., & Angeli, N. (2002). Temporal mapping of phytoplankton assemblages
in Lake Geneva: Annual and interannual changes in their patterns of succession. Limnology & Oceanography, 47, 1355–1366.
Beare, D. J., & McKenzie, E. (1999a). Connecting ecological and physical time-series: the potential role of changing seasonality.
Marine Ecology Progress Series, 178, 307–309.
Beare, D. J., & McKenzie, E. (1999b). The multinomial logit model: a new tool for exploring Continuous Plankton Recorder data.
Fisheries Oceanography, 8(Suppl. 1), 25–39.
Beaugrand, G. (1999). Le programme Continuous Plankton Recorder (CPR) et son application à l’étude des changements spatiotemporels de la biodiversité pélagique en Atlantique nord et en mer du Nord. Océanis, 25, 417–433.
Beaugrand, G., & Edwards, M. (2001). Comparison in performance among four indices used to evaluate diversity in pelagic ecosystems. Oceanologica Acta, 24, 467–477.
Beaugrand, G., & Ibañez, F. (2002). Spatial dependence of pelagic diversity in the North Atlantic Ocean. Marine Ecology Progress
Series, 232, 197–211.
Beaugrand, G., Ibañez, F., & Lindley, J. A. (2001). Geographical distribution and seasonal and diel changes of the diversity of
calanoid copepods in the North Atlantic and North Sea. Marine Ecology Progress Series, 219, 205–219.
Beaugrand, G., Ibañez, F., Lindley, J. A., & Reid, P. C. (2002b). Diversity of calanoid copepods in the North Atlantic and adjacent
seas: species associations and biogeography. Marine Ecology Progress Series, 232, 179–195.
Beaugrand, G., Ibañez, F., & Reid, P. C. (2000b). Long-term and seasonal fluctuations of plankton in relation to hydroclimatic features
in the English Channel, Celtic Sea and Bay of Biscay. Marine Ecology Progress Series, 200, 93–102.
Beaugrand, G., Reid, P. C., Ibañez, F., Lindley, J. A., & Edwards, M. (2002a). Reorganisation of North Atlantic marine copepod
biodiversity and climate. Science, 296, 1692–1694.
Beaugrand, G., Reid, P. C., Ibañez, F., & Planque, P. (2000a). Biodiversity of North Atlantic and North Sea calanoid copepods.
Marine Ecology Progress Series, 204, 299–303.
Becker, G., & Dooley, H. (1995). The 1989/91 high salinity anomaly. Ocean Challenge, 6, 52–57.
260
G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262
Cassie, R. M. (1963). Multivariate analysis in the interpretation of numerical plankton data. New Zealand Journal of Science, 6, 36–58.
Clifford, H. T., & Goodall, D. W. (1967). A numerical contribution to the classification of the Poaceae. Australian Journal of Botanics,
15, 499–519.
Colebrook, J. M. (1964). Continuous Plankton Records: a principal component analysis of the geographical distribution of zooplankton.
Bulletins of Marine Ecology, 6, 78–100.
Colebrook, J. M. (1969). Variability in plankton. Progress in Oceanography, 5, 115–125.
Colebrook, J. M. (1978). Continuous Plankton Records: zooplankton and environment, north-east Atlantic and North Sea, 1948–1975.
Oceanologica Acta, 1, 9–23.
Colebrook, J. M. (1979). Continuous Plankton Records: seasonal cycles of phytoplankton and copepods in the North Atlantic Ocean
and the North Sea. Marine Biology, 51, 23–32.
Colebrook, J. M. (1981). Continuous Plankton Records: persistence in time-series of annual means of abundance of zooplankton.
Marine Biology, 61, 143–149.
Colebrook, J. M. (1982a). Continuous plankton records: phytoplankton, zooplankton and environment, North-East Atlantic and North
Sea, 1958–1980. Oceanologica Acta, 5, 473–480.
Colebrook, J. M. (1982b). Continuous Plankton Records: persistence in time-series and the population dynamics of Pseudocalanus
elongatus and Acartia clausi. Marine Biology, 66, 289–294.
Colebrook, J. M. (1984). Continuous Plankton Records: relationships between species of phytoplankton and zooplankton in the
seasonal cycle. Marine Biology, 83, 313–323.
Colebrook, J. M. (1985a). Continuous Plankton Records: overwintering and annual fluctuations in the abundance of zooplankton.
Marine Biology, 84, 261–265.
Colebrook, J. M. (1985b). Sea surface temperature and zooplankton, North Sea, 1948 to 1983. Journal du Conseil. Conseil International pour l’Exploration de la Mer, 42, 179–185.
Colebrook, J. M. (1986). Environmental influences on long-term variability in marine plankton. Hydrobiologia, 142, 309–325.
Colebrook, J. M. (1991). Continuous Plankton Records: from seasons to decades in the plankton of the North-East Atlantic. In T.
Kawasaki, S. Tanaka, Y. Toba, & A. Taniguchi (Eds.), Long-term variability of pelagic fish population and their environments
(pp. 29–45). Oxford: Pergamon Press.
Colebrook, J. M., Glover, R. S., & Robinson, G. A. (1961a). Contribution towards a plankton atlas of the North-Eastern Atlantic
and the North Sea. General introduction. Bulletins of Marine Ecology, 5, 67–80.
Colebrook, J. M., John, D. E., & Brown, W. W. (1961b). Contribution towards a plankton atlas of the North-Eastern Atlantic and
the North Sea. Part II: Copepoda. Bulletins of Marine Ecology, 5, 90–97.
Colebrook, J. M., & Robinson, G. A. (1964). Continuous Plankton Records: annual variations of abundance of plankton, 1948–1960.
Bulletins of Marine Ecology, 6, 52–69.
Colebrook, J. M., & Taylor, A. H. (1984). Significant time scales of long-term variability in the plankton and the environment.
Rapport et Procès-verbaux des réunions. Conseil International pour l’Exploration de la Mer, 183, 20–26.
Dickson, R. R., Meincke, J., Malmberg, S. A., & Lee, A. J. (1988). The ‘Great Salinity Anomaly’ in the northern North Atlantic,
1968–1982. Progress in Oceanography, 20, 103–151.
Diggle, P. J. (1990). Time series: a biostatistical introduction. Oxford: Clarendon Press.
Dufrêne, M., & Legendre, P. (1997). Species assemblages and indicator species: the need for a flexible asymetrical approach. Ecological Monographs, 67, 345–366.
Edinburgh Oceanographic Laboratory (1973). Continuous plankton records: A plankton atlas of the North Atlantic and the North
Sea. Bulletins of Marine Ecology, 7, 1–174.
Edwards, M. (2000). Large-scale temporal and spatial patterns of marine phytoplankton and climate variability in the North Atlantic.
Ph.D. Thesis, Plymouth University, England.
Edwards, M., Beaugrand, G., Reid, P. C., Rowden, A. A., & Jones, M. B. (2002). Ocean climate anomalies and the ecology of the
North Sea. Marine Ecology Progress Series, 239, 1–10.
Edwards, M., John, A. W. G., Hunt, H. G., & Lindley, J. A. (1999). Exceptional influx of oceanic species into the North Sea late
1997. Journal of the Marine Biological Association of the United Kingdom, 79, 737–739.
Edwards, M., John, A. W. G., Johns, D. G., & Reid, P. C. (2001a). Case history and persistence of the non-indigenous diatom
Coscinodiscus wailesii in the north-east Atlantic. Journal of the Marine Biological Association of the United Kingdom, 81,
207–211.
Edwards, M., Reid, P. C., & Planque, B. (2001b). Long-term and regional variability of phytoplankton biomass in the northeast
Atlantic (1960–1995). ICES Journal of Marine Science, 58, 39–49.
Fromentin, J. -M., & Planque, B. (1996). Calanus and environment in the eastern North Atlantic. II. Influence of the North Atlantic
Oscillation on C. finmarchicus and C. helgolandicus. Marine Ecology Progress Series, 134, 111–118.
Glémarec, M. (1973). The benthic communities of the European North Atlantic continental shelf. Oceanography and Marine Biology
Annual Review, 11, 263–289.
G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262
261
Glover, R. S. (1952). Continuous Plankton Records: the Euphausiacea of the north-eastern Atlantic and the North Sea. Hull Bulletins
of Marine Ecology, 3, 185–214.
Glover, R. S. (1957). An ecological survey of the Scottish herring fishery. Part II: the planktonic environment of the herring. Bulletins
of Marine Ecology, 5, 1–43.
Haury, L. R., & McGowan, J. A. (1998). Time-space scales in marine biogeography. Intergovernmental Oceanographic Commission
Workshop Report, 142, 163–170.
Hays, G. C. (1995). Diel vertical migration behaviour of Calanus hyperboreus at temperate latitudes. Marine Ecology Progress Series,
127, 301–304.
Hays, G. C. (1996). Large-scale patterns of diel vertical migration in the North Atlantic. Deep-Sea Research I, 43, 1601–1615.
Hays, G. C., Proctor, C. A., John, A. W. G., & Warner, A. J. (1994). Interspecific differences in the diel vertical migration of marine
copepods: the implications of size, color, and morphology. Limnology and Oceanography, 39, 1621–1629.
Hirst, A. G., & Batten, S. D. (1998). Long-term changes in the diel vertical migration behaviour of Calanus finmarchicus in the
North Sea are unrelated to fish predation. Marine Ecology Progress Series, 171, 307–310.
Ibañez, F., & Dauvin, J. -C. (1988). Long-term changes (1977 to 1987) in a muddy fine sand Abra alba—Melinna palmata community
from the western English Channel: multivariate time series analysis. Marine Ecology Progress Series, 49, 65–81.
Ibañez, F., & Etienne, M. (1991). Le filtrage des séries chronologiques par l’analyse en composantes principales de processus (ACPP).
Journal de Recherche océanographique, 16, 27–33.
Ibañez, F., Fromentin, J. -M., & Castel, J. (1993). Application de la méthode des sommes cumulées à l’analyse des séries chronologiques en océanographie. Comptes Rendus de l’Académie des Sciences de Paris, Sciences de la Vie, 316, 745–748.
Jolliffe, I. T. (1986). Principal Component Analysis. New York: Springer-Verlag New York Inc.
Kruskal, J. B. (1964). Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika, 29, 1–27.
Lam, N. S. N. (1983). Spatial interpolation methods: a review. American Cartography, 10, 129–149.
Lance, G. N., & Williams, W. T. (1967). A general theory of classificatory sorting strategies. I. Hierarchical systems. Computer
Journal, 9, 373–380.
Legendre, P., & Legendre, L. (1983). Echantillonnage et traitement des données. In S. Frontier (Ed.), Stratégies d’échantillonnage
en écologie (pp. 163–216). Paris: Masson.
Legendre, P., & Legendre, L. (1998). Numerical Ecology. (2nd ed.). Amsterdam: Elsevier Science B.V.
Levin, S. A. (1992). The problem of pattern and scale in ecology. Ecology, 73, 1943–1967.
Lindley, J. A. (1987). Continuous Plankton Records: the geographical distribution and seasonal cycles of Decapod Crustacean larvae
and pelagic post-larvae in the north-eastern Atlantic Ocean and the North Sea, 1981–3. Journal of the Marine Biological Association of the United Kingdom, 67, 145–167.
Lindley, J. A. (1998). Diversity, biomass and production of decapod crustacean larvae in a changing environment. Invertebrate
Reproduction and Development, 33, 209–219.
Lindley, J. A., Roskell, J., Warner, A. J., Halliday, N. C., Hunt, H. G., John, A. W. G., & Jonas, T. D. (1990). Doliolids in the
German Bight in 1989: evidence for exceptional inflow into the North Sea. Journal of the Marine Biological Association of the
United Kingdom, 70, 679–682.
Lindley, J. A., & Williams, R. (1994). Relating plankton assemblages to environmental variables using instruments towed by shipsof-opportunity. Marine Ecology Progress Series, 107, 245–262.
Longhurst, A. (1998). Ecological Geography of the Sea. London: Academic Press.
Lucas, C. E. (1940). Ecological investigations with the Continuous Plankton Recorder: the phytoplankton in the southern North Sea,
1932-1937. Hull Bulletins of Marine Ecology, 1, 73–170.
Lucas, C. E. (1941). Continuous plankton records: phytoplankton in the North Sea, 1938–39. Part 1. Diatoms. Hull Bulletins of
Marine Ecology, 2, 19–46.
Lucas, C. E. (1942). Continuous plankton records: phytoplankton in the North Sea, 1938–39. Part II.—Dinoflagellates, Phaeocystis,
etc. Hull Bulletins of Marine Ecology, 2, 47–70.
Lundberg, P., Ranta, E., Ripa, J., & Kaitala, V. (2000). Population variability in space and time. Trends in Ecology and Evolution,
15, 460–464.
Mann, K. H., & Lazier, J. R. N. (1996). Dynamics of marine ecosystems: biological-physical interactions in the oceans. (2nd ed.).
Cambridge: Blackwell Science.
Planque, B. (1996). Spatial and temporal fluctuations in Calanus populations sampled by the Continuous Plankton Recorder. Ph.D.
Thesis, Université Pierre et Marie Curie, Paris, France.
Planque, B., & Batten, S. D. (2000). Calanus finmarchicus in the North Atlantic: the year of Calanus in the context of interdecadal
change. ICES Journal of Marine Science, 57, 1528–1535.
Planque, B., & Fromentin, J. -M. (1996). Calanus and environment in the eastern North Atlantic. I. Spatial and temporal patterns of
C. finmarchicus and C. helgolandicus. Marine Ecology Progress Series, 134, 101–109.
Planque, B., Hays, G. C., Ibañez, F., & Gamble, J. C. (1997). Large scale spatial variations in the seasonal abundance of Calanus
finmarchicus. Deep-Sea Research (I), 44, 315–326.
262
G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262
Planque, B., & Ibañez, F. (1997). Long-term time series in Calanus finmarchicus abundance—a question of space? Oceanologica
Acta, 20, 159–164.
Rae, K. M., & Rees, C. B. (1947). Continuous plankton records: the Copepoda in the North Sea, 1938–1939. Hull Bulletins of Marine
Ecology, 2, 95–132.
Rees, C. B. (1952). Continuous Plankton Records: the decapod larvae in the North Sea, 1947–1949. Hull Bulletins of Marine Ecology,
3, 157–184.
Reid, P. C., & Beaugrand, G. (2002). Interregional biological responses in the NorthAtlantic to hydrometeorological forcing. In K.
Sherman, & H. -R. Skjoldal (Eds.), Changing states of the Large Marine Ecosystems of the North Atlantic (pp. 27–48). Amsterdam:
Elsevier Science.
Reid, P. C., Edwards, M., Hunt, H. G., & Warner, A. J. (1998a). Phytoplankton change in the North Atlantic. Nature, 391, 546.
Reid, P. C., & Planque, B. (2000). Long-term planktonic variations and the climate of the North Atlantic. In D. Mills (Ed.), The ocean
life of Atlantic salmon. Environmental and biological factors influencing survival (pp. 153–169). Bodmin: Fishing News Books.
Reid, P. C., Planque, B., & Edwards, M. (1998b). Is variability in the long-term results of the Continuous Plankton Recorder survey
a response to climate change? Fisheries Oceanography, 7, 282–288.
Robinson, G. A. (1961). Contribution towards a plankton atlas of the North-Eastern Atlantic and the North Sea. Part I: phytoplankton.
Bulletins of Marine Ecology, 5, 81–89.
Rothschild, B. J. (1998). Year class strengths of zooplankton in the North Sea and their relation to cod and herring abundance.
Journal of Plankton Research, 20, 1721–1741.
Sen, Z. (1989). Cumulative semi-variogram models of regionalized variables. Mathematical Geology, 21, 891–903.
Shepard, R. N. (1962). The analysis of proximities: multidimensional scaling with an unknown distance function. Psychometrika, 27,
125–139.
Taylor, A. H., Colebrook, J. M., Stephens, J. A., & Baker, N. G. (1992). Latitudinal displacements of the Gulf Stream and the
abundance of plankton in the North-East Atlantic. Journal of the Marine Biological Association of the United Kingdom, 72,
919–921.
Vautard, R., Yiou, P., & Ghil, M. (1992). Singular-spectrum analysis: a toolkit for short, noisy chaotic signals. Physica D, 58, 95–126.
Wackernagel, H. (1995). Multivariate geostatistics. An introduction with applications. Berlin: Springer-Verlag Berlin Heidelberg.
Williamson, M. H. (1961). An ecological survey of a Scottish herring fishery. Part IV: changes in the plankton during the period
1949 to 1959. Appendix: a method for studying the relation of plankton variations to hydrography. Bulletins of Marine Ecology,
5, 207–229.
Worthington, L. V. (1976). On the North Atlantic circulation. Oceanography studies, 6, 1–110.