An overview of statistical methods applied to CPR data G. Beaugrand

Progress in Oceanography 58 (2003) 235–262 www.elsevier.com/locate/pocean An overview of statistical methods applied to CPR data G. Beaugrand a,∗, F. Ibañez b, J.A. Lindley a b a Sir Alister Hardy Foundation for Ocean Science, The Laboratory Citadel Hill, Plymouth PL1 2PB, UK Observatoire océanologique, Laboratoire d’Océanologie de Villefranche, BP 28, 06230 Villefranche-Sur-Mer, France Abstract Since the beginning of the Continuous Plankton Recorder (CPR) survey in 1931, information on the abundance of a large number of plankton species or taxa has been obtained on a monthly basis in the northern North Atlantic. The many different ecological issues in which the survey has been involved have led to the application of a range of statistical methods. In this paper, we review some of the methods applied to the CPR data by presenting new and upto-date analyses. Special attention is devoted to multivariate analysis, which has been used extensively to extract information from the CPR database. Results obtained from recently applied geostatistical methods on CPR data are then considered. An example of a time series decomposition by the use of Eigenvector filtering is presented to illustrate time-series analysis.  2003 Elsevier Ltd. All rights reserved. Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236 2. The descriptive period of the CPR survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 3. Multivariate analyses . . . . . . . . . . . . . . . . . . . . . . 3.1. Ordination in reduced space . . . . . . . . . . . . . . . . 3.1.1. Standardised Principal Component Analysis (PCA) 3.1.2. Centred PCA at diel and seasonal scales . . . . . . 3.1.3. Three-mode Principal Component Analysis . . . . . 3.1.4. Non-metric multidimensional scaling (MDS) . . . . 3.2. Cluster analysis . . . . . . . . . . . . . . . . . . . . . . . 3.2.1. Seriation . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2. Cluster Analysis and ordination . . . . . . . . . . . . 3.3. Indicator-value method . . . . . . . . . . . . . . . . . . . 4. ∗ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 238 238 243 244 246 247 247 248 248 Geostatistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250 Corresponding author. Tel.: +44-1752-633133; fax: +44-1752-600015. E-mail address: gbea@mail.pml.ac.uk (G. Beaugrand). 0079-6611/$ - see front matter  2003 Elsevier Ltd. All rights reserved. doi:10.1016/j.pocean.2003.08.006 236 G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262 4.1. 4.2. Spatial interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250 Semi-variograms and cumulative semi-variograms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252 5. Time series analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1. Cumulative sums . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2. Eigenvector filtering (EVF) and power spectra . . . . . . . . 5.3. Maximum entropy spectral and cross-spectral analyses . . . 6. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254 254 254 258 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258 1. Introduction Since the start of the Continuous Plankton Recorder (CPR) monitoring survey in 1931, large amounts of data have been accumulated. At present, information on the abundance of more than 450 species or taxa has been gathered. A total of some 178,000 CPR samples were collected by the year 2000, comprising ~2 million entries and ~80 million data-points in the database. This programme has become the largest plankton monitoring programme in the world, considering both its wide spatial coverage and long time span. The CPR survey has been involved in the investigation of many ecological issues. Biogeographical studies have been conducted showing spatial distribution throughout the North Atlantic Ocean and shelf seas of more than 250 species such as Calanus finmarchicus and Calanus helgolandicus (Colebrook, Glover & Robinson, 1961a; Edinburgh Oceanographic Laboratory, 1973). Recently, the mapping protocol has been improved using the Lambert conical projection (Planque, 1996) and mapping techniques such as kriging and the inverse squared distance interpolation method (Planque, 1996; Planque & Ibañez, 1997; Beaugrand, Reid, Ibañez, & Planque, 2000a). A number of investigations have allowed a better characterisation of seasonal cycles and of spatial changes for many taxa (Glover, 1957; Colebrook, 1979; Colebrook, 1984). Other works have examined long-term changes in phytoplankton and zooplankton in relation to hydro-meteorological forcing (Colebrook, 1981; Colebrook, 1982a; Colebrook, 1991; Taylor, Colebrook, Stephens, & Baker, 1992; Reid, Edwards, Hunt, & Warner, 1998a; Edwards, John, Hunt, & Lindley, 1999). Recent results using this large dataset indicate that year-to-year changes in standing stock, production and community structure of plankton may be related to the North Atlantic Oscillation (NAO) and climate change (Fromentin & Planque, 1996; Reid & Planque, 2000; Beaugrand, Ibañez, & Reid, 2000b; Beaugrand, Reid, Ibañez, Lindley, & Edwards, 2002a). Other studies on diel vertical migration of some calanoid copepods (Hays, Proctor, John, & Warner, 1994; Hays, 1995; Hays, 1996; Hirst & Batten, 1998), spatial and temporal changes in the diversity of copepods (Beaugrand & Edwards, 2001; Beaugrand, Ibañez, & Lindley, 2001), monitoring of non-indigenous species (Edwards, John, Johns, & Reid, 2001a), and unusual events (Lindley et al., 1990; Edwards, John, Hunt & Lindley, 1999; Edwards, Reid, & Planque, 2001b; Edwards, Beaugrand, Reid, Rowden, & Jones, 2002) have been undertaken and have led to a better understanding of the ecology of many species, exceptional events and the functioning of North Atlantic pelagic ecosystems. The many issues in which the CPR data have been used have involved the deployment of numerous statistical analyses, of which only a limited number can reasonably be presented in this paper. Since most statistical analyses found in classical statistical manuals can be used on the CPR data, only those methods that have often been applied to the CPR dataset and for which it was possible to include a clear example associated with a particular ecological issue are emphasised in this review. Moreover, the importance of scales of variability, as stressed by many authors (e.g. Levin, 1992; Angel, 1994; Mann & Lazier, 1996; Haury & McGowan, 1998; Lundberg, Ranta, Ripa, & Kaitala, 2000) is also addressed as this is as important as the analyses themselves. G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262 237 2. The descriptive period of the CPR survey Until 1964, geographical distribution, annual and year-to-year variability of species or taxa sampled by the CPR survey were mainly investigated by the use of graphs, contour diagrams or maps (Lucas, 1941; Lucas, 1942; Rees, 1952; Glover, 1952). Most statistical analyses were restricted to one dimension. Despite that, results were immediately meaningful and good progress was made in describing the biogeography of species around the United Kingdom (Lucas, 1940; Robinson, 1961; Colebrook, John, & Brown, 1961b). The spatial distribution of Centropages hamatus and Ceratium fusus is shown in Fig. 1, based on data collected during the period 1948–1956 (Colebrook, John & Brown, 1961b; Robinson, 1961). This shows the coastal distribution of Centropages hamatus, whereas the dinoflagellate Ceratium fusus has a wider distribution, occurring in both oceanic and neritic waters (Fig. 1). This way of presenting the spatial distribution of plankton (on gridded charts of 1° of latitude by 2° of longitude), described by Colebrook, John & Brown, 1961b) was used to produce the first atlas of plankton in the North Atlantic Ocean (Edinburgh Oceanographic Laboratory, 1973). Seasonal cycles of plankton around the British Isles were also investigated. For example, Rae and Rees (1947) presented the seasonal cycle of Temora longicornis and the group Para-Pseudocalanus spp. This way of investigating CPR results is still used today, although the application of multivariate statistics has radically changed the way in which information is extracted from the CPR dataset. 3. Multivariate analyses While graphical presentation of CPR data is useful, it soon became clear that the huge mass of multidimensional information provided by the Survey had to be sorted and reduced according to its relevance. For most techniques reviewed in this paper, no mathematical expressions are given in the text and readers are referred to specialised books (e.g. Jolliffe, 1986; Legendre & Legendre, 1983; Legendre & Legendre, Fig. 1. Spatial distribution of the calanoid copepod Centropages hamatus (a) and the dinoflagellate Ceratium fusus (b) for the period 1948–1956 around the British Isles. From Colebrook, Glover & Robinson (1961a) and Robinson (1961). 238 G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262 1998) and other references cited in the text. Table 1 lists the types of multivariate analyses that have been applied to CPR data. 3.1. Ordination in reduced space This type of multivariate analysis has been applied extensively to CPR data. It consists of representing the relationships between objects and observations in a reduced number of dimensions (Legendre & Legendre, 1998). Principal Component Analysis (PCA) is an example. This ordination method has greatly helped in the extraction of relevant information in many types of tables derived from the CPR dataset (Table 2). Non-metric multidimensional scaling (Shepard, 1962; Kruskal, 1964) and three-mode Principal Component Analysis (Jolliffe, 1986) are other techniques that have been applied more recently. 3.1.1. Standardised Principal Component Analysis (PCA) Following Williamson (1961) and Cassie (1963), who were among the first to apply Principal Component Analysis in plankton ecology, Colebrook (1964) started to analyse data on abundance from the CPR using multivariate techniques. Standardised PCA was first applied by him to examine the main patterns of variability in the distributions of 22 different taxa around the United Kingdom (Fig. 2). He used eigenvectors (Fig. 2(a)) to investigate the relationships between the species and principal components to examine the spatial distribution of groups of species (Fig. 2(c)–(d)). Fig. 2(a) illustates the separation between neritic and southern species along the first axis, while the second axis separates northern oceanic and intermediate species. The 22 taxa were classified into five species associations (northern and southern oceanic, northern Table 1 Types of multivariate analysis performed on CPR data Multivariate techniques Ecological goal Authors Standardised PCA Centred PCA Seriation See Table 2 See Table 2 Examination of the relations between species based on their annual fluctuation in abundance Grouping of species or taxa See Table 2 See Table 2 Colebrook (1964), Colebrook and Robinson (1964), Colebrook (1969) Lindley (1987), Lindley and Williams (1994) Clustering of pixels or geographical areas to identify regions with similar year-to-year or annual patterns in the abundance of species Partition of the North Atlantic Ocean based on the diel and seasonal pattern of diversity of calanoid copepods Determination of species associations based on the relative abundance and presence of species in distinct areas in the North Atlantic Ordination of species or taxa based on the similarity of their spatial distribution Study of relationships between the size of spatial structures and their temporal variability Spatial and temporal modelling of the abundance of species Analyses of biological tables structured in space and time. Evaluation and quantification of the interactions between biology, space and time Planque and Ibañez (1997), Beaugrand, Reid, Ibañez & Planque, 2000a) Beaugrand, Ibañez, Lindley & Reid, 2002b) Cluster Analysis. Single linkage agglomerative (nearest-neighbour) clustering method Cluster Analysis. Hierarchical agglomerative flexible clustering technique (Lance & Williams, 1967) Cluster Analysis. Complete linkage agglomerative clustering Indicator-value method (Dufrêne & Legendre, 1997) Non-metric multidimensional scaling Mantel correlogram Generalised additive models Three-mode PCA Beaugrand, Ibañez, Lindley & Reid, 2002b) Lindley (1987)Lindley and Williams (1994) Planque and Ibañez (1997) Beare & McKenzie, 1999a, 1999b) Beaugrand, Reid, Ibañez & Planque, 2000a) G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262 239 Table 2 Diversity of matrices on which principal component analysis has been performed Tables Correlation/ Ecological goal covariance matrix Authors Standardised PCA Matrix area × taxa taxa × taxa Colebrook (1964, 1984) Standardised PCA Matrix years × geographical areas Areas × areas Standardised PCA Matrix years × taxa taxa × taxa Standardised PCA Buys-Ballot table Months-total geographical areas × copepods-colour months-total index × Monthscopepods-colour total copepodsindex colour index Standardised PCA Standardised PCA Standardised PCA Standardised PCA for table with missing data Centred PCA Identification of species assemblages. Examination of the relations between species. Geographical locations of species associations. Extraction of major patterns of year-to-year variability in the abundance of species and its variation in space. Examination of the relationships between species on the basis of their year-to-year and long-term changes in a region. Determination of the relationships between the timing of the amplitude and the duration of the spring bloom for total copepods and phytoplankton (eigenvectors). Examination of spatial changes in the characteristics of the seasonal cycle (principal components). Table geographical Months × months Investigation of the relationships between areas × months months for species such as Temora longicornis and Acartia clausii and examination of the spatial coherence of the seasonal cycle. Table months × taxa taxa × taxa Investigation of the relationships between months and ordination of species according to their main pattern of seasonality. Table years × Months × months Relationships between the seasonal cycle and months the year-to-year variability of species. Matrix months × Pixels × pixels Determination of seasonal cycle of C. map pixels of the finmarchicus and investigation of its spatial abundance of variation. C.finmarchicus Buys-Ballot table Pixels × pixels Determination of seasonal and diel patterns of months-2-hour the diversity of calanoid copepods. period × pixels for Quantification of the two scales of variability at diversity of calanoid a mesoscale resolution in the North Atlantic. copepods Examination of the spatial variation of the diversity of calanoids at diel and seasonal scales. Colebrook (1978, 1982b, 1986) Colebrook (1978, 1982b) Reid et al., 1998b) Reid and Beaugrand (2002) Colebrook (1979) Colebrook (1981, 1982a, 1984) Colebrook (1984) Colebrook (1985a) Planque, Hays, Ibañez and Gamble (1997) Beaugrand et al. (2001) and southern intermediate, and neritic) and their locations (see Fig. 2(b)–(d)) were in part explained by the effect of temperature and its seasonal variability, and also salinity. The ‘simplification’ of multidimensional space by this method proved satisfactory and led to the extensive use of standardised PCA on CPR data. Examples of studies that used this method of ordination are summarised in Table 2. Standardised PCA was much used to extract the main patterns of year-to-year and longterm changes in the community structure of phytoplankton and zooplankton, typically in CPR Standard Areas (Colebrook (1978; Colebrook, 1982a). In most of the CPR Standard Areas, Colebrook (1978, 1982a) reported a declining trend of about 70% for zooplankton taxa and 60% for phytoplankton taxa. As these changes were detected consistently throughout a large geographical region, and were shown to be correlated with westerly weather, Colebrook (1986) argued that these changes were being triggered by meteorological forcing. 240 G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262 Fig. 2. Principal Component Analysis on a matrix of geographical rectangles × species or taxa (22) in the eastern North Atlantic. (a) Scatter diagrams for the first two eigenvectors. Each point on this diagram represents one species. Five species groups were identified by examination of the first three eigenvectors. The points were clustered on the basis of the ecological knowledge of the author. (b)–(d) Maps of the distribution of the first three principal components. From Colebrook, Glover & Robinson (1961a). This type of PCA was re-applied in 1998 (Reid, Planque, & Edwards, 1998b) and 2001 (Reid & Beaugrand, 2002), and an example is presented for a set of zooplankton taxa in the North Sea (Fig. 3). A total of 28 taxa (Table 3), which were abundant and did not have a high frequency of zeros during the period 1958–1999, were selected. Scatter plots of the first two eigenvectors are shown (Fig. 3(a)) as well as longterm changes in the associated principal components (Fig. 3(b) and Fig. 3(c)). Groups of years have been distinguished by a Cluster Analysis (Lance & Williams, 1967; hierarchical agglomerative flexible algorithm) are indicated. The first principal component (Figs. 3(b), 30.2% of the total variance) shows there was a period of high values from 1962 to 1976 followed by one of low values from 1983 for both G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262 241 Fig. 3. (a) Ordination by PCA of the 28 species or taxa listed in Table 3 in the plane of the two first principal components (50.5% of the total variability). (b) Year-to-year changes in the first principal component. (c) Year-to-year changes in the second principal component. Periods detected by a Cluster Analysis using the flexible algorithm of Lance and Williams (1967) are indicated. Overall, there is a good temporal connection with the exception of the years 1958 (period 1), 1975 (period 2), 1991 (period 5), 1993 (period 5). 242 G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262 Table 3 List of species used in a PCA to examine long-term change in zooplankton community structure in the North Sea with normalised eigenvectors 1 and 2. The correlation between each species or taxon and the corresponding principal components is indicated by (r). The coefficient of determination (r2) indicates the contribution of a species to the first two axes. Numbers (column 1) correspond to those shown in Fig. 3 Identification number names of taxa eigenvector 1 r 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 Calanus I-IV Pseudocalanus elongatus Adult Para-Pseudocalanus spp. Temora longicornis Acartia spp. Centropages typicus Centropages hamatus Oithona spp. Corycaeus spp. Calanus Total Traverse Podon spp. Evadne spp. Limacina retroversa Lamellibranchia larvae Chaetognatha Traverse Cyphonautes larvae Echinoderm larvae Larvacea Calanus finmarchicus Calanus helgolandicus Decapoda larvae Euphausiacea Total Chaetognatha Eyecount Harpacticoida Total Metridia Total Traverse Copepod nauplii Cirripede larvae Euphausiacea calyptopis 0.497 ⫺0.401 ⫺0.604 ⫺0.708 ⫺0.529 ⫺0.623 ⫺0.445 ⫺0.270 ⫺0.743 0.500 ⫺0.547 ⫺0.517 ⫺0.234 ⫺0.538 ⫺0.632 ⫺0.571 ⫺0.607 ⫺0.571 0.616 ⫺0.633 ⫺0.789 0.688 ⫺0.515 ⫺0.580 ⫺0.482 ⫺0.425 ⫺0.249 0.367 eigenvector 2 r2 0.247 0.161 0.365 0.501 0.280 0.388 0.198 0.073 0.552 0.250 0.300 0.267 0.054 0.289 0.399 0.326 0.369 0.326 0.379 0.401 0.622 0.474 0.265 0.336 0.233 0.181 0.062 0.135 r2 r 0.763 0.667 0.688 ⫺0.043 ⫺0.032 ⫺0.081 0.061 0.844 ⫺0.208 0.784 ⫺0.283 0.289 0.724 0.396 0.541 0.420 0.004 ⫺0.334 0.719 ⫺0.138 ⫺0.094 0.452 0.503 0.002 0.421 ⫺0.334 ⫺0.278 0.055 0.583 0.445 0.474 0.001 0.001 0.006 0.003 0.713 0.043 0.614 0.080 0.083 0.524 0.156 0.293 0.176 0.000 0.111 0.517 0.019 0.008 0.204 0.253 0.000 0.177 0.112 0.077 0.003 cold-water mixed oceanic and neritic species (e.g. Euphausiacea and C. finmarchicus), which were positively related to the first axis. A strong increase was detected during a cold period between 1978–1982 (see Fig. 3(a) and Table 3). For species negatively related to the first axis, the long-term changes showed the inverse pattern with an increasing trend followed by a significant decrease during the cold period (see Fig. 3(a) and Table 3). This pattern of variability was followed by the warmer-water, neritic or pseudooceanic species such as C. helgolandicus, Temora longicornis, Corycaeus spp. and decapod larvae. The second principal component (Fig. 3(c), 20.32% of the total variance) displays a decreasing trend for temperate neritic and pseudo-oceanic taxa such as Para-Pseudocalanus spp., Oithona spp., Limacina retroversa and colder-water taxa such as Calanus finmarchicus. These opposing trends led to a change in the ecosystem of the North Sea with a decrease in indicators of cold water and an increase in warmerwater pseudo-oceanic and neritic taxa. This confirms the trend discovered by Beaugrand, Reid, Ibañez, Lindley & Edwards (2002a), who found an increase in the abundances of warm-temperate and temperate species, which was associated with decreases in colder-water species. These changes have been linked to the climatic warming observed in the North-East Atlantic in recent decades. G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262 243 3.1.2. Centred PCA at diel and seasonal scales Until recently, little attention has been devoted to the analysis of spatial changes in pelagic diversity (Lindley, 1998; Beaugrand, Reid, Ibañez & Planque, 2000a) at all temporal scales. Studies have been carried out to examine in more detail spatial patterns of pelagic biodiversity at diel and seasonal scales (Beaugrand, Ibañez & Lindley, 2001; Beaugrand, Ibañez, Lindley, & Reid, 2002b). PCA was used to identify the spatial patterns in diversity (in terms of the number of taxa per CPR sample) of calanoid copepods and to detect major seasonal and diel patterns of change across the northern North Atlantic Ocean. Fig. 4 shows (left) the first four eigenvectors and (right) monthly and diel changes of the corresponding principal components from January to December based on 40 years of CPR sampling (1958–1997). They represent a total explained variance of 63.0%. The monthly and diel plot of the first principal component (Fig. 4, PC1, 47.8%) shows that strong diel variations occurred throughout the year. These diel changes were more pronounced from April to October. Seasonal changes were also detected but were weaker than diel changes. The value and intensity of diel variations were clearly detected during winter. As the first eigenvector is only composed of positive values, high values (in red on the first map in Fig. 4) indicate Fig. 4. Spatial, seasonal and diel changes in calanoid diversity in the northern North Atlantic. Mapping of the first four eigenvectors and monthly and diel changes in the corresponding principal components (PC 1-4). The symbol above each graph indicates midnight and the dashed lines between them denote midday. Modified, from Beaugrand et al. (2001). 244 G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262 where monthly and diel changes were strongest. This pattern occurred predominantly in the south–west sector of the North Atlantic Drift Province (Longhurst, 1998). PC2 (Fig. 4, 8.4%) shows that there were seasonal changes in species richness although diel changes were still detectable in almost all months. The diel changes were weaker in summer than in spring, autumn and winter. The corresponding eigenvector has both negative and positive values. High negative values should be negatively related to the signal displayed by PC2 and inversely related to high positive eigenvector values. Thus, in the northern part of the North Atlantic Drift Province, the southern part of the Atlantic Subarctic Province and the North Sea there were large seasonal changes in diversity with high values occurring mainly in summer and low diel changes. In contrast, regions south of 50°N had high values for diversity, mainly in spring, and showed higher diel variations. PC3 (Fig. 4, 4.6%) displays the contrast that exists between seasonal changes in diversity between spring, autumn and winter periods. High negative values of EV3 in the Bay of Biscay region indicate high diversity during spring and the high positive values reflect high diversity during autumn and winter in the Gulf Stream extension region. PC4 (Fig. 4, 2.2%) shows two seasonal maxima in March–April and in July–October. May, June and winter months are characterised by a lower value. Examination of EV4 shows that this pattern occurs off the Iberian coast. Diel and seasonal changes were modelled by multiplying the first four principal components by their respective eigenvectors. Fig. 5 shows the seasonal and diel variability of calanoid diversity by re-estimation of the original matrix. The North Atlantic Drift Province (Longhurst, 1998) can be clearly divided into two parts; one to the south-west that is highly variable at a diel scale; the other to the north-east that is highly variable on a seasonal scale. Consideration of these two scales of variability gave better discrimination between regions, as a result of which new divisions of the North Atlantic Ocean and adjacent seas were proposed and new hypotheses about factors that contribute to the regulation of pelagic diversity suggested (Beaugrand, Ibañez & Lindley, 2001). 3.1.3. Three-mode Principal Component Analysis This numerical technique has recently been applied to CPR data to investigate long-term changes in the community structure of pelagic ecosystems along the SA route. This route, which crosses the English Channel, the Celtic Sea and the Bay of Biscay, was divided into twenty sections ranging in length from 20 to 70 km, but which contained the same number of CPR samples (188 observations for each section, making a total of 3760 samples). Selecting the most common phytoplankton and zooplankton species, a three-way table of the annual mean abundance of each taxon for each section and for each year over the period 1979–1995 was constructed. In oceanography, methods that allow the analysis of such complex tables are rare. A three-mode PCA was developed and applied in conjunction with cluster analysis (Beaugrand, Ibañez & Reid, 2000b). The calculation of a three-mode PCA is made in two stages. First, three ‘classical’ PCAs are performed on the matrices time-space x species (mode species), time-species x space (mode space) and space-species x time (mode time). Secondly, a core matrix, which establishes the interrelationships between each mode, is calculated from the three eigenvector matrices computed in the first step of the analysis. Fig. 6 presents the results of this analysis, showing the regions identified (Fig. 6(a)) and the long-term changes from the three principal components, species-locations (Fig. 6(b), mode time), years-locations (Fig. 6(c), mode species) and species-time (Fig. 6(d), mode space). Five different zones, corresponding to a distinct interannual variability in plankton abundance, were identified (Fig. 6(a)– (b)). The zones were also characterised by distinct physical processes. It was even possible to detect the effects of the Ushant Front, which corresponded to zone 3. Significant negative correlations were detected between the NAO index, air temperature and the first principal component in the English Channel. Thalassionema nitzschioides, Nitzschia delicatissima and various zooplankton taxa mainly present in the English Channel showed interannual variability in abundance that differed from that in the Bay of Biscay (Fig. G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262 245 Fig. 5. (a) Intensity of the diel variability (as a percentage) in the diversity of calanoid copepods in the North Atlantic Ocean. (b) Intensity of the seasonal variability (in percentage) in the diversity of calanoids. Redrawn, from Beaugrand et al. (2001). 6(c) and (d)). The first principal component in each mode was indicative of plankton abundance and showed a decrease between 1988 and 1991 in the English Channel (Fig. 6), a period that coincided with a high NAO index as well as the beginning of the 1989/1991 high-salinity anomaly (Becker & Dooley, 1995). Furthermore, especially in the northeast and central English Channel, higher abundances were observed at times of negative or low NAO values. At times of high and positive NAO indices, westerly winds are stronger throughout this area, and this may lead to an increase in mixing, which could delay the onset of 246 G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262 Fig. 6. Interannual variability along the SA route. (a) Location of the SA route sampled by the CPR survey. The five regions detected by the analysis are superimposed. (b) year mode. Variability of the first principal component (species-locations). The groups determined from the cluster analysis are indicated for species on the ordinate and for locations on the abscissa. (c) species mode. Year-toyear variability in the first principal component (years-locations). The grey level indicates the intensity of the first component. The groups determined from the cluster analysis are indicated for years on the ordinate and for locations on the abscissa. (d) location mode. Variability in the first principal component (species-years). The groups determined from the cluster analysis are indicated for species on the ordinate and for years on the abscissa. Redrawn, from Beaugrand, Ibañez & Reid, 2000b). Z1: northern eastern English Channel; Z2: southern western English Channel; Z3: Ushant Front; Z4: Celtic Sea; Z5: Bay of Biscay. TN: Thalassionema nitzschoides; AC: Acartia spp.; CH: Calanus helgolandicus; PP: Para-Pseudocalanus spp.; ND: Nitzschia delicatissima; LI: Limacina spp.; CT: Centropages typicus; OI: Oithona spp.; CF: Ceratium fusus; CM: Ceratium macroceros; CC: Clausocalanus spp. water column stabilisation essential for the seasonal increase in net primary production (Dickson, Meincke, Malmberg, & Lee, 1988). 3.1.4. Non-metric multidimensional scaling (MDS) MDS is a non-parametric ordination method that aims to project multidimensional space into a reduced number of dimensions, generally two. This analysis, which can be applied with almost any coefficient of association (see Legendre & Legendre, 1998), in contrast to PCA (Euclidean distance) or correspondence analysis (c2 distance), has been applied to CPR data by Lindley (1987); Lindley and Williams (1994), and Edwards (2000). The analysis presented here to illustrate the method is one that was performed by Lindley and Williams (1994) but the dendrogram and MDS scatter plot for this tow were not presented in that G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262 247 paper. Plankton was sampled by a Continuous Plankton and Environmental Recorder (CPER) between Aberdeen and Grimsby in the North Sea along route LR. Four groups of plankton were recognised. One group occurred mainly in samples from Areas A (Aberdeen end) and C (unstratified water near Grimsby), the second occurred mainly in samples from Area A with a few from Area B (Grimsby), the third and largest group occurred throughout the tow, and the fourth was found mainly in Area B. MDS used in conjunction with cluster analysis (hierarchical agglomerative single-clustering method) made it possible to identify three regions along the transect on the basis of their plankton composition (20 taxa were considered). Fig. 7 shows a clear separation between station 21 situated in unstratified water and other stations located in more stratified areas. The cluster analysis grouped the northern and southern stations. The stress coefficient for the MDS plot was 0.08 indicating that the projection of the multidimensional space into two dimensions was satisfactory. This was also confirmed by the cluster analysis. 3.2. Cluster analysis Cluster analysis is a powerful multivariate tool that is used to group objects or descriptors. With the exception of probabilistic clustering methods (e.g. Clifford & Goodall, 1967), which necessitate a particular association coefficient (e.g. Goodall’s probabilistic coefficient), this technique can possibly be applied to almost any distance or similarity matrix between objects or descriptors. The choice of the coefficient of association depends on the type (e.g. quantitative, semi-quantitative or qualitative data) and nature (abundance of species or presence/absence) of data and the hypothesis that is under study (Legendre & Legendre, 1998). Results from cluster analysis are often represented by means of a dendrogram. 3.2.1. Seriation Before cluster analysis techniques became available, relationships between objects or descriptors were investigated by rearrangement of an association matrix. Colebrook (1964); Colebrook and Robinson (1964) and Colebrook (1969) applied this technique to study relationships between species and to detect species associations based on their geographical variation in abundance or to examine geographical similarities in the interannual variability of a species (e.g. Temora longicornis in the North Sea; Colebrook, 1969). Fig. 7. Two-dimensional ordination of the 21 locations sampled by a Continuous Plankton and Environmental Recorder (CPER) between Aberdeen and Grimsby in the North Sea. A Bray-Curtis similarity coefficient was used and a cluster analysis was applied to group locations on the scatter plot. 248 G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262 3.2.2. Cluster Analysis and ordination Lindley (1987) was one of the first to apply cluster analysis (single hierarchical agglomerative clustering method) in conjunction with an ordination method. These two techniques were applied to investigate the distribution of 36 species of decapod larvae around the British Isles. The resulting dendrogram demonstrated the presence of seven groups of species, although the shape of the dendrogram clearly indicated a gradient in the distribution of larvae. The distributions of these decapods were explained by the interaction between life histories of organisms and bathymetric depth, which is quite important in the ecology of benthic organisms (Glémarec, 1973). The joint application of cluster analysis and an ordination method enables a visual inspection of the deformation of the projection of the multidimentional space into a two-dimensional scatter plot to be visualised. This procedure has been recommended by several authors (e.g. Legendre & Legendre, 1998). 3.3. Indicator-value method The recently proposed ‘Indicator-value method’ (Dufrêne & Legendre, 1997) has been applied to calanoid copepods by Beaugrand, Ibañez, Lindley & Reid (2002b). This method enables species associations to be identified. Several steps are necessary to detect such associations. If the goal is to identify indicator species in an area, a cluster analysis is first applied in order to identify the regions. Alternatively the regions can be determined a priori if the area under investigation is already well known. This can be done using any type of data (e.g. abundance, diversity, or abiotic factors). Then, a measure of specificity and of fidelity must be calculated. The specificity Aij computes the ratio of the average abundance of species i in the pixels of group j (Nindividualsij) to the sum of the mean abundance of the species i in all groups (Nindividualsi.). Aij ⫽ Nindividualsij Nindividualsi. The fidelity Bij is the ratio of the number of pixels where the species i in the group j is present to the total number of pixels in this group. Bij ⫽ Nsitesij Nsites.j The indicator value (INDVALij) is computed by multiplying the specificity and fidelity indices, as these two quantities represent independent information. INDVALij ⫽ Aij ⫻ Bij ⫻ 100 Dufrêne and Legendre (1997) retained the maximum indicator value for each species among all groups. This method has been used with CPR data to derive species assemblages from calanoid diversity (108 taxa) (Beaugrand, Ibañez, Lindley & Reid, 2002b) as shown in Fig. 8. To take one example, the warmtemperate oceanic assemblage comprises 16 taxa. The boundary to this assemblage is quite sharp and its geographic coverage does not extend into water depths ⬍200 m (Fig. 8(a)). The influence of the Oceanic Polar Front at about 52–53°N on the latitudinal distribution of this association is strong between the Northwest Corner (see Worthington, 1976) at 51°N, 44°W and the mid-Atlantic ridge. West of this, the latitudinal front becomes meridional and the association extends to the north to about 58°N south of Iceland and 55°N to the west of Ireland. To take another example, Fig. 8(c) and (d) shows a clear complementarity between the distribution of warm-water and cold-water species. At a lower level of distance in the dendrogram (not shown), warm-water species were divided into coastal, continental shelf, and pseudo-oceanic species (Fig. 9) while cold-water species were divided into cold-temperate, subarctic and arctic species associations (Fig. 10). The subtropical and warm-temperate oceanic and pseudo-oceanic species associations G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262 249 Fig. 8. Spatial distribution of calanoid copepod assemblages in the northern North Atlantic. The maps show the percentage of occurrence of taxa per pixel for each assemblage defined at the partition level of 1.152 of a dendrogram(not shown). Abundance data were transformed into presence/absence data. Then, the percentage of present taxa per pixel for each assemblage was computed. An elevated percentage denotes a high degree of spatial aggregation of taxa inside an assemblage and vice versa. A blank pixel inside the survey area indicates the absence of species for a given group. (a) Warm-temperate oceanic species assemblage (16 species or taxa). (b) Bay of Biscay and southern European shelf-edge assemblage (4 species). (c) temperate neritic and pseudo-oceanic species assemblage (12 species or taxa) (d) cold-temperate, subarctic and arctic species assemblage (11 species or taxa). (e)–(f). Subtropical and warm-temperate species assemblage (25 species). Redrawn from Beaugrand, Ibañez, Lindley & Reid (2002b). were clearly detectable in the path of the Gulf Stream extension (Fig. 8(e) and (f)). These species assemblages have been used to define in greater detail the ecosystems and ecotones of the North Atlantic Ocean and adjacent seas and to understand better the factors regulating diversity. Four modulating factors have been identified: 1. temperature, 2. hydrodynamics, 3. stratification, and 4. seasonal variability. These factors are often linked, but they can act at different scales, and their contributions can vary geographically. Moreover, this study clearly detected the influence of warm currents on diversity and hence the functional characteristics of ecotones west of Europe and in the Gulf Stream extension. Relationships between species associations and water masses or currents are strong. These assemblages may, therefore, represent an important environmental indicator for monitoring marine ecosystems and evaluating the impact of climate change. Other techniques combining clustering methods and Bayesian probabilities, used recently by Anneville, Souissi, Ibañez, Ginot, Druart and Angeli (2002), could be used with the CPR data. 250 G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262 Fig. 9. Spatial distribution of calanoid copepod assemblages in the northern North Atlantic. Subdivision of the temperate neritic and pseudo-oceanic species assemblages (see Fig. 8(c)) at the partition level of 0.5 of the dendrogram (not shown). Data were coded in two states: 0 when the abundance of the taxa was less than half the mean. 1 when the abundance was more than half the mean. The percentage of species in one square was then calculated. This transformation allowed the main centres of the spatial distribution of taxa inside a group to be detected. Redrawn from Beaugrand, Ibañez, Lindley & Reid (2002b). 4. Geostatistics 4.1. Spatial interpolation Selection of appropriate interpolation methods for spatial representation of plankton data is a key stage in making spatial and temporal comparisons of biological variables. The method selected should be rapid in calculation time and applicable to all species both rare and abundant. Many methods exist to interpolate data (e.g. Lam, 1983). Two of the available methods have been used on CPR data. Planque (1996) applied the kriging procedure for the first time to CPR data. Kriging has the advantage that it takes into consideration spatial scales of change in ecological variability. This method allows a standard deviation of the interpolation error to be derived, whereas this is less obvious in the case of the alternative inverse squared distance method. Kriging was also used by Planque and Fromentin (1996) and has subsequently been applied successfully to abundant species and to the diversity of calanoid copepods (Planque & Ibañez, 1997; Beaugrand, 1999). The total number of taxa identified per CPR sample has recently been mapped using this procedure (Beaugrand, 1999). Fig. 11 shows monthly changes in the total number of taxa identified per CPR sample. A clear contrast is seen between the total number of taxa in the eastern and western northern North Atlantic. High values were found during winter off Canada and in the Bay of Biscay. Then, the number of taxa identified per CPR sample progressively extended northwards until September. The G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262 251 Fig. 10. Spatial distribution of calanoid copepod assemblages in the northern North Atlantic. Subdivision of the cold-temperate, subarctic and arctic species assemblages (see Fig. 8(d)) at the partition level of 0.5 of the dendrogram, not shown. (a) cold-temperate species assemblage (4). (b) Subarctic species assemblage (4). (c) Arctic species assemblage (3). Redrawn from Beaugrand, Ibañez, Lindley & Reid (2002b). use of kriging with CPR data is limited, however, by three main problems. First, there is the interpretation of the variogram and its approximation using a theoretical model. For each spatial interpolation, this step must be checked. Secondly, both geometric and zonal anisotropy must be corrected (Wackernagel, 1995). Thirdly, it is difficult to use kriging for rare species because of the high proportion of zeros in the matrices. In practice it is hard to verify all these parameters when a large number of maps is produced, so another method, called inverse squared distance, has been applied (Beaugrand, Reid, Ibañez & Planque, 2000a; Planque & Batten, 2000). It is simpler than kriging and gives similar results when the radius of interpolation is relatively small (i.e less than 300 nautical miles). Fig. 12 shows the mean spatial distribution of some taxa using this technique. There are considerable differences in the distribution patterns of the illustrated taxa. The distribution of some taxa is complimentary (e.g. Euchaeta norvegica and the group ParaPseudocalanus). In the case of Metridia lucens, a higher abundance is seen in the pelagic ecotones situated west of the British Isles and also along the path of the Gulf Stream extension and the North Atlantic 252 G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262 Fig. 11. Seasonal changes in the total number of taxa per CPR sample. Kriging was applied to interpolate the data spatially (period 1958–1997, search radius = 200 nautical miles, neighbours between 5 and 15). Redrawn from Beaugrand (1999). Current. Spatial interpolation techniques should be used with care in any application applied to the CPR data because of the spatial and temporal heterogeneity of CPR sampling. 4.2. Semi-variograms and cumulative semi-variograms Planque (1996) used experimental semi-variograms to investigate the spatial scale of variability of C. finmarchicus and C. helgolandicus. Semi-variograms were modelled using a spherical model. Important year-to-year variability was found in the shape of the experimental semi-variograms. This variance was attributed to the sensitivity of the method to a small number of CPR samples. A clear spatial dependency in the abundance of Calanus species was found within a range of about 400 km. C. helgolandicus exhibited a more complex pattern from April to August, which was attributed to a multi-scale distribution pattern. In that case, spherical models were fitted to the experimental variograms to interpolate by kriging the abundance of the two species of Calanus for all months of the period 1958–1992. This same procedure was then repeated in subsequent studies for different periods at different scales (Planque & Fromentin, 1996; Fromentin & Planque, 1996). This technique probably helped to discover the well-known negative G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262 253 Fig. 12. Mean spatial distribution of some key species or taxa in the North Atlantic Ocean. An inverse squared distance method was used with a research radius of 250 nm (neighbours between 5 and 15). Spatial interpolation was performed for each 2-month and 2-hour period (144 maps). The mean of all maps was then calculated. relationship between the state of the North Atlantic Oscillation and of C. finmarchicus (Fromentin & Planque, 1996). However, classical experimental semi-variograms are highly sensitive to irregular distribution of observations inside the spatial domain (Wackernagel, 1995; Sen, 1989). Sampling by the CPR survey is irregular, so an unbiased estimate of the experimental semi-variogram is difficult to obtain. Moreover, the choice of distance classes is quite arbitrary and may strongly influence the shape of the curve. This led Sen (1989) to propose a new way to evaluate the spatial dependence of observations for geological purpose. The principle of Sen’s cumulative semi-variograms is to calculate for each sampling point a semi-variogram based on geographical distances and dissimilarity between the particular sampling point and the others. Values for each semi-variogram are pooled, and then, as there are as many semi-variograms as there are points, it is possible to map the spatial dependence of the data. This gives an indication of the anisotropy of the regional variable and shows spatial changes in the scale of variability of the variable. Sen’s cumulative semi-variograms were applied to CPR data by Beaugrand and Ibañez (2002). Fig. 13 shows the result of applying the procedure to one month, using only data collected at night. The regional dependence in 254 G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262 Fig. 13. Application of Sen’s cumulative semi-variogram to show the spatial dependence of calanoid diversity in the northern North Atlantic. The value of the PCSV (Point Cumulative Semi-Variogram) was fixed at 1000. Blue indicates diversity changes at small spatial scales, while red indicates that diversity varies at a larger spatial scale. Main surface currents are superimposed. Redrawn from Beaugrand and Ibañez (2002). the path of the North Atlantic Current and in the north-eastern part of the North Atlantic Drift Province (see Longhurst, 1998) was low (400 km for a value of the Point Cumulative Semi-Variograms of 1000). This indicated that spatial variability in calanoid diversity varied at a small scale. The regional dependence of diversity inside the two subtropical gyres was stronger (1200 km for a value of the Point Cumulative Semi-Variograms of 1000) and indicated a greater spatial variability in calanoid diversity. 5. Time series analysis Objectives of this type of analysis are to describe and decompose time series and then develop models to enable future forecasting (Diggle, 1990). To date, most time series analyses used on the CPR dataset have been used to describe seasonal, year-to-year and long-term changes in the abundance of species. Then, correlation and cross-correlation techniques have been applied to identify environmental parameters responsible for the observed trends (e.g. Fromentin & Planque, 1996). Time series analyses applied to the CPR data are presented in Table 4. 5.1. Cumulative sums This technique is a simple method, which consists of graphically detecting local changes in a time series and assessing the intensity and duration of these changes (Ibañez, Fromentin, & Castel, 1993). This function is calculated by subtracting for all values of the time series a reference number (i.e. the mean of the time series) and progressively pooling the residuals (Ibañez et al., 1993). This function was applied to the CPR data with the objectives of emphasising the relationships between community change and air surface temperature in the English Channel, the Celtic Sea and the Bay of Biscay (Beaugrand et al., 2000b, Fig. 6). A clear relationship was detected between changes in community structure (Principal Component yearlocation from a three-mode Principal Component Analysis calculated on the three-dimensional table, years × locations × species) and air surface temperature in the English Channel and the Celtic Sea (Fig. 14). 5.2. Eigenvector filtering (EVF) and power spectra The Eigenvector filtering method, also known as Principal Component Analysis of Processes (Ibañez & Etienne, 1991) or singular-spectrum analysis (Vautard, Yiou, & Ghil, 1992), was used on CPR data by G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262 255 Table 4 Examples of time series analyses used to interpret CPR data Types of analyses Ecological goal or utility Authors Box and Jenkins models (AR.MA. ARMA. ARIMA) Multinomial logit model Modelling of a time series and forecasting. Rothschild (1998) Cumulative sums Polynomial regression Eigenvector filtering Power spectra Maximum Entropy Spectral analysis Maximum entropy cross spectral analysis This kind of Generalized Additive Model was applied to analyse Beare & McKenzie (1999b) the CPR data, using recorded values. This technique was utilised to reveal the seasonal, spatial and long-term variability in the abundance of species. Examination of local trends in a time series. Beaugrand, Reid, Ibañez & Planque (2000a) Determination of the trend in the abundance of Calanus Fromentin and Planque (1996) finmarchicus, C. helgolandicus and environmental parameters.This type of regression is used to de-trend the different time series and to take into account temporal autocorrelation in the calculation of correlations (e.g. correlation between C. finmarchicus and the NAO index). Decomposition of a signal. Smoothing of a time series and Colebrook (1978. 1982b) extraction of the trends. Emphasis of the major signal in the time series and quantification of temporal variability by the use of eigenvalues (Ibañez & Dauvin, 1988; Ibañez & Etienne, 1991). Assessment of the scales of variability of a variable. Colebrook (1979) Assessment of the scales of temporal variability of a variable. This Colebrook (1981, 1982a, was used on time series of sea-surface temperature, principal 1985b, 1991) components and species abundance. This analysis is more adapted to short time series and measures for which a higher degree of error is expected than for classical spectral analysis (Legendre & Legendre, 1998). Examination of the common patterns of temporal variability for Colebrook (1985b, 1986, two pairs of variables (e.g. total copepods and sea-surface 1991) temperature). This method uses both coherence and phase diagrams and determines relationships between variables for all possible scales of variability. Fig. 14. Cumulative sums of surface air temperature and the 2-dimensional principal component (years-locations) in the eastern English Channel. The figure shows the negative relationship between both variables. Redrawn from Beaugrand, Reid, Ibañez & Planque (2000a). 256 G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262 Colebrook, 1978, 1982a). He used this method to smooth and emphasise the trend of plankton time series and applied it directly to abundance data and to principal components (Colebrook, 1978). In more recent years, EVF has seldom been used on CPR data. An application to the decomposition and quantification of the scales of temporal variability in the diversity of calanoid copepods is presented in Fig. 15. A time series was built up for the North Sea (0°E–10°W, 50°N–60°N) from 1958 to 1999 with (× 42) the diversity (as the number of taxa per CPR sample) for each year, for daylight and dark periods (× 2) and for each two-month period (× 6). Hence, the length of the time series was 504 (2 × 6 × 42). The lag chosen for the Toeplitz matrix (autocovariance matrix) was 100. This lag was selected in order to eliminate long-term cycles and to emphasise the trend in the time series. The autocorrelation function (Fig. 15(a)) of the time Fig. 15. Variability in the diversity of calanoid copepods from 1958 to 1999 for every two-month period and dark/daylight periods (504 points). (a) Autocorrelation function (99% and 95% confidence intervals are indicated). (b). Gain spectra assessed from the results of the EVF. G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262 257 series was high with a lag of one year between each maximum, which clearly showed the strong effect of seasonality on the diversity of calanoid copepods. The gain function was also calculated according to the procedure described by Ibañez and Etienne (1991) (Fig. 15(b)). This function indicates the periods emphasised by the corresponding eigenvectors of the EVF. A high value in the gain function for long periods (infinity) indicates a trend while a high value for small periods can often be attributed to noise. The gain function (Fig. 15(b)) shows that the first two axes of the decomposition (S1 and S2, 24.9% and 24.1% of the total variance of the time series, respectively) generated cycles with a period of about one year (the seasonal cycle). The third series (S3, 3.7%) emphasised the long-term variability of the time series, the fourth (S4, 2.8%) pointed out a cyclical trend with a period of approximately 16 years and the fifth series (S5, 2.8%) identified the day/night variation in diversity. Fig. 16 confirms the results of the gain function. Series 1 (the original time series reassessed by multiplying the first eigenvector by the first principal component) and 2 (the original time series reassessed by multiplying the second eigenvector by the second principal component) represented the seasonal cycle. Series 3 indicated the trend of the time series. This trend showed three peaks of high diversity in 1959, 1972 and 1990, which corresponded to warmer sea surface temperature. The low diversity in 1980 corresponds with the inflow of cold water into the North Sea. Series 4 clearly shows a pseudo-cycle of about 16 years, evident from the gain function. Fig. 16. Series recalculated from the data presented in Fig. 15 using the first five eigenvectors. As the use of the second eigenvector gave a similar result to that of the first, it is not represented here. (a) first series: seasonal variability. (b) third series: long-term trend. (c) fourth series: cyclical variability (pseudo-period of about 16 years) with a slight influence of diel variability in diversity. (d) fifth series: diel variability in the diversity of calanoid copepods. 258 G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262 Series 5 emphasised the day/night variation in diversity. Coefficients of variation calculated for each series indicated that the seasonal variability was more important than the year-to-year variability (Fig. 17). This result confirms the observations on diel and seasonal variability made by Beaugrand, Ibañez and Lindley (2001). The result also shows that it is important to take seasonal variability into account in the examination of calanoid copepod diversity. Diel variability is also important in relation to year-to-year variability and should also be considered (Fig. 17). This may also apply to abundance data. 5.3. Maximum entropy spectral and cross-spectral analyses Colebrook was the first to apply Maximum Entropy Spectral and Cross-Spectral analyses to CPR data (Colebrook, 1981, 1982b, 1985a, 1991). Colebrook and Taylor (1984) used these techniques to analyse temporal variability in the abundance of plankton sampled on a monthly basis and physical data such as sea-surface temperature from 1948 to 1980. Maximum Entropy spectral and cross-spectral analyses were used to determine the characteristic frequency of long-term variability in the abundance of plankton (first principal component from a standardised PCA on the matrix years x species) and to examine similarities between plankton and physical variables around the British Isles. Using coherence and phase spectra, these authors identified a number of characteristic periods (e.g. 10–12 years, 5–6 years, 3–4 years). Wavelengths of 3–4 years were associated with surface-heat exchange phenomena. 6. Conclusions Considering the current number of years (43) and months (516) recorded from 1958 to 2000, for all species or taxa (about 450) and standard areas (33), about 7.7 million graphs would be needed to examine year-to-year and long-term changes in the seasonal cycle of each species or taxon in all standard areas. More than ever, multivariate analyses need to be used to extract relevant information contained in the Fig. 17. Quantification of temporal scales of variability in calanoid copepod diversity. The coefficient of variation was calculated for the first five series reconstructed from the EVF (see Fig. 16). G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262 259 database. This review has emphasised how important statistical analyses have been, and are likely to continue to be, in the interpretation of CPR data. There is a clear need to develop techniques to improve the sorting of information in the CPR database and to evaluate relationships between biological and environmental data. Environmental parameters are available from the web (e.g. temperature, CZCS data, salinity, wind speed, wind direction). Environmental tables could be gathered and compared with biological tables assembled from the CPR database. Techniques so far not applied to CPR data (e.g. Redundancy Analysis or Canonical Correspondence Analysis) may be appropriate. Other techniques based on probability distribution (e.g. randomisation procedure for customised distributions, Poisson and Poisson-like distributions for rare species, Levy and log-Levy distributions for geometrically fractal distributions, Bayesian techniques for regional variables) could also help to assess relationships between biological and physical variables. Acknowledgements The authors are grateful to all past and present members and supporters of the Sir Alister Hardy Foundation for Ocean Science whose continuous efforts have allowed the long-term establishment and maintenance of the CPR dataset. We are particularly grateful to Philip C. Reid, Martin Edwards, Benjamin Planque, Arnold Taylor and the two referees for advice and comments on the manuscript. The research presented was supported by the European Community Research Project No. MAS3-CT98-5058, the Netherlands (contract RKZ595) and the French ‘Programme National en environnement côtier, thème: influence des facteurs hydroclimatiques ou anthropiques sur la variabilité spatio-temporelle des populations et écosystèmes marins’ (PNEC art 4). References Angel, M. V. (1994). Spatial distribution of marine organisms: patterns and processes. In P. J. Edwards, R. M. May, & N. R. Webb (Eds.), Large-scale ecology and conservation biology (pp. 59–109). Cambridge: Blackwell Scientific Publications. Anneville, O., Souissi, S., Ibañez, F., Ginot, V., Druart, J. -C., & Angeli, N. (2002). Temporal mapping of phytoplankton assemblages in Lake Geneva: Annual and interannual changes in their patterns of succession. Limnology & Oceanography, 47, 1355–1366. Beare, D. J., & McKenzie, E. (1999a). Connecting ecological and physical time-series: the potential role of changing seasonality. Marine Ecology Progress Series, 178, 307–309. Beare, D. J., & McKenzie, E. (1999b). The multinomial logit model: a new tool for exploring Continuous Plankton Recorder data. Fisheries Oceanography, 8(Suppl. 1), 25–39. Beaugrand, G. (1999). Le programme Continuous Plankton Recorder (CPR) et son application à l’étude des changements spatiotemporels de la biodiversité pélagique en Atlantique nord et en mer du Nord. Océanis, 25, 417–433. Beaugrand, G., & Edwards, M. (2001). Comparison in performance among four indices used to evaluate diversity in pelagic ecosystems. Oceanologica Acta, 24, 467–477. Beaugrand, G., & Ibañez, F. (2002). Spatial dependence of pelagic diversity in the North Atlantic Ocean. Marine Ecology Progress Series, 232, 197–211. Beaugrand, G., Ibañez, F., & Lindley, J. A. (2001). Geographical distribution and seasonal and diel changes of the diversity of calanoid copepods in the North Atlantic and North Sea. Marine Ecology Progress Series, 219, 205–219. Beaugrand, G., Ibañez, F., Lindley, J. A., & Reid, P. C. (2002b). Diversity of calanoid copepods in the North Atlantic and adjacent seas: species associations and biogeography. Marine Ecology Progress Series, 232, 179–195. Beaugrand, G., Ibañez, F., & Reid, P. C. (2000b). Long-term and seasonal fluctuations of plankton in relation to hydroclimatic features in the English Channel, Celtic Sea and Bay of Biscay. Marine Ecology Progress Series, 200, 93–102. Beaugrand, G., Reid, P. C., Ibañez, F., Lindley, J. A., & Edwards, M. (2002a). Reorganisation of North Atlantic marine copepod biodiversity and climate. Science, 296, 1692–1694. Beaugrand, G., Reid, P. C., Ibañez, F., & Planque, P. (2000a). Biodiversity of North Atlantic and North Sea calanoid copepods. Marine Ecology Progress Series, 204, 299–303. Becker, G., & Dooley, H. (1995). The 1989/91 high salinity anomaly. Ocean Challenge, 6, 52–57. 260 G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262 Cassie, R. M. (1963). Multivariate analysis in the interpretation of numerical plankton data. New Zealand Journal of Science, 6, 36–58. Clifford, H. T., & Goodall, D. W. (1967). A numerical contribution to the classification of the Poaceae. Australian Journal of Botanics, 15, 499–519. Colebrook, J. M. (1964). Continuous Plankton Records: a principal component analysis of the geographical distribution of zooplankton. Bulletins of Marine Ecology, 6, 78–100. Colebrook, J. M. (1969). Variability in plankton. Progress in Oceanography, 5, 115–125. Colebrook, J. M. (1978). Continuous Plankton Records: zooplankton and environment, north-east Atlantic and North Sea, 1948–1975. Oceanologica Acta, 1, 9–23. Colebrook, J. M. (1979). Continuous Plankton Records: seasonal cycles of phytoplankton and copepods in the North Atlantic Ocean and the North Sea. Marine Biology, 51, 23–32. Colebrook, J. M. (1981). Continuous Plankton Records: persistence in time-series of annual means of abundance of zooplankton. Marine Biology, 61, 143–149. Colebrook, J. M. (1982a). Continuous plankton records: phytoplankton, zooplankton and environment, North-East Atlantic and North Sea, 1958–1980. Oceanologica Acta, 5, 473–480. Colebrook, J. M. (1982b). Continuous Plankton Records: persistence in time-series and the population dynamics of Pseudocalanus elongatus and Acartia clausi. Marine Biology, 66, 289–294. Colebrook, J. M. (1984). Continuous Plankton Records: relationships between species of phytoplankton and zooplankton in the seasonal cycle. Marine Biology, 83, 313–323. Colebrook, J. M. (1985a). Continuous Plankton Records: overwintering and annual fluctuations in the abundance of zooplankton. Marine Biology, 84, 261–265. Colebrook, J. M. (1985b). Sea surface temperature and zooplankton, North Sea, 1948 to 1983. Journal du Conseil. Conseil International pour l’Exploration de la Mer, 42, 179–185. Colebrook, J. M. (1986). Environmental influences on long-term variability in marine plankton. Hydrobiologia, 142, 309–325. Colebrook, J. M. (1991). Continuous Plankton Records: from seasons to decades in the plankton of the North-East Atlantic. In T. Kawasaki, S. Tanaka, Y. Toba, & A. Taniguchi (Eds.), Long-term variability of pelagic fish population and their environments (pp. 29–45). Oxford: Pergamon Press. Colebrook, J. M., Glover, R. S., & Robinson, G. A. (1961a). Contribution towards a plankton atlas of the North-Eastern Atlantic and the North Sea. General introduction. Bulletins of Marine Ecology, 5, 67–80. Colebrook, J. M., John, D. E., & Brown, W. W. (1961b). Contribution towards a plankton atlas of the North-Eastern Atlantic and the North Sea. Part II: Copepoda. Bulletins of Marine Ecology, 5, 90–97. Colebrook, J. M., & Robinson, G. A. (1964). Continuous Plankton Records: annual variations of abundance of plankton, 1948–1960. Bulletins of Marine Ecology, 6, 52–69. Colebrook, J. M., & Taylor, A. H. (1984). Significant time scales of long-term variability in the plankton and the environment. Rapport et Procès-verbaux des réunions. Conseil International pour l’Exploration de la Mer, 183, 20–26. Dickson, R. R., Meincke, J., Malmberg, S. A., & Lee, A. J. (1988). The ‘Great Salinity Anomaly’ in the northern North Atlantic, 1968–1982. Progress in Oceanography, 20, 103–151. Diggle, P. J. (1990). Time series: a biostatistical introduction. Oxford: Clarendon Press. Dufrêne, M., & Legendre, P. (1997). Species assemblages and indicator species: the need for a flexible asymetrical approach. Ecological Monographs, 67, 345–366. Edinburgh Oceanographic Laboratory (1973). Continuous plankton records: A plankton atlas of the North Atlantic and the North Sea. Bulletins of Marine Ecology, 7, 1–174. Edwards, M. (2000). Large-scale temporal and spatial patterns of marine phytoplankton and climate variability in the North Atlantic. Ph.D. Thesis, Plymouth University, England. Edwards, M., Beaugrand, G., Reid, P. C., Rowden, A. A., & Jones, M. B. (2002). Ocean climate anomalies and the ecology of the North Sea. Marine Ecology Progress Series, 239, 1–10. Edwards, M., John, A. W. G., Hunt, H. G., & Lindley, J. A. (1999). Exceptional influx of oceanic species into the North Sea late 1997. Journal of the Marine Biological Association of the United Kingdom, 79, 737–739. Edwards, M., John, A. W. G., Johns, D. G., & Reid, P. C. (2001a). Case history and persistence of the non-indigenous diatom Coscinodiscus wailesii in the north-east Atlantic. Journal of the Marine Biological Association of the United Kingdom, 81, 207–211. Edwards, M., Reid, P. C., & Planque, B. (2001b). Long-term and regional variability of phytoplankton biomass in the northeast Atlantic (1960–1995). ICES Journal of Marine Science, 58, 39–49. Fromentin, J. -M., & Planque, B. (1996). Calanus and environment in the eastern North Atlantic. II. Influence of the North Atlantic Oscillation on C. finmarchicus and C. helgolandicus. Marine Ecology Progress Series, 134, 111–118. Glémarec, M. (1973). The benthic communities of the European North Atlantic continental shelf. Oceanography and Marine Biology Annual Review, 11, 263–289. G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262 261 Glover, R. S. (1952). Continuous Plankton Records: the Euphausiacea of the north-eastern Atlantic and the North Sea. Hull Bulletins of Marine Ecology, 3, 185–214. Glover, R. S. (1957). An ecological survey of the Scottish herring fishery. Part II: the planktonic environment of the herring. Bulletins of Marine Ecology, 5, 1–43. Haury, L. R., & McGowan, J. A. (1998). Time-space scales in marine biogeography. Intergovernmental Oceanographic Commission Workshop Report, 142, 163–170. Hays, G. C. (1995). Diel vertical migration behaviour of Calanus hyperboreus at temperate latitudes. Marine Ecology Progress Series, 127, 301–304. Hays, G. C. (1996). Large-scale patterns of diel vertical migration in the North Atlantic. Deep-Sea Research I, 43, 1601–1615. Hays, G. C., Proctor, C. A., John, A. W. G., & Warner, A. J. (1994). Interspecific differences in the diel vertical migration of marine copepods: the implications of size, color, and morphology. Limnology and Oceanography, 39, 1621–1629. Hirst, A. G., & Batten, S. D. (1998). Long-term changes in the diel vertical migration behaviour of Calanus finmarchicus in the North Sea are unrelated to fish predation. Marine Ecology Progress Series, 171, 307–310. Ibañez, F., & Dauvin, J. -C. (1988). Long-term changes (1977 to 1987) in a muddy fine sand Abra alba—Melinna palmata community from the western English Channel: multivariate time series analysis. Marine Ecology Progress Series, 49, 65–81. Ibañez, F., & Etienne, M. (1991). Le filtrage des séries chronologiques par l’analyse en composantes principales de processus (ACPP). Journal de Recherche océanographique, 16, 27–33. Ibañez, F., Fromentin, J. -M., & Castel, J. (1993). Application de la méthode des sommes cumulées à l’analyse des séries chronologiques en océanographie. Comptes Rendus de l’Académie des Sciences de Paris, Sciences de la Vie, 316, 745–748. Jolliffe, I. T. (1986). Principal Component Analysis. New York: Springer-Verlag New York Inc. Kruskal, J. B. (1964). Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika, 29, 1–27. Lam, N. S. N. (1983). Spatial interpolation methods: a review. American Cartography, 10, 129–149. Lance, G. N., & Williams, W. T. (1967). A general theory of classificatory sorting strategies. I. Hierarchical systems. Computer Journal, 9, 373–380. Legendre, P., & Legendre, L. (1983). Echantillonnage et traitement des données. In S. Frontier (Ed.), Stratégies d’échantillonnage en écologie (pp. 163–216). Paris: Masson. Legendre, P., & Legendre, L. (1998). Numerical Ecology. (2nd ed.). Amsterdam: Elsevier Science B.V. Levin, S. A. (1992). The problem of pattern and scale in ecology. Ecology, 73, 1943–1967. Lindley, J. A. (1987). Continuous Plankton Records: the geographical distribution and seasonal cycles of Decapod Crustacean larvae and pelagic post-larvae in the north-eastern Atlantic Ocean and the North Sea, 1981–3. Journal of the Marine Biological Association of the United Kingdom, 67, 145–167. Lindley, J. A. (1998). Diversity, biomass and production of decapod crustacean larvae in a changing environment. Invertebrate Reproduction and Development, 33, 209–219. Lindley, J. A., Roskell, J., Warner, A. J., Halliday, N. C., Hunt, H. G., John, A. W. G., & Jonas, T. D. (1990). Doliolids in the German Bight in 1989: evidence for exceptional inflow into the North Sea. Journal of the Marine Biological Association of the United Kingdom, 70, 679–682. Lindley, J. A., & Williams, R. (1994). Relating plankton assemblages to environmental variables using instruments towed by shipsof-opportunity. Marine Ecology Progress Series, 107, 245–262. Longhurst, A. (1998). Ecological Geography of the Sea. London: Academic Press. Lucas, C. E. (1940). Ecological investigations with the Continuous Plankton Recorder: the phytoplankton in the southern North Sea, 1932-1937. Hull Bulletins of Marine Ecology, 1, 73–170. Lucas, C. E. (1941). Continuous plankton records: phytoplankton in the North Sea, 1938–39. Part 1. Diatoms. Hull Bulletins of Marine Ecology, 2, 19–46. Lucas, C. E. (1942). Continuous plankton records: phytoplankton in the North Sea, 1938–39. Part II.—Dinoflagellates, Phaeocystis, etc. Hull Bulletins of Marine Ecology, 2, 47–70. Lundberg, P., Ranta, E., Ripa, J., & Kaitala, V. (2000). Population variability in space and time. Trends in Ecology and Evolution, 15, 460–464. Mann, K. H., & Lazier, J. R. N. (1996). Dynamics of marine ecosystems: biological-physical interactions in the oceans. (2nd ed.). Cambridge: Blackwell Science. Planque, B. (1996). Spatial and temporal fluctuations in Calanus populations sampled by the Continuous Plankton Recorder. Ph.D. Thesis, Université Pierre et Marie Curie, Paris, France. Planque, B., & Batten, S. D. (2000). Calanus finmarchicus in the North Atlantic: the year of Calanus in the context of interdecadal change. ICES Journal of Marine Science, 57, 1528–1535. Planque, B., & Fromentin, J. -M. (1996). Calanus and environment in the eastern North Atlantic. I. Spatial and temporal patterns of C. finmarchicus and C. helgolandicus. Marine Ecology Progress Series, 134, 101–109. Planque, B., Hays, G. C., Ibañez, F., & Gamble, J. C. (1997). Large scale spatial variations in the seasonal abundance of Calanus finmarchicus. Deep-Sea Research (I), 44, 315–326. 262 G. Beaugrand et al. / Progress in Oceanography 58 (2003) 235–262 Planque, B., & Ibañez, F. (1997). Long-term time series in Calanus finmarchicus abundance—a question of space? Oceanologica Acta, 20, 159–164. Rae, K. M., & Rees, C. B. (1947). Continuous plankton records: the Copepoda in the North Sea, 1938–1939. Hull Bulletins of Marine Ecology, 2, 95–132. Rees, C. B. (1952). Continuous Plankton Records: the decapod larvae in the North Sea, 1947–1949. Hull Bulletins of Marine Ecology, 3, 157–184. Reid, P. C., & Beaugrand, G. (2002). Interregional biological responses in the NorthAtlantic to hydrometeorological forcing. In K. Sherman, & H. -R. Skjoldal (Eds.), Changing states of the Large Marine Ecosystems of the North Atlantic (pp. 27–48). Amsterdam: Elsevier Science. Reid, P. C., Edwards, M., Hunt, H. G., & Warner, A. J. (1998a). Phytoplankton change in the North Atlantic. Nature, 391, 546. Reid, P. C., & Planque, B. (2000). Long-term planktonic variations and the climate of the North Atlantic. In D. Mills (Ed.), The ocean life of Atlantic salmon. Environmental and biological factors influencing survival (pp. 153–169). Bodmin: Fishing News Books. Reid, P. C., Planque, B., & Edwards, M. (1998b). Is variability in the long-term results of the Continuous Plankton Recorder survey a response to climate change? Fisheries Oceanography, 7, 282–288. Robinson, G. A. (1961). Contribution towards a plankton atlas of the North-Eastern Atlantic and the North Sea. Part I: phytoplankton. Bulletins of Marine Ecology, 5, 81–89. Rothschild, B. J. (1998). Year class strengths of zooplankton in the North Sea and their relation to cod and herring abundance. Journal of Plankton Research, 20, 1721–1741. Sen, Z. (1989). Cumulative semi-variogram models of regionalized variables. Mathematical Geology, 21, 891–903. Shepard, R. N. (1962). The analysis of proximities: multidimensional scaling with an unknown distance function. Psychometrika, 27, 125–139. Taylor, A. H., Colebrook, J. M., Stephens, J. A., & Baker, N. G. (1992). Latitudinal displacements of the Gulf Stream and the abundance of plankton in the North-East Atlantic. Journal of the Marine Biological Association of the United Kingdom, 72, 919–921. Vautard, R., Yiou, P., & Ghil, M. (1992). Singular-spectrum analysis: a toolkit for short, noisy chaotic signals. Physica D, 58, 95–126. Wackernagel, H. (1995). Multivariate geostatistics. An introduction with applications. Berlin: Springer-Verlag Berlin Heidelberg. Williamson, M. H. (1961). An ecological survey of a Scottish herring fishery. Part IV: changes in the plankton during the period 1949 to 1959. Appendix: a method for studying the relation of plankton variations to hydrography. Bulletins of Marine Ecology, 5, 207–229. Worthington, L. V. (1976). On the North Atlantic circulation. Oceanography studies, 6, 1–110.

An overview of statistical methods applied to CPR data G. Beaugrand

Related documents

Products

Support

An overview of statistical methods applied to CPR data G. Beaugrand

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib