animal behav.doc

advertisement
Group size versus individual group size frequency distributions: a nontrivial
distinction
Roger Jovani a, *, Roddy Mavor b,1
a
b
Estación Biológica de Doñana, CSIC
Seabird Monitoring Programme, JNCC
Keywords:
colony size
crowding
group living
group size
individual group size
seabird
Understanding group size variation is a major challenge in animal ecology. However, we argue that
understanding group sizes from an individual point of view (i.e. individual group sizes) and the relationship with population group sizes may be even more important. This may seem redundant, but in the
present study we show that it is not. We analysed colony sizes of 20 seabird species breeding in Britain
and Ireland from the Seabird 2000 project (19 978 colonies; 3 779 919 nests) comparing group (¼colony)
size frequency distributions (GSFDs) with their individual group size frequency distribution (IndGSFD)
counterparts. We did so for the first time for a number of species with semilogarithmic plots, and
correlated eight statistics from each GSFDeIndGSFD pair. Shape-related variables (e.g. skewness) of GSFDe
IndGSFD pairs were highly unrelated with only 1e15% of redundancy. In fact, species with similar GSFDs
had individuals concentrating in either the largest or the medium-sized groups. There was a trend
towards those species with higher group size variation having individuals living in a narrower range of
group sizes. Some group size-related measures (e.g. mean group size) showed a tight linear correlation in
logelog scatterplots between GSFDs and IndGSFDs. However, this correlation disappeared in linear
scatterplots for two of the four measures. Moreover, group size-related measures were always a poor
surrogate of corresponding individual group size measures. We discuss how animal grouping research
could benefit from similar comparisons between GSFDs and IndGSFDs and how this can be carried out in
a meaningful way.
Most animals live in groups either temporarily or permanently.
Group size shapes the cost/benefit payoff of group living, with some
group sizes often conferring higher fitness than others (Krause &
Ruxton 2002). However, empirical and modelling approaches
have shown that even when there is a clear peak in the fitness
function of group sizes (i.e. there is an ‘optimal’ group size), a huge
variation in group sizes still tends to exist. After decades of study,
understanding this variation remains an unsolved challenge in
animal ecology research (Giraldeau & Caraco 2000; Gerard et al.
2002; Krause & Ruxton 2002; Safran et al. 2007; Sumpter 2010).
A major driver of this research agenda has been the description of
group size frequency distributions (hereafter GSFDs; e.g. Götmark
1982; Wirtz & Lörscher 1983; Brown et al. 1990; Stacey & Koenig
1990; Avilés & Tufiño 1998; Krause & Ruxton 2002; Jovani & Tella
2007; Serrano & Tella 2007; Jovani et al. 2008a,b). These studies
* Correspondence: R. Jovani, Department of Evolutionary Ecology, Estación
Biológica de Doñana, CSIC, Américo Vespuccio s/n, E-41092 Sevilla, Spain.
E-mail address: jovani@ebd.csic.es (R. Jovani).
1
R. Mavor is at the Seabird Monitoring Programme, JNCC, Inverdee House, Baxter
Street, Aberdeen AB11 9QA, U.K.
examined group sizes from a population point of view. However,
group sizes can be viewed from an individual point of view as well.
Describing individual group size selection, the reasons behind these
choices and its constraints has proved to be a powerful mechanistic
approach to explaining population group size patterns (Brown &
Brown 2000; Safran 2004; Safran et al. 2007; Serrano & Tella 2007;
Jovani et al. 2008b). However, surprisingly few studies have analysed, per se, individual group size frequency distribution patterns
(IndGSFD; but see Jarman 1974; Wirtz & Lörscher 1983; Weso1owski
et al. 1985; Reiczigel et al. 2005, 2008). An illustrative example of this
uneven attention to GSFDs versus IndGSFDs is the book on cooperative breeding in birds edited by Stacey & Koenig (1990) in which 14 of
18 chapters (each covering a study species) show a histogram of the
GSFD of the population, but only one chapter (Emlen 1990) shows
both the GSFD and the IndGSFD. This previous lack of attention paid to
IndGSFDs could be because the properties (e.g. mean) of GSFDs and
their IndGSFD counterparts are biologically redundant, thus presenting only a mathematical subtlety without biological relevance. In
fact, some evidence would suggest that this might be the case.
First, a given GSFD has a unique IndGSFD counterpart. For instance,
in a hypothetical population of 16 individuals distributed among five
Number of groups (colonies)
(a)
100
50
0
(b)
400
200
0
(c)
100
50
0
(d)
140
70
0
(e)
14
7
0
(f)
70
35
0
(g)
280
140
0
90
45
0
50
25
0
400
200
0
0
150 000
300 000
300
150
0
0
400 000
800 000
0
400 000
800 000
0
10 000
20 000
0
4000
8000
0
3000
6000
(n)
80
40
0
(o)
400
200
0
(i)
0
14 000
28 000
400
200
0
(j)
0
1300
2600
0
3000
6000
24
12
0
800
400
0
(h)
(l)
(m)
10
5
0
0
160 000
320 000
0
10 000
20 000
500
250
0
200
100
0
0
12 000
24 000
(k)
0
9000
18 000
0
300
600
0
1400
2800
(p)
0
17 000
34 000
Number of individuals (nests)
100
50
0
(q)
0
100 000
200 000
(r)
0
2000
4000
(s)
0
1000
2000
(t)
0
200
400
Group size
Figure 1. Semilogarithmic group (colony) size frequency distributions (in black; left Y axis) and corresponding individual frequency distributions (in grey; right Y axis) for 20 seabird
species breeding in Britain and Ireland. Logarithmic bins of the form [Xn,Xnþ1 — 1] with n ¼ 0,1,2,3. are used; for instance, for X ¼ 2, bins are [1e1], [2e3], [4e7], [8e15]. The X
axis shows the logarithmic midpoint of the bin (i.e. 10(log(minimum group size of the bin)þlog(maximum group size of the bin))/2), and the linear Y axis shows the number of groups (or individuals
groups of sizes 2, 2, 3, 4 and 5, specific individuals will be present in
(i.e. experience) groups of sizes 2, 2, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5 and 5
(individual group sizes). Thus, GSFDeIndGSFD pairs are completely
interlocked, and thus potentially redundant. Second, although the
mean of an IndGSFD is always larger than that of its GSFD counterpart
(Preston 1948, 1962; Lloyd 1967), the distinction between the two
may be biologically meaningless. For instance, in the above example,
mean group size is (2 þ 2 þ 3 þ 4 þ 5)/5 ¼ 3.2 and the mean
individual group size is (2 þ 2 þ 2 þ 2 þ 3 þ 3 þ 3 þ 4 þ 4 þ 4 þ
4 þ5 þ 5 þ 5þ 5þ 5)/16 ¼ 3.625; surely not a large difference in
biological terms. Finally, Lloyd (1967) showed that the mean of an
IndGSFD is larger than the mean of its GSFD as much as the variance/
mean of its GSFD, thus showing that one is the trivial predictable
outcome of the other [e.g. in the above example mean
IndGSFD ¼ 3.2 þ (1.36/3.2) ¼ 3.625]. Moreover, Iwao (1968) and
recently Reiczigel et al. (2005, 2008) have shown a very tight linear
correlation between log(mean GSFD) and log(mean IndGSFD) across
different taxa, suggesting that mean GSFDs and mean IndGSFDs hold
essentially the same biological information.
However, we show here that GSFD measures should not be used
as surrogates of corresponding IndGSFD measures, and that the
direct study of IndGSFDs combined with GSFDs can reveal interesting
nonredundant information about group living. First, understanding
IndGSFDs may be biologically even more important than understanding group size variation. This is because most of the processes
shaping the ecology and evolution of species (natural selection/
demography) have the individual rather than the group as the unit.
For instance, if breeding success is lowered at large group sizes
(negative density dependence), an important measure of the impact
of these processes upon population demography will not be the
proportion of large/small group sizes in the population, but rather the
proportion of individuals breeding within such group sizes.
Second, contrary to the evidence stated above, IndGSFDs may
not yield redundant information about their GSFD counterparts.
This is because natural GSFD patterns do not follow the same and
ideal theoretical distributions, but are considerably more complex.
For instance, in a previous study we showed that similarly shaped
GSFDs from 20 seabird species when plotted in standard histograms hide contrasting patterns that are unravelled when the same
data are plotted with logarithmic bins (Jovani et al. 2008a). Thus,
we predicted that GSFDs with different combinations of skewness,
variability or maximum group sizes could have nontrivial impacts
on their IndGSFDs. We reanalysed this seabird data set by
comparing GSFDs and their IndGSFDs counterparts.
METHODS
We built on a previous study by Jovani et al. (2008a) in which we
analysed the colony sizes (here also called group sizes) of seabird
species breeding in Britain and Ireland. This is a data set from
Seabird 2000, a collaboration between the Joint Nature Conservation Committee, U.K., and the Royal Society for the Protection of
Birds, U.K. The project involved over 1000 surveyors following
detailed instructions for the census of each seabird species. No less
important was the meticulous checking during the process of data
entry, both by routine quality control by the Recorder 2000 software, and later by data entry personnel. The result is the
highest-quality data on a snapshot (mainly 1998e2002) of bird
colony sizes for a large area, and possibly the largest data set on
animal group sizes considering different species in a large area.
Overall, it covers 20 seabird species, 19 978 colonies and 3 779 919
nests. For further details of Seabird 2000 see Mitchell et al. (2004),
and of the data set analysed here see Jovani et al. (2008a).
Plotting Frequency Distributions
The data set of individual group sizes for each species was
created from their GSFDs as explained in the hypothetical example
in the Introduction, that is, with one value (the colony size in which
the breeding pair was nesting) for each breeding pair of the species.
Our unit of measure is typical in bird coloniality studies, that is, the
breeding pair (the nest), and thus we used the nest as the ‘individual’ of IndGSFDs to lend comparability to other studies on group
living. IndGSFDs were plotted in semilogarithmic plots (Fig. 1)
following the same procedures as for GSFDs detailed in Jovani et al.
(2008a), where we used Preston’s (1962) methods with slight
modifications (see Fig. 1 and Pueyo & Jovani 2006 for details).
GSFD and IndGSFD Statistics
Seabird GSFDs are clearly not Gaussian distributions, but show
distributions closer to log-normal and power laws (Jovani et al.
2008a). Thus, parametric measures such as the mean (and even lognormal measures such as the geometric mean) are not the most
appropriate. The only parametric measure used was the mean to
compare our results with those of Iwao (1968) and Reiczigel et al.
(2005, 2008). Overall, we calculated eight statistics from each GSFD
and IndGSFD. Our aim was to achieve a general description of the
characteristics of GSFDs and IndGSFDs to be able to compare these
two ways of looking at group size frequency distributions. For the size
of the groups (or of individual group sizes) of each species we
calculated the 5th percentile, the median, the mean and the 95th
percentile. Minimum and maximum group sizes and individual group
sizes were not measured because they are, by definition, the same for
GSFDeIndGSFD pairs. To characterize the shape of the distributions
we calculated the skewness (a measure of asymmetry), fit of GSFDs
and IndGSFDs to a log-normal distribution as measured by the KolmogoroveSmirnov statistic, kurtosis (a measure of ‘peakedness’
around the mean), and population variability, a nonparametric
counterpart of the coefficient of variation (CV) which quantifies the
mean deviation of all group size pairs within populations (see Heath
2006 for details). Statistics were calculated with standard MatLab
(MathWorks, Natick, MA, U.S.A.) functions applied to nontransformed
group (and individual group) sizes. The fit to a log-normal distribution
was calculated with log-transformed group sizes and individual
group sizes. Population variability was calculated by modifying the
code in version 1.1 of the variability calculator for MatLab by Heath
(2006). All measures along with the data necessary to plot the
frequency distributions were retrieved from the MatLab algorithm
freely available from the Supplementary Material.
We used the Pearson productemoment correlation coefficient to
calculate the linear correlation of each statistic between each
GSFDeIndGSFD pair across the 20 analysed species. Although it is
impossible to determine accurately whether our data (20 values for
each statistic) follow a normal distribution, we used Pearson instead
of rank correlation coefficients (e.g. Spearman correlation) because
the latter do not test for the tightness of the correlation to a linear
one (which is what we wanted to test) but rather for the level of
correlation in the increase in x relative to y. From the scattering of
data in Figs 2 and 3, we thought Pearson correlations were better
for IndGSFDs) for each bin. Note that all black bars must have their corresponding grey bar below, but because of the highly right-skewed distributions there are some grey bars that
are too narrow to be visualized, e.g. in (g). Note the log scale only in the X axis. (a) Uria aalge; (b) Rissa tridactyla; (c) Fulmarus glacialis; (d) Alca torda; (e) Sterna paradisaea; (f)
Hydrobates pelagicus; (g) Fratercula arctica; (h) Phalacrocorax aristotelis; (i) Chroicocephalus (¼Larus) ridibundus; (j) Phalacrocorax carbo; (k) Larus argentatus; (l) Cepphus grylle; (m)
Larus canus; (n) Sterna albifrons; (o) Sterna hirundo; (p) Larus fuscus; (q) Puffinus puffinus; (r) Larus marinus; (s) Stercorarius skua; (t) Stercorarius parasiticus.
suited for Fig. 3, and thus we interpreted Pearson correlations from
Fig. 2 with caution. Pearson correlation r ¼ 1 (or —1) would indicate
that all data fall along the linear trend line fitted to the data, and
values closer to 0 would indicate a complete scatter of values around
the fitted trend line. The coefficient of determination, R2 (r squared),
was calculated as a measure of the variance in each IndGSFD statistic
(e.g. median individual group size) explained as a linear function of
the corresponding GSFD counterpart (e.g. median group size of the
population), that is, a measure of the redundancy, r ¼ 1, meaning
that GSFDeIndGSFD pairs provide essentially the same information
about the grouping patterns of the species.
important because if all species followed the same distribution,
IndGSFDs would be easy to predict from its GSFD (e.g. compare Fig. 1a
and b). However, Fig. 1 shows that this is not so trivial for real animal
grouping patterns. For instance, GSFDs in Fig. 1d and g are similar, but
their IndGSFDs (grey bars in Fig. 1) are very different, while Fig. 1g and p
show contrasting GSFDs but similar IndGSFDs. Note that these patterns
remain hidden when we plot the same data in linear (standard)
histograms (Appendix Fig. A1). This apparent lack of a general rule
linking GSFDeIndGSFD pairs leads us to ask whether GSFDeIndGSFD
pairs provide redundant or complementary information.
GSFD versus IndGSFD Statistics
RESULTS
GSFDs versus IndGSFDs Histograms
Figure 1 (black bars) shows the same GSFDs as those previously
reported in Jovani et al. (2008a). These are semilogarithmic plots in
which a distribution with a Gaussian shape thus corresponds to a lognormal distribution. Seabirds in Britain and Ireland show contrasting
GSFDs, from clear log-normal distributions (e.g. Fig. 1a, b; Jovani et al.
2008a) to very skewed log-normal distributions (Fig. 1ret; following
power laws as detailed in Jovani et al. 2008a). However, species show
different combinations of kurtosis and skewness (Figs 1, 2) so that many
of the distributions depart from neat log-normals (e.g. Fig. 1g, k). This is
2
Kurtosis and population variability showed a negative correlation between IndGSFDs and GSFDs (ca. —0.4), explaining ca. 15% of
variance (Fig. 2, Table 1). This did not reach statistical significance,
possibly because of low sample size, but also note the potential
effect of outliers. In any case, the scattering of data around the trend
line was considerable. In general, the shape of the IndGSFDs was
not redundant with their GSFDs: only 1e15% of the variance in
IndGSFD characteristics was explained by corresponding GSFD
characteristics (Fig. 2, Table 1).
As expected, all IndGSFDs were more left skewed (with lower
skewness values) than their GSFDs counterparts (Fig. 2). This is
because the mean IndGSFD is constrained to being larger than the
0.4
(b)
(a)
1
q
0.3
s
Individual group size frequency distribution
t
h
e km
r
c
l
d
j
n o
b
a
i p
f
0
−1
−2
g
f
−1
0
np
o
0.1
i
d
a bc e
q g
−2
1
2
8
t
j
k
hl
0
r
0.1
0.2
0.8
m
6
0.7
s
k
ep c
d
h o i
b a
f
j
ln
g
r
t
5
f
3
0.6
a
i
l
b
n oj
qch
ke
0.5
p
q
r
0.4
2
st
m
1
1
0.4
(d)
q
g
4
0.3
0.9
(c)
7
m
0.2
s
2
3
4
5
6
7
8
0.3
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Group size frequency distribution
Figure 2. Correlation of shape-related statistics describing group size frequency distributions and their corresponding individual group size frequency distributions for the 20
seabird species studied. (a) Skewness, (b) KolmogoroveSmirnov, (c) kurtosis and (d) population variability. Species codes are the same as in Fig. 1.
100 000 (a)
(b)
100 000
q
g
10 000
10 000
q
1000
g
Individual group size frequency distribution
100
10
dpi
el
q
m
h
ln
s
100
c
r
10
b
dc
b
1
100 000
i
1000
a
f
a
f
p
m k
e
o
hn
l
j
t
1
(c)
q
10 000
g
f a
p
j
dc
m
1000
k
e
s
r
100
h
o
(d)
100 000
p
10 000
mk
e
s h
r o
b
1000
dci
g a
f
b
q
j
j
100
n
l
t
l n
t
10
10
1
1
Group size frequency distribution
Figure 3. Correlation of log(size-related statistics) describing group size frequency distributions and their corresponding individual group size frequency distributions. See Table 1
for correlation statistics. Species codes are the same as in Fig. 1. (a) 5th percentile, (b) median, (c) mean and (d) 95th percentile. Grey vertical lines show the potential range of
log(individual group size) values for a species according to its group size frequency distribution (see Discussion for more details). For instance, since the mean group size of species t
was 3.3 and its maximum group size was 107, this species could only have a mean individual group size between 3.3 and 107, and shows an intermediate empirical value of 22.
However, species q could have values between 6269.3 and 120 000 and shows a value close to its potential maximum (82 431.7).
mean GSFD (see above), and thus the left tail of IndGSFDs extends
further than in the corresponding GSFDs (Fig. 1). However, skewness of
IndGSFDs and GSFDs was uncorrelated across species (Table 1, Fig. 2).
The other three shape-related characteristics showed indistinctly higher or lower values in GSFDs than IndGSFDs (Fig. 2). The fit
to a log-normal distribution was uncorrelated between GSFDs and
IndGSFDs. In other words, any combination was possible. This is easy
to visualize comparing Fig. 1a, g and r. In Fig. 1a, a neat log-normal
GSFD leads to a slightly left-skewed log-normal IndGSFD, but in
Table 1
Pearson correlation coefficients (r), coefficient of determination (i.e. variance
explained by the linear model, R2), and P values for each graph in Figs 2, 3 and A2
Raw data
r
Shape-related statistics
Skewness
0.112
KolmogoroveSmirnov
0.219
Kurtosis
—0.388
Population variability
—0.392
Size-related statistics
5th percentile
0.201
Median
0.179
Mean
0.973
95th percentile
0.865
Log (data)
2
R
P
r
0.013
0.048
0.151
0.154
0.639
0.354
0.091
0.087
0.041
0.032
0.950
0.749
0.395
0.449
<0.001
<0.001
R2
P
Fig. 1g a slight departure from a log-normal shape of the GSFD
produces a highly skewed IndGSFD. The opposite occurs in Fig. 1r.
Group size-related measures were also analysed for their correlation between GSFDs and corresponding IndGSFDs. This was done
for raw variables and also for their logarithms (to compare with
Reiczigel et al. 2008), because these are two approaches that give
complementary information. Mean and 95th percentile group sizes
were highly correlated in both linear and logelog plots (Fig. 3,
Table 2, Appendix Fig. A2), with 75e95% of redundancy (Table 1).
However, despite this strong linear correlation, mean and 95th
percentile group sizes were very different between GSFDs and corresponding IndGSFDs (Table 2). Median and 5th percentile group
sizes were significantly correlated in logelog plots but not when raw
data were analysed (compare Fig. 3 and Fig. A2, Table 1), and raw
data were highly different between GSFD and IndGSFDs (Table 2).
DISCUSSION
0.610
0.734
0.903
0.838
0.372
0.539
0.815
0.702
<0.001
<0.001
<0.001
<0.001
Our results show the first comparison of GSFD versus IndGSFD
in semilogarithmic histograms. They have revealed a nontrivial
relationship between the group sizes of a population and the group
sizes in which individuals live, something difficult to appreciate in
standard histograms (compare Fig. 1 and Fig. A1). This challenges
Table 2
Group size-related measures for group size frequency distributions (GSFDs) and corresponding individual group size frequency distributions (IndGSFDs)
5th percentile
Median
Mean
95th percentile
SC
GSFD
IndGSFD
SC
GSFD
IndGSFD
SC
GSFD
IndGSFD
SC
GSFD
IndGSFD
d
l
g
k
m
p
r
i
h
j
t
s
n
o
e
c
f
q
a
b
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
2
2
2
5
6
1
2
2
4
4
6
7
9
14
14
17
35
46
65
579
50
309
3286
517
165
t
r
s
o
m
p
l
n
h
k
e
i
j
d
g
c
f
q
b
a
1
2
2
6
6
6
7
8
8
10
14
16
25
27
29
31
59
61
154
205
8
41
195
157
300
3309
31
50
68
295
200
2500
125
961
40 000
950
6800
101 800
2361
8679
t
r
s
l
n
h
m
o
j
e
k
p
d
c
i
b
f
g
a
q
3
9
13
15
18
23
32
33
51
52
52
117
166
181
210
613
845
1226
1534
6269
22
142
994
50
67
217
3767
310
186
740
1565
8005
2728
3055
4424
3894
12 191
33 250
14 856
82 432
t
s
r
l
m
h
n
o
e
k
j
p
d
c
i
b
g
f
a
q
12
30
37
50
80
84
85
122
166
190
200
234
613
710
800
2759
3104
4866
7344
41 697
98
2293
983
208
11 219
1720
220
1033
4000
10 129
558
19 487
11 384
12 276
14 575
11 077
59 471
27 297
75 493
120 000
SC: species codes are the same as in Fig. 1.
previous evidence suggesting the redundancy of this double
approach to animal group sizes (see Introduction).
Mean Group Size
We confirm the unavoidable mathematical fact that individuals
live in larger groups than the average group size in their population,
and the strong linear correlation between log(mean group sizes)
and log(individual mean group sizes) previously reported by Iwao
(1968) and Reiczigel et al. (2005, 2008). Reiczigel et al. (2008,
page 719) argued that ‘Since mean group size tends to predict mean
crowding (Fig. 3), this approach may also be useful as a rough
approximation’. However, our results contradict this interpretation
for the following reasons.
First, note that the apparent good fit shown in Fig. 3 (Table 1) and the
similar Figure 3 in Reiczigel et al. (2008) is not so surprising when
considering the potential individual group size values that a species can
have with a given GSFD. This is what we have attempted to illustrate in
Fig. 3 with the vertical grey lines. Given that a statistic (e.g. mean) of
individual group sizes is always larger than its group size counterpart
(Fig. 3; Preston 1962; Lloyd 1967), and that individual group sizes can
never be larger than the maximum group size of the population (i.e.
individuals cannot live in larger groups than the largest group of the
population), these grey lines show the range of individual group size
values that a given population can have. This shows that a tight linear fit
to log(data) is simply a mathematical constraint imposed by the
interlocked nature of GSFDeIndGSFD pairs: any random distribution of
dots within the grey lines would create a tight linear correlation.
Second, even within the narrow range of values that a species could
exhibit in Fig. 3, species differ considerably in their relative position
within their corresponding grey line. This is very biologically relevant
because of the logarithmic scale (compare Fig. 3 and Fig. A2), even
hiding paradoxical situations (also acknowledged by Reiczigel et al.
2005, 2008): species with clearly larger group sizes can have individuals living in clearly smaller groups. For instance, Rissa tridactyla has
a median group size of 154 nests, clearly larger than the 29 nests for
Fratercula arctica (species b and g in Table 2, respectively). However, the
median individual of species b lives in groups of 2361 nests and that of
species g in groups of 40 000 nests. This could imply a huge difference
in the ecology of the population (e.g. for the strength of negative
density-dependent processes) and in the evolution of the species (e.g.
behavioural adaptations to living in a given social scenario).
Third, even for variables showing a tight linear correlation between
GSFDs and IndGSFDs on logelog (Fig. 3) and linear axes (Fig. A2), GSFD
measures (e.g. mean group size) were a poor approximation of corresponding IndGSFD measures; to be at least a rough approximation they
would need to be close to the x ¼ y line in Fig. A2 (see also raw data in
Table 2). Note that if one is interested in the mean group size experienced by individuals in a given species, the mean group size is a poor
predictor (Table 2). For instance, suppose Chroicocephalus ridibundus
(species i) suffers a strong negative density dependence on breeding
success when nesting in colonies larger than 2500 pairs. In that case,
the median colony size would be highly misleading (i.e. 16 nests) in the
evaluation of the demographic consequences of this density dependence, because it would suggest a negligible effect. However, in fact,
50% of the population breeds in colonies larger than 2500 nests (i.e.
median individual colony size ¼ 2500; Table 2), thus having a probable
effect on individual fitness and population demography.
Overall, this shows that while it is true that in a comparative
(interspecific) study on seabirds, one can infer the log(mean individual group size) from the log(mean group size) of the species, it is
also true that for a given species, one can only predict that mean
individual group size will be larger than mean group size and lower
than the maximum group size in the population, thus losing relevant
information on the group sizes experienced by individuals. In any
case, GSFD measures are a poor approximation of corresponding
IndGSFD measures (Table 2).
Other Group Size Statistics
We have analysed not only the mean but also several other
statistics of GSFDeIndGSFD pairs and we have found interesting new
information potentially linking individual behaviour and population
patterns. For instance, in half of the studied species, individuals live
in a definite range of intermediate to large group sizes (e.g. Fig. 1a, h,
l, r), avoiding the lower half of their GSFDs. In almost another half of
the species, individuals cluster in the largest group sizes (e.g. Fig. 1f,
g, p, q). In others, there does not seem to be a clear preference (e.g.
Fig. 1t). In fact, Fig. 2 shows either a lack of correlation or a negative
correlation between group size variation and individual group size
variation. This challenges our view about the link between individual
behaviour and group size population patterns and poses a paradox:
species with larger group size variation have a larger proportion of
individuals concentrated in particular group sizes.
The Relevance of Logarithmic Binning
Animal GSFDs often do not follow normal distributions but show
highly right-skewed frequency histograms, with many small groups
and very few large ones (Götmark 1982; Wirtz & Lörscher 1983; Brown
et al. 1990; Stacey & Koenig 1990; Avilés & Tufiño 1998; Krause &
Ruxton 2002; Jovani & Tella 2007; Serrano & Tella 2007; Jovani et al.
2008a,b). However, this apparent uniformity among species in their
GSFDs is not real, but the result of a weak plotting history in animal
grouping research. For populations/species with small ranges of group
sizes (ca. 1e50), using linear bins (e.g. [1,5], [6,10], [11,15].) clearly
highlights the underlying distribution even for highly skewed distributions (see several good examples in Stacey & Koenig 1990). Often,
however, group sizes range from a few individuals to several hundred
or even hundreds of thousands. This makes linear bins a poor choice
for detecting differences in GSFD properties across time, space or taxa,
because very large groups inevitably confine most of the groups in the
smallest one/few bins. Here, we have used logarithmic bins (see
Methods). This approach has been key to unravelling the surprising
nontrivial relationship between GSFDs and IndGSFDs. This is easy to
appreciate in Fig. A1, where we have plotted in standard histograms
the same data as in Fig. 1, and where the difficulty of visualizing the
contrasting patterns within and between GSFDs and their IndGSFDs
found in Fig. 1 is apparent. Also, semilogarithmic plots are a direct way
of assessing how individuals are distributed across group sizes. This is
a powerful way of identifying either possible preferences of individuals for particular group sizes (something difficult to appreciate from
GSFD alone) or the relevance that particular processes (e.g. high
negative density dependence in survival in large group sizes) could
have upon a population.
Individual Group Sizes versus Crowding
We have not used the term ‘crowding’ recently coined by
Reiczigel et al. (2005). ‘Crowding’ has been an interesting
contribution that opens the analysis to any statistic of IndGSFDs
instead of only focusing on the mean group size experienced by
individuals (i.e. the ‘typical group size’ of Jarman 1974). However,
we prefer ‘individual group size’ because it is the logical individual
counterpart of ‘group size’ without any connotation about the
consequences of group size. ‘Crowding’ suggests that larger group
sizes imply greater density. This is true when space is finite as
occurs when parasite intensity increases in a host, or similarly sized
hosts show different parasite intensities (Poulin 2007). However,
this need not be the case in other situations. For instance, nest
spacing in seabird colonies is often constant despite colony sizes
ranging from tens to thousands of nests (Nelson 1980, page 125).
Conservation Implications
It was not necessary to plot IndGSFD for seabirds in Britain and
Ireland to know that there are some species such as Puffinus puffinus in which a few colonies harbour a large proportion of the total
population, and, thus, are colonies of special conservation concern
(Mitchell et al. 2004). However, we believe that plotting IndGSFDs
of all species together as in Fig. 1 gives a more informed point of
view on the degree in which this occurs in the different species.
This is especially important because by only knowing the sizes of
colonies (black lines in Fig. 1) it is difficult, without plotting them, to
predict how concentrated, in a few large colonies, the population is
(e.g. compare black and grey bars in Fig. 1d versus g or in r versus s).
For instance, knowing the maximum colony sizes of a species is not
enough to know the proportion of the total breeding population
that will be lost if, for instance, the largest five colonies are
destroyed (and birds do not move to other colonies; Fig. 4). Obviously, species with the largest colonies are more sensitive to losing
one of their five largest colonies. However, the correlation
(r ¼ 0.498, P ¼ 0.026) only explained R 2 ¼ 0.248 of variation and
20e80% of a population can be contained in the five largest colonies
of a large-colony species (e.g. a, g, q), clearly a significant range for
the purposes of conservation.
1
q
Proportion of nests in largest group
Group size variation could come from two sources: from individual
behaviour (e.g. owing to a larger underlying genetic predisposition for
contrasting group sizes, Brown & Brown 2000; Serrano & Tella 2007), or
because of formationedestruction dynamics (e.g. all large colonies
started with a few nests). What these analyses tell us is that in species
with larger colony size variation, individuals live in more specific
colony sizes. The paradox is potentially solved by colony size dynamics:
intraspecifically, variability in individual behaviour (e.g. owing to
underlying genetics) could promote colony size variation. However,
because of the imperative colony size dynamics, when a species shows
a preference for breeding in large colonies, all smaller colonies also
exist in the population (i.e. very small colony sizes are pervasive even in
bird species with huge colonies; Brown et al. 1990; see Table 1 in Jovani
et al. 2008a), thus leading to higher colony size variation even when
most individuals prefer to live in some particularly large colonies.
The 5th percentile showed the weakest correlation for group
size-related variables between GSFDs and IndGSFDs (Fig. 3, Table 1,
Fig. A2). In fact, all species showed a lowest group size of fewer than
10 nests, but even species with a minimum group size of one nest
showed contrasting 5th percentile individual group sizes from one to
579 nests. Since minimum group sizes of species often show very
low values, often close to one nest, that is, solitary breeders (e.g.
Brown et al. 1990; Krause & Ruxton 2002; this study), minimum
colony sizes could scarcely be seen as a species-specific trait.
However, our analyses show that seabirds in Britain and Ireland
differ substantially in the smaller (5th percentile) group sizes in
which individuals live, thus presenting the possibility that this could
be a species-specific trait. This necessitates a study comparing
populations of the same species in different parts of the world.
0.75
f
s
0.5
p
m
g
i
n
o
0.25
e
t
j
r h
d
k
a
bc
l
0
Maximum group size
Figure 4. Correlation between the maximum group (colony) size of each species and the
proportion of all the breeding pairs of the species found in their largest five colonies.
Modelling Implications
References
Theoretical approaches are aimed at understanding group size
variation, but not individual group size variation. An important (and
apparently trivial and obvious) starting point of animal grouping
models is that, ideally, mean group size in a population should be the
group size conferring the highest fitness to the individuals (reviewed
in Clark & Mangel 1986). Posterior modelling approaches, however,
have questioned the validity of this assumption showing, for instance,
that the difference between optimal and realized mean group sizes
depends on whether group members have control over the entrance
of newcomers to the group (Giraldeau & Caraco 2000). However, the
initial assumption (i.e. that mean group sizes should be close to
optimal group sizes) has been not questioned. This could be
misleading because in genetically unrelated animals (e.g. a huge
seabird colony) what should be expected is not that groups should be
of an optimal size, but that most of the individuals of the population
should live in such optimal group sizes, that is, show an adaptive
behaviour. If group size variation is low, mean population group size
and mean individual group size may be essentially the same (see
hypothetical example in the Introduction, Lloyd 1967), and thus
approaches modelling these kinds of GSFDs remain essentially
equally valid. However, our results show that mean individual group
sizes could be many times larger than population mean group sizes
(e.g. Fig. 1f, g, i, p, q; Table 2). Thus, the empirical finding that group
size is often larger than the optimal group size (Giraldeau & Caraco
2000; Krause & Ruxton 2002; Sumpter 2010) is even more
intriguing when examined from the individual point of view.
More generally, since GSFDs and IndGSFDs have been shown to
yield different information, models could be tested (and their
design aided, Grimm & Railsback 2005) by how well they reproduce
not only mean group sizes of GSFDs but also several of their
properties (e.g. skewness), as well as for their IndGSFDs. These new
approaches will surely benefit from current advances in the
statistical treatment of IndGSFDs (Reiczigel et al. 2008; Neuhäuser
2009; Neuhäuser et al. 2010).
Finally, fitting theoretical models (e.g. power laws or truncated
power laws) to empirical data has been shown to unravel interesting
factors shaping population grouping patterns (e.g. Bonabeau et al.
1999; Sjöberg et al. 2000; Lusseau et al. 2004; Jovani et al. 2008b).
Our study clearly shows that contrasting results can be found if individual group sizes are studied instead of population group sizes.
Therefore, where the aim of the study demands it, it would be interesting to make this double approach to group sizes either to complement group size analyses or to gain a new perspective on the causes
and consequences of group living.
Avilés, L. & Tufiño, P. 1998. Colony size and individual fitness in the social spider
Anelosimus eximius. American Naturalist, 152, 403e418.
Bonabeau, E., Dagorn, L. & Fréon, P. 1999. Scaling in animal group-size distributions.
Proceedings of the National Academy of Sciences, U.S.A., 96, 4472e4477.
Brown, C. R. & Brown, M. B. 2000. Heritable basis for choice of group size in
a colonial bird. Proceedings of the National Academy of Sciences, U.S.A., 97,
14825e14830.
Brown, C. R., Stutchbury, B. J. & Walsh, P. D. 1990. Choice of colony size in birds.
Trends in Ecology & Evolution, 5, 398e403.
Clark, C. W. & Mangel, M. 1986. The evolutionary advantages of group foraging.
Theoretical Population Biology, 30, 45e75.
Emlen, S. T. 1990. White-fronted bee-eaters: helping in a colonially nesting species.
In: Cooperative Breeding in Birds. Long-Term Studies of Ecology and Behavior (Ed. by
P. B. Stacey & W. D. Koenig), pp. 487e526. Cambridge: Cambridge University
Press.
Gerard, J.-F., Bideau, E., Maublanc, M.-L., Loisel, P. & Marchal, C. 2002. Herd size
in large herbivores: encoded in the individual or emergent? Biological Bulletin,
202, 275e282.
Giraldeau, L.-A. & Caraco, T. 2000. Social Foraging Theory. Princeton, New Jersey:
Princeton University Press.
Götmark, F. 1982. Coloniality in five Larus gulls: a comparative study. Ornis Scandinavica, 13, 211e224.
Grimm, V. & Railsback, S. F. 2005. Individual-Based Modeling and Ecology. Princeton, New Jersey: Princeton University Press.
Heath, J. P. 2006. Quantifying temporal variability in population abundances. Oikos,
115, 573e581.
Iwao, S. 1968. A new regression method for analyzing the aggregation pattern of
animal populations. Research Population Ecology, 10, 1e20.
Jarman, P. J. 1974. The social organization of antelope in relation to their ecology.
Behaviour, 48, 215e268.
Jovani, R. & Tella, J. L. 2007. Fractal bird nest distribution produces scale-free
colony sizes. Proceedings of the Royal Society B, 274, 2465e2469.
Jovani, R., Mavor, R. & Oro, D. 2008a. Hidden patterns of colony size variation in
seabirds: a logarithmic point of view. Oikos, 117, 1774e1781.
Jovani, R., Serrano, D., Ursúa, E. & Tella, J. L. 2008b. Truncated power laws reveal
a link between low-level behavioral processes and grouping patterns in
a colonial bird. PLoS ONE, 3, e1992, doi:10.1371/journal.pone.0001992.
Krause, J. & Ruxton, G. D. 2002. Living in Groups. Oxford: Oxford University Press.
Lusseau, D., Williams, R., Wilson, B., Grellier, K., Barton, T. R., Hammond, P. S. &
Thompson, P. M. 2004. Parallel influence of climate on the behaviour of Pacific
killer whales and Atlantic bottlenose dolphins. Ecology Letters, 7, 1068e1076.
Lloyd, M. 1967. Mean crowding. Journal Animal Ecology, 36, 1e30.
Mitchell, P. I., Newton, S. F., Ratcliffe, N. & Dunn, T. E. 2004. Seabird Populations of
Britain and Ireland. London: T. & A. D. Poyser.
Nelson, B. 1980. Seabirds. Their Biology and Ecology. Toronto: Hamlyn.
Neuhäuser, M. 2009. The importance of the biological system underlying the data
when choosing a statistical test: why penguins need to be treated differently to
parasites. Animal Behaviour, 77, e1ee3.
Neuhäuser, M., Kotzmann, J., Walier, M. & Poulin, R. 2010. The comparison of
mean crowding between two groups. Journal of Parasitology, 96, 477e481.
Poulin, R. 2007. Evolutionary Ecology of Parasites. 2nd edn. Princeton, New Jersey:
Princeton University Press.
Preston, F. W. 1948. The commonness, and rarity, of species. Ecology, 29, 254e283.
Preston, F. W. 1962. The canonical distribution of commonness and rarity, Part I.
Ecology, 43, 185e215.
Pueyo, S. & Jovani, R. 2006. Comment on ‘A Keystone Mutualism Drives Pattern in
a Power Function’. Science, 313, 1739c.
Reiczigel, J., Lang, Z., Rózsa, L. & Tóthmérész, B. 2005. Properties of crowding
indices and statistical tools to analyze parasite crowding data. Journal of Parasitology, 91, 245e252.
Reiczigel, J., Lang, Z., Rózsa, L. & Tóthmérész, B. 2008. Measures of sociality: two
different views of group size. Animal Behaviour, 75, 715e721.
Safran, R. J. 2004. Adaptive site selection rules and variation in group size of barn
swallows: individual decisions predict population patterns. American Naturalist,
164, 121e131.
Safran, R. J., Doerr, V. A. J., Sherman, P. W., Doerr, E. D., Flaxman, S. M. &
Winkler, D. W. 2007. Group breeding in vertebrates: linking individual and
population-level approaches. Evolutionary Ecology Research, 9, 1163e1185.
Serrano, D. & Tella, J. L. 2007. The role of despotism and heritability in determining
settlement patterns in the colonial lesser kestrel. American Naturalist, 169,
E53eE67.
Sjöberg, M., Albrectsen, B. & Hjältén, J. 2000. Truncated power laws: a tool for
understanding aggregation patterns in animals? Ecology Letters, 3, 90e94.
Stacey, P. B. & Koenig, W. D. 1990. Cooperative Breeding in Birds. Long-term Studies
of Ecology and Behavior. Cambridge: Cambridge University Press.
Sumpter, D. J. T. 2010. Collective Animal Behavior. Princeton, New Jersey: Princeton
University Press.
Weso1owski, T., G1az_ewska, E., G1az_ ewski, L., Hejnowicz, E., Nawrocka, B.,
‘ ska, K. 1985. Size, habitat distribution and site turnover of
Nawrocki, P. & Okon
gull and tern colonies on the middle Vistula. Acta Ornithologica, 21, 46e67.
Wirtz, P. & Lörscher, J. 1983. Group sizes of antelopes in an east African national
park. Behaviour, 84, 135e156.
Acknowledgments
This and previous work on the Seabird 2000 data set would not
have been possible without the collaboration between the Joint
Nature Conservation Committee and the Royal Society for the
Protection of Birds, and the over 1000 volunteers that have gathered the data. We also thank José L. Tella, Daniel Oro, José
A. Donázar, Olga Ceballos, David Serrano, Jaime Potti and Ainara
Cortés-Avizanda for discussion, and David Lusseau, Steve Oswald
and an anonymous referee for interesting contributions. R.J. is
supported by a Ramón y Cajal research contract (RYC-2009-03967)
from the Ministerio de Ciencia e Innovación.
Supplementary Material
Supplementary material associated with this article can be
found, in the online version, at doi:10.1016/j.anbehav.2011.07.037.
Appendix
50
611
25
40
(a)
0
2804
20
75 493
(k)
10 129
0
0
0
160 000
70
320 000
568
35
(b)
350
100 000
1074
175
17 546
0
50 000
0
0
(l)
323
0
60 000
3500
120 000
60
30
2885
10
20 424
0
7000
20
(c)
1551
(m)
11 219
0
0
0
150 000
50
300 000
821
25
(d)
11 384
0
15 000
40
30 000
71
20
(n)
220
0
0
0
32 000
250
64 000
Number of groups (colonies)
892
13
500
40
(e)
20
4000
0
368
1033
0
0
(o)
0
14 000
2000
28 000
40
87
20
0
4000
50
(f)
27 297
976
25
19 487
0
0
(p)
0
21 000
15 000
30 000
40
464
20
42 000
20
(g)
0
47
10
59 471
(q)
120 000
0
0
0
60 000
70 000
120 000
60
1325
30
140 000
70
(h)
1998
35
1720
0
983
0
0
(r)
0
6000
10 000
30
640
15
30
(i)
15
14 575
0
719
0
0
(s)
2293
0
2500
20 000
5000
40 000
70
35
0
160
60
(j)
30
675
0
0
1500
0
Number of individuals (nests)
26
577
(t)
675
0
550
3000
Max
0
1100
Max
Group size
Figure A1. Group (colony) size frequency distribution (in black; left Y axis) and corresponding individual group size frequency distribution linear histograms (in grey; right Y axis) for 20
seabird species breeding in Britain and Ireland. Grey bars are always shown in their full length, but the first group size bin (the left-hand black bar of each graph) has been cut and its real
value depicted by the left-hand value inside each graph. Each graph shows from left to right, 20 linear bins ranging from the smallest to the largest group size in the species. Bins are of
different length between graphs, but constant within graphs. Bins are calculated as (maximum group size/20). For instance, in (e) the maximum group size (the number above the largest
bin) is 4000. Thus, in (e), bin length is 4000/20 ¼ 200; thus, the first bin is (0e200], the second bin (200e400] and the last bin (3800e4000] (‘(‘ means that the number is not included in
the bin and ‘]’ that it is included). See Fig. 1 for label details.
120 000
4000
(b)
Individual group size frequency distribution
(a)
3000
90 000
2000
60 000
1000
30 000
0
0
0
2
4
6
0
50
100
150
200
250
100 000
(c)
(d)
80 000
120 000
60 000
80 000
40 000
40 000
20 000
0
0
0
2000
4000
6000
0
20 000
40 000
Group size frequency distribution
Figure A2. Correlation of size-related statistics describing group size frequency distributions and their corresponding individual group size frequency distributions. (a) 5th
percentile, (b) median, (c) mean and (d) 95th percentile. Species codes are the same as in Fig. 1. See Table 1 for correlation statistics. Dashed lines depict the x ¼ y line.
Download