Joel E. Cohen and Meng Xu - Black Rock Forest Consortium

advertisement
1
THIS PAGE IS TO BE REMOVED FROM THE BLINDED VERSION
2
Random sampling of skewed distributions and Taylor's power law of fluctuation scaling
3
Joel E. Cohen and Meng Xu
4
Authors' footnote
5
Joel E. Cohen is Professor, and Meng Xu was Postdoctoral Associate, at The Rockefeller
6
University, 1230 York Avenue, Box 20, New York, NY, 10065-6399 (e-mail:
7
cohen@rockefeller.edu, xumeng04@gmail.com). Xu is now Lecturer, Department of
8
Mathematics and Physics, University of New Haven, West Haven, CT 06516. This work was
9
partially supported by National Science Foundation grants EF-1038337 and DMS-1225529. The
10
authors thank Richard Chandler, Russell Millar and Andrew Wood for helpful comments and
11
Priscilla K. Rogerson for assistance.
12
THIS PAGE IS TO BE REMOVED FROM THE BLINDED VERSION
13
0
14
Random sampling of skewed distributions and Taylor's power law of fluctuation scaling
15
Abstract. Taylor’s law (TL), a widely verified quantitative pattern in ecology, describes the
16
variance in population density as a power-law function of the average population density:
17
variance = a(mean)b, a > 0. In the past half-century, multiple mechanisms have been proposed to
18
explain and interpret TL. Here we show that TL arises when data are randomly sampled in
19
blocks from any frequency distribution with four finite moments, and that b is positive whenever
20
the frequency distribution has positive skewness. We give approximate formulas for the sample
21
estimate bΜ‚ of b and the variance of 𝑏̂: if the distribution has mean M, variance V, coefficient of
22
variation CV = V1/2/M, skewness γ1 = μ3/V3/2, and kurtosis κ = μ4/V2, then as the number N of
23
blocks and sample sizes in all blocks become large, bΜ‚ converges in probability to 𝛾1⁄𝐢𝑉 and
24
(𝑁 − 2)𝑠 2 (𝑏̂) converges in probability to [πœ… − 1 − 𝛾12 ]⁄(𝐢𝑉)2. In simulations and an empirical
25
example using counts of oak trees from Black Rock Forest, the formula for bΜ‚ agrees with the
26
estimate of b obtained by least-squares regression. Our null model provides a baseline against
27
which more complex explanations of TL can be compared.
28
Key words: delta method; least-square regression; skewness; variance function
29
1. Introduction
30
Taylor's law (TL), sometimes called Taylor's power law of fluctuation scaling after Taylor
31
(1961), is the subject of more than a thousand papers in ecology, epidemiology, biomedical
32
sciences and other applied statistical areas (Eisler et al. 2008, Jørgensen et al. Draft), but is not
33
vary widely known among some statisticians. For example, Davidian and Carroll (1987, p. 1079)
34
"model the variance as proportional to a power of the mean response" as their first example of
35
"variance functions." This model is precisely TL, variance = a(mean)b, a > 0, but Davidian and
1
36
Carroll do not cite the large prior ecological literature (e.g., review by Taylor 1984) on precisely
37
that model, substantial related prior statistical research (e.g., Bartlett 1936, 1947, Tweedie 1946,
38
1947, 1984 ) and practical applications to the design of sampling plans for the control of insect
39
pests of soybeans (Kogan et al. 1974, Bechinski and Pedigo 1981) and cotton (review by Wilson
40
et al. 1989).
41
In recent decades, Jørgensen, Kendal and colleagues (Jørgensen 1997, Kendal and Jørgensen
42
2011, Jørgensen et al. Draft) have done much to develop a statistical theory of TL but awareness
43
in the statistical community of TL and its associated literature remains limited.
44
In applications, TL arises when observations of a nonnegative real-valued random variable
45
(originally, counts of the size of a local population of insects) are grouped into N blocks, j = 1,
46
…, N. When the N points (log of the sample mean of observations in block j, log of the sample
47
variance of observations in block j) are plotted, TL predicts that the points will approximate a
48
straight line (Taylor 1961).
49
A linear relationship between the logarithm of the sample variance and the logarithm of the
50
sample mean of population density is one of the most frequently confirmed empirical patterns in
51
ecology and has been verified for hundreds of species, recently including bacteria (Ramsayer et
52
al. 2012; Kaltz et al. 2012), forest trees (Cohen, Xu, and Schuster 2012, 2013a), and people
53
(Cohen, Xu, and Brunborg 2013b). Examples of TL have also been found in diverse other fields
54
(see reviews by Eisler et al. 2008, Jørgensen et al., Draft), including cell populations within
55
organisms, epidemiology of measles and whooping cough, cancer metastases, single nucleotide
56
polymorphisms and genes on chromosomes, and non-biological measurements such as
57
precipitation, packet switching on the Internet, stock market trading, and number theory.
2
58
Theories of the mechanistic or stochastic origins of TL and interpretations of the intercept log a
59
and slope b of the linear relationship log variance = log a + b×log mean (mathematically
60
equivalent to its power-law form) are as diverse as the empirical applications (again see reviews
61
of Eisler et al. 2008, Jørgensen et al. Draft). Taylor (1961) proposed that b was an index of
62
aggregation in populations (Taylor 1961). Kilpatrick and Ives (2003) theorized that negative
63
interactions among species leads to b < 2. Ballantyne (2005) proposed that b = 2, which
64
corresponds to a constant coefficient of variation, is a consequence of deterministic population
65
growth. Ballantyne and Kerkhoff (2007) suggested that individuals’ reproductive correlation
66
determines the size of b. Cohen et al. (2013a) showed that the Lewontin-Cohen stochastic
67
multiplicative population model (a geometric random walk) implied TL and calculated log a and
68
b explicitly. Cohen (2013) showed that exponentially growing, non-interacting clones would
69
asymptotically satisfy TL with b = 2.
70
As TL has been a subject of investigation for more than half a century, it is somewhat surprising
71
that a simple null hypothesis to explain the origins of TL under certain conditions has so far
72
apparently not been considered. Here we show that TL can arise as a consequence of randomly
73
blocking observations arising from simple random sampling of a single underlying nonnegative-
74
valued probability distribution. No other mechanisms are required to generate TL. We derive
75
explicit approximate formulae for the TL slope b and its variance when random samples from
76
any distribution with four finite moments are randomly grouped into blocks. We give simulations
77
to illustrate how well these theoretical formulae approximate the slope, and we give an empirical
78
example using published data on counts of oak trees.
79
Our formulae imply that b > 0 arises from random sampling in blocks of any right-skewed
80
distribution, and b < 0 arises from random sampling in blocks of any left-skewed distribution.
3
81
This derivation provides a purely statistical "null-model" of the slope b. This null model does not
82
purport to be a universal explanation of TL in all or most circumstances. However, the
83
availability of this null model will require that future more elaborate explanations of TL, in terms
84
of specific mechanisms, show first why this null model is not a sufficient explanation.
85
2. Results
86
Suppose X is a nonnegative real-valued random variable with cumulative distribution function F,
87
mean E(X) = M > 0, variance var(X) = V > 0, and finite central moments E([X - M]h) = μh, h = 3,
88
4. Consider N > 2 "blocks" or sets of independently and identically distributed (iid) observations
89
(random samples) of X. Let xij denote observation i in block j, i = 1, …, nj, assuming the sample
90
size of block j satisfies nj > 3, j = 1, …, N. The total number of observations is O = n1+ n2+β‹― +
91
nN. For block j the sample mean and its expectation and variance are π‘šπ‘— = (π‘₯1𝑗 + β‹― + π‘₯𝑛𝑗𝑗 )⁄𝑛𝑗 ,
92
𝐸(π‘šπ‘— ) = 𝑀, π‘£π‘Žπ‘Ÿ(π‘šπ‘— ) = 𝑉 ⁄𝑛𝑗 . The unbiased estimator of the sample variance and its
93
expectation and variance are
𝑛𝑗
94
𝑛𝑗
𝑛𝑗 − 3 2
1
1
2
𝑣𝑗 =
∑ π‘₯𝑖𝑗
−
π‘šπ‘—2 , 𝐸(𝑣𝑗 ) = 𝑉, π‘£π‘Žπ‘Ÿ(𝑣𝑗 ) = (πœ‡4 −
𝑉 ).
𝑛𝑗 − 1
𝑛𝑗 − 1
𝑛𝑗
𝑛𝑗 − 1
𝑖=1
95
The formula for π‘£π‘Žπ‘Ÿ(𝑣𝑗 ) is from Neter et al. (1990).
96
In this model, the variation between blocks in the sample mean is small because it arises only
97
from differences due to random sampling of the same distribution for every block. Variation
98
between blocks in the sample variance is also small for the same reason. Precisely how small the
99
differences are depends on the sample size per block, as shown in the above formulas for
100
π‘£π‘Žπ‘Ÿ(π‘šπ‘— ) and π‘£π‘Žπ‘Ÿ(𝑣𝑗 ). Since any two smoothly varying functions can be locally linearly related,
4
101
the logarithm of the sample variance per block can be approximated as a linear function of the
102
logarithm of the sample mean per block, which is just TL: log 𝑣𝑗 = log π‘Ž + 𝑏 log π‘šπ‘— , across
103
blocks j = 1, …, N.
104
It is customary in ecological applications and elsewhere to estimate the intercept log a and the
105
slope b of TL using ordinary least-squares regression of log(vj) as a linear function of log(mj),
106
ignoring the variability in the x-coordinate log(mj). This practice may be defensible for large nj
107
because, at least in this model, π‘£π‘Žπ‘Ÿ(log π‘šπ‘— ) is inversely proportional to nj (see Appendix for
108
proof).
109
In the following statement of our main analytical result, the symbol ''→𝑃 " means convergence in
110
probability. By definition, the coefficient of variation of the underlying distribution of X is CV =
111
V1/2/M, the skewness is γ1 = μ3/V3/2, and the kurtosis is κ = μ4/V2.
112
Theorem. Let 𝑏̂ and 𝑠 2 (𝑏̂) denote the least-squares estimator of b and the unbiased estimator of
113
its sample variance when all blocks are weighted equally. Then the limits in probability of 𝑏̂ and
114
𝑠 2 (𝑏̂) satisfy, for large N and 𝑛𝑗 , 𝑗 = 1, 2, … , 𝑁,
115
116
𝑏̂ →𝑃
π‘π‘œπ‘£(π‘šπ‘— , 𝑣𝑗 ) π‘£π‘Žπ‘Ÿ(π‘šπ‘— )
⁄
= πœ‡3 𝑀⁄𝑉 2 = 𝛾1⁄𝐢𝑉 .
𝑀𝑉
𝑀2
(𝑁 − 2)𝑠 2 (𝑏̂) →𝑃 𝑀2 (πœ‡4 𝑉 − 𝑉 3 − πœ‡32 )⁄𝑉 4 = (πœ… − 1 − 𝛾12 )⁄(𝐢𝑉)2 .
117
We prove this Theorem in the Appendix. Since CV > 0, the first formula shows that random
118
sampling of any right-skewed distribution (one with 𝛾1 > 0) generates TL with positive slope.
119
From the second formula on the extreme right, since 𝑠 2 (𝑏̂) ≥ 0 intrinsically, we conclude that
5
120
πœ… − 1 − 𝛾12 ≥ 0 for any distribution with finite fourth moment. Rohatgi and Székely (1989)
121
proved independently that πœ… − 1 − 𝛾12 ≥ 0 for any distribution with finite fourth moment.
122
3. Simulations
123
If X is gamma distributed with probability density function 𝑓(π‘₯|𝛼, 𝛽) =
124
π‘₯ 𝛼−1 exp(− π‘₯⁄𝛽 )⁄[𝛽 𝛼 Γ(𝛼)] , 𝛼 ≥ 1, 𝛽 > 0, where Γ(𝛼) is the gamma function evaluated at 𝛼,
125
then M = αβ, V = αβ2, μ3 = 2αβ3, and μ4 = 3αβ4(α + 2). Consequently, the estimator of the TL
126
slope is 𝑏̂ = 2 and 𝑠 2 (𝑏̂) = (2𝛼 + 2)/(𝑁 − 2). In the special case when α = 1 and β= 1/λ, X is
127
exponentially distributed with parameter λ > 0 and probability density function f(λ) = λ⋅exp(-λx).
128
In this case, M = 1/λ, V = 1/λ2, μ3= 2/λ3, μ4 = 9/λ4 and 𝑏̂ = 2, s2(𝑏̂) = 4/(N-2).
129
To illustrate this model numerically, we created four matrices with n = 100 rows and N = 100
130
columns. In each matrix, the elements were iid from, respectively, an exponential, gamma,
131
lognormal, and normal distribution (Figure 1A, B, C, D). For each matrix, we plotted the log
132
sample variance vj of each column j on the ordinate against the log sample mean mj on the
133
abscissa, j = 1, …, N. For the three positively skewed distributions (A, B, C), a positive
134
approximately linear relationship was observed. For the symmetric normal distribution with zero
135
skewness (D), no relationship between the log sample variance and the log sample mean was
136
observed.
137
We investigated the exponential distribution with λ = 1 in greater detail (Table 1). The median of
138
𝑏̂ estimated from the Theorem was always less than, but became closer to, the corresponding
139
value estimated from linear regression, as both n and N increased. For each set of n and N, the
140
95% confidence interval (CI) of 𝑏̂ from each method contained the theoretically calculated
141
asymptotic value 2. On the other hand, the median of (N - 2)s2(𝑏̂) from the Theorem was bigger
6
142
than its corresponding median estimated from linear regression, although the 95% CIs from both
143
methods overlapped and included the theoretically calculated asymptotic value 4, except when n
144
= 10, N = 100 under linear regression. It is not surprising that the delta method should yield a
145
quite accurate approximation to b in this situation as all variation is due to random sampling
146
alone.
147
Figure 1. Taylor's law in random sampling from (A) an exponential (λ = 1), (B) a gamma (α = 4,
148
β = 1), (C) a lognormal (μ = 1, σ = 1), and (D) a shifted normal (5 + 𝒩(0,1)) distribution, i.e., a
149
𝒩(0,1) distribution with 5 added to each value to make each block's mean positive with high
150
probability. For each panel, 10,000 independent and identically distributed observations from the
151
selected distribution were arranged in a single matrix with n = 100 rows and N = 100 columns.
152
For each column j, the sample mean mj and the sample variance vj were calculated and plotted
153
here on log-log coordinates, j = 1, …, N. For the three positively skewed distributions (A, with
154
skewness 2; B, with skewness 1; and C, with skewness (e+2)(e-1)1/2 ≈ 6.1849), a positive
155
approximately linear relationship was observed. The solid line is the least-squares linear
156
regression log10 vj = log10 a + b log10 mj. The parameter values are (A) exponential slope 1.8501,
157
intercept -0.0116; predicted slope bΜ‚ ± [s2(bΜ‚)]1/2 = 2 ± [4/98]1/2 = 2 ± 0.2020; (B) slope 2.1643,
158
intercept -0.6969; predicted slope bΜ‚ ± [s2(bΜ‚)]1/2 = 2 ± [10/98]1/2 = 2 ± 0.3194; (C) slope 4.0138,
159
intercept -1.1455; predicted slope bΜ‚ ± [s2(bΜ‚)]1/2 = 4.7183 ± (0.6660) (D) slope 0.0233, intercept -
160
0.0243; predicted slope bΜ‚ ± [s2(bΜ‚)]1/2 = 0 ± 0.1429. The slope estimated by linear regression fell
161
within one estimated standard deviation of the theoretically predicted slope except for the
162
lognormal observations (panel (C)), where the linear regression slope was 1.06 standard
163
deviation below the theoretically predicted slope.
7
164
8
165
Table 1. Estimating the slope b of Taylor's law (known to have slope 2 in this case) and its sample variance (known to be 4) by
166
two methods: linear regression, and the formulae in the Theorem using sample moments. We generated 10,000 n × N matrices,
167
where each element was iid from an exponential distribution with parameter λ = 1. For each n × N matrix, for each column j =
168
1, …, N, we calculated the column mean mj and column variance vj over elements in column j. The conventional least-squares
169
estimator (column (3)) of the linear regression slope b of the log vj as a linear function of the log mj and (N-2) times the sample
170
variance (column (5)) of the least-squares regression slope were calculated from MATLAB function “LinearModel.fit”
171
(Snedecor and Cochran 1980, pp. 155). For the same n × N matrix, we calculated 𝑏̂ (column (4)) and its sample variance s2(𝑏̂)
172
Μ‚ Μ‚2
Μ‚ 2 Μ‚4 − 𝑉̂ 2 )⁄𝑉̂ 3 − πœ‡
Μ‚ 2 ⁄𝑉̂ 4 } respectively, where
times (N-2) (column (6)) from the approximate formulae πœ‡
̂𝑀
Μ‚3 2 𝑀
3 ⁄𝑉 and {𝑀 (πœ‡
173
Μ‚ , 𝑉̂ , πœ‡
𝑀
Μ‚,
Μ‚4 were the sample moments estimated from all nN elements of the matrix. For every combination of n = 10,
3 and πœ‡
174
100, and N = 10, 100, the table shows the 2.5%, 50%, and 97.5% quantiles of the 10,000 estimates of the slope and (N-2) times
175
the variance of the slope of each method separately. The random matrices were simulated by using function “exprnd”
176
(MATLAB R2012b). Quantiles were calculated using the “Distributions” platform in JMP 10 (SAS Institute 2012).
177
9
(1)
(2)
N
N
(3) Regression 𝑏̂
(4) Approximate 𝑏̂
95% CI
median 95% CI
median
(5) (N-2)×Regression s2(𝑏̂ )
(6) (N-2)×Approximate s2(𝑏̂)
95% CI
median
95% CI
median
10
10
(0.8206, 3.1690)
2.0076
(0.5438, 4.2160) 1.7473
(0.4852, 9.3439)
2.1504
(0.3408, 15.8448)
2.2072
10
100 (1.6828, 2.3115)
1.9995
(1.4116, 2.7970) 1.9400
(1.5868, 3.6929)
2.4250
(1.7934, 9.0552)
3.6162
(0.5583, 3.4533)
1.9960
(0.5315, 3.6667) 1.9594
(0.7088, 13.5331)
3.1270
(0.6696, 15.7344)
3.2208
100 100 (1.6262, 2.3751)
1.9999
(1.5808, 2.4530) 1.9916
(2.3203, 5.3334)
3.5280
(2.4108, 6.5366)
3.9004
100 10
10
178
4. Empirical example: counts of oaks in Black Rock Forest
179
We report here a new analysis of data previously published and analyzed differently by Cohen et
180
al. (2012). In the Black Rock Forest, Cornwall, New York, 12 contiguous study plots, each
181
approximately 75 m by 75 m, were laid out in a rectangular array with three rows labeled A, B, C
182
and four columns labeled 1, 2, 3, 4. Each plot was identified by its row label and its column
183
label. For example, plot A1 was in the northwest corner of the study area while plot C4 was in
184
the southeast corner. Cohen et al. (2012, p. 15833, Figure 7) gave a map and contour diagram of
185
the study area and plots. Each plot was divided into nine approximately 25 m × 25 m subplots.
186
The number of oaks with diameter not less than 1 inch (2.54 cm) at breast height (approximately
187
1.5 m above ground) was recorded for each of the 108 subplots (Table 2). Here we consider only
188
the data from 2007. Cohen et al. (2012) give additional data and details.
189
At first sight, it seems plausible to propose that the data from this study design might be modeled
190
by the null model, since the plots (blocks) and subplots (observations within a block) are square
191
areas arbitrarily imposed on a contiguous rectangle of hillside forest. However, a one-way
192
analysis of variance (JMP 10, SAS Institute 2012) rejected (with P < 0.0001) the null hypothesis
193
that the variability in oak tree counts between plots could be accounted for by random sampling
194
from a single underlying distribution. Plots B4 and C4 in the southeast corner of the study area had
195
far more oaks per subplot than the other plots (Figure 2A). Plot A1 may have had fewer than average
196
oaks per subplot.
11
197
Table 2. Counts of oak trees in 2007 in Black Rock Forest plots A1, A2, …, C4, by subplot C, E, …, W. C = central, E = east
198
of central, N = north of central, S = south of central, W = west of central. This contingency table was derived from data
199
originally published as an online supplement by Cohen et al. (2012).
Plot
A1
A2
A3
A4
B1
B2
B3
B4
C1
C2
C3
C4
C
3
11
15
7
4
12
24
38
10
6
8
23
E
8
6
12
8
5
10
15
22
14
9
11
38
N
8
4
7
14
2
13
11
49
5
12
11
23
NE
8
16
8
9
9
12
20
17
6
8
8
51
NW
6
9
14
6
7
4
13
35
3
4
16
10
S
7
15
14
13
8
9
11
19
6
10
10
12
SE
2
18
15
10
7
14
9
40
12
13
9
19
SW
9
5
15
19
8
12
5
18
8
7
6
4
W
6
5
8
15
12
4
11
23
6
6
17
11
Plot
57
89
108
101
62
90
119
261
70
75
96
191
6.33
9.89
12.00 11.22 6.89
7.78
8.33
10.67 21.22
Subplot
total
Mean
Variance 5.75
28.61 11.50 18.44 8.61
10.00 13.22 29.00
13.75 33.19 136.00 12.69 8.75
12
13.50 223.94
200
Figure 2. Oak trees in Black Rock Forest, 2007. (A) Number of oaks per subplot in each of 12plots separately. The horizontal
201
line is the mean number of oaks per subplot, 12.21. A pooled estimate of the error variance (within plots) was 2.18. (B) Mean
202
and variance (plotted on log-log coordinates) of the number of oak trees per subplot in each of 12 plots. The straight (red) line
203
is the least-squares fitted linear model. The slightly convex (blue) curve is the least-squares fitted quadratic model. The
204
rightmost data point corresponds to plot B4, and the next-to-rightmost point corresponds to plot C4. The relationship between
205
variance and mean in these two outlying plots is consistent with that observed in the remaining 10 plots. (C) Frequency
206
histogram of the number of oak trees per subplot.
A
B
C
13
207
A least-squares linear model, TL, had coefficients Log(Variance(NTrees)) = -2.501856 +
208
2.3041749*Log(Mean(NTrees)). To test for linearity, we also fitted by least-squares a quadratic
209
model, obtaining the coefficients Log(Variance(NTrees)) = -0.936401 +
210
1.058334*Log(Mean(NTrees)) + 0.2398604*[Log(Mean(NTrees))]2. In the latter model, a t-test
211
to test the null hypothesis that the coefficient of the quadratic term did not differ significantly
212
from zero had P > 0.736, so there was no statistically significant evidence of nonlinearity. As
213
Figure 2B suggests, the quadratic term did not significantly improve the fit to the data. In the
214
former model, which is TL, the linear coefficient, 2.30417, was approximately 7.22 times its
215
standard error, so there was statistically significant evidence (P < 0.0001) that the slope was
216
positive, as is obvious from Figure 2B. The frequency histogram of oak trees per subplot was
217
right-skewed (Figure 2C).
218
The 108 counts of oak trees per subplot have the following moments (rounded to two decimal
219
places): mean 12.21, standard deviation 8.86, skewness 2.34, kurtosis 9.59, and CV 0.73. From
220
the Theorem (again rounding to two decimal places), bΜ‚ ± [s2(bΜ‚)]1/2 = 3.23 ± 0.77. The least-
221
squares estimate of the slope, 2.30, differed from the estimate of the slope derived from the
222
(counterfactual) model of iid sampling by 1.20 estimated standard deviations. Despite the
223
inhomogeneity among plots in the mean number of oaks per subplot revealed by the ANOVA,
224
the null model predicted a slope of TL that was in reasonable agreement with that observed. The
225
covariation between the mean and variance of samples from a skewed distribution in this
226
example contributed substantially to the relationship described by TL.
14
227
5. Conclusions
228
The Theorem makes it possible to compare the parameters of Taylor's law estimated from block-
229
structured data with the parameters that would arise from random sampling and random blocking
230
of the underlying frequency distribution. In ecological applications, this comparison makes it
231
possible to compare the effects of biologically based blocking with random sampling of a
232
frequency distribution. It remains to be seen how generally this null model will approximate well
233
the slope of TL estimated in empirical examples. To demonstrate mechanisms other than random
234
sampling that generate TL, it will be necessary to design studies in which the variation in means
235
between blocks is too large to be accounted for by random sampling, or in which the estimate of
236
b differs significantly from that predicted by the Theorem.
237
6. Appendix
238
If X is a real-valued random variable with finite mean E(X) and finite variance var(X), and if a
239
real-valued function f of real x is twice differentiable at E(X), then the delta method (Oehlert
240
1992; Hosmer, Lemeshow and May 2008, pp. 355-358) gives the approximations
241
𝑓(𝑋) ≈ 𝑓(𝐸(𝑋)) + (𝑋 − 𝐸(𝑋)){(𝑓 ′ (π‘₯))|π‘₯=𝐸(𝑋) },
242
𝐸(𝑓(𝑋)) ≈ 𝑓(𝐸(𝑋)) + {
243
𝑓 ′′ (π‘₯)
2
| π‘₯=𝐸(𝑋) } ⋅ π‘£π‘Žπ‘Ÿ(𝑋),
2
π‘£π‘Žπ‘Ÿ(𝑓(𝑋)) ≈ {(𝑓 ′ (π‘₯))|π‘₯=𝐸(𝑋) } π‘£π‘Žπ‘Ÿ(𝑋).
244
In practice, we compute sample moments from observations of X, plug them in to replace the
245
population moments, and accept the result as approximations to the left sides.
15
246
Lemma 1. If x > 0 and f(x) = log(x), then 𝑓 ′ (π‘₯) = 1⁄π‘₯ , 𝑓 ′′ (π‘₯) = −π‘₯ −2. Assume the sample size
247
in block 𝑗 is 𝑛𝑗 (𝑗 = 1, 2, … , 𝑁) and N is the number of blocks. Assume mj is the sample mean in
248
block j and E(mj) = M > 0. Then the approximations given by the delta method are log π‘šπ‘— ≈
249
log 𝑀 + (π‘šπ‘— − 𝑀)⁄𝑀, π‘£π‘Žπ‘Ÿ(log π‘šπ‘— ) ≈ 𝑉 ⁄(𝑛𝑗 𝑀2 ), 𝐸(log π‘šπ‘— ) ≈ log 𝑀 − 𝑉 ⁄(2𝑛𝑗 𝑀2 ).
250
Proof. In the delta method, we set X = mj , f(x) = log(x). From Loève (1977, p. 276, Exercise 5),
251
Oehlert (1992) showed essentially that for π‘ž ≥ 0, 𝐸 {|π‘šπ‘— − 𝑀|
252
use this bound with q = 0, 1/2, and 1 separately. Applying Taylor’s expansion to log π‘šπ‘— yields
253
2
2(π‘ž+1)
} = 𝑂 (𝑛𝑗−(π‘ž+1) ). We shall
3
log π‘šπ‘— = log 𝑀 + (π‘šπ‘— − 𝑀)/𝑀 − (π‘šπ‘— − 𝑀) /(2𝑀2 ) + 𝑂 ((π‘šπ‘— − 𝑀) ).
254
Following Oehlert’s notation, we define 𝑔(π‘šπ‘— ) = log π‘šπ‘— , and 𝐴2 (π‘šπ‘— ) = log 𝑀 + (π‘šπ‘— − 𝑀)/
255
𝑀 − (π‘šπ‘— − 𝑀) /(2𝑀2 ). Because M > 0 and because the logarithmic function is infinitely
256
differentiable in any open interval that contains M, by Taylor’s theorem, there exists a finite
257
constant C > 0, such that |𝑔(π‘šπ‘— ) − 𝐴2 (π‘šπ‘— )| ≤ 𝐢 |(π‘šπ‘— − 𝑀) |. From Oehlert (1992) with q =
258
1/2, we have 𝐸 {𝐢 |(π‘šπ‘— − 𝑀) |} = 𝑂(𝑛𝑗−3 2 ). Therefore, as 𝑛𝑗 → ∞, for 1 < πœ‚ < 2, 𝑛𝑗 πœ‚ ⋅
259
𝐸{|𝑔(π‘šπ‘— ) − 𝐴2 (π‘šπ‘— )|} = 𝑂 (𝑛𝑗 πœ‚−2 ) → 0. Here “→” denotes point-wise convergence. By triangle
260
inequality, 𝐸 (𝑔(π‘šπ‘— )) = 𝐸 (𝐴2 (π‘šπ‘— )) + π‘œ(𝑛𝑗 −πœ‚ ). After substitution, 𝐸(log π‘šπ‘— ) = log 𝑀 +
261
𝐸(π‘šπ‘— − 𝑀)/𝑀 − 𝐸 {(π‘šπ‘— − 𝑀) } /(2𝑀2 ) + π‘œ(𝑛𝑗 −πœ‚ ) = log 𝑀 − 𝑉 ⁄(2𝑀2 𝑛𝑗 ) + π‘œ(𝑛𝑗 −πœ‚ ).
262
Hence 𝐸(log π‘šπ‘— ) ≈ log 𝑀 − 𝑉 ⁄(2𝑀2 𝑛𝑗 ). As 𝑛𝑗 → ∞, this leads to the first-order approximation
263
𝐸(log π‘šπ‘— ) → log 𝑀.
2
3
3
3
⁄
3
2
16
264
Now we estimate π‘£π‘Žπ‘Ÿ(log π‘šπ‘— ) using the first-order Taylor expansion of log π‘šπ‘— , namely,
265
log π‘šπ‘— = log 𝑀 + (π‘šπ‘— − 𝑀)/𝑀 + 𝑂 ((π‘šπ‘— − 𝑀) ). Denote 𝐴1 (π‘šπ‘— ) = log 𝑀 + (π‘šπ‘— − 𝑀)/𝑀.
266
By Taylor’s theorem, there exists a finite constant 𝐢1 > 0, such that |𝑔(π‘šπ‘— ) − 𝐴1 (π‘šπ‘— )| ≤
267
𝐢1 |(π‘šπ‘— − 𝑀) |. From Oehlert (1992) with q = 0, we have 𝐸 {𝐢1 |(π‘šπ‘— − 𝑀) |} = 𝑂(𝑛𝑗−1 ). We
268
now approximate 𝐸 {(log π‘šπ‘— ) } using the delta method.
269
2
2
2
2
2
{𝑔(π‘šπ‘— )} = {𝑔(π‘šπ‘— ) − 𝐴1 (π‘šπ‘— ) + 𝐴1 (π‘šπ‘— )}
2
272
273
2
= {𝑔(π‘šπ‘— ) − 𝐴1 (π‘šπ‘— )} + {𝐴1 (π‘šπ‘— )} + 2{𝐴1 (π‘šπ‘— )} ⋅ {𝑔(π‘šπ‘— ) − 𝐴1 (π‘šπ‘— )}.
270
271
2
In other words,
2
2
2
{𝑔(π‘šπ‘— )} − {𝐴1 (π‘šπ‘— )} = {𝑔(π‘šπ‘— ) − 𝐴1 (π‘šπ‘— )} + 2{𝐴1 (π‘šπ‘— )} ⋅ {𝑔(π‘šπ‘— ) − 𝐴1 (π‘šπ‘— )}.
2
2
4
Since |𝑔(π‘šπ‘— ) − 𝐴1 (π‘šπ‘— )| ≤ 𝐢1 |(π‘šπ‘— − 𝑀) |, |𝑔(π‘šπ‘— ) − 𝐴1 (π‘šπ‘— )| ≤ 𝐢12 |(π‘šπ‘— − 𝑀) |. So
2
𝐢1
3
|(π‘šπ‘— − 𝑀) |.
𝑀
274
{𝐴1 (π‘šπ‘— )} ⋅ {𝑔(π‘šπ‘— ) − 𝐴1 (π‘šπ‘— )} ≤ 𝐢1 log 𝑀 ⋅ |(π‘šπ‘— − 𝑀) | +
275
|{𝑔(π‘šπ‘— )} − {𝐴1 (π‘šπ‘— )} | ≤ 𝐢12 |(π‘šπ‘— − 𝑀) | + 2𝐢1 log 𝑀 ⋅ |(π‘šπ‘— − 𝑀) | +
2
2
4
2
2𝐢1
3
|(π‘šπ‘— − 𝑀) |.
𝑀
276
From Oehlert (1992) using q = 1 for the first term on the right side, q = 0 for the second term,
277
and q = 1/2 for the third term, the expectation of the right side of the above inequality is
278
𝑂(𝑛𝑗 −1 ). As 𝑛𝑗 → +∞, for 0 < 𝛾 < 1, 𝑛𝛾 𝐸 |{𝑔(π‘šπ‘— )} − {𝐴1 (π‘šπ‘— )} | ≤ 𝑂(𝑛𝑗 𝛾−1 ) → 0. From
279
the triangle inequality, 𝐸 [{𝑔(π‘šπ‘— )} ] = 𝐸 [{𝐴1 (π‘šπ‘— )} ] + π‘œ(𝑛𝑗 −𝛾 ). Thus the approximate mean
2
2
2
17
2
2
2
2
280
of (log π‘šπ‘— ) is 𝐸 {(log π‘šπ‘— ) } ≈ 𝐸 [{log 𝑀 + (π‘šπ‘— − 𝑀)} /𝑀] = 𝐸 {(log 𝑀)2 + 2 (log 𝑀)(π‘šπ‘— −
281
𝑀)/𝑀 + (π‘šπ‘— − 𝑀) ⁄𝑀2 } = (log 𝑀)2 + 𝑉 ⁄(𝑀2 𝑛𝑗 ).
282
Overall, the estimated variance of log π‘šπ‘— from the delta method using the first-order Taylor
283
expansion of log π‘šπ‘— is π‘£π‘Žπ‘Ÿ(log π‘šπ‘— ) = 𝐸 {(log π‘šπ‘— ) } − {𝐸(log π‘šπ‘— )} ≈ (log 𝑀)2 +
284
𝑉 ⁄(𝑀2 𝑛𝑗 ) − (log 𝑀)2 = 𝑉 ⁄(𝑀2 𝑛𝑗 ). This proves Lemma 1.
285
Lemma 2. Under the assumptions of Lemma 1, also assume vj is the sample variance in block j
286
and E(vj) = V > 0. Then the approximations given by the delta method are log 𝑣𝑗 ≈ log 𝑉 +
287
(𝑣𝑗 − 𝑉)⁄𝑉 , π‘£π‘Žπ‘Ÿ(log 𝑣𝑗 ) ≈ (πœ‡4 − 𝑛𝑗−1 𝑉 2 ) /(𝑛𝑗 𝑉 2 ), 𝐸(log 𝑣𝑗 ) ≈ log 𝑉 − 2𝑛 (𝑉42 − 𝑛𝑗 −1).
288
Proof. Setting X = vj and following the same arguments as in the proof of Lemma 1 gives the
289
results.
290
Lemma 3. Under the assumptions of Lemma's 1 and 2, the covariance of the sample mean and
291
sample variance is π‘π‘œπ‘£(𝑣𝑗 , π‘šπ‘— ) = πœ‡3 ⁄𝑛𝑗 ,where μ3 is the third central moment.
292
Zhang (2007) gives a proof of this classical formula, which has been known at least since 1903
293
(Editorial 1903, p. 279, equation (xiii); Editorial 1913, p. 7, equation (xxvi); Neyman 1925, p.
294
479, equation (67); Neyman 1926).
295
Proof of Theorem. When all blocks are weighted equally, the least-squares estimator of b and the
296
unbiased estimator of its sample variance are respectively (Snedecor and Cochran 1980, pp. 155)
297
𝑏̂ = π‘π‘œπ‘£+ (log 𝑣𝑗 , log π‘šπ‘— )⁄π‘£π‘Žπ‘Ÿ+ (log π‘šπ‘— ),
2
2
2
𝑛 −3
1
𝑗
πœ‡
𝑗
18
𝑛 −3
𝑗
298
2
2
𝑠 2 (𝑏̂) = [π‘£π‘Žπ‘Ÿ+ (log 𝑣𝑗 )⁄π‘£π‘Žπ‘Ÿ+ (log π‘šπ‘— ) − {π‘π‘œπ‘£+ (log 𝑣𝑗 , log π‘šπ‘— )} ⁄{π‘£π‘Žπ‘Ÿ+ (log π‘šπ‘— )} ] /(𝑁 − 2).
299
The notations π‘π‘œπ‘£+ (log 𝑣𝑗 , log π‘šπ‘— ) and π‘£π‘Žπ‘Ÿ+ (log π‘šπ‘— ) are to be read as the covariance and
300
variance across all blocks and not as referring to any single block j. Explicitly, the sample
301
estimators are defined by
302
π‘£π‘Žπ‘Ÿ+ (log π‘šπ‘— ) =
1
𝑁−1
2
∑𝑁
𝑗=1(log π‘šπ‘— ) −
2
1
1
𝑁(𝑁−1)
2
(∑𝑁
𝑗=1 log π‘šπ‘— ) ,
2
1
303
𝑁
π‘£π‘Žπ‘Ÿ+ (log 𝑣𝑗 ) = 𝑁−1 ∑𝑁
𝑗=1(log 𝑣𝑗 ) − 𝑁(𝑁−1) (∑𝑗=1 log 𝑣𝑗 ) ,
304
𝑁
𝑁
π‘π‘œπ‘£+ (log 𝑣𝑗 , log π‘šπ‘— ) = 𝑁−1 ∑𝑁
𝑗=1(log π‘šπ‘— ⋅ log 𝑣𝑗 ) − 𝑁(𝑁−1) (∑𝑗=1 log π‘šπ‘— )(∑𝑗=1 log 𝑣𝑗 ).
305
They are all consistent by law of large numbers: as 𝑁 → ∞, π‘£π‘Žπ‘Ÿ+ (log π‘šπ‘— ) →𝑃 π‘£π‘Žπ‘Ÿ(log π‘šπ‘— ),
306
π‘£π‘Žπ‘Ÿ+ (log 𝑣𝑗 ) →𝑃 π‘£π‘Žπ‘Ÿ(log 𝑣𝑗 ), and π‘π‘œπ‘£+ (log 𝑣𝑗 , log π‘šπ‘— ) →𝑃 π‘π‘œπ‘£(log 𝑣𝑗 , log π‘šπ‘— ).
307
To find the limits in probability of 𝑏̂ and 𝑠 2 (𝑏̂), we approximate the above estimators by the
308
delta method using Lemmas 1, 2, and 3. We first approximate the numerator and the
309
denominator of 𝑏̂ separately. For the numerator of bΜ‚, namely, π‘π‘œπ‘£+ (log 𝑣𝑗 , log π‘šπ‘— ), The first
310
term is approximately
1
1
19
𝑁
𝑁
𝑗=1
𝑗=1
1
1
1
1
∑(log π‘šπ‘— ⋅ log 𝑣𝑗 ) ≈
∑ {log 𝑀 + (π‘šπ‘— − 𝑀)} ⋅ {log 𝑉 + (𝑣𝑗 − 𝑉)}
𝑁−1
𝑁−1
𝑀
𝑉
311
𝑁
𝑁
𝑗=1
𝑗=1
𝑁
log 𝑉
log 𝑀
=
⋅ log 𝑀 ⋅ log 𝑉 +
∑(π‘šπ‘— − 𝑀) +
∑(𝑣𝑗 − 𝑉)
(𝑁 − 1)𝑀
(𝑁 − 1)𝑉
𝑁−1
312
𝑁
1
+
∑(π‘šπ‘— − 𝑀)(𝑣𝑗 − 𝑉).
(𝑁 − 1)𝑀𝑉
313
𝑗=1
314
315
316
317
The second term of the numerator of bΜ‚ is approximately
1
1
𝑁(𝑁−1)
1
𝑉
1
𝑁
𝑁
𝑁
(∑𝑁
𝑗=1 log π‘šπ‘— )(∑𝑗=1 log 𝑣𝑗 ) ≈ 𝑁(𝑁−1) ∑𝑗=1 {log 𝑀 + 𝑀 (π‘šπ‘— − 𝑀)} ⋅ ∑𝑗=1 {log 𝑉 +
𝑁
log 𝑉
log 𝑀
𝑁
(𝑣𝑗 − 𝑉)} = 𝑁−1 ⋅ log 𝑀 ⋅ log 𝑉 + (𝑁−1)𝑀 ∑𝑁
𝑗=1(π‘šπ‘— − 𝑀) + (𝑁−1)𝑉 ∑𝑗=1(𝑣𝑗 − 𝑉) +
1
𝑁(𝑁−1)𝑀𝑉
𝑁
∑𝑁
𝑗=1(π‘šπ‘— − 𝑀) ∑𝑗=1(𝑣𝑗 − 𝑉).
1
1
318
𝑁
Therefore π‘π‘œπ‘£+ (log 𝑣𝑗 , log π‘šπ‘— ) ≈ (𝑁−1)𝑀𝑉 ∑𝑁
𝑗=1(π‘šπ‘— − 𝑀)(𝑣𝑗 − 𝑉) − 𝑁(𝑁−1)𝑀𝑉 ∑𝑗=1(π‘šπ‘— −
319
𝑁
𝑁
𝑁
𝑀) ∑𝑁
𝑗=1(𝑣𝑗 − 𝑉) = (𝑁−1)𝑀𝑉 ∑𝑗=1 π‘šπ‘— 𝑣𝑗 − 𝑁(𝑁−1)𝑀𝑉 ∑𝑗=1 π‘šπ‘— ∑𝑗=1 𝑣𝑗 =
320
2
1
1
1
2
𝑁
denominator of 𝑏̂ is approximately π‘£π‘Žπ‘Ÿ+ (log π‘šπ‘— ) ≈ 𝑀2 {(𝑁−1) ∑𝑁
𝑗=1 π‘šπ‘— − 𝑁(𝑁−1) (∑𝑗=1 π‘šπ‘— ) } =
321
π‘π‘œπ‘£+ (π‘šπ‘— ,𝑣𝑗 ) π‘£π‘Žπ‘Ÿ+ (π‘šπ‘— )
π‘£π‘Žπ‘Ÿ+ (π‘šπ‘— )⁄𝑀2 . Consequently, for large nj, 𝑗 = 1, 2, … , 𝑁, 𝑏̂ ≈
⁄ 𝑀2 . By
𝑀𝑉
322
π‘π‘œπ‘£(π‘šπ‘— ,𝑣𝑗 ) π‘£π‘Žπ‘Ÿ(π‘šπ‘— )
consistency, for large N, using Lemma 3 in the numerator, 𝑏̂ ≈
⁄ 𝑀2 =
𝑀𝑉
323
1
πœ‡3
⁄
𝑉
𝑛𝑗 𝑀𝑉 𝑛𝑗 𝑀2
1
= πœ‡3 𝑀⁄𝑉 2 = 𝛾1⁄𝐢𝑉 .
20
π‘π‘œπ‘£+ (π‘šπ‘— ,𝑣𝑗 )
𝑀𝑉
. Similarly, the
324
The derivation of π‘£π‘Žπ‘Ÿ+ (log 𝑣𝑗 ) is the same as that of π‘£π‘Žπ‘Ÿ+ (log π‘šπ‘— ). Replacing π‘šπ‘— with 𝑣𝑗 and 𝑀
325
with 𝑉 yields π‘£π‘Žπ‘Ÿ+ (log 𝑣𝑗 ) ≈ π‘£π‘Žπ‘Ÿ+ (𝑣𝑗 )⁄𝑉 2. For large N and 𝑛𝑗 , 𝑗 = 1, 2, … , 𝑁, substituting into
326
the formula for 𝑠 2 (𝑏̂) the estimators corresponding to π‘£π‘Žπ‘Ÿ+ (π‘šπ‘— ), π‘£π‘Žπ‘Ÿ+ (𝑣𝑗 ), and 𝑏̂ yields
327
(𝑁 − 2)𝑠 (𝑏̂) ≈ (
2
πœ‡4
𝑉
𝑀2 (πœ‡4 𝑉 − 𝑉 3 − πœ‡32 )
πœ… − 1 − 𝛾12
2
2
− 1)⁄ 2 − (πœ‡3 𝑀⁄𝑉 ) =
=
,
(𝑁 − 2)𝑉 4
(𝑁 − 2)(𝐢𝑉)2
𝑉2
𝑀
328
where κ = μ4/V2 is the kurtosis. This completes the proof.
329
References
330
Ballantyne, F. and Kerkhoff, A. J. (2007). The observed range for temporal mean-variance
331
scaling exponents can be explained by reproductive correlation. Oikos 116, 174-180.
332
Ballantyne, F. (2005). The upper limit for the exponent of Taylor’s power law is a consequence
333
of deterministic population growth. Evolutionary Ecology Research 7(8), 1213-1220.
334
Bartlett, M. S. 1936 Some notes on insecticide tests in the laboratory and in the field.
335
Supplement to the Journal of the Royal Statistical Society 3(2):185-194. Stable URL:
336
http://www.jstor.org/stable/2983670
337
M. S. Bartlett 1947 The use of transformations. Biometrics 3(1):39-52 Stable URL:
338
http://www.jstor.org/stable/3001536
339
Bechinski, E. J. and Pedigo, L. P. (1981). Population dispersion and development of sampling
340
plans for Orius insidiosus and Nabis spp. in soybeans. Environmental Entomology 10(6), 956-
341
959.
21
342
Cohen, J. E. (2013). Taylor's power law of fluctuation scaling and the growth-rate theorem.
343
Theoretical Population Biology doi: 10.1016/j.tpb.2013.04.002.
344
Cohen, J. E., Xu, M. and Schuster, W. S. F. (2013a). Stochastic multiplicative population growth
345
predicts and interprets Taylor’s power law of fluctuation scaling. Proceedings of the Royal
346
Society B, Biological Sciences 280, 20122955.
347
Cohen, J. E., Xu, M. and Brunborg, H. (2013b). Taylor's law applies to spatial variation in a
348
human population. Genus 69(1), 25-60.
349
Cohen, J. E., Xu, M. and Schuster, W. S. F. (2012). Allometric scaling of population variance
350
with mean body size is predicted from Taylor’s law and density-mass allometry. Proceedings of
351
the National Academy of Sciences, USA 109(39), 15829–15834.
352
Davidian, M. and Carroll, R. J. 1987 Variance function estimation. Journal of the American
353
Statistical Association 82(400):1079-1091, December. Stable URL:
354
http://www.jstor.org/stable/2289384 he
355
Editorial [probably K. Pearson, then the editor] (1903). On the probable errors of frequency
356
constants. Biometrika. 2(3), 273-281. Stable URL: http://www.jstor.org/stable/2331603.
357
Editorial [probably K. Pearson, then the editor] (1913). On the probable errors of frequency
358
constants. Biometrika. 9(1/2), 1-10. Stable URL: http://www.jstor.org/stable/2331796.
359
Eisler, Z., Bartos, I. and Kertész, J. (2008). Fluctuation scaling in complex systems: Taylor’s law
360
and beyond. Advances in Physics 57, 89-142.
22
361
Hosmer, D. W., Lemeshow, S. and May S. (2008). Applied Survival Analysis: Regression
362
Modeling of Time-to-Event Data, 2nd Edition, John Wiley & Sons, Inc.
363
Jørgensen, B., Kendal, W. S., Demétrio, C. G. B. and Holst, R. (Draft 2012). The ecological
364
footprint of Taylor’s universal power law. Draft.
365
Jørgensen, B. (1997). The Theory of Dispersion Models. Chapman & Hall, London.
366
Kaltz, O., Escobar-Páramo, P., Hochberg, M. E. and Cohen, J. E. (2012). Bacterial microcosms
367
obey Taylor's law: effects of abiotic and biotic stress and genetics on mean and variance of
368
population density. Ecological Processes, 1:5.
369
Kendal, W.S. and Jørgensen, B. (2011). Taylor's power law and fluctuation scaling explained by
370
a central-limit-like convergence. Physical Review E 83, 066115.
371
Kilpatrick, A. M. and Ives, A. R. (2003). Species interactions can explain Taylor’s power law for
372
ecological time series. Nature 422, 65-68.
373
Kogan, M., Ruesink, W. G. and McDowell, K. (1974). Spatial and temporal distribution patterns
374
of the bean leaf beetle, Cerotoma trifurcata (Forster), on soybeans in Illinois. Environmental
375
Entomology 3(4), 607-617.
376
Loève, M. (1977). Probability Theory I, 4th edition, New York, Springer-Verlag.
377
Neter, J., Wasserman, W. and Kutner, M. H. (1990). Applied Linear Statistical Models, 3rd
378
edition, 622-623.CRC Press.
23
379
Neyman, J. (1925). Contributions to the theory of small samples drawn from a finite population.
380
Biometrika 17(3/4), 472-479. Stable URL: http://www.jstor.org/stable/2332092.
381
Neyman, J. (1926). On the correlation of the mean and the variance in samples drawn from an
382
"infinite" population. Biometrika 18(3/4), 401-413. Stable URL:
383
http://www.jstor.org/stable/2331958.
384
Oehlert, G. W. (1992). A note on the delta method. The American Statistician 46(1), 27–29.
385
Ramsayer, J., Fellous, S., Cohen, J. E. and Hochberg, M. E. (2012). Taylor’s law holds in
386
experimental bacterial populations but competition does not influence the slope. Biology Letters
387
8(2), 316-319.
388
Rohatgi, V. K. and Székely, G. J. (1989). Sharp inequalities between skewness and kurtosis.
389
Statistics & Probability Letters 8, 297-299.
390
SAS Institute Inc. 2012. JMP® 10 Basic Analysis and Graphing. Cary, NC: SAS Institute Inc.
391
Snedecor, G. W. and Cochran, W. G. (1980). Statistical Methods. 7th ed. Iowa State University
392
Press, Ames.
393
Taylor, L. R. (1961). Aggregation, variance and the mean. Nature 189(4766), 732–735.
394
Taylor, L. R. 1984 Assessing and interpreting the spatial distributions of insect populations.
395
Annual Review of Entomology 29: 321-357. doi:10.1146/annurev.en.29.010184.001541
396
Tweedie, M. C. K. 1946 The regression of the sample variance on the sample mean. J. Lond.
397
Math. Soc. 21:22–28. doi:10.1112/jlms/s1-21.1.22
24
398
Tweedie, M. C. K. 1947 Functions of a statistical variate with given means, with special
399
reference to Laplacian distributions. Proc. Cambridge Philos. Soc. 43:41–49.
400
doi:10.1017/S0305004100023185
401
Tweedie, M. C. K. 1984 An index which distinguishes between some important exponential
402
families. In: Statistics: Applications and New Directions. Proceedings of the Indian Statistical
403
Institute Golden Jubilee International Conference, J. K. Ghosh and J. Roy, eds. Indian Statistical
404
Institute, Calcutta. Pp. 579-604.
405
Wilson, L. T., Sterling, W. L., Rummel, D. R. and DeVay, J. E. (1989). Quantitative sampling
406
principles in cotton. Ch. 5 In Integrated Pest Management Systems and Cotton Production (eds
407
R. E. Frisbie, K. M. El-Zik, and L. T. Wilson) pp. 85-119. John Wiley & Sons, Inc.
408
Zhang, L. (2007). Sample mean and sample variance: their covariance and their (in)dependence.
409
The American Statistician 61(2), 159-160.
25
Download