Statistics answers questions

advertisement
BESC 320 (Water & Bioenvironmental Science)
Intro to probabilistic thought
•
•
•
•
Statistics answers questions:
It gives measures of effect size
It sets confidence limits on conclusions
It is simple in principles and general practice
Therefore, it is a crime to NOT be an expert
• E.g. two simple tests
1. A lady is asked in eight instances whether
milk or tea is added first to her coffee
2. The lady is asked to rate how much she
enjoys each cup on a scale of 1-100
Ronald Aylmer Fisher
(1890-1962)
• Founding father of modern statistics and
the Darwinian synthesis. Book rec: The Origins
of Theoretical Population Genetics, by Will Provine ($10)
• In 1919 worked as a statistician at the Rothamsted
Agricultural Experiment Station in the UK.
• Published many papers and wrote several books on
experimental design and evolution.
• Creative demonstration of powers of statistical
analysis using data from a “Lady Tasting Tea”.
1. A lady is asked in eight instances
whether milk or tea is added first
Correct Incorrect
Actual data
8
0
Random expectation 4
4
Calculate degree of pattern (deviation from random) and if improbably large
then bias (causation) exists. That is, you conclude she can tell the difference.
If pattern could reasonably be due to chance alone then accept
the default (null) hypothesis that she can not tell (i.e. unpatterned data)
2. The lady rates how much she
enjoys each cup on a scale of 1-100
Calculate deviation from random (using trick of comparing betweenand within-group variance.
Both approaches contrast Yin of Pattern with Yang of Random
☯
At Rothamsted Fisher recognized problems with
agricultural experiments
Same field, same treatment, but plant
performance is uneven...
Thin
Growth
Thick
Growth
Fisher’s Solution:
Replicate, Randomize
(Spread variation without
bias among treatments)
Source of Picture: http://www.ipm.iastate.edu/ipm/icm/files/images/uneven-corn-VS6.jpg
Fisher’s Lessons from Rothamsted
Growth
Experiments prior to Fisher generally involved
two fields (containing hundreds of plants), each
receiving a treatment (e.g. two levels of N)
Field with
High N
Field with
Low N
Treatment
Problem: So much variability exists within each
field it is difficult or impossible to tease out the
treatment effect (i.e. a signal to noise problem)
Fisher’s Solution at Rothamsted
– Old Problematic Design: One large field receiving high
nitrogen (N), one large field receiving low nitrogen (N).
(Today this design is sometimes called “pseudoreplication” if
the experimenter attempts to say that the sample size is the
number of plants.)
– New Improved Design: Many small plots, randomly
receiving high N or low N; plots can also be blocked to
help tease out the variation due to location and local
conditions.
Hurlbert, S. H. (1984). Pseudoreplication and the design of ecological field experiments. Ecological monographs 54(2): 187-211.
Examples of Correct & Incorrect Ways
to Randomize Treatments
Correct Ways:
• Use a random
number table.
• Pick treatments
from a hat.
• Flip a coin.
Incorrect Ways:
• Haphazardly decide which experimental
units should receive which treatments.
(Problem: too tempting for experimenter to bias.)
• Use a net to grab the goldfish in an
ecology study. (Problem: might pick just the
easiest to catch, sickly animals.)
• Alternate treatments (every other one).
(Problem: that’s systematic, not random; who knows
what other factors vary in the same systematic way.)
• Assign people to drug study on the basis
of their last name. (Problem: could be related to
a person’s ancestry.)
Fisher, Randomization, Replication & Blocking
• No replication (or pseudoreplication) (Rothamsted, pre-Fisher):
Field with
High N
Field with
Low N
• Replicated with complete randomization:
Field broken
up into
smaller plots
Treatments are applied to plots
rather than to an entire field;
this improves replication &
interspersion of treatments.
• Replicated, randomized and blocked design:
Field broken
up into
smaller plots
& plots are
grouped.
Dashed rectangle
is a block
Plots are blocked by
location or other
condition; treatments
are applied randomly to
plots within blocks.
Another of Fisher’s Contributions to Statistics:
The Analysis of Variance (ANOVA)
Allows scientists to mathematically partition variation in
a measured variable due to different sources
(treatments, blocks, plots, for example).
Some of Fisher’s contributions to the field of statistics grew out of
his experience with spatial agricultural experiments at
Rothamsted.
Why do these two
plants differ in
growth? Is it
because of block,
treatment, or
extraneous
variation within
plots?
At Rothamsted, Fisher saw firsthand that the purpose of good experimental design is
not to eliminate variation entirely, but rather to try to ensure that extraneous
variation is spread evenly among treatments. In the case of ANOVA, the
experimental design can enable the variation to be partitioned mathematically
during analysis.
Variation in growth of plants can be partitioned into different sources of variation:
1. Variation in soil moisture, texture, etc. within a plot.
2. Variation between treatments (high N and low N).
3. Variation in soil moisture, texture, sunlight, etc., among blocks.
This and following slides by TJ DeWitt
Let us try an experiment and analysis
Fiji water is awesome
Two tests:
I. Side by side comparison
Everyone knows that
Let’s prove it
II. Scaled measure of quality
1. Chi-square (χ²) on our water preference test data
Actual data
Random expectation
Fiji
22
26.5
RO (remineralized)
31
26.5 (expectation of 53 random outcomes)
Calculate deviation (bias) from random and if improbably large
students have a patterned taste preference, else do not make that conclusion.
χ² = ∑ (obs-exp)²
exp
= (22-26.5)²/26.5 + (31-26.5)²/26.5 = 1.528
The probability, P, of getting a metric of pattern this great due only to chance
is 0.216—not improbable. Generally if P < 0.05 we consider the pattern
Improbable due to chance. Thus we are safest concluding there is insufficient
evidence of pattern here; i.e., no taste preference noted.
FYI: Get P values in Excel® for χ² tests by entering, e.g., “=CHIDIST(1.528,1)” into a cell.
2. t-test on our water preference data
64
85
50
80
40
80
95
85
70
65
55
50
100
75
50
80
70
60
50
50
80
90
71
42
0
100
75
80
100
5
50
70
85
70
90
95
75
70
75
50
70
85
80
70
0
56
85
60
80
60
100
75
85
65
60
90
67
35
70
50
80
20
10
10
60
40
21
63
75
70
65
90
10
60
50
60
75
85
100
100
80
73
88
67
60
80
100
100
50
50
90
80
90
78.5
Data at left (you can paste into Excel®).
Random expectation: average difference of 0 between
water taste scores given by students for Fiji and RO
Recall measures of pattern in statistics pit the among
group deviations scaled to within group deviations.
Here our measure of pattern is a t statistic—the average
difference between scores divided by the standard
deviation of within-individual differences:
t = avg1-avg2
= 66.94 - 77.11 = 0.04
stdev(diffs)/sqrt(n)
30.95 / 6.86
Not big. The P-value is 0.97. It would be common to
get a measure of pattern this large (or larger) by chance.
So what are the cardinal points?
The field of statistics provides tools to measure pattern against
random (or a priori) expectations
Test statistics, like χ², t, F, Λ, are metrics of pattern
1. generally among group (or along gradient) variation
relative to within group (or off gradient) variation
2. Can be compared to the greatest expected values of the test
statistics one might expect to arise by chance alone
3. A P value is the chance of a pattern equal to or greater
than that observed occurring only by chance
Independent replication is important in statistical analysis so
pattern due to sloppy experimental design can not intrude to
create either excess bias or noise.
Download