Introduction to Oneway ANOVA Analysis of Variance

advertisement
Introduction to
ANOVA
(ANalysis Of
VAriance)
Why ANOVA?
• Effects of CHO loading
• how much?
• 1 gm/kg? 2 gm/kg? 5 gm/kg?
• Effects of bracing on GRF
• which brace
• taping? Swed O? ActiveAnkle?
• Effeks uf alcohol on spelin
• what blood/alcohol level
• 0.04? 0.08? 0.10?
ANalysis Of VAriance
• 1-way ANOVA
• Grouping variable = factor = independent variable
• The variable will consist of a number of levels
• If 1-way ANOVA is being used, there will be >2 levels of
one IV.
• E.G. What type of program has the greatest
impact on aggression?
• Violent movies, soap operas, or “infomercials”?
• Type of program is the independent variable or factor
• Violent movies is one level of the factor
• soap operas is one level of the factor
• Infomercials is one level of the factor
• Aggression is the dependent variable
Example of Oneway
ANOVA (single factor)
• No reason to assume correlation
between the cases in the “k” groups
• (k = number of groups)
• Question: does CHO affect time to
fatigue??
• IV: diet (3 levels of IV or factor)
• DV: Endurance time on bike
How to compare more
than 2 means?
•  refers to risk of making a Type 1 error
• with each comparison, we have “ ”
chances of making a Type 1 error
•  = 0.05
• 5 times in 100 we will reject a true null
hypothesis when running each comparison
Type 1 error rate is
exponentially cumulative
Family Wise error rate

FW
=
1- (1 - 
c
)
where c is the number of
comparisons to be made
ie if  = 0.05 and three means
Type 1 error rate is
exponentially cumulative
Family Wise error rate with three
means to compare
 FW = 1- (1 = 0.143
3
0.05)
Type 1 error rate is
exponentially cumulative
Estimating Family Wise error rate
 FW =  c
where c is the number of comparisons to
be made
Note: always overestimates the error rate
ie if  = 0.05: k = 3; k = 4?????
ANOVA is an
attempt to
maintain the
FW error rate
at a known
(acceptable)
level
Example of ANOVA:
Return to our original question
Question: does amount of CHO injected
affect time to fatigue??
IV: diet (3 levels of IV)
DV: Endurance time on bike
1-way ANOVA (0ne IV)
• IV = Grouping variable = factor
• The IV consists of a number of levels
Steps to Oneway
ANOVA
• set  (0.05)
• set sample size
• Thirty randomly selected subjects
• Three randomly assigned groups
• n = 10 in each group
• Grp 1: Regular Diet
• Grp 2: CHO supp diet (0.5 g/kg)
• Gpr 3: CHO supp diet (1.0 g/kg)
• set HO:
Set statistical
hypotheses: I
HO
• Null hypothesis
• Any observed
difference between
the 3 groups will be
attributable to
random sampling
errors
H1 (HA)
• Alternative hypothesis
• If HO is rejected, the
difference is not
attributable to random
sampling errors
(perhaps diet)?
Set statistical
hypotheses: II
• HO
• Null hypothesis
• The population
means of the 3
groups are equal
• H1
• Alternative hypothesis
• The population means
of the three groups
differ in some way
Note: no directional
hypothesis; Null may be
false in many different
ways
Steps
•
•
•
•
Set  (0.05)
set sample size (n = 10/grp)
set Ho:
test all subjects with a standardized
protocol (bike)
Subject
Data
file ANOVA1.sav
Steps
Set  (0.05)
set sample size (n = 10/grp)
set Ho:
test all subjects with a standardized protocol
(bike)
• get descriptive statistics of each group
•
•
•
•
• histograms
• mean, SD, n
• compare the group means
How to compare the
groups?
• With k = 3,  = 0.05,

FW
= ???
Concept of ANOVA
• Evaluate the effect of treatment
(the IV)
Concept of ANOVA
• Evaluate the effect of treatment
(the IV) by analyzing the amount of
variation among the subgroup
sample means (DV)
Concept of ANOVA
• Evaluate the effect of treatment
(the IV) by analyzing the amount of
variation among the subgroup
sample means (DV)
But how much variation is expected
if the subgroup population means are
equal?
Some
Nomenclature
• Grand Mean: mean of ALL scores,
regardless of group
• ie all 30 scores
X
• Group Mean: mean of all scores from
subjects treated the same
• groups of 10
X
3 Sources of
Variability
(Deviation Scores!!!!)
X-X
X-X
X-X
: Total Variability (individual
scores around Grand Mean)
3 Sources of
Variability
X-X
X-X
X-X
: will sum to 0, so square it
for each subject, then sum.
Gives us The Total Sum of
Squares
3 Sources of
Variability
X-X
X-X
X-X
: Within Group Variability
(individual scores around
Group Mean)
3 Sources of
Variability
X-X
X-X
: Within Group Variability
(scores around Group Mean)
X-X
Reflects INHERENT variability
(all treated the same)
Within-group
• Variation between people that is not due to
the grouping factor
• Example:
• You might assign people to three different tanning beds to
see which has the greatest tanning effect
• But folk within each type of bed would still vary greatly in
the degree of tanning they achieved
• Within group variance is the pooled variance from
all levels of the grouping factor (similar to pooled
SD in t-test)
3 Sources of
Variability
X-X
X-X
X-X
: will sum to 0, so square it
for each subject, then sum.
Gives us Within Group
Sum of Squares
3 Sources of
Variability
X-X
X-X
X-X
: Between Group Variability
(Groups around Grand Mean)
Between-group variation
• Is the variation normally expected
between people (within-group variation),
plus variation due to the grouping factor
3 Sources of
Variability
X-X
X-X
X-X
Reflects inherent and
TREATMENT EFFECT
: Between Group Variability
(Groups around Grand Mean)
3 Sources of
Variability
X-X
X-X
X-X
: will sum to 0, so square it
for each group, then sum.
Gives us Between Group
Sum of Squares
Recall
• Size of the Sum of Squares is affected
by
• size of each deviation score
• number of cases that are summed
Calculate the MEAN SQUARE of a sum
of squares by dividing through by the
degrees of freedom contributing to the sum.
3 Sources of
Variability
X-X
X-X
X-X
df for EACH group = n-1
Statistics
Humour
Two unbiased estimators were
sitting in a bar. The first says
“So how do you like married life?“
The other replies, "It's pretty good
if you don't mind giving up that
one degree of freedom!"
3 Sources of
Variability
X-X
X-X
X-X
df for EACH group = n-1
df for TOTAL groups = k (n-1)
3 Sources of
Variability
X-X
X-X
X-X
df for EACH group = n-1
df for TOTAL groups = k (n-1)
For our Diet study:
df Within = 3 (10 - 1) = 27
3 Sources of
Variability
X-X
X-X
X-X
df = k -1
3 Sources of
Variability
X-X
X-X
X-X
For our Diet study
df Between = 3 - 1 = 2
df = k -1
A new ratio between
variabilities for us to
consider
Inherent Variability + Treatment Effect
Inherent Variability
A new ratio between
variabilities for us to
consider
Inherent + Treatment
Between
=
Inherent
Between: between group variability
Within: within group variability
Within
A new ratio between
variabilities for us to
consider
Inherent + Treatment
MSBetween
=
Inherent
MSWithin
By using Mean Square, account for different
number of cases contributing to each estimate
of error (random SE).
A new ratio between
variabilities for us to
consider
Inherent + Treatment
MSBetween
=
Inherent
MSWithin
Note: if Treatment effect = 0 (ie no effect)
the ratio will be equal to ????
A new ratio between
variabilities for us to
consider
F
MSBetween
=
MSWithin
Note: if Treatment effect = 0 (ie no effect)
the ratio will be equal to 1.00
Evaluating Fobserved
with the F distribution
• A distribution of F ratios is not normally
distributed
• follows an F distribution
• positively skewed
• depends on the number of degrees of
freedom in the numerator (MS between) and
the denominator (MS within)
The F distribution
(hypothetical)
0
1
2
3
4
5
6
7
8
Fcritical : the F value that
must be equaled or
exceeded to classify a
difference among group
means as statistically
significant (identify a
main effect)
Fcritical
depends
on df of
MSbetween
and
MS within,
and
chosen 
Fcritical
depends
on df of
MSbetween
and
MS within,
and
chosen 
The F distribution
(hypothetical)
Region of
rejection
0
1
2
3
4
5
6
7
8
F.05 = ???
For our Diet study, with  = 0.05 and df = 2 and 27, Fcritical = ???
The F distribution
(hypothetical)
F distribution for df 2, 27
Concept of evaluating
Fobs against Fcrit
F distribution for df 2, 27
Area = 0.05 (5%)
Fcrit = 3.35
Concept of evaluating
Fobs against Fcrit
F distribution for df 2, 27
Area = 0.05 (5%)
Fcrit = 3.35
Fobs < Fcrit, Decision: ?????
Concept of evaluating
Fobs against Fcrit
F distribution for df 2, 27
Area = 0.05 (5%)
Fcrit = 3.35
Fobs  Fcrit, Decision: ?????
Running Oneway ANOVA
(single factor ANOVA)
Using SPSS
e
e
N
F
0
F
1
F
Demonstrate with anova1.sav
1-way ANOVA in SPSS
Procedure: Choose the
appropriate procedure,
and…
1-way ANOVA in SPSS
Dialog box: slide the
variables…
…into the appropriate
places
ANOVA in SPSS
O
F
m
d
F
S
i
a
B
2
0
0
0
W
7
1
T
9
Decision
• Since Fobs = 11.13  Fcrit of 3.35,
our decision is to ...
Decision
• Since Fobs = 11.13  Fcrit of 3.35,
our decision is to reject Ho
stating that...
Decision
• Since Fobs = 11.13  Fcrit of 3.35,
our decision is to reject Ho stating
that the difference among the
means is not more than would be
expected by chance and accept HA
stating that...
Decision
• Since Fobs = 11.13  Fcrit of 3.35,
our decision is to reject Ho stating
that the difference among the
means is not more than would be
expected by chance and accept HA
stating that the means differ in
some way.
Decision
• Since Fobs = 11.13  Fcrit of 3.35,
our decision is to reject Ho stating
that the difference among the
means is not more than would be
expected by chance and accept HA
stating that the means differ in
some way.
Omnibus F: identify a significant main effect
Decision
• Since Fobs = 11.13  Fcrit of 3.35,
our decision is to reject Ho stating
that the difference among the
means is not more than would be
expected by chance and accept HA
stating that the means differ in
some way.
How to determine which means differ?
Time to Fatigue (mins)
Is Normal different from o.5 g/kg?
From 1.0 g/kg?
Is 0.5 g/kg different from 1.0 g/
kg?
50
45
40
35
Normal
0.5g CHO
Diet Group
1.0g CHO
ANOVA in SPSS
O
F
m
d
F
S
i
a
B
2
0
0
0
W
7
1
T
9
Significant result…now
what?
There are
more than 2
means
Among all
means, or
just two?
Better do a
follow-up,
mate
There is a
significant
difference
among the
means
Don’t
know
Rats
Ok
then
Why not use 3 unpaired
t-tests?
• Normal vs 0.5 g/kg
• Normal vs 1.0 g/kg
• 0.5 g/kg vs 1.0 g/kg
Why not use 3 unpaired
t-tests?
• Normal vs 0.5 g/kg
• Normal vs 1.0 g/kg
• 0.5 g/kg vs 1.0 g/kg
Because we will be operating with
an inflated Family Wise .
Post Hoc tests
• After the Fact comparisons of means
used to identify which specific pairs of
means are significantly different
• Designed to maintain a specified
Family Wise level regardless of how
many pairs of means are compared
Post Hoc tests
• Follow-up tests
• ONLY compute after a significant ANOVA
• Like a collection of little t-tests
• But they control overall type 1 error comparatively
well
• They do not have as much power as the omnibus
test (the ANOVA) – so you might get a significant
ANOVA & no sig. Follow-up
• Purpose is to identify the locus of the effect (what
means are different, exactly?)
Significant result…now
what?
• Follow-up tests – most common…
• Tukey’s HSD (honestly sig. diff.)
• Formula:
MSwithin
HSD  q
ngroup
• But it’s easier to use SPSS…
Follow-ups to ANOVA in
SPSS
Choose “post-hoc”
test (meaning
‘after this’)
Follow-ups to ANOVA in
SPSS
Check the
appropriate
box for the
HSD (Tukey,
not Tukey’s b)
Run Tukey’s HSD test:
Oneway in SPSS
• Use our diet data (ANOVA1.sav)
i m
And one
a
Groups
T
that does
that do
a
not
differ
N
1
2
D
N
0
0
0
0
0
1
0
0
M
a
U
Assumptions to test in
One-Way
1.
2.
3.
Samples should be independent (as with
independent t-test – does not mean perfectly
uncorrelated)
Each of the k populations should be normal
(important only when samples are small…if there’s
a problem, can use Kruskal-Wallis test)
The k samples should have equal variances (this is
the homogeneity of variance assumption, and we’ll
look at it shortly…violations are important mostly
with small samples and unequal n’s)
Homogeneity of
variance - SPSS
1. Click on
the ‘options’
button
Homogeneity of
variance - SPSS
2. Choose
homogeneity of
variance (I’ve
also chosen
descriptives
here)
3. Click
continu
e
Homogeneity of
variance - SPSS
4. SPSS output
The test has to be significant for
there to be a violation
Reporting ANOVA
results
Table 1. Descriptive statistics of mean time to exhaustion
(minutes) by diet group (n = 10). A solid line joins pairs of
means that are not significantly different (Tukey’s HSD,
=0.05)
Mean
SD
Regular
0.5g/kg
1.0g/kg
Diet
CHO
CHO
38.9
44.2
44.7
3.5
2.9
2.7
Time to Fatigue (mins)
50
40
30
*
20
10
0
Normal
0.5g CHO
1.0g CHO
Diet Group
Figure 1. Descriptive statistics of time to exhaustion
with different diets. An asterisk indicates group means
that are not significantly different (=0.05)
Reporting ANOVA results
Table 2. ANOVA summary table for the
effects of diet on time to exhaustion.
Source
Diet
Error
df
SS
MS
2 206.6 103.3
27 250.6
9.3
F
p
11.1 0.0003
Optional: include in appendix if not in body of thesis
Reporting ANOVA results
Descriptive statistics for the mean time to exhaustion
for the three diet groups are presented in table 1 and
graphically in Figure 1. A oneway ANOVA at  =
0.05 revealed a significant difference among the diet
groups for mean time to exhaustion (F 2,27 = 11.13,
p = 0.0003). Tukey’s HSD was used to identify the
source of the significant omnibus F, and indicated that
the mean time to exhaustion for the regular diet group
(38.9  3.5 minutes) was significantly shorter than the
time for the groups receiving 0.5 grams CHO per kilogram
body weight (g/kg) or 1.0 g/kg. These two groups , with
means of 44.2 ( 2.9) and 44.7 ( 2.7) minutes
respectively, were not significantly different.
Reporting ANOVA results
These results suggest that CHO supplements of at least 0.5
g/kg of CHO will increase time to exhaustion on the bicycle
by about 5.5 minutes or 14%. The data also suggest a plateau
effect of CHO supplementation, with no additional increase
in time to exhaustion seen with 1.0 g/kg compared to 0.5 g/kg.
In discussion, address whether the observed
increase is physiologically meaningful, and
elaborate on the concept of a plateau effect
with CHO supplements.
Calculating
Tukey-b (HSD) test
HSD  q
MS
within
n
Honestly Significant Difference
the magnitude of mean difference that must
exist to claim levels are Significantly Different
Tukey-b (HSD) test
HSD  q
MS
within
n
The studentized range statistic (table E, p. 470)
depends on the number of levels to be
compared and df within and 
Tukey-b (HSD) test
HSD  q
MS
within
n
For our diet study: k = 3 (# of levels) and
df within = 27,  = 0.05
From Table F, q = ???
Tukey-b (HSD) test
HSD  q
MS
within
n
For our diet study: k = 3 (# of levels) and
df within = 27,  = 0.05
From Table 8, q = 3.51
Tukey-b (HSD) test
HSD  q
MS
Mean SquareWithin,
taken from ANOVA
Summary Table
n
within
Tukey-b (HSD) test
HSD  q
MS
For our diet study,
MSwithin = 9.2815
n
within
Tukey-b (HSD) test
HSD  q
MS
within
n
Number of Subjects in EACH group
Tukey-b (HSD) test
HSD  q
MS
For our diet study, n = 10
n
within
Tukey-b (HSD) test
HSD  3.51
9.2851
= 3.382
10
Apply Tukey’s HSD test
value of 3.4 to the diet
data:
• Normal vs 0.5 g/kg
• 38.9 vs 44.2 minutes
• difference = -5.3 minutes *
• Normal vs 1.0 g/kg
• 38.9 vs 44.7 minutes
• difference = -5.8 minutes *
• 0.5 g/kg vs 1.0 g/kg
• 44.2 vs 44.7 minutes
• difference = -0.5
Download