Essentials of Biostatistics in Public Health

advertisement
BS704 Class 8
Analysis of Variance
HW Set #7
Chapter 7
Problems 5, 14, 19 and 28
R Problem Set 7 (on Blackboard)
Due November 2
Please complete Quiz 9 Before Nov 2
An RCT to Assess the Efficacy of a
New Drug for Asthma in Children
 Background characteristics
 Age
 Sex
 Years since diagnosis of asthma
 Outcomes
 Self-reported improvement in symptoms
 FEV1
Did the randomization work?
Characteristic
Placebo
Age, years
10 (2.4)
9.9 (2.1) .76
% Male
54%
43%
.04
3.1 (2.1)
.34
Yrs since Dx
3.4 (1.9)
New Drug
p
1. Yes
2. No
o
0%
N
Ye
s
0%
What are hypotheses to compare
ages?
Characteristic
Placebo
Age, years
10 (2.4)
9.9 (2.1) .76
% Male
54%
43%
.04
3.1 (2.1)
.34
Yrs since Dx
1.
2.
3.
4.
3.4 (1.9)
New Drug
H0:m1=m2 vs H1:m1≠m2
H0:p1=p2 vs H1:p1≠p2
H0:m=10 vs H1:m≠10
H0:md=0 vs H1:md≠0
p
What test would be used to compare
% improvement between groups?
1. Test for equality of means
2. Test for equality of
proportions
3. Test for mean difference
4. No clue
m
fo
r
st
Te
o
cl
en
c
ea
n
lit
y
ua
eq
fo
r
st
Te
N
of
di
ffe
r
p.
..
...
of
lit
y
ua
eq
fo
r
st
Te
ue
e
0% 0% 0% 0%
What test would be used to compare
FEV1 between groups?
1. Test for equality of means
2. Test for equality of
proportions
3. Test for mean difference
4. No clue
m
fo
r
st
Te
o
cl
en
c
ea
n
lit
y
ua
eq
fo
r
st
Te
N
of
di
ffe
r
p.
..
...
of
lit
y
ua
eq
fo
r
st
Te
ue
e
0% 0% 0% 0%
Objectives
 Understand the procedure for testing
the equality of k > 2 means
 Perform the test by hand and using R
 Appropriately interpret results
Hypothesis Testing Procedures
1. Set up null and research
hypotheses, select a
2. Select test statistic
3. Set up decision rule
4. Compute test statistic
5. Draw conclusion & summarize
significance (p-value)
Hypothesis Testing for More than 2
Means - Analysis of Variance
 Continuous outcome
 k Independent Samples, k > 2
H0: m1=m2=m3 … =mk
H1: Means are not all equal
Test Statistic
F=
Σn j (X j  X) 2 /(k  1)
ΣΣ(X  X j ) 2 /(N  k)
(Find critical value in Table 4)
Test Statistic - F Statistic
 Comparison of two estimates of variability in
data
 Between treatment variation, is based on the
assumption that H0 is true (i.e., population
means are equal)
 Within treatment, Residual or Error variation,
is independent of H0 (i.e., we do not assume
that the population means are equal and we
treat each sample separately)
F Statistic
Difference BETWEEN
each group mean and
overall mean
Σn j (X j  X) /(k  1)
2
F=
ΣΣ(X  X j ) /(N  k)
2
Difference between each
observation and its group
mean (WITHIN group
variation - ERROR)
F Statistic
F = MSB/MSE
MS = Mean Square
What values of F that indicate H0 is
likely true?
Decision Rule
Reject H0 if F > Critical Value of F with
df1=k-1 and df2=N-k
from Table 4
k= # comparison groups
N=Total sample size
ANOVA Table
Source of
Variation
Sums of
Squares
df
Between
2
SSB = Σ n j (X j - X )
Treatments
k-1
2
Error
SSE = Σ Σ (X - X j)
Total
SST = Σ Σ (X - X )
2
N-k
N-1
Mean
Squares
F
SSB/k-1 MSB/MSE
SSE/N-k
Example
Is there a significant difference in mean
weight loss among 4 different diet
programs?
(Data are pounds lost over 8 weeks)
Low-Cal
8
9
6
7
3
Low-Fat
2
4
3
5
1
Low-Carb
3
5
4
2
3
Control
2
2
-1
0
3
Example
Summary Statistics on Weight Loss by
Treatment
Low-Cal
n
5
Mean
6.6
Low-Fat
5
3.0
Overall Mean = 3.6
Low-Carb Control
5
5
3.4
1.2
Is there a statistically significant
difference in weight loss programs?
1. Yes
2. No
3. ??
0%
??
o
0%
N
Ye
s
0%
Example
1. H0: m1=m2=m3=m4
H1: Means are not all equal
2. Test statistic
F=
Σn j (X j  X) 2 /(k  1)
ΣΣ(X  X j ) 2 /(N  k)
a=0.05
Example
3. Decision rule
df1=k-1=4-1=3
df2=N-k=20-4=16
Reject H0 if F > 3.24
Example
SSB = Σ n j (X j - X )
2
=5(6.6-3.6)2+5(3.0-3.6)2+5(3.4-3.6)2+5(1.2-3.6)2
= 75.8
Example
SSE = Σ Σ (X - X j)
Low-Cal
8
9
6
7
3
Total
(X-6.6)
1.4
2.4
-0.6
0.4
-3.6
0
2
(X-6.6)2
2.0
5.8
0.4
0.2
13.0
21.4
Example
SSE = Σ Σ (X - X j)
Low-Fat
2
4
3
5
1
Total
(X-3.0)
-1.0
1.0
0
2.0
-2.0
0
2
(X-3.0)2
1.0
1.0
0
4.0
4.0
10.0
Example
SSE = Σ Σ (X - X j)
Low-Carb
3
5
4
2
3
Total
(X-3.4)
-0.4
1.6
0.6
-1.4
-0.4
0
2
(X-3.4)2
0.2
2.6
0.4
2.0
0.2
5.4
Example
SSE = Σ Σ (X - X j)
Control
2
2
-1
0
3
Total
(X-1.2)
0.8
0.8
-2.2
-1.2
1.8
0
2
(X-1.2)2
0.6
0.6
4.8
1.4
3.2
10.6
Example
SSE = Σ Σ (X - X j)
2
=21.4 + 10.0 + 5.4 + 10.6 = 47.4
Example
Source of
Variation
Sums of
Squares
df
Mean
Squares
F
8.43
Between
Treatments
75.8
3
25.3
Error
47.4
16
3.0
Total
123.2
19
Example
4. Compute test statistic
F=8.43
5. Conclusion. Reject H0 because 8.43 > 3.24.
We have statistically significant evidence at
a=0.05 to show that there is a difference in
mean weight loss among 4 different diet
programs.
ANOVA Using R
.csv data file
Example
An investigator wishes to compare the
average time to relief of headache pain
under three distinct medications, A, B and
C.
Fifteen patients who suffer from chronic
headaches are randomly selected for the
investigation.
The outcome is time to pain relief, in
minutes.
One Way ANOVA
RCT to Compare 3 Medications for
Chronic Pain N=15
Randomize
A
B
C
Outcome: Time to Pain Relief, minutes
One Way ANOVA (cont’d)
Data
Mean
Drug A
30
35
40
25
35
33.0
Drug B
25
20
30
20
30
25.0
Drug C
15
20
25
20
20
20.0
One Way ANOVA (cont’d)
1. Hypotheses
H0: m1 = m2 = m3
H1: means not all equal
a=0.05
2. Test Statistic F
One Way ANOVA (cont’d)
3. Decision Rule
K-1=3-1=2, N-k=15-3=12
Reject H0 if F > 3.89
4. Compute Sums of Squares
One Way ANOVA (cont’d)
(33.0 + 25.0 + 20.0)
= 26.0
X.. =
3
SSB = Σ n j (X j - X )
2
= 5((33-26.0)2 + (25-26.0)2 + (20-26.0)2)
= 430
SSE = Σ Σ (X - X j)
2
One Way ANOVA (cont’d)
Drug A
X
30
35
40
25
35
(X-33)
-3
-2
7
-8
-2
0
(X-33)2
9
4
49
64
4
130
One Way ANOVA (cont’d)
Drug B
X
25
20
30
20
30
(X-25)
0
-5
5
-5
5
0
(X-25)2
1
25
25
25
25
100
One Way ANOVA (cont’d)
Drug C
X
15
20
25
20
20
(X-20)
-5
0
5
0
0
0
(X-20)2
25
0
25
0
0
50
One Way ANOVA (cont’d)
SSE = Σ Σ (X - X j)
2
= 130+100+50 = 280
Source
SS
df
MS
F
Between
430.0
2
215
9.21
Error
280.0
12
Total
710.0
14
23.3
One Way ANOVA (cont’d)
Reject H0 since 9.21 > 3.89 –
Means are not all equal.
Paper – Testosterone Replacement
 Study design?
 RCT
 Number of comparison groups?
-placebo, no exercise
-testosterone, no exercise
-placebo and exercise
-testosterone and exercise
 Primary outcomes?
 Change in muscle strength, body weight, muscle
volume, lean body mass (continuous)
Paper – Testosterone Replacement
 Objective is to compare mean change in
muscle strength, body weight, muscle
volume, lean body mass (One at a time)
across four treatment groups
 Figure 1 – generalizability?
Paper – Testosterone Replacement
 Table 1 – what tests were used?
 Table 2 – what tests were used?
Practice Problem – Complete the
ANOVA Table
H0: m1=m2=m3=m4=m5
H1: means not all equal
Source SS
Between
Within
Total
225
a=0.05
df
MS
50
2.5
F
Practice Problem – Complete the
ANOVA Table
H0: m1=m2=m3=m4=m5
H1: means not all equal
a=0.05
Source SS
df
MS
Between 100
4
25
Within
125
50
2.5
Total
225
Reject H0 if F > F0.05(4,50)=2.56
F
10
ANOVA
 When the sample sizes are equal, the
design is said to be balanced
 Balanced designs give greatest power
and are more robust to violations of
the normality assumption
Extensions
 Multiple Comparison Procedures –
Used to test for specific differences in
means after rejecting equality of all
means
 Higher-Order ANOVA - Tests for
differences in means as a function of
several factors
Extensions
 Repeated Measures ANOVA - Tests
for differences in means when there
are multiple measurements in the
same participants (e.g., measures
taken serially in time)
Multiple Comparisons
Procedures (MCPs)
If we reject H0 in an ANOVA – we
conclude that the k means are not all
equal. Which means are different?
Pairwise comparisons H0: mi=mj
General contrasts H0: (mi+mj)/2=mk
MCPs (continued)
 With k treatments there are k(k-1)/2
possible pairwise comparisons
 The overall Type I error rate can be
as large as a{k(k-1)/2}!
 There are a number of different MCPs
– they differ in terms of treatment of
Type I error rate
MCPs (continued)
Error rate per comparison (ER_PC)
= P(Type I error) on any one test or
comparison (usually ER_PC is 0.05).
Error rate per experiment (ER_PE)
=the number of Type I errors we
expect to make in any experiment
under H0 (in 100 tests, we expect to
make 5 Type I errors = #tests(a)).
MCPs (continued)
Familywise error rate (FW_ER)
=P(at least 1 Type I error) in experiment.
FW_ER =1 - (1-ai)c,
where ai is the ER_PC
c=# contrasts in experiment.
MCPs (continued)
Example. Suppose we test the equality of 5
treatment means using ANOVA and the null
hypotheses is rejected at a = 0.05.
Suppose that it is of interest to perform all
pairwise comparisons.
There are k(k-1)/2 = 5(5-1)/2 = 10 distinct
pairwise comparisons.
MCPs (continued)
Suppose we wish to conduct each comparison at a
5% level of significance.
NOTE: Only tests that are of substantive interest
should be run and not all possible tests.
ER_PC = 0.05.
ER_PE = 10(0.05) = 0.5.
FW_ER = 1 - (1 - 0.05)10 = 0.401.
Scheffe MCP
 Conservative procedure that controls
familywise error rate regardless of the number
of contrasts
 Handles both pairwise and general contrasts
 Other MCPs include the Tukey procedure,
Duncan procedure (multiple range test),
Fisher's Least Significant Difference, the
Newman-Keuls test, and Dunnett's test (used
to compare a control to several active
treatments).
Scheffe MCP
For pairwise tests
H0: mi = mj
H1: mi ≠ mj
F=
( X.i - X.j )2
1
1
MSE  + 
 ni n j 
Reject H0 if F > (k-1) Fa (k-1, N-k)
One Way ANOVA-Scheffe
In Example we determined 3 drugs
were significantly different with
respect to mean time to pain relief
Mean
Drug A
33.0
Drug B
25.0
Drug C
20.0
Which drugs are different?
Scheffe Test – Drug A Vs. B
H0: mA = mB
H1: mA ≠ mB
2
(X A - X B )
F=
1
 1
MSE  + 
 nA nB 
Reject H0 if F > (k-1) Fa (k-1, N-k)
(k-1) F 0.05 (2,12) = 2(3.89) = 7.78
Scheffe Test – Drug A Vs. B
(X A - X B )
(33.0  25.0) 2
F=
=
= 6.87
1 
 1
1 1
23.3  
MSE  + 
5 5
 nA nB 
2
Do not reject H0 since 6.87<7.78. No
significant difference in mean times to pain
relief for Drugs A and B.
Scheffe Test – Drug A Vs. C
H0: mA = mC
H1: mA ≠ mC
2
(X A - XC )
F=
 1
1
MSE  + 
 nA nC 
Reject H0 if F > (k-1) Fa (k-1, N-k)
(k-1) F 0.05 (2,12) = 2(3.89) = 7.78
Scheffe Test – Drug A Vs. C
(X A - XC )
(33.0  20.0) 2
F=
=
= 18.13
 1
1 
1 1
23.3  
MSE  + 
5 5
 nA nC 
2
Reject H0 since 18.13>7.78. Significant
difference in mean times to pain relief for
Drugs A and C.
Scheffe Test – Drug B Vs. C
H0: mB = mC
H1: mB ≠ mC
2
(X B - XC )
F=
1
1
MSE  + 
 nB nC 
Reject H0 if F > (k-1) Fa (k-1, N-k)
(k-1) F 0.05 (2,12) = 2(3.89) = 7.78
Scheffe Test – Drug B Vs. C
(X B - XC )
(25.0  20.0) 2
F=
=
= 2.68
1
1
1 1
MSE  +  23.3  
5 5
 nB nC 
2
Do not reject H0 since 2.68<7.78. No
significant difference in mean times to pain
relief for Drugs B and C.
Overall conclusion??
Tukey Test in R
ANOVA
Pairwise Tests using Tukey MCP
Only significant result is A vs C
Download