732A35 1

advertisement
732A35
1


Regression analysis: Observational study
Analysis of Variance: Experimental study
Analysis of Variance = ANOVA
We still have a normally distributed response variable
but the explanatory variables are qualitative variables
called Factors.
The values of the factor are called levels.
732A35
2
Y=number of sold TVs
Is there a difference between the number of sold TVs if
there has been a advertisement in the newspaper or TV
advertisement?
Type of advertisement is determined in advance, so it’s an
experimental study, not an observational study.
If 𝑌𝑖 ~𝑁 0, 𝜎 2 , i=1,2
then
𝑡=
𝑌1 − 𝑌2
𝑠𝑝
1
𝑛1
+ 𝑛1
~ 𝑡 𝑛1 + 𝑛2 − 2
2
2
2
(𝑛
−1)𝑠
+
(𝑛
−1)𝑠
1
2
1
2
𝑠𝑝2 =
𝑛1 + 𝑛2 − 2
732A35
3
Assume now that we want to compare three levels,
that is
1)
newspaper advertisement
2)
TV advertisement
3)
No advertisement
We need a new model for this
732A35
4
𝑌𝑖𝑗 = 𝜇𝑖 + 𝜀𝑖𝑗
𝑖 = 1,2, … , 𝑟 number of levels
𝑗 = 1,2, … , 𝑛𝑖 number of observations for level i
𝜀𝑖𝑗 ~𝑁 0, 𝜎 2 𝑖𝑖𝑑 is the random component
𝐸 𝑌𝑖𝑗 = 𝜇𝑖
732A35
5
𝑌𝑖𝑗 = 𝜇∙ + 𝜏𝑖 + 𝜀𝑖𝑗
𝑖 = 1,2, … , 𝑟 number of levels
𝑗 = 1,2, … , 𝑛𝑖 number of observations for level i
𝜏𝑖 = factor effect for level i
𝑟
𝜏𝑖 = 0
𝑖=1
𝜏1 + 𝜏2
+ ⋯ + 𝜏𝑟−1 = −𝜏𝑟
This holds if 𝑛𝑖 = 𝑛
Otherwise:
𝑖 𝑛𝑖 𝜏𝑖
𝑛𝑇
=0
732A35
6
Example
Factor
1
𝑦11 𝑦12 𝑦13 𝑦14
𝑛1 = 4
𝑌1∙
2
𝑦21 𝑦22 𝑦23 𝑦24 𝑦25 𝑦26
𝑛2 = 6
𝑌2∙
3=r
𝑦31 𝑦32 𝑦33
𝑛3 = 3
𝑌3∙
732A35
7
𝑛𝑖
𝑌𝑖𝑗 = 𝑌𝑖∙
𝑗=1
𝑌𝑖∙
𝑌𝑖∙ =
𝑛𝑖
𝑟
𝑛𝑖 = 𝑛 𝑇
𝑖=1
𝑌∙∙ =
𝑖,𝑗 𝑌𝑖𝑗
𝑛𝑇
𝑌∙∙
=
𝑛𝑇
732A35
8
Ordinary least squares estimates (OLS):
Minimize:
𝑟
𝑛𝑖
(𝑌𝑖𝑗 − 𝜇𝑖 )2
𝑄=
𝑖=1 𝑗=1
Result:
𝜇𝑖 = 𝑌𝑖∙
Or if 𝜇𝑖 = 𝜇∙ + 𝜏𝑖
then 𝜇∙ = 𝑌∙∙ and 𝜏𝑖 = 𝑌𝑖∙ − 𝑌∙∙
732A35
9
𝑒𝑖𝑗 = 𝑌𝑖𝑗 − 𝑌𝑖𝑗 = 𝑌𝑖𝑗 − 𝑌𝑖∙
Sum to zero and have the same properties as in
regression analysis.
That is:

Normally distributed

Constant variance

Independent
Plot against fitted values and in observational order
if needed.
732A35
10
SSTO=SSTR+SSE
𝑟
𝑛𝑖
(𝑌𝑖𝑗 − 𝑌∙∙ )2
𝑆𝑆𝑇𝑂 =
𝑖=1 𝑗=1
𝑟 𝑛𝑖
𝑟
(𝑌𝑖∙ − 𝑌∙∙ )2 =
𝑆𝑆𝑇𝑅 =
𝑖=1 𝑗=1
𝑟 𝑛𝑖
𝑛𝑖 (𝑌𝑖∙ − 𝑌∙∙ )2
𝑖=1
𝑟 𝑛𝑖
2
𝑒𝑖𝑗
(𝑌𝑖𝑗 − 𝑌𝑖∙ )2 =
𝑆𝑆𝐸 =
𝑖=1 𝑗=1
𝑖=1 𝑗=1
732A35
11
Source
df
SS
MS
F
Treatment
𝑟−1
SSTR
𝑆𝑆𝑇𝑅
=MSTR
𝑟−1
𝑀𝑆𝑇𝑅
𝑀𝑆𝐸
Error
𝑛𝑡 − 𝑟 SSE
Total
𝑛𝑡 − 1 SSTO
𝑆𝑆𝐸
=MSE
𝑛𝑡 −𝑟
732A35
12
𝜎 2 = 𝑀𝑆𝐸
𝑀𝑆𝐸 =
𝑟
𝑖=1
𝑛𝑖
𝑗=1(𝑌𝑖𝑗
− 𝑌𝑖∙
𝑛𝑇 − 𝑟
)2
=
Easier to calculate : 𝑀𝑆𝐸 =
𝑠𝑖2 = 𝑛 1−1
𝑖
𝑛𝑖
2
𝑒
𝑗=1 𝑖𝑗
𝑟
𝑖=1
𝑛𝑇 − 𝑟
2
(𝑛
−1)𝑠
𝑖 𝑖
𝑖
𝑛𝑇 −𝑟
(𝑌𝑖𝑗 − 𝑌𝑖∙ )2
𝑗
732A35
13
𝐻0 : 𝜇1 = 𝜇2 = ⋯ = 𝜇𝑟
𝐻𝑎 : 𝑛𝑜𝑡 𝑎𝑙𝑙 𝑚𝑒𝑎𝑛𝑠 𝑎𝑟𝑒 𝑒𝑞𝑢𝑎𝑙
𝐻0 : 𝜏1 = 𝜏2 = ⋯ = 𝜏𝑟 = 0
𝐻𝑎 : 𝑛𝑜𝑡 𝑎𝑙𝑙 𝑒𝑓𝑓𝑒𝑐𝑡𝑠 𝑎𝑟𝑒 𝑧𝑒𝑟𝑜
𝐹∗
𝑀𝑆𝑇𝑅
=
~𝐹(𝑟 − 1, 𝑛 𝑇 − 𝑟)
𝑀𝑆𝐸
732A35
14
𝐸 𝑀𝑆𝐸 = 𝜎 2
so MSE is an unbiased estimate of 𝜎 2
𝐸 𝑀𝑆𝑇𝑅 = 𝜎 2 +
𝑛𝑖 (𝜇𝑖 − 𝜇∙ )2
𝑖
where 𝜇∙ =
𝑖 𝑛𝑖 𝜇𝑖
𝑛𝑇
𝐸 𝑀𝑆𝑇𝑅 = 𝜎 2 when the null hypothesis is true
732A35
15
If we reject the null hypothesis, then we would like to
investigate where the differences are.
Calculate confidence-intervals for

𝜇𝑖

𝐷 = 𝜇𝑖 − 𝜇𝑖´

𝐿=
𝑟
𝑖=1 𝑐𝑖 𝜇𝑖
Difference
𝑖 ≠ 𝑖´
𝑟
𝑖=1 𝑐𝑖
=0
Contrast
Example of a contrast:
𝜇1 + 𝜇2
𝐿=
− 𝜇3
2
732A35
16

𝜇𝑖 = 𝑌𝑖∙
𝑠 𝑌𝑖∙ =
𝑀𝑆𝐸
𝑛𝑖

𝐷 = 𝑌𝑖∙ − 𝑌𝑖´∙
𝑠 𝐷 =
𝑀𝑆𝐸

𝐿=
𝑟
𝑖=1 𝑐𝑖 𝑌𝑖∙
𝑠 𝐿 =
𝑀𝑆𝐸
1
𝑛𝑖
+ 𝑛1
𝑖´
𝑐𝑖2
𝑟
𝑖=1 𝑛
732A35
𝑖
17


Denote one of the parameters 𝜇𝑖 , D or L by 𝜃
For a single confidence-interval, use the tdistribution.
𝑡 = 𝑡(1 − 𝛼2; 𝑛 𝑇 − 𝑟)

CI for 𝜃: 𝜃 ± 𝑡 ∙ 𝑠(𝜃)

𝜃−𝜃0
Test-statistic:
to test if 𝜃 = 𝜃0
𝑠(𝜃 )

Compare with the distribution above
732A35
18

For g confidence intervals, use the Bonferroni
method
𝛼
B = 𝑡(1 − 2𝑔
; 𝑛 𝑇 − 𝑟)
Family confidence 1 − 𝛼

CI for 𝜃: 𝜃 ± 𝐵 ∙ 𝑠(𝜃 )

𝜃 −𝜃0
Test-statistic:
to test if 𝜃 = 𝜃0
𝑠(𝜃)

Compare with the distribution above
732A35
19

Use Tukeys method
𝑇=
1
2
𝑞(1 − 𝛼: 𝑟, 𝑛 𝑇 − 𝑟)
q is the studentized range distribution
Family confidence 1 − 𝛼


CI for 𝐷: 𝐷 ± 𝑇 ∙ 𝑠(𝐷 )
Test-statistic:
Most often 𝐷0 = 0

𝐷−𝐷0
to test if 𝐷 = 𝐷0
𝑠(𝐷)
Compare with the distribution above
732A35
20

Use Scheffés method
𝑆 2 = (1 − 𝑟)𝐹(1 − 𝛼; 𝑟 − 1, 𝑛 𝑇 − 𝑟)
Family confidence 1 − 𝛼

CI for 𝐿: 𝐿 ± 𝑆 ∙ 𝑠(𝐿)

𝐿
Test-statistic:
𝑠(𝐿)

Compare with the distribution above
to test if 𝐿 = 0
732A35
21

Chapter 15, 16 and 17
732A35
22
Download