Residual Analysis for ANOVA Models

advertisement
Residual Analysis for ANOVA
Models
KNNL – Chapter 18
Residuals
Model Errors (unobserved):
^
eij  Yij  Y ij  Yi  Y i
 ni  1 

 ni 
 n 1 
 n 1 
s 2 eij   MSE  i 
s eij   MSE  i 
 ni 
 ni 
Semi-Studentized Residual (Residual divided by estimate of  , trivial to compute):
E eij   0
eij* 
 2 eij    2 
eij
MSE
Studentized Residual (Residual divided by its standard error, messier to compute):
rij 
ei
 n 1 
MSE  i 
 ni 
Studentized Deleted Residual:
1/2




n

r

1
T

tij  eij 

 ni  1  2 
 SSE 
  eij 
n


 i 
Model Departures Detected With Residuals and Plots
•
•
•
•
•
•
Errors have non-constant variance
Errors are not independent
Existence of Outlying Observations
Omission of Important Predictors
Non-normal Errors
Common Plots






Residuals versus Treatment
Residuals versus Treatment Mean
Aligned Dot Plot (aka Strip Chart)
Residuals versus Time
Residuals versus Omitted Variables
Box Plots, Histograms, Normal Probability Plots
Tests for Constant Variance H0:12=...=t2
Hartley's Test: (Assumes normal data, equal sample sizes)
H* 
max  si2 
min  s
2
i

Reject H 0 if H *  H 1   ; r , n  1
where n1  ...  nr  n
Brown-Forsythe Test: (Robust to non-normality, allows unequal sample sizes)
~
dij  Yij  Yi
i  1,..., r j  1,..., ni Yi  median Yi1 ,..., Yini
ni
d i 
*
BF
F
d
j 1
ni

~
r
ij
d  
MSTRBF

MSEBF
ni
 d
i 1 j 1
nT
 n d
r
ij
MSTRBF 
i 1
i
i

 d 
r 1
Reject H 0 if F *  F 1   ; r  1, nT  r 

  d
r
2
MSEBF 
ni
ij
 d i
i 1 j 1
nT  r

2
Remedial Measures
• Normally distributed, Unequal variances – Use
Weighted Least Squares with weights: wij = 1/si2
 SSEw  R   SSEw  F  
Fw*  

r 1


 SSEw  F  


 nT  r 
Conclude means not all equal if Fw*  F 1   ; r  1, nT  r 
• Non-normal data (with possibly unequal variances) –
Variance Stabilizing Transformations and Box-Cox
Transformation
–
–
–
–
Variance proportional to mean: Y’=sqrt(Y)
Standard Deviation proportional to mean: Y’=log(Y)
Standard Deviation proportional to mean2: Y’=1/Y
Response is a (binomial) proportion: Y’=2arcsin(sqrt(Y))
• Non-parametric tests – F-test based on ranks and
Kruskal-Wallis Test
Effects of Model Departures
• Non-normal Data – Generally not problematic in
terms of the F-test, if data are not too far from
normal, and reasonably large sample sizes
• Unequal Error Variances – As long as sample sizes are
approximately equal, generally not a problem in
terms of F-test.
• Non-independence of error terms – Can cause
problems with tests. Should use Repeated Measures
ANOVA if same subject receives each treatment
Nonparametric Tests
Rank all observations across treatments from 1 to nT , assigning average ranks when ties occur
ni
R i 
r
R
j 1
ij
R  
ni
r
ni
ni
 R
nT

SSTOR   Rij  R 
i 1 j 1
ij
i 1 j 1

2

1  ...  nT nT  nT  1 2 nT  1


nT
nT
2
r

SSTRR   ni R i  R 
i 1

2
r
ni

SSER   Rij  R i
i 1 j 1

2
(Approximate) F  test :
 SSTRR  r  1  MSTRR
FR*  

 SSER  nT  r   MSER
Conclude means not all equal if FR*  F 1   ; r  1, nT  r 


Simultaneous CIs for Differences in Mean Ranks: R i  R i '  z 1   / 2 g  
nT  nT  1  1 1 
  
12
 ni ni ' 
Kruskal-Wallis Test (Directly computed in most software packages):
X
2
KW
r

Ri2 
SSTRR
12

  3  nT  1 

 SSTOR 
 nT  nT  1 i 1 ni 
 n 1 
 T

2
Conclude means not all equal if X KW
  2 1   ; r  1
Download