Box-Cox Procedure for Single-Factor ANOVA When both normality and homoscedasticity assumptions are violated, the Box-Cox procedure provides a general method for finding a suitable transformation of the response variable. Below, the Box-Cox procedure is applied to the Servo-Data example. The following SAS program would need to be run for various values of within a range from -1 to 1. The best transformation would be the one yielding the smallest value for SSE. data one; input trt y; label trt = "Location" y = "Time Between Failures"; ypr = y; obs = _n_; dum = 1; ; cards; 1 4.41 1 100.65 1 14.45 1 47.13 1 85.21 2 8.24 2 81.16 2 7.35 2 12.29 2 1.61 3 106.19 3 33.83 3 78.88 3 342.81 3 44.33 ; proc means; var y; output out=ydesc min=miny n=size; ; data two;set ydesc; dum = 1; ; data three;merge one two;by dum; if miny <= 0 then y = y - miny + 1; keep y trt size dum obs ypr; ; data four;set three; if obs = 1 then do; u = ypr**(1/size); v = u; end; if obs > 1 then do; v = u*(ypr**(1/size)); u = v; end; retain u; ; proc print; ; data five;set four;if obs = size; keep u size dum obs; dum = 1; ; proc print; ; data six;merge three five;by dum; /* The following statement needs to be changed for different values of lambda. */ lambda = 0.1; if lambda = 0 then do; k = u; w = k*log(y); end; if lambda ne 0 then do; k = lambda*(u**(lambda-1)); w = ((y**lambda) - 1)/k; end; ; proc print; ; proc plot; plot w*trt; ; proc glm; class trt; model w = trt; output out=resi r=resids; means trt / lsd duncan tukey scheffe; title "Analysis of Variance for Ch.18 p. 792"; title2 "With Follow-Up Tests"; ; data two;set resi; /* The following statement creates a dummy variable with value 1 for every */ /* observation in the data set. This variable will be used to merge the sample */ /* statistics with every observation in the original data set. */ dum = 1; ; proc sort;by resids; ; proc means noprint; var resids; output out=meanr mean=mu std=s n=size; title; title2; ; data three;set meanr; dum = 1; ; data three;merge two three;by dum; p = (_n_ - 0.5)/size; /* The following equation would need to be changed for q-q plots for other */ /* probability distributions. For example, for an exponential(mu) distribution, */ /* the statement would be Q = -mu*log(1-p). */ Q = probit(p); ; proc plot; plot resids*trt; title 'Plot of Residuals vs. Factor Levels'; ; proc reg noprint; model resids = Q; plot (resids predicted.)*Q / overlay; title 'Normal Probability Plot'; title2 'For Residuals from ANOVA'; ; run;