Statistical Analysis Professor Lynne Stokes Department of Statistical Science Lecture #19 Analysis of Designs with Random Factor Levels Fermentation Process Experiment MGH Ex 10.17 Batch 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 Process F1 F1 F1 F1 F1 F2 F2 F2 F2 F2 F3 F3 F3 F3 F3 F4 F4 F4 F4 F4 Response (%) 84 79 76 82 74 83 72 82 97 76 92 87 82 84 75 89 74 80 79 83 Fermentation Process Experiment Proc GLM data=Ferment; class batch process; model response = batch process; random batch / test; lsmeans process / stderr pdiff; run; Fermentation Process Experiment The GLM Procedure Dependent Variable: Response Sum of Source DF Squares Mean Square F Value Pr > F Model 7 389.0000000 55.5714286 1.73 0.1934 Error 12 386.0000000 32.1666667 Corrected Total 19 775.0000000 Source R-Square Coeff Var Root MSE Response Mean 0.501935 6.958977 5.671567 81.50000 DF Type I SS Mean Square F Value Pr > F Batch 4 324.0000000 81.0000000 2.52 0.0965 Process 3 65.0000000 21.6666667 0.67 0.5846 DF Type III SS Mean Square F Value Pr > F Batch 4 324.0000000 81.0000000 2.52 0.0965 Process 3 65.0000000 21.6666667 0.67 0.5846 Source Fermentation Process Experiment Mason, Gunst, & Hess: Exercise 10.17 The GLM Procedure Source Type III Expected Mean Square Batch Var(Error) + 4 Var(Batch) Process Var(Error) + Q(Process) The Random Statement Produces This Output Fermentation Process Experiment Mason, Gunst, & Hess: Exercise 10.17 7 The GLM Procedure Tests of Hypotheses for Mixed Model Analysis of Variance Dependent Variable: Response Source DF Type III SS Mean Square F Value Pr > F Batch 4 324.000000 81.000000 2.52 0.0965 Process 3 65.000000 21.666667 0.67 0.5846 12 386.000000 32.166667 Error: MS(Error) The Random Statement Produces This Output Fermentation Process Experiment Mason, Gunst, & Hess: Exercise 10.17 8 Least Squares Means Response Standard LSMEAN Error Pr > |t| Number F1 79.0000000 2.5364017 <.0001 1 F2 82.0000000 2.5364017 <.0001 2 F3 84.0000000 2.5364017 <.0001 3 F4 81.0000000 2.5364017 <.0001 4 Process LSMEAN LSMEANS Standard Errors Only use the Fixed Effects Computing Formulas Least Squares Means for effect Process i/j Pr > |t| for H0: LSMean(i)=LSMean(j) These are Incorrect Dependent Variable: Response (See Proc Mixed Results) 1 1 2 3 4 0.4193 0.1886 0.5874 0.5874 0.7852 2 0.4193 3 0.1886 0.5874 4 0.5874 0.7852 0.4193 0.4193 Estimation of Variance Components: Method of Moments Equate mean squares to their expected mean squares and solve Random Main Effects Model MS A ˆ e2 rˆ a2 MS E ˆ e2 ˆ a2 MS A MS E r Method of Moments F Test: MSA / MSE Fermentation Process Experiment Mason, Gunst, & Hess: Exercise 10.17 The GLM Procedure Source Type III Expected Mean Square Batch Var(Error) + 4 Var(Batch) Process Var(Error) + Q(Process) e 32.1667 5.67 81.0000 32.1667 a 3.49 4 Estimation of Variance Components: Method of Moments Equate mean squares to their expected mean squares and solve Three-Factor Random Effects Model MS ABC 2e r 2abc MS E 2e 2abc MS ABC MS E r Method of Moments F Test: MSABC / MSE Estimation of Variance Components Three-Factor Random Effects Model 2 2 MS AB ˆ e2 rˆ abc rcˆ ab 2 MS ABC ˆ e2 rˆ abc MSE ˆ e2 2 ˆ ab MSAB MSABC rc F Test: MSAB / MSABC Estimation of Variance Components Three-Factor Random Effects Model MS A 2e r 2abc rc 2ab rb 2ac rbc 2a MS AB 2e r 2abc rc 2ab MS ABC 2e r 2abc MS E 2e 2a MS A MS AB MS AC MS ABC rbc F Test: No Exact Test Estimation of Variance Components Confidence Intervals MS ABC 2e r 2abc MS E 2e SS E 2e E MS E 2 2 / 2 ~ 2 ( E ) 2e E MS E 12 / 2 Estimation of Variance Components Confidence Intervals MS ABC 2e r 2abc MS E 2e MS ABC / ( 2e r 2abc ) MS E / 2e ~ F( ABC , E ) 2abc 1 MS ABC 1 MS ABC 1 2 1 r MS E F / 2 r MS E F1 / 2 e F Estimation of Variance Components Confidence Intervals MS AB 2e r 2abc rc 2ab MS ABC 2e r 2abc MS E 2e MS AB / ( 2e r 2abc rc 2ab ) MS ABC / ( 2e 1 MS AB 1 rc MS ABC F / 2 r 2abc ) 2ab ( 2e r 2abc ) ~ F( AB , ABC ) MS AB 1 1 rc MS ABC F1 / 2 Testing Variance Components Three-Factor Random Effects Model MS A 2e r 2abc rc 2ab rb 2ac rbc 2a MS AB 2e r 2abc rc 2ab MS ABC 2e r 2abc MS E 2e 2a MS A MS AB MS AC MS ABC rbc F Test: No Exact Test Satterthwaite’s Approximate F Statistic Assumptions MS1, MS2, ... , MSk are Pairwise Independent ANOVA Mean Squares SS j E (MS j ) ~ 2 ( j ) SS j var 2 j E(MS j ) var(MS j ) = 2E(MS j ) 2 j Satterthwaite’s Approximate F Statistic k L c j MS j j1 Approximation L ~ L 2(L) E(L) c j E(MS j ) L L var(L) c 2j 2 E(MS j ) 2 j 2 2L L Satterthwaite’s Approximate F Statistic Solution 2 2 1 var(L) c j E(MS j ) / j L 2 E(L) c j E(MS j ) 2 { c j E(MS j )} 2 E(L) L 2 var(L) c 2j E(MS j ) 2 / j Satterthwaite’s Approximate F Statistic Application Under Ho k L c j MS j j1 Regardless of Ho E(L) L L M k M d j MS j j1 E( M) M MM ~ 2 ( M ) M Select to be Independent of L Satterthwaite’s Approximate F Statistic L / L L L ~ F( L , M ) M / M M L { c j MS j } 2 2 2 c MS j j / j Satterthwaite’s Approximate F Statistic MS A 2e r 2abc rc 2ab rb 2ac rbc 2a MS AB 2e r 2abc rc 2ab MS ABC 2e r 2abc MS E 2e Approximation #1 L1 MS A F= M1 MS AB MS AC MS ABC E(L1 | H 0 ) E(M1 ) M1 & F Can Be Negative Satterthwaite’s Approximate F Statistic MS A 2e r 2abc rc 2ab rb 2ac rbc 2a MS AB 2e r 2abc rc 2ab MS ABC 2e r 2abc MS E 2e Approximation #2 L 2 MS A MS ABC F= M 2 MS AB MS AC E(L 2 | H 0 ) E(M 2 ) M2 & F Are Positive Random Effects Testing Three-Factor Random Effects Model Source Mean Square A MSA AB ABC Error MSAB MSABC MSE Expected Mean Square e2 + rabc2 + crab2 + brag2 + bcra2 e2 + rabc2 + crab2 e + rabc2 e2 •Effects Not Necessarily Tested Against Error •Test Main Effects Even if Interactions are Significant •May Not be an Exact Test (Mixed Effects Models) Random Effects Testing Three-Factor Random Effects Model Source Mean Square A MSA AB ABC Error MSAB MSABC MSE Expected Mean Square e2 + rabc2 + crab2 + brag2 + bcra2 e2 + rabc2 + crab2 e + rabc2 e2 Proc GLM: Random ... / Test Produces Satterthwaite Approximate Test Statistics Fixed Effects Standard Errors May be Incorrect Restricted Maximum Likelihood y X Zu , Zu ~ N(0, V) k Zu Z ju j , u j ~ NID j1 ( 0, 2j I nj ), k V Z jZj 2j j1 w My MX v , MX , v MZu ~ N(0, MV M) ( Estimate 2j by Maximizing L( | w ), 12 , 22 ,..., 2k ( ~ ~ XV 1X ) 1 ~ XV 1y ) Proc Mixed Proc Mixed data=Ferment Cl; class batch process; model response = process; random batch ; lsmeans process / adjust=tukey pdiff; run; Fermentation Process Experiment Mason, Gunst, & Hess: Exercise 10.17 8 The Mixed Procedure Covariance Parameter Estimates Cov Parm Estimate Alpha Lower Upper Batch 12.2083 0.05 2.8125 2023.05 Residual 32.1667 0.05 16.5405 87.6518 Balanced Design: Same as Method of Moments Fermentation Process Experiment Type 3 Tests of Fixed Effects Effect Num Den DF DF F Value Pr > F 3 12 0.67 0.5846 Process Least Squares Means Standard Effect Process Estimate Error DF t Value Pr > |t| Process F1 79.0000 2.9791 12 26.52 <.0001 Process F2 82.0000 2.9791 12 27.53 <.0001 Process F3 84.0000 2.9791 12 28.20 <.0001 Process F4 81.0000 2.9791 12 27.19 <.0001 Correct Standard Errors Fermentation Process Experiment Differences of Least Squares Means Standard Effect Process _Process Estimate Error DF t Value Pr > |t| Process F1 F2 -3.0000 3.5870 12 -0.84 0.4193 Process F1 F3 -5.0000 3.5870 12 -1.39 0.1886 Other Pairwise Comparisons on the Next Output Page Differences of Least Squares Means Effect Process _Process Adjustment Adj P Process F1 F2 Tukey-Kramer 0.8363 Process F1 F3 Tukey-Kramer 0.5261 Note: Standard Error of a Difference is Smaller than 2 SE (yi ) Random Effects Cancel in yi1 – yi2 (Pairwise Balance Needed) Randomized Complete Block Designs Factorial Structure with Main Effect for Blocks Nothing New Latin Square Designs Control Two Sources of Variability Restrictions Factor of Interest and Two blocking Factors Each at k Levels No Interactions Among the Experimental and Blocking Factors Experiment Size Latin Square : n = k2 Complete Factorial : n = k3 + r Analysis of Latin Square Designs y ij = + a i b j + k + e ij ith Row Block Effect jth Column Block Effect kth Factor Level Effect Error Variation From All Sources Except Blocks & Factor Main Effects Main Effects Analysis of Variance Model Balanced Incomplete Block Designs Used when blocks contain fewer experimental units than the number of unique factor-level combinations b blocks f factor-level combinations k < f experimental units per block No interactions with the design factor(s) Asphalt-Pavement Rating Study Purpose Assess the Deterioration of Highway Pavement Response Rating: 0 = No pavement remaining 100 = excellent condition Design Factor 16 District Engineers (Random) Blocking Factor 16 Road Segments (Random) Asphalt-Pavement Rating Study b = 16 Road Segments (Blocks) f = 16 Engineers (Factor-Level Combinations) k = 6 Engineers/Road Segment Asphalt-Pavement Rating Study Design Engineer 1 R o a d S e g m e n t 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 2 3 4 57 5 70 6 7 55 65 65 52 66 68 55 59 57 84 50 60 58 56 56 52 57 9 70 55 10 12 55 53 80 70 54 55 60 74 70 78 50 75 55 16 72 61 74 70 72 56 58 95 11 39 45 68 15 38 80 57 57 40 68 73 55 50 64 96 60 14 80 48 63 55 95 68 45 13 55 62 78 76 35 11 85 62 57 50 58 54 61 72 50 60 8 85 60 60 68 58 32 56 88 55 56 60 76 75 35 38 30 Analysis of Variance with Unbalanced Data Error Sums of Squares Models 1 & 2 are Hierarchical Model 2 has a Subset of Model 1 Terms SSE2 SSE1 Reduction in Error Sums of Squares R(M1 | M2) = SSE2 - SSE1 Testing Effects R(M1 |M 2 ) / ( 2 - 1 ) F= MS E1 df = 2 - 1 Balanced Incomplete Block Design Model 1 y ij = + i b j + e ij Model 2 y ij = + i + e ij Model 3 y ij = + b j + e ij Block Effect: R(M1 | M2) = SSE2 - SSE1 Factor Effect: R(M1 | M3) = SSE3 - SSE1 SAS PROC GLM Type I Sums of Squares Two Model Fits Asphalt-Pavement Rating Study Sum of Source DF Squares Mean Square F Value Pr > F Model 30 13422.12500 447.40417 7.10 <.0001 Error 65 4098.83333 63.05897 Corrected Total 95 17520.95833 Mean Square F Value Pr > F Source R-Square Coeff Var Root MSE Rating Mean 0.766061 12.93405 7.940968 61.39583 DF Type I SS Road 15 11786.95833 785.79722 12.46 <.0001 Engineer 15 1635.16667 109.01111 1.73 0.0668 Source DF Type III SS Mean Square F Value Pr > F Road 15 11005.16667 733.67778 11.63 <.0001 Engineer 15 1635.16667 109.01111 1.73 0.0668 Asphalt-Pavement Rating Study Sum of Source DF Squares Mean Square F Value Pr > F Model 30 13422.12500 447.40417 7.10 <.0001 Error 65 4098.83333 63.05897 Corrected Total 95 17520.95833 F Value Pr > F Source R-Square Coeff Var Root MSE Rating Mean 0.766061 12.93405 7.940968 61.39583 DF Type I SS Mean Square Engineer 15 2416.95833 161.13056 2.56 0.0047 Road 15 11005.16667 733.67778 11.63 <.0001 Source DF Type III SS Mean Square F Value Pr > F Engineer 15 1635.16667 109.01111 1.73 0.0668 Road 15 11005.16667 733.67778 11.63 <.000 Asphalt-Pavement Rating Study Asphalt-Paving Rating Study Mason, Gunst, & Hess: Table 10.4 The GLM Procedure Source Type III Expected Mean Square Engineer Var(Error) + 5.3333 Var(Engineer) Road Var(Error) + 5.3333 Var(Road) Asphalt-Pavement Rating Study The Mixed Procedure Convergence criteria met Covariance Parameter Estimates Cov Parm Estimate Alpha Lower Upper Road 121.96 0.05 63.5064 323.11 Engineer 7.9899 0.05 2.2767 222.91 Residual 63.4641 0.05 46.1892 92.6823 Allergic Reaction Study: Randomized Complete Block Design Source Road Segments Engineers Error Total df 15 15 65 95 SS 11,005 1,635 4,099 17,521 MS 734 109 63 F Value 11.63 1.73 p-Value 0.000 0.067 Not Additive MGH Table 10.6 Balanced Incomplete Block Designs Multiple Comparisons Use Adjusted Factor-Level Averages yi,adj y yi,blocks Nr ( yi yi,blocks ) Nb Average of r Block Averages Containing Factor-Level i MGH Exhibit 10.5