Analysis of Repeated Measures Data Ramon C. Littell Outline: 1. Introduction 2. Uni-variate and Multi-variate Analyses Using PROC GLM 3. Mixed Model Analyses Using PROC MIXED Effect on growth regulators on chrysanthemum plants Data courtesy James Barrett and Terril Nell Multivariate Data Set proc print data=mumsmult; Obs blk trt plt 1 3 1 1 1 3 2 2 1 9 1 3 1 9 2 4 2 3 1 5 2 3 2 6 2 9 1 7 2 9 2 8 3 3 1 9 3 3 2 10 3 9 1 11 3 9 2 12 4 3 1 13 4 3 2 14 4 9 1 15 4 9 2 16 5 3 1 17 5 3 2 18 5 9 1 19 5 9 2 20 ht1 ht2 ht3 ht4 ht5 ht6 elong chem 3.0 2.5 1.0 4.0 1.0 4.0 3.0 3.5 3.0 3.5 3.0 2.0 2.5 3.0 4.0 4.0 2.0 3.5 3.0 4.0 4.0 5.5 19.5 33.0 44.5 3.0 5.5 18.0 33.5 46.5 2.0 2.0 6.0 14.0 26.5 4.5 7.0 17.0 31.5 44.0 2.0 4.0 17.5 35.5 47.0 4.5 11.5 29.0 41.5 54.0 4.0 5.5 17.5 33.0 48.0 4.5 7.0 19.0 31.5 43.5 3.5 5.5 19.0 34.5 46.0 4.0 6.0 17.0 35.0 48.5 3.5 6.5 15.5 30.0 40.5 2.0 2.5 5.5 10.5 23.0 4.0 5.0 13.0 30.0 44.0 3.5 5.0 17.0 33.5 48.0 4.5 6.5 14.0 28.0 38.0 4.5 8.0 21.0 36.0 47.0 2.5 3.0 9.0 19.5 32.5 4.0 6.5 20.0 35.5 47.5 4.0 7.0 19.0 31.5 43.0 4.5 6.5 16.0 26.0 37.0 41.5bonzi 44.0bonzi 25.5sumag 40.0sumag 46.0bonzi 50.0bonzi 45.0sumag 40.0sumag 43.0bonzi 45.0bonzi 37.5sumag 21.0sumag 41.5bonzi 45.0bonzi 34.0sumag 43.0sumag 30.5bonzi 44.0bonzi 40.0sumag 33.0sumag Effect on growth regulators on chrysanthemum plants Data courtesy James Barrett and Terril Nell Univariate ANOVA at each time proc glm data=mumsmult; class blk chem plt; model ht1-ht6 = blk chem blk*chem; estimate ‘bonzi-sumag’ chem 1 -1; The GLM Procedure Class Level Information Class LevelsValues 51 2 3 4 5 blk 2bonzi sumag chem 21 2 plt Number of Observations Read Number of Observations Used 20 20 Effect on growth regulators on chrysanthemum plants Data courtesy James Barrett and Terril Nell Univariate ANOVA at each time The GLM Procedure Dependent Variable: ht1 Source DF Sum of Mean Square F Value Pr > F Squares 9 4.61250000 10 11.62500000 19 16.23750000 Model Error Corrected Total 0.51250000 1.16250000 0.44 0.8834 R-Square Coeff Var Root MSE ht1 Mean 0.284065 Source blk chem blk*chem 36.24178 1.078193 2.975000 DF Type III SS Mean Square F Value Pr > F 4 1 4 Parameter bonzi-sumag 1.30000000 0.61250000 2.70000000 Estimate 0.32500000 0.61250000 0.67500000 0.28 0.53 0.58 0.8846 0.4846 0.6836 Standard t Value Pr > |t| Error -0.35000000 0.48218254 -0.73 0.4846 Summary of Results from ANOVA at each Time Mean Squares (p-values) and Chem Differences (std. err.) at each Time Time Blk Chem Blk*Chem Error Difference 1 .325 .613(.48) .675 1.16 -0.35(.48) 2 .481 .450(.51) .794 0.95 -0.30(.44) 3 2.64 0.05(.93) 3.46 5.73 -0.10(1.07) 4 25.3 40.6(.25) 27.1 27.8 2.85(2.36) 5 46.0 177(.10) 46.0 54.4 5.95(3.30) 6 54.5 231(.06) 37.2 52.5 6.80(3.24) Conclusions: • Differences between Chems not statistically significant until Time 6, even though trends appear to separate at Time 4 • Mean squares increase with Time, corresponding to growth of plants • Univariate analyses at each time are valid, but not most efficient Effect on growth regulators on chrysanthemum plants Data courtesy James Barrett and Terril Nell Multivariate ANOVA repeated time / printe; Partial Correlation Coefficients from the Error SSCP Matrix / Prob > |r| DF = 10 ht1 ht2 ht3 ht4 ht5 ht6 1.000000 0.951572 0.915772 0.821807 0.688554 0.688022 ht1 ht2 ht3 ht4 ht5 ht6 0.951572 <.0001 0.915772 <.0001 0.821807 0.0019 0.688554 0.0191 0.688022 0.0193 <.0001 1.000000 0.927271 <.0001 0.831303 0.0015 0.740812 0.0091 0.722152 0.0121 <.0001 0.927271 <.0001 1.000000 0.922833 <.0001 0.791272 0.0037 0.788787 0.0039 0.0019 0.831303 0.0015 0.922833 <.0001 1.000000 0.0191 0.740812 0.0091 0.791272 0.0037 0.918780 <.0001 1.000000 0.918780 <.0001 0.909319 0.0001 0.0193 0.722152 0.0121 0.788787 0.0039 0.909319 0.0001 0.989512 <.0001 1.000000 0.989512 <.0001 MANOVA Test Criteria and Exact F Statistics for the Hypothesis of no time Effect H = Type III SSCP Matrix for Time E = Error SSCP Matrix S=1 Statistic Wilks' Lambda Pillai's Trace Hotelling-Lawley Trace Roy's Greatest Root Value 0.00202528 0.99797472 492.75942574 492.75942574 M=1.5 N=2 F Value Num DF 591.31 591.31 591.31 591.31 Den DF Pr > F 6 6 6 6 <.0001 <.0001 <.0001 <.0001 5 5 5 5 MANOVA Test Criteria and Exact F Statistics for the Hypothesis of no time*chem Effect H = Type III SSCP Matrix for Time*Chem E = Error SSCP Matrix S=1 Statistic Wilks' Lambda Pillai's Trace Hotelling-Lawley Trace Roy's Greatest Root Value 0.36066374 0.63933626 1.77266579 1.77266579 M=1.5 N=2 F Value 2.13 2.13 2.13 2.13 Num DF Den DF Pr > F 5 5 5 5 6 6 6 6 0.1925 0.1925 0.1925 0.1925 Conclusions from Multivariate ANOVA: • Effect of Time significant—no surprise • Time*Chem not significant—reflects weakness of multivariate test Effect on growth regulators on chrysanthemum plants Data courtesy James Barrett and Terril Nell Univariate ANOVA (Split-plot in time) The GLM Procedure Repeated Measures Analysis of Variance Tests of Hypotheses for Between Subjects Effects Source blk chem blk*chem “Whole plot” Error DF Type III SS Mean Square F Value Pr > F 4 1 4 10 293.9666667 183.7687500 288.7000000 899.0208333 73.4916667 183.7687500 72.1750000 89.9020833 0.82 2.04 0.80 0.5425 0.1833 0.5505 The GLM Procedure Repeated Measures Analysis of Variance Univariate Tests of Hypotheses for Within Subject Effects Source DF Type III SS Mean Square F Value Pr > F Adj Pr > F G-G H-F time time*blk time*chem time*blk*chem “Sub plot” Error 5 26437.68542 20 223.03333 5 266.16875 20 172.55000 50 526.60417 5287.53708 11.15167 53.23375 8.62750 10.53208 502.04 1.06 5.05 0.82 <.0001 0.4184 0.0008 0.6799 <.0001 0.4270 0.0403 0.5531 <.0001 0.4276 0.0107 0.6112 Conclusions from Split-plot in Time ANOVA: • Chem Diff not significant (p=.18) • Chem*Time significant (unadj. p=.0008, H-F adj. p=.01) Effect on growth regulators on chrysanthemum plants Data courtesy James Barrett and Terril Nell Test for Justification of “Split-plot in time” ANOVA Variables Transformed Variates Orthogonal Components Sphericity Tests DF Mauchly's Chi-Square Pr > ChiSq Criterion 14 14 4.6116E-7 6.7767E-6 118.17504 96.406309 <.0001 <.0001 Conclusions: • Sphericity Assumption does not hold • Therefore Split-plot in Time analysis is not justified. It would result in incorrect standard errors and invalid test of hypothesis. Effect on growth regulators on chrysanthemum plants Data courtesy James Barrett and Terril Nell Mixed Model Repeated Measures Analyses proc print data=mumsuni; Obs blk trt plt chem time ht 1 3 1bonzi 1 3.0 1 1 3 1bonzi 2 4.0 2 1 3 1bonzi 3 5.5 3 1 3 1bonzi 4 19.5 4 1 3 1bonzi 5 33.0 5 1 3 1bonzi 6 44.5 6 1 3 2bonzi 1 2.5 7 1 3 2bonzi 2 3.0 8 1 3 2bonzi 3 5.5 9 1 3 2bonzi 4 18.0 10 1 3 2bonzi 5 33.5 11 1 3 2bonzi 6 46.5 12 1 9 1sumag 1 1.0 13 1 9 1sumag 2 2.0 14 1 9 1sumag 3 2.0 15 1 9 1sumag 4 6.0 16 1 9 1sumag 5 14.0 17 1 9 1sumag 6 26.5 18 1 9 2sumag 1 4.0 19 1 9 2sumag 2 4.5 20 1 9 2sumag 3 7.0 21 1 9 2sumag 4 17.0 22 1 9 2sumag 5 31.5 23 1 9 2sumag 6 44.0 24 2 3 1bonzi 1 1.0 25 2 3 1bonzi 2 2.0 26 2 3 1bonzi 3 4.0 27 2 3 1bonzi 4 17.5 28 2 3 1bonzi 5 35.5 29 2 3 1bonzi 6 47.0 30 2 3 2bonzi 1 4.0 31 2 3 2bonzi 2 4.5 32 2 3 2bonzi 3 11.5 33 2 3 2bonzi 4 29.0 34 2 3 2bonzi 5 41.5 35 2 3 2bonzi 6 54.0 36 2 9 1sumag 1 3.0 37 2 9 1sumag 2 4.0 38 2 9 1sumag 3 5.5 39 2 9 1sumag 4 17.5 40 Effect on growth regulators on chrysanthemum plants Data courtesy James Barrett and Terril Nell Mixed Model Repeated Measures Analyses Mixed Model analysis of repeated measures data incorporates the covariance into the analysis, resulting in efficient and valid analyses. The first step is to model the covariance structure. It is usually to begin with unstructured covariance to examine the covariance matrix for patterns. The MIXED procedure uses syntax similar to the GLM procedure. A major distinction is that only fixed effects appear in the model statement. The repeated statement is used to define the covariance structure. The MIXED procedure employs likelihood methods to fit the model and compute inferential statistics. proc mixed data=mumsuni; class blk chem plt time; model ht = chem time chem*time / ddfm=kr; repeated time / sub=plt(chem time) type=un r rcorr; The Mixed Procedure Model Information WORK.MUMSUNI Data Set ht Dependent Variable Unstructured Covariance Structure plt(blk*chem) Subject Effect REML Estimation Method None Residual Variance Method Prasad-Rao-Jeske-Kackar-Harville Fixed Effects SE Method Degrees of Freedom Method Kenward-Roger Class Level Information Class LevelsValues 51 2 3 4 5 blk 2bonzi sumag chem 21 2 plt 61 2 3 4 5 6 time Iteration History Iteration Evaluations -2 Res Log Like Criterion 1 669.21399990 0 1 356.89670434 0.00000000 1 Convergence criteria met. This is good news! Effect on growth regulators on chrysanthemum plants Data courtesy James Barrett and Terril Nell Mixed Model Repeated Measures Analyses Covariance Matrix from r option in repeated statement: Estimated R Matrix for plt(blk*chem) 1 1 bonzi Row Col1 Col2 Col3 Col4 Col5 Col6 1 0.8681 0.7806 1.6236 3.4014 4.3750 4.1736 2 0.7806 0.8111 1.5667 3.4639 4.7583 4.6444 3 1.6236 1.5667 4.5361 10.2403 12.5236 11.8722 4 3.4014 3.4639 10.2403 27.1181 34.4472 32.9903 5 4.3750 4.7583 12.5236 34.4472 50.6736 49.3125 6 4.1736 4.6444 11.8722 32.9903 49.3125 49.5417 Interpretation: • The variance of height is .8681 at time 1, .8111 at time 2, 4.536 at time 3, etc. • The covariance is .7806 between heights at times 1 and 2, 1.623 between times 1 and 3, 1.566 between times 2 and 3, etc. • General pattern: - variances increase with time - covariances increase with time Correlation Matrix from rcorr option in repeated statement: Estimated R Correlation Matrix for plt(blk*chem) 1 1 bonzi Row Col1 Col2 Col3 Col4 Col5 1 1.0000 0.9302 0.8182 0.7011 0.6596 2 0.9302 1.0000 0.8168 0.7386 0.7422 3 0.8182 0.8168 1.0000 0.9233 0.8260 4 0.7011 0.7386 0.9233 1.0000 0.9293 5 0.6596 0.7422 0.8260 0.9293 1.0000 6 0.6364 0.7327 0.7920 0.9001 0.9842 Col6 0.6364 0.7327 0.7920 0.9001 0.9842 1.0000 Interpretation: • The correlation is .9302 between heights at times 1 and 2, .8182 between times 1 and 3, .8162 between times 2 and 3, etc. • General pattern: - correlations decrease with time interval - correlations of equal time lag are similar Effect on growth regulators on chrysanthemum plants Data courtesy James Barrett and Terril Nell Mixed Model Repeated Measures Analyses Selecting a Covariance Structure The next step is to select a covariance structure from those with the characteristics identified in the “unstructured” covariance and correlation matrices. One such candidate is heterogeneous autoregressive. proc mixed data=mumsuni; class blk chem plt time; model ht = chem time chem*time / ddfm=kr; repeated time / sub=plt(chem time) type=arh(1) r rcorr; The Mixed Procedure Model Information WORK.MUMSUNI Data Set ht Dependent Variable Heterogeneous Autoregressive Covariance Structure plt(blk*chem) Subject Effect REML Estimation Method None Residual Variance Method Prasad-Rao-Jeske-Kackar-Harville Fixed Effects SE Method Degrees of Freedom Method Kenward-Roger Iteration History Iteration Evaluations -2 Res Log Like 1 669.21399990 0 2 390.36247704 1 1 389.73086224 2 1 387.28045811 3 1 386.77198610 4 1 386.71272264 5 1 386.71128525 6 1 386.71128394 7 Criterion 0.04137490 0.02128789 0.00456145 0.00057582 0.00001498 0.00000001 0.00000000 Convergence criteria met. Interpretation: The estimation algorithm converges in 7 steps; more good news. Effect on growth regulators on chrysanthemum plants Data courtesy James Barrett and Terril Nell Mixed Model Repeated Measures Analyses Selecting a Covariance Structure Estimated R Matrix for plt(blk*chem) 1 1 bonzi Row Col1 Col2 Col3 Col4 Col5 Col6 1.0498 0.9780 2.0913 4.3011 5.0151 4.4498 1 2 0.9780 1.0623 2.2716 4.6720 5.4475 4.8335 3 2.0913 2.2716 5.6636 11.6484 13.5819 12.0511 4 4.3011 4.6720 11.6484 27.9341 32.5708 28.8999 5 5.0151 5.4475 13.5819 32.5708 44.2811 39.2904 6 4.4498 4.8335 12.0511 28.8999 39.2904 40.6491 Estimated R Correlation Matrix for plt(blk*chem) 1 1 bonzi Row Col1 Col2 Col3 Col4 Col5 1 1.0000 0.9261 0.8576 0.7942 0.7355 2 0.9261 1.0000 0.9261 0.8576 0.7942 3 0.8576 0.9261 1.0000 0.9261 0.8576 4 0.7942 0.8576 0.9261 1.0000 0.9261 5 0.7355 0.7942 0.8576 0.9261 1.0000 6 0.6812 0.7355 0.7942 0.8576 0.9261 Fit Statistics -2 Res Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) Col6 0.6812 0.7355 0.7942 0.8576 0.9261 1.0000 386.7 400.7 401.8 407.7 Interpretation: The arh(1) covariance and correlation matrices are similar to the “unstructured” matrices. The AICC fit index is 401.8 and the BIC fit index is 407.7. Type 3 Tests of Fixed Effects Effect Num Den F Value Pr > F DF DF 1 18.6 2.39 0.1393 chem 5 35.9 274.95 <.0001 time 5 35.9 2.70 0.0361 chem*time Interpretation: Chem*time is significant (p=.036) when using the arh(1) covariance Effect on growth regulators on chrysanthemum plants Data courtesy James Barrett and Terril Nell Mixed Model Repeated Measures Analyses Selecting a Covariance Structure Another covariance structure that is often useful is the Toeplitz structure. proc mixed data=mumsuni; class blk chem plt time; model ht = chem time chem*time / ddfm=kr; repeated time / sub=plt(chem time) type=toep r rcorr; The Mixed Procedure Model Information WORK.MUMSUNI Data Set ht Dependent Variable Toeplitz Covariance Structure plt(blk*chem) Subject Effect REML Estimation Method Profile Residual Variance Method Prasad-Rao-Jeske-Kackar-Harville Fixed Effects SE Method Degrees of Freedom Method Kenward-Roger Iteration History Iteration Evaluations -2 Res Log Like Criterion 1 669.21399990 0 2 521.60761870 17800.229199 1 1 514.30158009 3121.5889640 2 1 508.26590893 999.48842706 3 1 505.47449670 0.02318228 4 3 503.09397952 0.00247659 5 1 502.66411897 0.00016872 6 1 502.63720490 0.00000106 7 1 502.63704334 0.00000000 8 Convergence criteria met. Interpretation: The estimation algorithm converges in 8 steps; even more good news. Effect on growth regulators on chrysanthemum plants Data courtesy James Barrett and Terril Nell Mixed Model Repeated Measures Analyses Selecting a Covariance Structure Estimated R Matrix for plt(blk*chem) 1 1 bonzi Row Col1 Col2 Col3 Col4 Col5 Col6 22.0547 19.3033 14.4120 9.7663 7.1461 8.9552 1 2 19.3033 22.0547 19.3033 14.4120 9.7663 7.1461 3 14.4120 19.3033 22.0547 19.3033 14.4120 9.7663 4 9.7663 14.4120 19.3033 22.0547 19.3033 14.4120 5 7.1461 9.7663 14.4120 19.3033 22.0547 19.3033 6 8.9552 7.1461 9.7663 14.4120 19.3033 22.0547 Estimated R Correlation Matrix for plt(blk*chem) 1 1 bonzi Row Col1 Col2 Col3 Col4 Col5 1 1.0000 0.8752 0.6535 0.4428 0.3240 2 0.8752 1.0000 0.8752 0.6535 0.4428 3 0.6535 0.8752 1.0000 0.8752 0.6535 4 0.4428 0.6535 0.8752 1.0000 0.8752 5 0.3240 0.4428 0.6535 0.8752 1.0000 6 0.4060 0.3240 0.4428 0.6535 0.8752 Fit Statistics -2 Res Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) Col6 0.4060 0.3240 0.4428 0.6535 0.8752 1.0000 502.6 514.6 515.5 520.6 Interpretation: The “toep” and “unstructured” correlation matrices are similar, but the covariance matrices are quite different due to heterogeneous variances, which are not accommodated by “toep.” The AICC fit index is 515.5 and the BIC fit index is 520.6 for toep, both larger than for arh(1). This indicates using the arh(1) covariance structure. Effect on growth regulators on chrysanthemum plants Data courtesy James Barrett and Terril Nell Mixed Model Repeated Measures Analyses The next step is to use the selected covariance structure and compute inferential statistics. For this example, differences between the growth retardants at each time are of interest. proc mixed data=mumsuni; class blk chem plt time; model ht = chem time chem*time / ddfm=kr; repeated time / sub=plt(chem time) type=arh(1); estimate 'bonzi-sumag' chem 1 -1; estimate 'b-s time1' trt 1 -1 trt*time 1 0 0 0 0 0 -1 0 0 0 0 0; estimate 'b-s time2' trt 1 -1 trt*time 0 1 0 0 0 0 0 -1 0 0 0 0; estimate 'b-s time3' trt 1 -1 trt*time 0 0 1 0 0 0 0 0 -1 0 0 0; estimate 'b-s time4' trt 1 -1 trt*time 0 0 0 1 0 0 0 0 0 -1 0 0; estimate 'b-s time5' trt 1 -1 trt*time 0 0 0 0 1 0 0 0 0 0 -1 0; estimate 'b-s time6' trt 1 -1 trt*time 0 0 0 0 0 1 0 0 0 0 0 -1; Estimates Estimate Standard DF t Value Pr > |t| Error Label bonzi-sumag b-s time1 b-s time2 b-s time3 b-s time4 b-s time5 b-s time6 2.4750 -0.3500 -0.3000 -0.10000 2.8500 5.9500 6.8000 1.6025 0.4582 0.4609 1.0643 2.3636 2.9759 2.8513 18.6 17.5 17.4 17.4 18.3 19.6 20.5 1.54 -0.76 -0.65 -0.09 1.21 2.00 2.38 0.1393 0.4552 0.5237 0.9262 0.2433 0.0596 0.0268 Comparison of inferential results with ANOVA at each time: Time ANOVA Chem p-value Diff. (s.e.) MIXED Chem p-value Diff. (s.e.) 1 2 3 4 5 6 .48 -0.35(.48) .51 -0.30(.44) .93 -0.10(1.07) .25 2.85(2.36) .10 5.95(3.30) .06 6.80(3.24) .45 -0.35(.46) .52 -0.30(.46) .92 -0.10(1.06) .24 2.85(2.36) .06 5.95(2.97) .03 6.80(2.85)