Comparison of Repeated Measures and Covariance Analysis for Pretest-Posttest Data Chunmei Zhou Introduction We develop a comparison of repeated measures and covariance analysis for pretest-posttest data. We consider a study design for which subjects are randomized to a drug or placebo group and measured for systolic blood pressure before and after receiving the treatment. To develop models to assess the effect of treatment on SBP, we first consider baseline blood pressure as a covariate. As an alternative, the baseline and final blood pressure can be considered to be repeated measures. Models We assume the study is conducted using a large number of s=1, …, N subjects. Potentially, each of the N subjects could receive any one of the p treatments (p=1, 2) and the pretest (t=1) and posttest (t=2) measures are taken on each of the s subjects. We assume the time interval between pretest and posttest measurements is the same for all subjects. A model for a response for subject s measured at time t for treatment p is given by Ysptk spt E sptk where spt is a fixed constant corresponds to the potentially observable response of a subject at time t for treatment p. Esptk corresponds to response error. We can express spt using sp1 and sp 2 . 2 sp1 s1 ( sp1 s1 ) for p=1, 2. Where s1 sp1 . ( sp1 s1 ) corresponds to the 2 difference in baseline response due to treatment assignment. sp 2 sp 2 p 1 We assume treatment assignment has no effect on pretest measure, therefore sp1 s1 =0. If we let s1 s , then sp 2 s ( sp 2 s ) s sp where sp corresponds to the deviation from subject’s mean of the baseline response for subject s due to treatment assignment. Then the response can be represented via the model: 0 if Ysptk s X spt sp E sptk , where X spt 1 if t 1 t2 N We define a parameter for the mean of baseline measures as s , then s can be N expressed as s ( s ) s , where s corresponds to the deviation from the overall baseline mean for subject s. s 1 726864958 Page 1 4/11/2020 We define a parameter for the mean of the difference between posttest measure and pretest N measure as p sp for p=1 and 2, thus sp p ( sp p ) p sp , where sp N corresponds to the deviation from the mean of the difference between posttest measure and pretest measure for subject s. s 1 Then we can modify the model for response as Ysptk s X spt p X spt sp E sptk . From this model, we can get Ysp1k s E sp1k and Ysp 2 k s p sp E sp 2 k If we assume there is a linear relationship between pre- and post-treatment measures, then Ysp 2 k s p sp Ysp1k E sp 2 k Next, we express models for randomly selected subjects. Suppose we index the selection by the subscript i, then the response can be represented via the model: Yiptk X ipt p X ipt Bip Bi Eiptk , which represent the repeated measurement model. Where Bi is a random variable that represents the deviation from the population mean at baseline for the ith selected subject. Bip is another random variable corresponding to the deviation from the mean of the difference between posttest measure and pretest measure for the ith selected subject. Eiptk is also a random variable that represents the response error for SBP measure. If we use baseline measures as a covariate, a model for the final SBP is given by: Yip 2 k p Yip1k Bip Bi Eip 2 k which is also called covariance model. Where is the coefficient for baseline measure. Eip2k is a random variable that represents the response error for final measure. Vector and Matrix Representation of the Model We can represent the models using vector and matrix as the mixed model: Y Xα ZB E 726864958 Page 2 4/11/2020 Y1 Y For repeated-measures model, if the sample size of our study is n, then Y 2 , 2 n1 Y 2n 0 0 0 0 X 1 p1 1 0 0 0 0 1 X 1 p1 B1 p X 1p2 1 0 0 X 2 p1 1 0 0 B1 1 X 1 p 2 , α , Z 0 0 X 2 p2 1 0 0 , and Β . X 21 p 2 n2 n 2 n 2 2 n1 1 X np1 B np 1 X 0 0 0 0 X np1 1 np 2 Bn 0 0 0 X np 2 1 0 For this model, V=var(ZB+E)=ZGZ+R where G=2In. the R matrices for the placebo and e2 0 ; For treatment treatment groups are different. For placebo group, R i 2 2 0 e p 2 0 . group, R i e 2 2 0 e t 1 Y1 1 Y2 For the covariance model, Y , X n3 1 Y n 1 Y11 1 Y12 α , p , Z I n 1' 2 , and 31 n2 n 1 Y1n B1 p B1 Β . V=var(ZB+E)=ZGZ+R , where G=2I2n and R=e2In. As a result, 2 n1 Bnp B n V ( 2 e2 )I n . Data Description We simulate data for a pretest-posttest study for comparison of covariance analysis and repeated measures. We assume the response variable to be normally distributed, the sample size is 30, the average response for baseline is 175 and the treatment effect is –5. The SAS code for simulation is given below: 726864958 Page 3 4/11/2020 data %let %let %let %let %let %let %let %let sbp; basemean=175; nsub=30; err_v=100; tsubv=100; subv=200; nrep=1000; effp1=0; effp2=-5; *overall average response for baseline; *Number of subjects; *Residual error variance at a time point; *Treatment by subject variance; *Variance of the subject effects; *Number of replications of the simulation; *Placebo effect; *Treatment effect; do trial=1 to &nrep; do subj=1 to &nsub; m=&basemean; sub=sqrt(&subv)*rannor(3345); do time=1 to 2; v=sqrt(&err_v)*rannor(3345); if subj<=&nsub/2 then do; trt=1; if time=1 then y=m+sub+v; if time=2 then y=m+sub+&effp1+v; end; if subj>&nsub/2 then do; trt=2; if time=1 then y=m+sub+v; if time=2 then y=m+sub+&effp2+v+sqrt(&tsubv)*rannor(3345); end; output; end; end; end; Models Fitting Results First, we use placebo as reference group to fit the covariance model using the following statement: proc mixed data=a; by trial; class subj tref; model sbp2=sbp1 tref/s; make 'solutionf' out=est1; run; An example of results for variance and treatment effect estimates are shown below: Covariance Parameter Estimates Cov Parm Estimate Residual 207.11 Solution for Fixed Effects Effect Intercept sbp1 tref tref tref 0 1 Estimate 66.6827 0.6280 -5.9770 0 Standard Error 31.0513 0.1834 5.3231 . DF 27 27 27 . t Value 2.15 3.42 -1.12 . Pr > |t| 0.0409 0.0020 0.2714 . The residual error variance is 207.11 and the treatment effect is estimated as –5.9770. 726864958 Page 4 4/11/2020 Then we fit repeated measures model using placebo and pretest as reference groups. We also create a group variable to have value of 1 for placebo group at post test, 2 for treatment group at post test, and 3 for pretest. We use this group variable to get estimates for different R matrices. The SAS code is shown below: proc mixed data=b; by trial; class subj tref timref grp; model y=tref timref tref*timref/s; repeated /group=grp; random subj; make 'solutionf' out=est2; run; An example of results for variances and treatment effect estimates are shown below: Covariance Parameter Estimates Cov Parm subj Residual Residual Residual Group Estimate 170.26 72.3522 259.29 49.7515 grp 1 grp 2 grp 3 Solution for Fixed Effects Effect Intercept tref tref timref timref tref*timref tref*timref tref*timref tref*timref tref timref 0 1 0 0 1 1 0 1 0 1 0 1 Estimate 168.13 4.6322 0 4.1336 0 -7.7003 0 0 0 Standard Error 3.8298 5.4162 . 2.8531 . 5.3612 . . . DF 28 28 . 28 . 28 . . . t Value 43.90 0.86 . 1.45 . -1.44 . . . Pr > |t| <.0001 0.3997 . 0.1585 . 0.1620 . . . Four variance component estimates are given. The between subject variance is estimated as 2 49.75 0 0 and the 170.26. The R matrix for placebo group is R i e 2 2 0 0 72 . 35 e p 2 0 0 49.75 . The estimate for R matrix for treatment group is R i e 2 2 259.29 0 e t 0 treatment effect is –7.7003. Comparison of Two Approaches 726864958 Page 5 4/11/2020 To compare the two approaches—covariance analysis and repeated measures, we set six groups of different variance components (subject variance, residual error variance and treatment by subject variance) to generate data, and then run the simulation 1000 times for each group of variance components and get estimates of treatment effect each time from each methods. However, we could not get model fitting results for the total 1000 simulations from repeated-measures model because of infinite likelihood or convergence failure for some trials. The simulation results are shown in the following tables. Table1. Comparison of Estimates for treatment effect from two models with 1000 trials with treatment effect –5, subject variance 200, residual error variance 100 and treatment by subject variance 1000 Method No. of Trials Mean of Est. of trt effect Std. Dev. of Estimates Covariance 1000 -4.66 9.20 Repeat measures 877 -4.66 9.27 Table2. Comparison of Estimates for treatment effect from two models with 1000 trials with treatment effect –5, subject variance 200, residual error variance 100 and treatment by subject variance 100 Method No. of Trials Mean of Est. of trt effect Std. Dev. of Estimates Covariance 1000 -4.97 5.30 Repeat measures 911 -4.99 5.70 Table3. Comparison of Estimates for treatment effect from two models with 1000 trials with treatment effect –5, subject variance 100, residual error variance 50 and treatment by subject variance 100 Method No. of Trials Mean of Est. of trt effect Std. Dev. of Estimates Covariance 1000 -4.94 4.14 Repeat measures 962 -4.92 4.39 Table4. Comparison of Estimates for treatment effect from two models with 1000 trials with treatment effect –5, subject variance 100, residual error variance 100 and treatment by subject variance 100 Method No. of Trials Mean of Est. of trt effect Std. Dev. of Estimates Covariance 1000 -4.95 5.07 Repeat measures 962 -4.98 5.69 Table5. Comparison of Estimates for treatment effect from two models with 1000 trials with treatment effect –5, subject variance 200, residual error variance 200 and treatment by subject variance 100 Method No. of Trials Mean of Est. of trt effect Std. Dev. of Estimates Covariance 1000 -5.00 6.74 Repeat measures 965 -5.05 7.65 Table6. Comparison of Estimates for treatment effect from two models with 1000 trials with treatment effect –5, subject variance 400, residual error variance 400 and treatment by subject variance 100 Method No. of Trials Mean of Est. of trt effect Std. Dev. of Estimates Covariance 1000 -5.06 9.21 Repeat measures 970 -5.13 10.51 To better illustrate the estimates distribution, we draw box plots for the data resulting in the estimates in Table 5 and Table 6. From the graphs, we can see the estimates from repeatedmeasures model are more scattered and with larger range. The subject variance and error variance in Figure 2 are two times of the variances in Figure 1. So the data are more dispersive in Figure 2 compared the data in Figure 1. 726864958 Page 6 4/11/2020 Covariance Model Covariance Model Repeated-Measures Repeated Measures 25 15 20 10 15 5 10 0 5 0 -5 -5 -10 -10 -15 -15 -20 -20 -25 -25 -30 -30 -35 -40 -35 Figure1. Box Plots of Estimates for treatment effect from two models with 1000 trials with treatment effect –5, subject variance 200, residual error variance 200 and treatment by subject variance 100 Figure2. Box Plots of Estimates for treatment effect from two models with 1000 trials with treatment effect –5, subject variance 400, residual error var 400 and treatment by subject variance 100 The results in above tables and graphs suggest there is no big difference for the means of estimate of treatment effect from two methods. With the variances increase, the standard deviations of estimates are increase too. However, the standard deviations of estimate are always smaller for covariance model than for repeated-measures model. Therefore, we can conclude, from the long run, the covariance analysis may give more accurate estimate for treatment effect. Discussion The repeated-measures analysis of variance and analysis of covariance are two common approaches for analysis of pretest-posttest data. They are closely related but assumptions for the analysis and variance estimates for the parameters are differ. Our analysis results suggest the covariance analysis method may improve the precision of the estimates of the treatment effects compared to repeated measures analysis. The use of a baseline covariate will compensate for any differences between the mean levels of the covariate in the treatment groups prior to treatment being received. But we should know the assumption that there is a linear relationship between pre- and post-treatment values may not be true. If this were the case, fitting a baseline covariate could lead to less precise results. Attachment: 726864958 Page 7 4/11/2020 SAS Program: OPTIONS LINESIZE=80 PAGESIZE=55 NOCENTER NODATE NONUMBER NOFMTERR; *******************************************************************; * BioEpi 740 Final Project ; * PROGRAM NAME DATE PROGRAMMER ; Title1 "Source:BE740FINAL.SAS 5/12/2001 CMZ " ; * Description: ; * -Simulate data for a pretest-posttest design ; * -Applied Mixed Model on final Diastolic Blood Pressure using ; * baseline DBP as a covariate ; * -Applied Mixed Model considering baseline and final DBP as ; * repeated measures ; *******************************************************************; data %let %let %let %let %let %let %let %let sbp; basemean=175; nsub=30; err_v=100; tsubv=100; subv=200; nrep=1000; effp1=0; effp2=-5; *Overall average response at baseline; *Number of subjects; *Residual error variance at a time point; *Treatment by subject variance; *Variance of the subject effects; *Number of replications of the simulation; *Placebo effect; *Treatment effect; do trial=1 to &nrep; do subj=1 to &nsub; m=&basemean; sub=sqrt(&subv)*rannor(3345); do time=1 to 2; v=sqrt(&err_v)*rannor(3345); if subj<=&nsub/2 then do; trt=1; if time=1 then y=m+sub+v; if time=2 then y=m+sub+&effp1+v; end; if subj>&nsub/2 then do; trt=2; if time=1 then y=m+sub+v; if time=2 then y=m+sub+&effp2+v+sqrt(&tsubv)*rannor(3345); end; output; end; end; end; run; /*ODS LISTING CLOSE;*/ /* turn printed output off */ proc means mean var maxdec=2 data=sbp nway; by trial; class time trt; var y; output out=meandata (drop=_type_ _freq_)mean=mean var=var; title2 "nsub=&nsub Baseline Mean=&basemean Resid Var=&err_v"; title3 "Subject Var=&subv Trt*Sub Var=&tsubv Trt=&effp2"; title4 "Table1. Descriptive Statistics for Simulation Data"; 726864958 Page 8 4/11/2020 run; ********************************************; ** Creat data set for covariance analysis **; ** and fit covariance model **; ********************************************; proc sort data=sbp; by trial subj; data a; array sbp[2]; retain sbp1 sbp2; set sbp; by trial subj; sbp[time]=y; if trt=2 then tref=0; if trt=1 then tref=1; if last.subj then output; keep trial subj tref sbp1 sbp2; proc mixed data=a; by trial; class subj tref; model sbp2=sbp1 tref/s; make 'solutionf' out=est1; title4 "Table2. Covariance Model Fitting Results"; run; ****************************************************; ** Create data set for repeated-measures analysis **; ** Create group variable for R matrices estimates **; ** Use time=1 and trt=1 as reference group **; ** Fit repeated-measures model **; ****************************************************; data b; set sbp; if trt=1 and time=2 then grp=1; if trt=2 and time=2 then grp=2; if time=1 then grp=3; if trt=2 then tref=0; if trt=1 then tref=1; if time=2 then timref=0; if time=1 then timref=1; proc sort data=b; by trial grp; proc mixed data=b; by trial; class subj tref timref grp; model y=tref timref tref*timref/s; repeated /group=grp; random subj; make 'solutionf' out=est2; title4 "Table3. Repeated Measures Model Fitting Results With"; title5 "different residual variances"; run; *********************************************************************; 726864958 Page 9 4/11/2020 ** Construct summary table for 1000 estimates of treatment effect **; ** from covariance analysis and repeated-measures analysis results **; *********************************************************************; proc transpose data=est1 out=cov (keep=trial est3) prefix=est; by trial; var estimate; run; proc transpose data=est2 out=repeat(keep=trial est6) prefix=est; by trial; var estimate; run; ods listing; /* turn printed output on */ data compare; merge cov repeat; by trial; covest=est3; repest=est6; label covest="Est. of Trt. Eff.*from*cov.*model: *EST" repest="Est.*of*trt.*eff.*from*repeat*model:*EST"; proc means data=compare n mean std var covest repest; run; 726864958 maxdec=2; Page 10 4/11/2020