Models

advertisement
Comparison of Repeated Measures and Covariance Analysis
for Pretest-Posttest Data
Chunmei Zhou
Introduction
We develop a comparison of repeated measures and covariance analysis for pretest-posttest
data. We consider a study design for which subjects are randomized to a drug or placebo
group and measured for systolic blood pressure before and after receiving the treatment. To
develop models to assess the effect of treatment on SBP, we first consider baseline blood
pressure as a covariate. As an alternative, the baseline and final blood pressure can be
considered to be repeated measures.
Models
We assume the study is conducted using a large number of s=1, …, N subjects. Potentially,
each of the N subjects could receive any one of the p treatments (p=1, 2) and the pretest (t=1)
and posttest (t=2) measures are taken on each of the s subjects. We assume the time interval
between pretest and posttest measurements is the same for all subjects.
A model for a response for subject s measured at time t for treatment p is given by
Ysptk   spt  E sptk
where  spt is a fixed constant corresponds to the potentially observable response of a subject
at time t for treatment p. Esptk corresponds to response error.
We can express  spt using  sp1 and  sp 2 .
2
 sp1   s1  (  sp1   s1 ) for p=1, 2. Where  s1  
 sp1
. (  sp1   s1 ) corresponds to the
2
difference in baseline response due to treatment assignment.
 sp 2   sp 2
p 1
We assume treatment assignment has no effect on pretest measure, therefore  sp1   s1 =0.
If we let  s1   s , then  sp 2   s  (  sp 2   s )   s   sp where  sp corresponds to the
deviation from subject’s mean of the baseline response for subject s due to treatment
assignment.
Then the response can be represented via the model:
0 if
Ysptk   s  X spt sp  E sptk , where X spt  
1 if
t 1
t2
N
We define a parameter for the mean of baseline measures as   
s
, then  s can be
N
expressed as  s    ( s   )     s , where  s corresponds to the deviation from the
overall baseline mean for subject s.
s 1
726864958
Page 1
4/11/2020
We define a parameter for the mean of the difference between posttest measure and pretest
N
measure as  p  
 sp
for p=1 and 2, thus  sp   p  ( sp   p )   p   sp , where  sp
N
corresponds to the deviation from the mean of the difference between posttest measure and
pretest measure for subject s.
s 1
Then we can modify the model for response as
Ysptk     s  X spt p  X spt  sp  E sptk .
From this model, we can get
Ysp1k     s  E sp1k and
Ysp 2 k     s   p   sp  E sp 2 k
If we assume there is a linear relationship between pre- and post-treatment measures, then
Ysp 2 k     s   p   sp  Ysp1k  E sp 2 k
Next, we express models for randomly selected subjects. Suppose we index the selection by
the subscript i, then the response can be represented via the model:
Yiptk    X ipt p  X ipt Bip  Bi  Eiptk ,
which represent the repeated measurement model. Where Bi is a random variable that
represents the deviation from the population mean at baseline for the ith selected subject. Bip
is another random variable corresponding to the deviation from the mean of the difference
between posttest measure and pretest measure for the ith selected subject. Eiptk is also a
random variable that represents the response error for SBP measure.
If we use baseline measures as a covariate, a model for the final SBP is given by:
Yip 2 k     p  Yip1k  Bip  Bi  Eip 2 k
which is also called covariance model. Where  is the coefficient for baseline measure. Eip2k
is a random variable that represents the response error for final measure.
Vector and Matrix Representation of the Model
We can represent the models using vector and matrix as the mixed model:
Y  Xα  ZB  E
726864958
Page 2
4/11/2020
 Y1 
 
Y 
For repeated-measures model, if the sample size of our study is n, then Y   2  ,
2 n1

 
Y 
 2n 
0
0 
0
0
 X 1 p1 1


0
0 
0
0
1 X 1 p1 
 B1 p 
 X 1p2 1






0
0 X 2 p1 1 
0
0
B1 
1 X 1 p 2 




 , α   , Z   0
0 X 2 p2 1 
0
0  , and Β     .
X  

 21  p  2 n2 n 
2 n 2
2 n1



 







1 X np1 
B


np


1 X 


 0
0
0
0  X np1 1 
np 2 
 Bn 



0
0
0  X np 2 1 
 0
For this model, V=var(ZB+E)=ZGZ+R where G=2In. the R matrices for the placebo and
  e2

0
 ; For treatment
treatment groups are different. For placebo group, R i  
2
2 
0



e
p 

2

0 
.
group, R i   e
2
2
0



e
t 

1
 Y1 

 
1
 Y2 
For the covariance model, Y    , X  
n3



 
1
Y 

 n
1 Y11 


 
1 Y12 
α

,
 p  , Z  I n  1' 2 , and
   31   n2 n

 
1 Y1n 
 B1 p 


 B1 
Β     . V=var(ZB+E)=ZGZ+R , where G=2I2n and R=e2In. As a result,
2 n1


 Bnp 
B 
 n
V  ( 2   e2 )I n .
Data Description
We simulate data for a pretest-posttest study for comparison of covariance analysis and
repeated measures. We assume the response variable to be normally distributed, the sample
size is 30, the average response for baseline is 175 and the treatment effect is –5. The SAS
code for simulation is given below:
726864958
Page 3
4/11/2020
data
%let
%let
%let
%let
%let
%let
%let
%let
sbp;
basemean=175;
nsub=30;
err_v=100;
tsubv=100;
subv=200;
nrep=1000;
effp1=0;
effp2=-5;
*overall average response for baseline;
*Number of subjects;
*Residual error variance at a time point;
*Treatment by subject variance;
*Variance of the subject effects;
*Number of replications of the simulation;
*Placebo effect;
*Treatment effect;
do trial=1 to &nrep;
do subj=1 to ⊄
m=&basemean;
sub=sqrt(&subv)*rannor(3345);
do time=1 to 2;
v=sqrt(&err_v)*rannor(3345);
if subj<=&nsub/2 then do;
trt=1;
if time=1 then y=m+sub+v;
if time=2 then y=m+sub+&effp1+v;
end;
if subj>&nsub/2 then do;
trt=2;
if time=1 then y=m+sub+v;
if time=2 then y=m+sub+&effp2+v+sqrt(&tsubv)*rannor(3345);
end;
output;
end;
end;
end;
Models Fitting Results
First, we use placebo as reference group to fit the covariance model using the following
statement:
proc mixed data=a;
by trial;
class subj tref;
model sbp2=sbp1 tref/s;
make 'solutionf' out=est1;
run;
An example of results for variance and treatment effect estimates are shown below:
Covariance Parameter Estimates
Cov Parm
Estimate
Residual
207.11
Solution for Fixed Effects
Effect
Intercept
sbp1
tref
tref
tref
0
1
Estimate
66.6827
0.6280
-5.9770
0
Standard
Error
31.0513
0.1834
5.3231
.
DF
27
27
27
.
t Value
2.15
3.42
-1.12
.
Pr > |t|
0.0409
0.0020
0.2714
.
The residual error variance is 207.11 and the treatment effect is estimated as –5.9770.
726864958
Page 4
4/11/2020
Then we fit repeated measures model using placebo and pretest as reference groups. We also
create a group variable to have value of 1 for placebo group at post test, 2 for treatment group
at post test, and 3 for pretest. We use this group variable to get estimates for different R
matrices. The SAS code is shown below:
proc mixed data=b;
by trial;
class subj tref timref grp;
model y=tref timref tref*timref/s;
repeated /group=grp;
random subj;
make 'solutionf' out=est2;
run;
An example of results for variances and treatment effect estimates are shown below:
Covariance Parameter Estimates
Cov Parm
subj
Residual
Residual
Residual
Group
Estimate
170.26
72.3522
259.29
49.7515
grp 1
grp 2
grp 3
Solution for Fixed Effects
Effect
Intercept
tref
tref
timref
timref
tref*timref
tref*timref
tref*timref
tref*timref
tref
timref
0
1
0
0
1
1
0
1
0
1
0
1
Estimate
168.13
4.6322
0
4.1336
0
-7.7003
0
0
0
Standard
Error
3.8298
5.4162
.
2.8531
.
5.3612
.
.
.
DF
28
28
.
28
.
28
.
.
.
t Value
43.90
0.86
.
1.45
.
-1.44
.
.
.
Pr > |t|
<.0001
0.3997
.
0.1585
.
0.1620
.
.
.
Four variance component estimates are given. The between subject variance is estimated as
 2
  49.75
0
0 

 and the
170.26. The R matrix for placebo group is R i   e
2
2 

0



0
72
.
35


e
p


 2
0 
0   49.75
  
 . The estimate for
R matrix for treatment group is R i   e
2
2
259.29 
 0 e t   0
treatment effect is –7.7003.
Comparison of Two Approaches
726864958
Page 5
4/11/2020
To compare the two approaches—covariance analysis and repeated measures, we set six
groups of different variance components (subject variance, residual error variance and
treatment by subject variance) to generate data, and then run the simulation 1000 times for
each group of variance components and get estimates of treatment effect each time from each
methods. However, we could not get model fitting results for the total 1000 simulations from
repeated-measures model because of infinite likelihood or convergence failure for some
trials. The simulation results are shown in the following tables.
Table1. Comparison of Estimates for treatment effect from two models with 1000 trials with treatment
effect –5, subject variance 200, residual error variance 100 and treatment by subject variance 1000
Method
No. of Trials
Mean of Est. of trt effect Std. Dev. of Estimates
Covariance
1000
-4.66
9.20
Repeat measures
877
-4.66
9.27
Table2. Comparison of Estimates for treatment effect from two models with 1000 trials with treatment
effect –5, subject variance 200, residual error variance 100 and treatment by subject variance 100
Method
No. of Trials
Mean of Est. of trt effect Std. Dev. of Estimates
Covariance
1000
-4.97
5.30
Repeat measures
911
-4.99
5.70
Table3. Comparison of Estimates for treatment effect from two models with 1000 trials with treatment
effect –5, subject variance 100, residual error variance 50 and treatment by subject variance 100
Method
No. of Trials
Mean of Est. of trt effect Std. Dev. of Estimates
Covariance
1000
-4.94
4.14
Repeat measures
962
-4.92
4.39
Table4. Comparison of Estimates for treatment effect from two models with 1000 trials with treatment
effect –5, subject variance 100, residual error variance 100 and treatment by subject variance 100
Method
No. of Trials
Mean of Est. of trt effect Std. Dev. of Estimates
Covariance
1000
-4.95
5.07
Repeat measures
962
-4.98
5.69
Table5. Comparison of Estimates for treatment effect from two models with 1000 trials with treatment
effect –5, subject variance 200, residual error variance 200 and treatment by subject variance 100
Method
No. of Trials
Mean of Est. of trt effect Std. Dev. of Estimates
Covariance
1000
-5.00
6.74
Repeat measures
965
-5.05
7.65
Table6. Comparison of Estimates for treatment effect from two models with 1000 trials with treatment
effect –5, subject variance 400, residual error variance 400 and treatment by subject variance 100
Method
No. of Trials
Mean of Est. of trt effect Std. Dev. of Estimates
Covariance
1000
-5.06
9.21
Repeat measures
970
-5.13
10.51
To better illustrate the estimates distribution, we draw box plots for the data resulting in the
estimates in Table 5 and Table 6. From the graphs, we can see the estimates from repeatedmeasures model are more scattered and with larger range. The subject variance and error
variance in Figure 2 are two times of the variances in Figure 1. So the data are more
dispersive in Figure 2 compared the data in Figure 1.
726864958
Page 6
4/11/2020
Covariance Model
Covariance Model
Repeated-Measures
Repeated Measures
25
15
20
10
15
5
10
0
5
0
-5
-5
-10
-10
-15
-15
-20
-20
-25
-25
-30
-30
-35
-40
-35
Figure1. Box Plots of Estimates for treatment effect
from two models with 1000 trials with treatment effect
–5, subject variance 200, residual error variance 200
and treatment by subject variance 100
Figure2. Box Plots of Estimates for treatment effect
from two models with 1000 trials with treatment
effect –5, subject variance 400, residual error var
400 and treatment by subject variance 100
The results in above tables and graphs suggest there is no big difference for the means of
estimate of treatment effect from two methods. With the variances increase, the standard
deviations of estimates are increase too. However, the standard deviations of estimate are
always smaller for covariance model than for repeated-measures model. Therefore, we can
conclude, from the long run, the covariance analysis may give more accurate estimate for
treatment effect.
Discussion
The repeated-measures analysis of variance and analysis of covariance are two common
approaches for analysis of pretest-posttest data. They are closely related but assumptions for
the analysis and variance estimates for the parameters are differ.
Our analysis results suggest the covariance analysis method may improve the precision of the
estimates of the treatment effects compared to repeated measures analysis. The use of a
baseline covariate will compensate for any differences between the mean levels of the
covariate in the treatment groups prior to treatment being received. But we should know the
assumption that there is a linear relationship between pre- and post-treatment values may not
be true. If this were the case, fitting a baseline covariate could lead to less precise results.
Attachment:
726864958
Page 7
4/11/2020
SAS Program:
OPTIONS LINESIZE=80 PAGESIZE=55 NOCENTER NODATE NONUMBER NOFMTERR;
*******************************************************************;
*
BioEpi 740 Final Project
;
*
PROGRAM NAME
DATE
PROGRAMMER
;
Title1 "Source:BE740FINAL.SAS
5/12/2001
CMZ "
;
* Description:
;
* -Simulate data for a pretest-posttest design
;
* -Applied Mixed Model on final Diastolic Blood Pressure using
;
* baseline DBP as a covariate
;
* -Applied Mixed Model considering baseline and final DBP as
;
* repeated measures
;
*******************************************************************;
data
%let
%let
%let
%let
%let
%let
%let
%let
sbp;
basemean=175;
nsub=30;
err_v=100;
tsubv=100;
subv=200;
nrep=1000;
effp1=0;
effp2=-5;
*Overall average response at baseline;
*Number of subjects;
*Residual error variance at a time point;
*Treatment by subject variance;
*Variance of the subject effects;
*Number of replications of the simulation;
*Placebo effect;
*Treatment effect;
do trial=1 to &nrep;
do subj=1 to ⊄
m=&basemean;
sub=sqrt(&subv)*rannor(3345);
do time=1 to 2;
v=sqrt(&err_v)*rannor(3345);
if subj<=&nsub/2 then do;
trt=1;
if time=1 then y=m+sub+v;
if time=2 then y=m+sub+&effp1+v;
end;
if subj>&nsub/2 then do;
trt=2;
if time=1 then y=m+sub+v;
if time=2 then y=m+sub+&effp2+v+sqrt(&tsubv)*rannor(3345);
end;
output;
end;
end;
end;
run;
/*ODS LISTING CLOSE;*/ /* turn printed output off */
proc means mean var maxdec=2 data=sbp nway;
by trial;
class time trt;
var y;
output out=meandata (drop=_type_ _freq_)mean=mean var=var;
title2 "nsub=&nsub Baseline Mean=&basemean Resid Var=&err_v";
title3 "Subject Var=&subv Trt*Sub Var=&tsubv Trt=&effp2";
title4 "Table1. Descriptive Statistics for Simulation Data";
726864958
Page 8
4/11/2020
run;
********************************************;
** Creat data set for covariance analysis **;
** and fit covariance model
**;
********************************************;
proc sort data=sbp; by trial subj;
data a;
array sbp[2];
retain sbp1 sbp2;
set sbp;
by trial subj;
sbp[time]=y;
if trt=2 then tref=0;
if trt=1 then tref=1;
if last.subj then output;
keep trial subj tref sbp1 sbp2;
proc mixed data=a;
by trial;
class subj tref;
model sbp2=sbp1 tref/s;
make 'solutionf' out=est1;
title4 "Table2. Covariance Model Fitting Results";
run;
****************************************************;
** Create data set for repeated-measures analysis **;
** Create group variable for R matrices estimates **;
** Use time=1 and trt=1 as reference group
**;
** Fit repeated-measures model
**;
****************************************************;
data b;
set sbp;
if trt=1 and time=2 then grp=1;
if trt=2 and time=2 then grp=2;
if time=1 then grp=3;
if trt=2 then tref=0;
if trt=1 then tref=1;
if time=2 then timref=0;
if time=1 then timref=1;
proc sort data=b;
by trial grp;
proc mixed data=b;
by trial;
class subj tref timref grp;
model y=tref timref tref*timref/s;
repeated /group=grp;
random subj;
make 'solutionf' out=est2;
title4 "Table3. Repeated Measures Model Fitting Results With";
title5 "different residual variances";
run;
*********************************************************************;
726864958
Page 9
4/11/2020
** Construct summary table for 1000 estimates of treatment effect **;
** from covariance analysis and repeated-measures analysis results **;
*********************************************************************;
proc transpose data=est1 out=cov (keep=trial est3) prefix=est;
by trial;
var estimate;
run;
proc transpose data=est2 out=repeat(keep=trial est6) prefix=est;
by trial;
var estimate;
run;
ods listing; /* turn printed output on */
data compare;
merge cov repeat;
by trial;
covest=est3;
repest=est6;
label covest="Est. of Trt. Eff.*from*cov.*model: *EST"
repest="Est.*of*trt.*eff.*from*repeat*model:*EST";
proc means data=compare n mean std
var covest repest;
run;
726864958
maxdec=2;
Page 10
4/11/2020
Download