Professor Martin Bland

advertisement
Randomised Controlled Trials in the
Social Sciences
Analysis of randomised trials
Martin Bland
Professor of Health Statistics
University of York
www-users.york.ac.uk/~mb55/
Trials in the social and health sciences
Randomisation began in agricultural research:
--- many treatments, complex designs,
--- research material plants,
--- few practical problems.
Randomisation in social and health sciences:
--- few treatments (usually 2), simple designs,
--- research material people, needing intervention,
--- many practical problems.
Trials in the social and health sciences
Practical problems:
 must recruit in service setting, using service personnel,
 must get consent, may have refusals unrepresentative
samples,
 misallocations in treatment may occur in service setting,
due to mistakes, resource pressures, sabotage
 groups not being comparable,
 treatments must be applied by service personnel, rather
than by the researchers themselves,  inconsistency,
 subjects may drop out at any stage  missing data,
 may not be able to randomise individuals  cluster
randomisation.
Trials in the social and health sciences
Analytical problems:
 non-comparable groups: intention to treat,
 missing data: imputation,
 allocation in groups: cluster level analyses, robust
standard errors, multilevel modelling.
The Lanarkshire Milk Experiment: a warning
from history
Nutritional experiment comparing ¾ pint milk per day with
no milk, raw milk with pasteurised milk.
Spring 1931.
67 primary schools, 33 raw milk, 34 pasteurised.
Within schools, children allocated to “feeders” or
“controls”.
20,000 children.
“Student” (W. G. Gossett) The Lanarkshire milk experiment.
Biometrika 1931; 23: 398.
The Lanarkshire Milk Experiment: a warning
from history
Allocation to “feeders” or “controls”
Left to the head teacher, groups to be “representative”.
“The teachers selected the two classes of pupils, those
getting milk and those acting as controls, in two different
ways. In certain cases they selected them by ballot and
in others by an alphabetical system.” (Student, quoting
original Report).
N.B. “classes” means “categories”, not school classes.
The Lanarkshire Milk Experiment: a warning
from history
Allocation to “feeders” or “controls”
Left to the head teacher, groups to be “representative”.
“The teachers selected the two classes of pupils, those
getting milk and those acting as controls, in two different
ways. In certain cases they selected them by ballot and
in others by an alphabetical system.” (Student, quoting
original Report).
“So far so good, but after invoking the goddess of chance
they unfortunately wavered in their adherence to her . . .”
(Student).
The Lanarkshire Milk Experiment: a warning
from history
Allocation to “feeders” or “controls”
“In any particular school where there was any group to
which these methods had given an undue proportion of
well-fed or ill-nourished children, others were substituted
in order to obtain a more level selection.” (Student,
quoting original Report.)
The controls were heavier and taller than the feeders.
The Lanarkshire Milk Experiment: a warning
from history
Allocation to “feeders” or “controls”
The controls were heavier and taller than the feeders.
The Lanarkshire Milk Experiment: a warning
from history
Allocation to “feeders” or “controls”
The controls were heavier and taller than the feeders.
The Lanarkshire Milk Experiment: a warning
from history
Allocation to “feeders” or “controls”
The controls were heavier and taller than the feeders.
“Presumably this discrimination in height and weight was
not made deliberately, but it would seem probable that the
teachers, swayed by the very human feeling that that the
poorer children needed the milk more than the
comparatively well to do, must have unconsciously made
too large a substitution of the ill-nourished among the
feeders and too few among the controls and that this
unconscious selection affected, secondarily, both
measurements.” (Student).
The Lanarkshire Milk Experiment: a warning
from history
Measurement
The controls were heavier and taller than the feeders.
The Lanarkshire Milk Experiment: a warning
from history
Measurement
Children were weighed in their indoor clothes.
Start: February
End: June
“. . . since the selection was probably affected by poverty
it is reasonable to suppose that the feeders would lose
less weight from this cause than the controls.” (Student).
The Lanarkshire Milk Experiment: a warning
from history
What should they have done?
Clearly, they should randomise and stick to the
randomisation.
Once the trial had been done, could they retrieve it?
If they knew the original allocation, they could have
analysed by intention to treat.
Analysis by intention to treat
We allocate subjects randomly so that we will have
comparable groups which differ only in intervention and
randomly.
Mistakes in treatment, sabotage, refusals, and drop-outs
lead to non-comparable groups.
Solution: we analyse subjects in the comparable groups
to which they were originally allocated
Analysis by intention to treat.
Analysis by intention to treat
1954 field trial of Salk poliomyelitis vaccine: a lesson
from history
Carried out using two different designs simultaneously,
due to a dispute about the correct method.
In some districts, second grade school-children were
invited to participate in the trial, and randomly allocated to
receive vaccine or an inert saline injection (placebo).
In other districts, all second grade children were offered
vaccination and the first and third grade left unvaccinated
as controls.
Meier P. The biggest health experiment ever: the 1954 field trial of
the Salk poliomyelitis vaccine. in Tanur JM et al. (eds.) Statistics: a
Guide to the Biological and Health Sciences San Francisco: HoldenDay, 1977.
Analysis by intention to treat
Result of the field trial of Salk poliomyelitis vaccine
Study group
Number in Paralytic Polio
group
Number Rate per
of cases 100000
Randomized control:
Vaccinated
200745
33
16
Control
201229
115
57
Not inoculated
338778
121
36
Observed control:
Vaccinated 2nd grade
221998
38
17
Control 1st and 3rd grade 725173
330
46
Unvaccinated 2nd grade
123605
43
35
Poliomyelitis
Viral disease transmitted by the faecal-oral route.
Before the development of vaccine almost everyone in the
population was exposed to it, usually in childhood.
In the majority of cases, paralysis does not result and
immunity is conferred without the child being aware of
having been exposed to polio.
In about one in 200 cases, paralysis or death occurs and a
diagnosis of polio is made.
The older the exposed individual is, the greater the chance
of paralysis developing.
Poliomyelitis
Children who are protected from infection by high standards
of hygiene are likely to be older when they are first exposed
to polio than those children from homes with low standards
of hygiene, and thus more likely to develop the clinical
disease.
There are many factors which may influence parents in their
decision as to whether to volunteer or refuse their child for a
vaccine trial.
These may include education, personal experience, current
illness, and others, but certainly include interest in health
and hygiene.
Thus in this trial the high risk children tended to be
volunteered and the low risk children tended to be refused.
Poliomyelitis
The higher risk volunteer control children experienced 57
cases of polio per 100000, compared to 36/100000 among
the lower risk refusers.
Suppose that the vaccine were saline instead, and that the
randomised vaccinated children had the same polio
experience as those receiving saline.
Analysis by intention to treat
Result of the field trial of Salk poliomyelitis vaccine
Study group
Number in Paralytic Polio
group
Number Rate per
of cases 100000
Randomized control:
Vaccinated
200745
33
16
Control
201229
115
57
Not inoculated
338778
121
36
Observed control:
Vaccinated 2nd grade
221998
38
17
Control 1st and 3rd grade 725173
330
46
Unvaccinated 2nd grade
123605
43
35
We would expect 200745 × 57 / 100000 = 114 cases, instead
of the 33 observed.
Total cases in randomised areas would be 114 + 115 + 121 =
350 and the rate per 100000 would 47.
Analysis by intention to treat
In the observed control areas of the Salk trial, the
vaccinated and control groups are not comparable.
How could we analyse the trial?
Compare all second grade children, both vaccinated and
refused, to the control group.
Analysis by intention to treat
Result of the field trial of Salk poliomyelitis vaccine
Study group
Number in Paralytic Polio
group
Number Rate per
of cases 100000
Randomized control:
Vaccinated
200745
33
16
Control
201229
115
57
Not inoculated
338778
121
36
Observed control:
Vaccinated 2nd grade
221998
38
17
Control 1st and 3rd grade 725173
330
46
Unvaccinated 2nd grade
123605
43
35
The rate in the second grade children
= (38 + 43) / ( 221998 + 123605) = 23 per 100,000.
Compare 46 per 100,000 1st and 3rd grade.
Analysis by intention to treat
In the observed control areas of the Salk trial, the vaccinated
and control groups are not comparable.
Compare all second grade children, both vaccinated and
refused, to the control group.
The rate in the second grade children is 23 per 100,000,
which is less than the rate of 46 in the control group,
demonstrating the effectiveness of the vaccine.
The “treatment” which we are evaluating is not vaccination
itself, but a policy of offering vaccination and treating those
who accept.
This is analysis by intention to treat.
Analysis by intention to treat
The random allocation procedure produces comparable
groups and it is these we must compare, whatever selection
may be made within them.
We therefore analyse the data according to the way we
intended to treat subjects, not the way in which they were
actually treated.
The alternative, analysing by treatment actually received, is
called on treatment analysis or per protocol analysis.
Analysis by intention to treat
Analysis by intention to treat is not free of bias.
As some participants may receive the other group's
treatment, the difference may be smaller than it should be.
We know that there is a bias and we know that it will make
the treatment difference smaller, by an unknown amount.
On treatment analyses are biased in favour of showing a
difference, whether there is one or not.
The Lanarkshire Milk Experiment: a warning
from history
Pasteurised or raw milk?
Some schools were provided with raw milk, some schools
were provided with pasteurised milk.
The children in one school were allocated to the same
type of milk.
“In so far as the conditions of this investigation are
concerned the effects of raw and pasteurised milk on
growth in weight and height are, so far as we can judge,
equal.” (Fisher and Bartlett, 1931, quoting original
Report).
Fisher RA, Bartlett S. Pasteurised and raw milk. Nature 1931; 127:
591-592.
The Lanarkshire Milk Experiment: a warning
from history
Pasteurised or raw milk?
“It is somewhat unfortunate . . . The whole of the milk
supplied to any one school [was] either raw or
pasteurised. In the absence of the records from the
separate schools, it is impossible altogether to eliminate
the doubt which this choice of method introduces.”
(Fisher and Bartlett 1931).
The children in this trial were allocated as a group rather
than as individuals.
It is a cluster allocated study.
(We do not know how clusters were allocated.)
The Lanarkshire Milk Experiment: a warning
from history
Pasteurised or raw milk?
“It is somewhat unfortunate . . . The whole of the milk
supplied to any one school [was] either raw or
pasteurised. In the absence of the records from the
separate schools, it is impossible altogether to eliminate
the doubt which this choice of method introduces.”
(Fisher and Bartlett 1931).
Fisher RA, Bartlett S. Pasteurised and raw milk. Nature 1931; 127:
591-592.
Cluster randomised trials
Also called group randomised trials.
Research subjects are not sampled independently, but in
a group.
For example:
 all the patients in a general practice are allocated to
the same intervention, the general practice forming a
cluster,
 all pupils in a school class are allocated to the same
intervention, the class forming a cluster.
Cluster randomised trials
Members of a cluster will be more like one another than
they are like members of other clusters.
We need to take this into account in the analysis and
design.
Cluster randomised trials
Methods of analysis which ignore clustering:
 two sample t method,
 chi-squared test for a two way table,
 difference between two proportions,
 relative risk,
 analysis of covariance,
 logistic regression.
May mislead, because they assume that all subjects are
independent observations.
Cluster randomised trials
Methods which ignore clustering may mislead, because
they assume that all subjects are independent
observations.
Observations within the same cluster are correlated.
May lead to standard errors which are too small,
confidence intervals which are too narrow, P values
which are too small.
Cluster randomised trials
A little simulation
Four cluster means, two in each group, from a Normal
distribution with mean 10 and standard deviation 2.
Generated 10 members of each cluster by adding a
random number from a Normal distribution with mean
zero and standard deviation 1.
The null hypothesis, that there is no difference between
the means in the two populations, is true.
Two-sample t test comparing the means, ignoring the
clustering.
Cluster randomised trials
A little simulation
1000 times:
600 significant differences, with P<0.05
502 highly significant, with P<0.01.
If t test ignoring the clustering were valid, expect 50
significant differences, 5%, and 10 highly significant
ones.
The analysis assumes that we have 20 independent
observations in each group. This is not true.
We have two independent clusters of observations, but
the observations in those clusters are really the same
thing repeated ten times.
Cluster randomised trials
A little simulation: valid statistical analysis
Possible analysis:
• find the means for the four clusters
• carry out a two-sample t test using these four
means
only.
1000 simulation runs:
53 (5.3%) significant at P<0.05
14 (1.4%) highly significant at P<0.01
Cluster randomised trials
A little simulation
Simulation is very extreme.
Two groups of two clusters and a very large cluster
effect.
Have seen a proposed study with two groups of two
clusters.
Smaller cluster effect would only reduce the shrinking of
the P values, it would not remove it.
Simulation shows that spurious significant differences
can occur if we ignore the clustering.
How big is the effect of clustering?
The design effect is what we must multiply the sample size
for a trial which is not clustered, to achieve the same
power.
Alternatively, the power of a cluster randomised trial is the
power of an individually randomised trial of size divided by
the design effect.
Design effect:
Deff = 1 + (m − 1)×ICC
where m is the number of observations in a cluster and ICC
is the intra-cluster correlation coefficient, the correlation
between pairs of subjects chosen at random from the same
cluster.
How big is the effect of clustering?
Deff = 1 + (m − 1)×ICC
If m =1, cluster size one, no clustering, then Deff =1,
otherwise Deff will exceed 1.
ICC is usually quite small, 0.04 is a typical figure for health
trials.
However, can be much larger.
In the incentives trial, Deff = 0.39.
Approximate design effect:
Deff = 1 + (6 − 1)×0.39 = 2.95
How big is the effect of clustering?
If we estimate the required sample size ignoring clustering,
we must multiply it by the design effect to get the sample
size required for the clustered sample.
Alternatively, if the sample size is estimated ignoring the
clustering, the clustered sample has the same power as for
a simple sample of size equal to what we get if we divide
our sample size by the design effect.
How big is the effect of clustering?
Deff = 1 + (m − 1)×ICC
Clustering may have a large effect if the ICC is large OR if
the cluster size is large.
E.g., if ICC = 0.001, cluster size = 500, the design effect
will be 1 + (500 − 1)0.001 = 1.5,
Need to increase the sample size by 50% to achieve the
same power as an unclustered trial.
Need to estimate variances both within and between
clusters.
If the number of clusters is small, the between clusters
variance will have few degrees of freedom and we will be
using the t distribution in inference rather than the Normal.
This too will cost in terms of power.
Analysis of cluster randomised trials
Several approaches can be used to allow for clustering:
 summary statistics for each cluster
 adjust standard errors using the design effect
 robust variance estimates
 general estimating equation models (GEEs)
 multilevel modeling
 Bayesian hierarchical models
 others
Any method which takes into account the clustering
will be a vast improvement compared to methods
which do not.
A trial of incentives to attend adult literacy
classes
Carole Torgerson, Greg Brooks, Jeremy Miles,
David Torgerson
Classes randomised to incentive or no incentive.
Outcome variable: number of sessions attended.
Classes randomised to incentive or no incentive.
Two groups of 14 classes.
Labelled “X” and “Y” in this data set.
Blinded for analysis.
Group X: 77 students
Group Y: 86 students
Outcome variable: number of sessions attended.
Frequency
25
20
15
10
5
0
0
5
10
15
Sessions attended mid-post
Compare mean number of sessions ignoring clustering:
. ttest sessions , by(group)
Two-sample t test with equal variances
-----------------------------------------------------------------------------Group |
Obs
Mean
Std. Err.
Std. Dev.
[95% Conf. Interval]
---------+-------------------------------------------------------------------X |
70
6.685714
.4177941
3.495516
5.852238
7.519191
Y |
82
5.280488
.2991881
2.709263
4.685197
5.875778
---------+-------------------------------------------------------------------combined |
152
5.927632
.2566817
3.164585
5.42048
6.434783
---------+-------------------------------------------------------------------diff |
1.405226
.5037841
.4097968
2.400656
-----------------------------------------------------------------------------Degrees of freedom: 150
Ho: mean(X) - mean(Y) = diff = 0
Ha: diff < 0
t =
2.7893
P < t =
0.9970
Ha: diff != 0
t =
2.7893
P > |t| =
0.0060
Ha: diff > 0
t =
2.7893
P > t =
0.0030
Compare mean number of sessions ignoring clustering:
. ttest sessions , by(group)
Two-sample t test with equal variances
-----------------------------------------------------------------------------Group |
Obs
Mean
Std. Err.
Std. Dev.
[95% Conf. Interval]
---------+-------------------------------------------------------------------X |
70
6.685714
.4177941
3.495516
5.852238
7.519191
Y |
82
5.280488
.2991881
2.709263
4.685197
5.875778
---------+-------------------------------------------------------------------combined |
152
5.927632
.2566817
3.164585
5.42048
6.434783
---------+-------------------------------------------------------------------diff |
1.405226
.5037841
.4097968
2.400656
-----------------------------------------------------------------------------Degrees of freedom: 150
Ho: mean(X) - mean(Y) = diff = 0
Ha: diff < 0
t =
2.7893
P < t =
0.9970
Stata version 8.
Ha: diff != 0
t =
2.7893
P > |t| =
0.0060
Ha: diff > 0
t =
2.7893
P > t =
0.0030
Compare mean number of sessions ignoring clustering:
. ttest sessions , by(group)
Two-sample t test with equal variances
-----------------------------------------------------------------------------Group |
Obs
Mean
Std. Err.
Std. Dev.
[95% Conf. Interval]
---------+-------------------------------------------------------------------X |
70
6.685714
.4177941
3.495516
5.852238
7.519191
Y |
82
5.280488
.2991881
2.709263
4.685197
5.875778
---------+-------------------------------------------------------------------combined |
152
5.927632
.2566817
3.164585
5.42048
6.434783
---------+-------------------------------------------------------------------diff |
1.405226
.5037841
.4097968
2.400656
-----------------------------------------------------------------------------Degrees of freedom: 150
Ho: mean(X) - mean(Y) = diff = 0
Ha: diff < 0
t =
2.7893
P < t =
0.9970
Ha: diff != 0
t =
2.7893
P > |t| =
0.0060
P = 0.006 — a highly significant difference!
Ha: diff > 0
t =
2.7893
P > t =
0.0030
Compare mean number of sessions ignoring clustering:
. ttest sessions , by(group)
Two-sample t test with equal variances
-----------------------------------------------------------------------------Group |
Obs
Mean
Std. Err.
Std. Dev.
[95% Conf. Interval]
---------+-------------------------------------------------------------------X |
70
6.685714
.4177941
3.495516
5.852238
7.519191
Y |
82
5.280488
.2991881
2.709263
4.685197
5.875778
---------+-------------------------------------------------------------------combined |
152
5.927632
.2566817
3.164585
5.42048
6.434783
---------+-------------------------------------------------------------------diff |
1.405226
.5037841
.4097968
2.400656
-----------------------------------------------------------------------------Degrees of freedom: 150
Ho: mean(X) - mean(Y) = diff = 0
Ha: diff < 0
t =
2.7893
P < t =
0.9970
Ha: diff != 0
t =
2.7893
P > |t| =
0.0060
P = 0.006 — a highly significant difference!
But it is wrong — it ignores the clustering!
Ha: diff > 0
t =
2.7893
P > t =
0.0030
Compare mean number of sessions ignoring clustering,
regression:
. regress sessions
group
Source |
SS
df
MS
-------------+-----------------------------Model | 74.5694526
1 74.5694526
Residual | 1437.63449
150 9.58422997
-------------+-----------------------------Total | 1512.20395
151 10.0145957
Number of obs
F( 1,
150)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
152
7.78
0.0060
0.0493
0.0430
3.0958
-----------------------------------------------------------------------------sessions |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------group | -1.405226
.5037841
-2.79
0.006
-2.400656
-.4097968
_cons |
8.090941
.8152001
9.93
0.000
6.480183
9.701699
------------------------------------------------------------------------------
P = 0.006 — identical to two sample t method.
It is still wrong — it ignores the clustering!
Compare mean number of sessions including clustering, two
sample t method on cluster means:
. ttest sessions , by(group)
Two-sample t test with equal variances
-----------------------------------------------------------------------------Group |
Obs
Mean
Std. Err.
Std. Dev.
[95% Conf. Interval]
---------+-------------------------------------------------------------------1 |
14
6.69932
.7457716
2.790422
5.088178
8.310461
2 |
14
5.189229
.3974616
1.487165
4.330565
6.047893
---------+-------------------------------------------------------------------combined |
28
5.944274
.439363
2.32489
5.042776
6.845773
---------+-------------------------------------------------------------------diff |
1.510091
.8450746
-.226985
3.247166
-----------------------------------------------------------------------------Degrees of freedom: 26
Ho: mean(1) - mean(2) = diff = 0
Ha: diff < 0
t =
1.7869
P < t =
0.9572
Ha: diff != 0
t =
1.7869
P > |t| =
0.0856
P = 0. 0856 — not significant.
Ha: diff > 0
t =
1.7869
P > t =
0.0428
Compare mean number of sessions including clustering, two
sample t method on cluster means:
. ttest sessions , by(group)
Two-sample t test with equal variances
-----------------------------------------------------------------------------Group |
Obs
Mean
Std. Err.
Std. Dev.
[95% Conf. Interval]
---------+-------------------------------------------------------------------1 |
14
6.69932
.7457716
2.790422
5.088178
8.310461
2 |
14
5.189229
.3974616
1.487165
4.330565
6.047893
---------+-------------------------------------------------------------------combined |
28
5.944274
.439363
2.32489
5.042776
6.845773
---------+-------------------------------------------------------------------diff |
1.510091
.8450746
-.226985
3.247166
-----------------------------------------------------------------------------Degrees of freedom: 26
Ho: mean(1) - mean(2) = diff = 0
Ha: diff < 0
t =
1.7869
P < t =
0.9572
Ha: diff != 0
t =
1.7869
P > |t| =
0.0856
Ha: diff > 0
t =
1.7869
P > t =
0.0428
P = 0. 0856 — not significant. Almost correct — it takes the data
structure into account, but not the variation in class size.
Compare number of sessions including clustering, two sample t
method on cluster means
Almost correct — it takes the data structure into account, but not
the variation in class size.
Frequency
Frequency
Group X
8
6
4
2
0
2
3
4 5 6 7 8
Number in class
Group Y
9 10
2
3
4 5 6 7 8
Number in class
9 10
4
3
2
1
0
Compare mean number of sessions including clustering,
regression method, weighted by class size:
. regress session group [aweight=learner]
(sum of wgt is
1.6300e+02)
Source |
SS
df
MS
-------------+-----------------------------Model | 13.3075302
1 13.3075302
Residual | 124.992713
26 4.80741204
-------------+-----------------------------Total | 138.300243
27 5.12223123
Number of obs
F( 1,
26)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
28
2.77
0.1082
0.0962
0.0615
2.1926
-----------------------------------------------------------------------------sessions |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------group | -1.380902
.8299839
-1.66
0.108
-3.086958
.3251548
_cons |
8.00502
1.33388
6.00
0.000
5.26319
10.74685
------------------------------------------------------------------------------
P = 0.108 — not significant.
Correct — it takes the data structure into account, including the
variation in class size.
Compare individual number of sessions including clustering,
robust standard error method (Huber-White-sandwich method):
. regress sessions
group, cluster(class)
Regression with robust standard errors
Number of clusters (class) = 28
Number of obs =
F( 1,
27) =
Prob > F
=
R-squared
=
Root MSE
=
152
2.79
0.1062
0.0493
3.0958
-----------------------------------------------------------------------------|
Robust
sessions |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------group | -1.405226
.8407909
-1.67
0.106
-3.130387
.319934
_cons |
8.090941
1.535933
5.27
0.000
4.939466
11.24242
------------------------------------------------------------------------------
P = 0.106 — not significant.
Correct — it takes the data structure into account.
Very similar estimate and P value to method using means.
Compare individual number of sessions including clustering,
robust standard error method (Huber-White-sandwich method).
Correct — it takes the data structure into account.
Very similar estimate and P value to method using means.
Compare individual number of sessions including clustering,
robust standard error method (Huber-White-sandwich method.
Correct — it takes the data structure into account.
Very similar estimate and P value to method using means.
I can do that using SPSS.
So what is the advantage?
Compare individual number of sessions including clustering,
robust standard error method (Huber-White-sandwich method):
Correct — it takes the data structure into account.
Very similar estimate and P value to method using means.
I can do that using SPSS.
So what is the advantage?
We can use subject-level covariates.
We can use subject-level covariates.
Mid-score = reading score before randomisation.
Frequency
30
20
10
0
0
20
40
60
80
Mid score (Scaled)
100
Compare individual number of sessions including clustering,
robust standard error method, adjusting for mid-score:
. regress sessions group midscl, cluster(class)
Regression with robust standard errors
Number of clusters (class) = 28
Number of obs =
F( 2,
27) =
Prob > F
=
R-squared
=
Root MSE
=
152
11.91
0.0002
0.1956
2.8572
-----------------------------------------------------------------------------|
Robust
sessions |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------group | -1.533053
.6128085
-2.50
0.019
-2.790433
-.2756742
midscl |
-.049151
.0104713
-4.69
0.000
-.0706363
-.0276658
_cons |
10.56678
1.304614
8.10
0.000
7.889936
13.24363
------------------------------------------------------------------------------
P = 0.019 — significant.
Correct — it takes the data structure into account.
Compare individual number of sessions including clustering,
robust standard error method, adjusting for mid-score:
. regress sessions group midscl, cluster(class)
Regression with robust standard errors
Number of clusters (class) = 28
Number of obs =
F( 2,
27) =
Prob > F
=
R-squared
=
Root MSE
=
152
11.91
0.0002
0.1956
2.8572
-----------------------------------------------------------------------------|
Robust
sessions |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------group | -1.533053
.6128085
-2.50
0.019
-2.790433
-.2756742
midscl |
-.049151
.0104713
-4.69
0.000
-.0706363
-.0276658
_cons |
10.56678
1.304614
8.10
0.000
7.889936
13.24363
------------------------------------------------------------------------------
P = 0.019 — significant.
Correct — it takes the data structure into account.
Adjustment produces true significant difference.
Randomised Controlled Trials in the
Social Sciences
Analysis of randomised trials
Martin Bland
Professor of Health Statistics
University of York
www-users.york.ac.uk/~mb55/
Download