Point Estimation: Odds Ratios, Hazard Ratios/Rates, Risk

advertisement
Clinical Trials in 20 Hours
Confidence Intervals
Elizabeth S. Garrett
esg@jhu.edu
Oncology Biostatistics
March 27, 2002
What is a “confidence interval”?
• It is an interval that tells the precision with which we
have estimated a sample statistic.
• Examples:
– parameter of interest: progression-free survival time:
“The 95% confidence interval on progression-free survival is 13 to 26
weeks.”
– parameter of interest: response rate
“The 95% confidence interval on response rate is 0.20 to 0.40.”
– Parameter of interest: change in %CD34+ cells
“The 95% confidence interval for %CD34+ cells is 0.2 to 0.4.”
3/27/2002
Clinical Trials in 20 Hours
2
Different Interpretations of the 95%
confidence interval
• “We are 95% sure that the TRUE parameter value
is in the 95% confidence interval”
• “If we repeated the experiment many many times,
95% of the time the TRUE parameter value would
be in the interval”
• “Before performing the experiment, the
probability that the interval would contain the true
parameter value was 0.95.”
3/27/2002
Clinical Trials in 20 Hours
3
Example
Leisha Emens, M.J. Kennedy, John H. Fetting, Nancy E. Davidson, Elizabeth
Garrett, Deborah A. Armstrong
“A phase 1 toxicity and feasibility trial of sequential dose dense induction
chemotherapy with doxorubicin, paclitaxel, and 5-fluorouracil followed by
high dose consolidation for high risk primary breast cancer”
83 patients underwent leukopheresis for peripheral blood stem cell collection after
conventional dose adjuvant therapy, and 14 patients underwent the procedure
on the dose dense adjuvant protocol 9626.
Results: Compared to the standard dose doxorubicin-containing adjuvant therapy,
the dose dense regimen decreased CD34+ peripheral blood stem cell (PBSC)
yields, requiring that 50% patients have a supplemental bone marrow harvest.
Question: What can we say about how %CD34+ peripheral blood stem cell
yields in each of the two groups?
3/27/2002
Clinical Trials in 20 Hours
4
Example
• %CD34+ PBSC in trial 9601
and 9626.
• We can estimate the mean
%CD34+ PBSC in each trial:
– 0.40 in the standard group
– 0.30 in the dose-dense
group.
cd34
1.5
1
• We can conclude:
– “We estimate that %CD34+
PBSC in the standard group
is 0.40 and in the dose
dense group is 0.30.”
• But, how “sure” are we
about those estimates?
3/27/2002
.5
0
Clinical Trials in 20 Hours
9601
9626
5
Quantifying Uncertainty
• Standard deviation: measures the variation of a
variable in the population.
– The standard deviation of %CD34+ PBSC in
the standard group is 0.27 and is 0.20 in the
dose dense group.
– Technically,
N
s
1
N 1
2
(
x

x
)
 i
i 1
s9601 
83
1
82
2
(%
CD
34

0
.
40
)

i
i 1
3/27/2002
Clinical Trials in 20 Hours
6
For normally distributed variables….
68% of individuals’
values fall between
1 standard deviation
of the mean
s
68%
x
3/27/2002
Clinical Trials in 20 Hours
7
For normally distributed variables….
1.96s
95% of individuals’
values fall between
1.96 standard
deviations
of the mean
95%
x
3/27/2002
Clinical Trials in 20 Hours
8
Standard deviation versus standard error
• The standard deviation (s) describes variability
between individuals in a population.
• The standard error describes variation of a sample
statistic.
• Example: We are interested in the mean %CD34+
PBSC. (We notate the mean by x).
– The standard deviation (0.27 in standard and 0.20 in
dose dense) describes how individuals differ.
– The standard error of the mean describes the precision
with which can make inference about the true mean.
3/27/2002
Clinical Trials in 20 Hours
9
Standard error of the mean
• Standard error of the mean (sem):
• Comments:
s
sx  sem 
n
– n = sample size
– even for large s, if n is large, we can get good
precision for sem
– always smaller than standard deviation (s)
3/27/2002
Clinical Trials in 20 Hours
10
Example
• In standard group, s = 0.27 and n = 83:
0.27
sx  sem 
 0.03
83
• In dose dense group, s = 0.20 and n = 14:
0.20
sx  sem 
 0.05
14
3/27/2002
Clinical Trials in 20 Hours
11
Sampling Distribution
0. 0.5 0.1 y1 0.15 0.2 0.25
The sampling distribution of a sample statistic refers
to what the distribution of the statistic would look
like if we chose a large number of samples from
the same population
Mean = 3
s = 2.45
The sample statistic of interest
to us is the mean.
0
2
4
6
8
10
12
x
3/27/2002
Clinical Trials in 20 Hours
12
Sampling Distribution of the Mean
=
25
n
=
0 5 10 15 20 25
n
0 5 10 15
By the Central Limit Theorem, it is
true that even if a variable is
NOT normally distributed, for
large sample size, the sampling
distribution of the mean is
normally distributed.
2.
0
2.
5
3.
0
3.
5
4.
0
2.
0
2.
5
3.
0
3.
5
4.
0. 0.5 0.1 y1 0.15 0.2 0.25
s amps
=
100
n
=
0 5 10 15 20
Mean = 3
s = 2.45
0 10 20 30
n
s amp
2.
0
2.
5
3.
0
3.
5
4.
0
2.
0
2.
5
3.
0
3.
5
4.
0
2
4
6
8
10
12
s amps
s amp
x
3/27/2002
Clinical Trials in 20 Hours
13
Sampling Distributions
0 5 10 15
sem = 0.47
=
25
n
=
0 5 10 15 20 25
n
50
sem = 0.23
2.
0
2.
5
3.
0
3.
5
4.
0
2.
0
2.
5
3.
0
3.
5
4.
0
s a mp s
=
100
n
0 5 10 15 20
sem = 0.47
0 10 20 30
n
s a mp s
=
500
sem = 0.10
2.
0
2.
5
3.
0
3.
5
4.
0
2.
0
2.
5
3.
0
3.
5
4.
0
s a mp s
3/27/2002
Clinical Trials in 20 Hours
s a mp s
14
Central Limit Theorem Main Ideas
• The sampling distribution of a sample
statistic is often normally distributed
• The mathematical result comes from the
Central Limit Theorem. For the theorem to
work, n should be large.
• Statisticians have derived formulas to
calculate the standard deviation of the
sampling distribution and it is called the
standard error of the statistic
3/27/2002
Clinical Trials in 20 Hours
15
Sampling Distribution of the Mean
• In general for large n, means have a normal
distribution.
• It is true that 95% of sample means will be within
1.96 of the true mean, .
  196
. sem  x    196
. sem
 x  196
. sem      x  196
. sem
x  196
. sem    x  196
. sem
x  196
. sem    x  196
. sem
The 95% confidence interval for the mean
3/27/2002
Clinical Trials in 20 Hours
16
General formula for 95% confidence interval
x  196
. sem
s
x  196
.
n
• Notes:
– sample size must be sufficiently large for nonnormal variables.
– how large is large? depends on skewness of
variable
– VERY often people use 2 instead of 1.96.
3/27/2002
Clinical Trials in 20 Hours
17
Example
• In the standard group, the mean was 0.40, s =
0.27, and n = 83:
27
0.40  196
. 0.83
 0.34    0.46
• In the dose dense group, the mean was 0.30, s =
0.20, and n = 14:
20
0.30  196
. 0.14
 0.20    0.40
3/27/2002
Clinical Trials in 20 Hours
18
Not only 95%….
• 90% confidence interval:
NARROWER than 95%
x  165
. sem
• 99% confidence interval:
WIDER than 95%
x  2.58sem
3/27/2002
Clinical Trials in 20 Hours
19
But why do we always see 95% CI’s?
• “Duality” between confidence intervals and pvalues
• Example: Assume that we are testing that for a significant
change in QOL due to an intervention, where QOL is
measured on a scale from 0 to 50.
– 95% confidence interval: (-2, 13)
– pvalue = 0.07
• It is true that if the 95% confidence interval overlaps 0, then a
t-test testing that the treatment effect is 0 will be insignificant
at the alpha = 0.05 level.
• It is true that if the 95% confidence interval does not overlap
0, then a t-test testing that the treatment effect is 0 will be
significant at the alpha = 0.05 level.
3/27/2002
Clinical Trials in 20 Hours
20
Other Confidence Intervals
• Differences in means
• Response rates
• Differences in response rates
•
•
•
•
Hazard ratios
median survival
difference in median survival
……..
3/27/2002
Clinical Trials in 20 Hours
21
Difference in Means
• Example: What is the 95% confidence interval for
the difference in %CD34+ PBSCs in the two
trials?
s12 s22
( x1  x 2 )  196
.

n1 n2
95% CI: ( 0.40  0.30)  196
.
3/27/2002
Clinical Trials in 20 Hours
0.27 2

0.20 2
83
14
 ( 0.02,0.22)
22
95% Confidence Intervals for Proportions
• Socinski et al., Phase III Trial Comparing a Defined Duration of
Therapy versus Continuous Therapy Followed by Second-Line
Therapy in Advanced-Stage IIIB/IV Non-Small-Cell Lung Cancer
JCO, March 1, 2002.
• Patients and Methods: Arm A (4 cycles of carboplatin at an AUC of 6
and paclitaxel), Arm B (continuous treatment with carboplatin/
paclitaxel until progression). At progression, patients from each arm
receive second-line weekly paclitaxel at 80mg/m2/week.
• Results: 230 Patients were randomized (114 in arm A and 116 in Arm
B). Overall response rates were 22% and 24% for arms A and B.
Grade 2 to 4 neuropathy was seen in 14% and 27% of Arm A and B
patients, respectively.
3/27/2002
Clinical Trials in 20 Hours
23
95% Confidence Intervals for Proportions
• What are 95% confidence intervals for the response rates
in the two arms?
p (1  p )
• standard error of a sample proportion is
n
• An equation for confidence interval for a proportion:
p (1  p )
p  196
.
n
• Note: this is an approximation based on the central limit
theorem! Using statistical programs, you can get “exact”
confidence intervals.
• Assumptions:
– n is reasonably large
– p is not “too” close to 0 or 1
– rule of thumb: pn > 5
3/27/2002
Clinical Trials in 20 Hours
24
Example: Response Rate to Treatment
p  196
.
p (1  p )
n
• Arm A:
0.22  196
.
0.22( 0.78)
 ( 014
. ,0.30)
114
0.24  196
.
0.24( 0.76)
 ( 016
. ,0.32)
116
• Arm B:
3/27/2002
Clinical Trials in 20 Hours
25
Example: Grade 2 to 4 Neuropathy
p  196
.
p (1  p )
n
• Arm A:
014
.  196
.
014
. ( 0.86)
 ( 0.08,0.20)
114
0.27  196
.
0.27( 0.73)
 ( 019
. ,0.35)
116
• Arm B:
3/27/2002
Clinical Trials in 20 Hours
26
95% Confidence Interval for Difference in Proportions
( p1  p 2 )  196
.
p1 (1  p1 ) p 2 (1  p 2 )

n1
n2
What is the 95% confidence interval for the
difference in rates of neuropathy in arms A
and B?
0.27( 0.73) 014
. ( 0.86)
( 0.27  014
. )  196
.

 ( 0.03,0.23)
116
114
3/27/2002
Clinical Trials in 20 Hours
27
Recap
• 95% confidence intervals are used to quantify certainty
about parameters of interest.
• Confidence intervals can be constructed for any parameter
of interest (we have just looked at some common ones).
• The general formulas shown here rely on the central limit
theorem
• You can choose level of confidence (does not have to be
95%).
• Confidence intervals are often preferable to pvalues
because they give a “reasonable range” of values for a
parameter.
3/27/2002
Clinical Trials in 20 Hours
28
Some Confidence Intervals in Survival Analysis
Example: Urba et al. Randomized Trial of Preoperative Chemoradiation
Versus Surgery Alone in Patients with Locoregional Esophageal Carcinoma,
JCO, Jan 15, 2001.
Hazard Ratio
95% CI
Chemo v. surgery
0.69
0.46-1.06
1 year survival
3 year survival
Arm 1
95%CI
46-73
8-30
%
58
16
%
72
30
Arm II
95%CI
58-84
20-46
What about the confidence interval for the 1 year and 3 year
difference?
3/27/2002
Clinical Trials in 20 Hours
29
• Why not provide confidence intervals for...
– Difference in median survival
– Difference in 1 year survival
– Difference in 3 year survival
• Would give readers a “reasonable range” of values to
consider for treatment effect that are intuitive.
• What is remembered?
– P = 0.09 which means insignificant result
– But, can anyone remember the treatment effect?
3/27/2002
Clinical Trials in 20 Hours
30
Confidence Intervals for Reporting Results of
Clinical Trials, Simon
• “[Hypothesis tests] are sometimes overused and their
results misinterpreted.”
• “Confidence intervals are of more than philosophical
interest, because their broader use would help eliminate
misinterpretations of published results.”
• “Frequently, a significance level or pvalue is reduced to a
‘significance test’ by saying that if the level is greater than
0.05, then the difference is ‘not significant’ and the null
hypothesis is ‘not rejected’….The distinction between
statistical significance and clinical significance should not
be confused.”
3/27/2002
Clinical Trials in 20 Hours
31
Caveats
“They should not be interpreted as reflecting the absence of a
clinically important difference in true response
probabilities.”
Treatment
A
B
Trt effect
95% CI
Pvalue
3/27/2002
Experiment 1
Response
13/25 (52%)
8/25 (32%)
20%
-7% - 47%
0.25
Experiment 2
Response
500/1000 (50%)
450/1000 (45%)
5%
0.6% - 9%
0.03
Clinical Trials in 20 Hours
32
Excellent References on Use of Confidence Intervals
in Clinical Trials
• Richard Simon, “Confidence Intervals for Reporting
Results of Clinical Trials”, Annals of Internal Medicine,
v.105, 1986, 429-435.
• Leonard Braitman, “Confidence Intervals Extract
Clinically Useful Information from the Data”, Annals of
Internal Medicine, v. 108, 1988, 296-298.
• Leonard Braitman, “Confidence Intervals Assess Both
Clinical and Statistical Significance”, Annals of Internal
Medicine, v. 114, 1991, 515-517.
3/27/2002
Clinical Trials in 20 Hours
33
Download