Statistical Inference - University of Dundee

advertisement
Statistics for Health Research
Assessing the
Evidence: Statistical
Inference
Peter T. Donnan
Professor of Epidemiology and Biostatistics
Objectives of Session
• Understand idea of inference
• Confidence interval approach
• Significance testing
• Briefly - Some simple tests
Statistical Inference
The aim is to draw conclusions
(INFER) from the specific
(sample) to the more general
(population).
Are differences between groups
chance occurrences or do they
represent statistically significant
results (I.e. real differences)?
Extrapolating from the
sample to population
Illustrations Ian Christie, Orthopaedic & Trauma Surgery, Copyright
2002 University of Dundee
Two approaches: confidence
intervals and hypothesis testing
Confidence Intervals
Random variability means that there
is statistical (random) variation
around any summary statistic:
Mean, proportion, difference
between means, etc
Confidence intervals
Uncertainty expressed as a Confidence
Interval defined by an upper and a
lower value:
Summary statistic  constant x standard
error
e.g. For 95% CI constant = 1.96
from Normal distribution
Confidence intervals
For a percentage the standard
error is given by:
se 
p (100  p )
n
So for p = 35%, se = 4.8%,
where n = 100
Confidence intervals
Consider a prevalence of 35% for the
uptake of statins for secondary
prevention of MI from one practice
prevalence = 35%,
95% CI = 35% ± 1.96x 4.8%
= 25.6% to 44.4%
Confidence intervals
For a mean the standard error is
given by:
se
=
s
n
where s is the standard
deviation of the distribution
Confidence intervals
Consider a mean cholesterol measurement of
5.4 mmol/l for a group of 100 patients
with type 2 diabetes and standard
deviation s = 1.1 mmol/l
= 5.4, 95% CI = 5.4 ± 1.96x 1.1/√100
= 5.2 to 5.6 mmol/l
Confidence intervals
Confidence intervals give estimation of
precision of summary statistic
Precise
Imprecise
Major determinant of precision is sample size
Illustrations Ian Christie, Orthopaedic & Trauma Surgery, Copyright
2002 University of Dundee
Confidence intervals
Warning!
Confidence intervals are usually
interpreted in a Bayesian way even
though using a frequentist method to
estimate
The probability of the true value lying
within the confidence interval is NOT,
repeat NOT 95%
Bayesian confidence intervals are called CREDIBLE
INTERVALS and probability of true value lying in
credible interval IS 95%
Frequentist Confidence Interval
means that with repeat samples…
Study Sample &
95% CI
Repeat
Samples…
…..ad
infinitum
95% Confidence interval means that
95% of proportions from repeat studies
would be within the confidence interval
To put it another way the
95% Confidence Interval is…
Sample & 95% CI
Repeat
Samples…
…..ad
infinitum
…one of many that could be constructed
with the assurance that 95% of the
time the true value of the parameter
would be included
Statistical Inference:
Hypothesis testing
Are differences between groups
chance occurrences or do they
represent statistically significant
results (I.e. real differences)?
The process of inference starts from
a neutral position – Null Hypothesis
Statistical Inference:
Hypothesis testing
The null hypothesis (H0) is usually
set to ‘there is no difference’

Collect data and carry out
hypothesis tests

Accept or reject the null hypothesis
Legal analogy
Hypothesis testing
Legal trial
Hypothesis test
Defendant
assumed innocent
until proved
guilty
Null hypothesis
assumes no
difference
between groups
Legal analogy
Hypothesis testing
Legal trial
Examine
evidence
Hypothesis test
Calculate test
statistic based
on evidence from
sample data
Legal analogy
Hypothesis testing
Legal trial
1.Accept evidence
proves guilt
2.Evidence does not
prove guilt ‘not proven’
Hypothesis test
1. Accept significant
difference between
groups
2. Insufficient evidence
to reject H0
Legal Analogy
Hypothesis Testing
No statistical
significance not
same as
No difference
Illustrations Ian Christie, Orthopaedic & Trauma Surgery, Copyright
2002 University of Dundee
Statistical Inference:
Hypothesis testing
The test statistic generally consists of:
Summary statistic – H0 value
Standard error of summary
e.g.Test that the mean is different to
zero:
t = Mean – 0
Se(Mean)
Statistical Inference:
Hypothesis testing
The test statistic is then compared with
tabulated values of a distribution (e.g
Normal distribution, t-distribution)
Assuming the null hypothesis is true, what
is the probability of obtaining the actual
observed value of the test statistic, t?
How likely is the value of t, to have
occurred by chance alone?
Statistical Inference:
Hypothesis testing
Assuming the null hypothesis is true, what
is the probability of obtaining the actual
observed or greater value of the test
statistic, t?
Using distribution
Of t which is similar to a
Normal distribution
this probability can be
Obtained in figure as
p = 0.042
2.1%
2.1%
Statistical Inference:
Hypothesis testing
If probability of the occurrence of the
observed value < 5% or p < 0.05 then
this is unlikely to be a chance finding
Result is declared statistically significant
Fortunately most statistical software (e.g.
SPSS) will carry out the test you request
and give p-values (SPSS labels as ‘Sig’)
Two group hypothesis testing
We will consider three common tests:
1.t-test for difference between two
means
2.Chi-squared test (2) for
difference between two proportions
3. Logrank test for difference
between two groups median survival
All are easily carried out in SPSS
Are practices with access to
community hospitals further away on
average from general hospitals?
No access
Access to CH
n=17
n=10
Mean = 8.68 km
Mean = 21.30 km
SD
SD
= 11.90 km
Se (mean) = 2.89
= 5.68 km
Se (mean) = 1.79
Example t-test
t =
( x1 sp
x2 ) - 0
1 / n1 + 1 / n 2
1 and 2 refer to the two groups
N is the number in each group
X bar refers to the mean and
sp is the pooled standard deviation
Example t-test
t =
t
t
(8.68
- 21.30)
10.112
- 0
* 0.398
= -12.62/ 4.024
= -3.13
With 25 degrees of freedom from t-tables
p = 0.004 and so the difference of
12.62 is highly statistically significant
Consider a recent RCT
• Rimonabant vs. placebo to reduce body weight
•
•
•
•
in obese people (BMI > 30kg/m2)
Rimonabant (20 mg daily) inhibits affects of
cannabinoid agonists which in turn affects
energy balance
Mean reduction in body weight at one year
was 6.6kg vs. 1.8 kg (rimonab vs. plac)
Difference was 4.7 kg (95% CI 4.1, 5.4)
By end of year 2 mean weight was back to
start!
Are practices with access to community
hospitals more likely to have training
status?
Community Hospital
No
No Training
Status
Training
status
Yes
12 (71%) 4 (40%)
5 (29%)
6 (60%)
Are practices with access to
community hospitals more likely to
have training status?
•Is the difference in proportions
60% - 29% = 31% well within the
realms of chance or a statistically
significant finding?
•Null hypothesis Difference = 0
• Use chi-squared (2) test for
significance of difference
Pearson Chi-Squared Test
Comm. Hosp.
No training
status
Training
status
No
Yes
a
b
a+b
c
d
c+d
a+c
b+d
N
Pearson Chi-Squared Test

2

N  ad  bc
a

2
 b  c  d  a  c b  d 
where N = a+b+c+d and |ad – bc|
means take the positive value of the
calculation
Pearson Chi-Squared Test

2

27  72  20

2
16 11 17 10 
= 2.44 with 1 degree of freedom
df = (no. rows – 1) x (no. columns – 1)
P = 0.118 which is not statistically significant
More complicated analyses
• Introduced simple two-group tests
• Results of more complicated analyses are
•
•
•
•
•
•
expressed in the same way
Summary statistic and 95% confidence interval
Usually p-value is also stated but often
implicit from the confidence interval
Beware spurious significance e.g.
p = 0.034729 (3 d.p. are enough)
‘Importance’ refers to size of difference
An ‘important’ result can be statistically nonsignificant
Sacred 5% level
• 5% level is arbitrary
• Practical choice before computer era to make
•
•
•
•
tables easier to construct
Are p = 0.046 and p = 0.051 different?
In past researchers tended to only present pvalues
Now emphasis is on size of effect and 95% CI
Unfortunately, Editors still influenced by pvalues leading to publication bias
Summary
• Do not get carried away by p-values
• Interpretation requires knowledge of
area to put into context, but also
understanding of what the tests do
• A p-value close to 5% is approaching
significance and may suggest it is worth
investigating
• The size of the effect is more clinically
or scientifically important
Download