Confidence intervals Kristin Tolksdorf (based on previous EPIET material) 18th EPIET/EUPHEM Introductory course 01.10.2012 Inferential statistics • Uses patterns in the sample data to draw inferences about the population represented, accounting for randomness. • Two basic approaches: – Hypothesis testing – Estimation 2 Criticism on significance testing “Epidemiological application need more than a decision as to whether chance alone could have produced association.” (Rothman et al. 2008) → Estimation of an effect measure (e.g. RR, OR) rather than significance testing. → Estimation of a mean → Estimation of a proportion 3 Why estimation? Norovirus outbreak on a Greek island: “The risk of illness was higher among people who ate raw seafood (RR=21.5).” How confident can we be in the result? What is the precision of our point estimate? 4 The epidemiologist needs measurements rather than probabilities 2 is a test of association OR, RR are measures of association on a continuous scale infinite number of possible values The best estimate = point estimate Range of “most plausible” values, given the sample data Confidence interval precision of the point estimate 5 Confidence interval (CI) Range of values, on the basis of the sample data, in which the population value (or true value) may lie. • Frequently used formulation: „If the data collection and analysis could be replicated many times, the CI should include the true value of the measure 95% of the time .” 6 Confidence interval (CI) a = 5% α/2 1-α Lower limit of 95% CI α/2 upper limit of 95% CI s 95% CI = x – 1.96 SE up to x + 1.96 SE Indicates the amount of random error in the estimate Can be calculated for any „test statistic“, e.g.: means, proportions, ORs, RRs 7 CI terminology Point estimate Confidence interval RR = 1.45 (0.99 – 2.13) Lower confidence limit Upper confidence limit 8 Width of confidence interval depends on … • amount of variability in the data • size of the sample • level of confidence (usually 90%, 95%, 99%) A common way to use CI regarding OR/RR is : If 1.0 is included in CI non significant If 1.0 is not included in CI significant 9 Looking at the CI A B RR = 1 Large RR Study A, large sample, precise results, narrow CI – SIGNIFICANT Study B, small size, large CI - NON SIGNIFICANT Study A, effect close to NO EFFECT Study B, no information about absence of large effect 10 More studies are better or worse? clinical or biological significance ? 20 studies with different results... 1 RR 11 Norovirus on a Greek island • How confident can we be in the result? • Relative risk = 21.5 (point estimate) • 95% CI for the relative risk: (8.9 - 51.8) The probability that the CI from 8.9 to 51.8 includes the true relative risk is 95%. 12 Norovirus on a Greek island “The risk of illness was higher among people who ate raw seafood (RR=21.5, 95% CI 8.9 to 51.8).” 13 Example: Chlordiazopoxide use and congenital heart disease (n=1 644) C use No C use Cases Controls 4 4 386 1 250 OR = (4 x 1250) / (4 x 386) = 3.2 p = 0.080 ; 95% CI = 0.6 - 17.5 From Rothman K 3.2 p=0.080 0.6 – 17.5 15 Example: Chlordiazopoxide use and congenital heart disease – large study (n=17 151) C use No C use Cases Controls 240 211 7 900 8 800 OR = (240 x 8800) / (211 x 7900) = 1.3 p = 0.013 ; 95% CI = 1.1 - 1.5 Precision and strength of association Strength Precision 17 Confidence interval provides more information than p value • Magnitude of the effect (strength of association) • Direction of the effect (RR > or < 1) • Precision of the point estimate of the effect (variability) p value can not provide them ! 18 What we have to evaluate the study 2 Test of association, depends on sample size p value Probability that equal (or more extreme) results can be observed by chance alone OR, RR Direction & strength of association if > 1 risk factor if < 1 protective factor (independently from sample size) CI Magnitude and precision of effect 19 Comments on p-values and CIs • Presence of significance does not prove clinical or biological relevance of an effect. • A lack of significance is not necessarily a lack of an effect: “Absence of evidence is not evidence of absence”. 20 Comments on p-values and CIs • A huge effect in a small sample or a small effect in a large sample can result in identical p values. • A statistical test will always give a significant result if the sample is big enough. • p values and CIs do not provide any information on the possibility that the observed association is due to bias or confounding. 21 2 and Relative Risk E NE Total E NE Total Cases 9 5 14 Cases 90 50 140 Non-cases 51 55 106 Non-cases 510 550 1060 Total 60 60 120 Total 600 600 1200 2 = 1.3 p = 0.13 RR = 1.8 95% CI [ 0.6 - 4.9 ] 2 = 12 p = 0.0002 RR = 1.8 95% CI [ 1.3-2.5 ] 22 Common source outbreak suspected Exposure Yes No Total Cases 15 50 65 23% Non-cases 20 200 AR% 42.8% 20.0% 220 2 p RR 95%CI = 9.1 = 0.002 = 2.1 = 1.4 - 3.4 REMEMBER: These values do not provide any information on the possibility that the observed association is due to a bias or confounding. 23 The ultimative (eye) test • Hypothesis testing: X²-Test – Question: Is the proportion of facilitators wearing glasses equal to the proportion of fellows wearing glasses? • Estimation of quantities: Proportion – What is the proportion of fellows/facilitators wearing glasses? 24 The ultimative (eye) test Glasses among fellows : Glasses among facilitators : Yes No 11 27 Yes No 6 8 Total 38 Total 14 Proportion = 11/38 = 0.29 SE = 0.074 95%CI = 0.14 - 0.44 Proportion = 6/14 = 0.43 SE = 0.132 95%CI = 0.17 - 0.69 25 Recommendations • Always look at the raw data (2x2-table). How many cases can be explained by the exposure? • Interpret with caution associations that achieve statistical significance. • Double caution if this statistical significance is not expected. • Use confidence intervals to describe your results. 26 Suggested reading • KJ Rothman, S Greenland, TL Lash, Modern Epidemiology, Lippincott Williams & Wilkins, Philadelphia, PA, 2008 • SN Goodman, R Royall, Evidence and Scientific Research, AJPH 78, 1568, 1988 • SN Goodman, Toward Evidence-Based Medical Statistics. 1: The P Value Fallacy, Ann Intern Med. 130, 995, 1999 • C Poole, Low P-Values or Narrow Confidence Intervals: Which are more Durable? Epidemiology 12, 291, 2001 27 Previous lecturers • • • • • • • Alain Moren Paolo D’Ancona Lisa King Preben Aavitsland Doris Radun Manuel Dehnert Ágnes Hajdu 28