Summary of Measures and Design 3h “A haunting tale of risk, rate and odds” Hein Stigum Presentation, data and programs at: http://folk.uio.no/heins/ courses May-16 H.S. 1 Concepts Risk cases p N probability, proportion, % Rate new cases r N t Km/h, cases/person-time Odds o p N disease 1 p N healthy May-16 H.S. risk 0.0100 0.1000 0.3000 Odds 0.0101 0.1111 0.4286 2 Cohorts Closed cohort start Open cohort start end Count persons, risk end Count person-time, rate Closed cohort with time varying covariates start end Count person-time, rate May-16 H.S. 3 Epidemiological measures • Frequency – prevalence – incidence How much disease? • Association – Risk difference – Risk ratio – Odds ratio More disease among exposed? • Potential impact – Attributable fraction May-16 Important cause? H.S. 4 Frequency measures Name Type Cohorts Existing Cases Population risk - IP New Cases Healthy risk closed cohort IR New Cases RiskTime rate any cohort odds - P Prevalence Incidence Proportion (Cumulative Incidence) Incidence Rate Equation odds Odds p Disease 1 - p NotDisease May convert rate to risk: IP 1 e IR t May-16 IR t H.S. for small IR 5 Disease frequency depicted Existing cases t a Healthy New cases Risk time start end H.S. Exercise: Risk, Rate and Odds Cohort of 200 subject followed for 10 years 10 year follow up Disease + Risk time - 30 1. 2. 3. 4. N Frequency Risk Rate Odds 170 200 1 850 Calculate the 10-year risk of disease Calculate the rate of disease Calculate the 10-year odds of disease Explain the results in words 5 minutes May-16 H.S. 7 Frequency measures risk rate odds May-16 Prevalence Incidence proportion Incidence rate existing cases new cases Prevalence odds Incidence odds H.S. 8 Association measures • More disease among exposed? – Compare frequency among exposed1 and unexposed0 – Difference: – Ratio: RD=risk1-risk0 RR=risk1/risk0 Frequency 0=no effect 1=no effect Association or Effect Difference Ratio Risk Risk Difference, RD Risk Ratio, RR Rate Rate Difference Rate Ratio, RR, IRR, (HRR) Odds - Odds Ratio, OR May-16 H.S. 9 DESIGNS May-16 H.S. 10 The 2 by 2 table Exposure + - Disease + 100 a b 100 10 c d 100 OR= 1 1 1 1 se(ln( OR )) a b c d .01+.01+.1 +.01 =.13 10.0 The lowest number sets the precision To increase power: Cohort: balance exposure Case-Control: balance disease May-16 (north-south) (east-west) H.S. 11 Time E,D E → D E ← D no time cross-section prospective cohort, nested CC, Case-Cohort retrospective traditional CaseControl Present time May-16 H.S. 12 Designs • Aims – Disease occurrence – Exposure-Disease association • Designs – Cross-sectional studies – Cohort studies – Case-control studies • Case-Cohort • Nested Case Control • Traditional Case Control May-16 Inside an existing cohort At the end of an imaginary cohort H.S. 13 3 examples • Gender and Smoking – Do girls smoke more than boys? • Exercise and Coronary Heart Disease (CHD) – Does exercise reduce the risk? • Genes and Diabetes type 1 – Does gene-type increase the risk? Consider: Reversed Causality Frequency of outcome Recall bias What design should we use? May-16 H.S. 14 CROSS-SECTION May-16 H.S. 15 Prevalence depicted Existing cases Prevalence risk Exposed: P1 Unexposed: P0 Healthy Prevalence odds Exposed: O1 Unexposed: O0 start May-16 end H.S. 16 Cross-sectional example Smoking + 300 140 Girls Boys Disease freq 22.0 % Exposure freq 50.0 % Pro: Con: Ex 1 2 3 Exposure Girl Exercise Gene May-16 N Frequency Risk Rate Odds Risk time - 700 860 1 560 1 000 1 000 2 000 Difference Ratio Interpretations in the two last slides fast and inexpensive reversed causality Crosssection Disease Dis freq Smoking 22 % OK CHD 9 % Rev. Caus. Diabetes 1 1% Typical problems: Rev. Caus. H.S. 0.30 0.43 0.14 0.16 Association 0.16 2.1 2.6 CaseCohort Control Size Recall 17 COHORT May-16 H.S. 18 Incidence risk, rate and odds depicted Existing cases t a Healthy New cases Risk time Exposed: Unexposed: start R1 R0 end H.S. Exercise: Cohort Full Cohort, 3 year follow up Frequency Risk Rate Odds 3 year follow up CHD + Exercise + - 100 800 Disease freq 9.0 % Exposure freq 20.0 % 1. 2. 3. 4. N Risk time 2 000 8 000 10 000 5 850 22 800 28 650 - 1 900 7 200 9 100 Association Difference Ratio Calculate the 3-year risk of disease for exposed and unexposed Calculate the rate of disease for exposed and unexposed Calculate the 3-year odds of disease for exposed and unexposed Calculate the difference and ratio association measures, use no exercise as reference. 5. Explain the results in words May-16 H.S. 10 minutes 20 CASE CONTROL STUDIES May-16 H.S. 21 Full Cohort: Traditional Case-Control Frequency Risk Rate Odds 10 year follow up Diabetes N Risk time 3 900 4 000 95 000 96 000 98 900 100 000 39 500 955 000 994 500 + Gene 100 1000 1100 + - Disease freq 1.1 % Exposure freq 4.0 % - 0.025 0.010 f=0.1 Difference Ratio Case-Control: Gene + - 100 1000 1100 0.026 0.011 Association 0.015 0.001 2.40 2.42 2.44 Sampling fraction f=0.1 Diabetes + 0.003 0.001 N Pseudo Frequency Risk Rate Odds Risk time - 390 9 500 9 890 0.256 0.105 Association Difference Ratio • In practice: All cases, 1-5 controls per case • Sample controls independent of exposure • Sample controls from base population of cases May-16 H.S. 2.44 Money saved? 22 Traditional Case-Control cont. One Control per Case: Diabetes + Gene + - 100 1000 1100 Pseudo Frequency Risk Rate Odds Sampling fraction f=? - 43 1 057 1 100 2.305 0.946 Association Difference Ratio Ex 1 2 3 Exposure Girl Exercise Gene Disease Smoking CHD Diabetes 1 CrossCaseDis freq section Cohort Control 22 % OK 9 % Rev. Caus. OK Recall bias OK 1% Large Typical problems: Rev. Caus. May-16 2.44 H.S. Size Recall 23 Nested studies: case-control inside a cohort Exposure information: relatively cheap relatively expensive store blood, questionnaire analyze blood Cohort start store blood quest. quest. full cohort cases+controls Cohort end quest. Cases analyze blood → prospective study → Controls exposure May-16 H.S. 24 Nested studies cont. for rare outcomes Cohort problems Trad. Case-Control problems Poor balance large study Recall bias Selection of controls Nested studies: Case-Cohort Nested case-control • Efficient design (balanced) • Prospective (no recall bias) • Easy selection of controls May-16 H.S. 25 Case-Control studies Design Nested Direction Sample controls Case-Cohort Yes Prospective At start Nested Case-Control Yes Prospective Traditional Case-Control No During follow up Retrospective At end of (imaginary) cohort E → D prospective E ← D May-16 retrospective H.S. 26 Case-Cohort Existing cases Healthy . . . . New cases controls Risk time start May-16 end H.S. 27 Nested Case-Control Existing cases . case . . . case controls Healthy New cases controls Risk time . risk set . risk set start May-16 end H.S. 28 Traditional Case-Control Existing cases Healthy New cases Risk time start May-16 . . . . controls end H.S. 29 Exercise: Case-Control studies 10 year follow up Full cohort cancer + Vit D N Frequency Risk Rate Odds Risk time - 714 000 1500 78 500 80 000 357 600 600 39 400 40 000 2100 117 900 . 120 000 1 071 600 + - Disease freq 1.8 % Exposure freq 66.7 % Difference Relative Sampling fraction f=0.1 0.019 0.002 0.019 0.015 0.002 0.015 Association 0.004 0.000 1.25 1.25 1.25 1. Include all cases from the full cohort. Sample controls as in 2-4 below: 2. Case-Cohort: sample 10% disease free (independent of exposure) at the start of the cohort. Calculate the pseudo risks of disease. 3. Nested Case-Control: sample 10% of the risk time (independent of exposure) as control. Calculate the pseudo rates of disease. 4. Traditional Case-Control: sample 10% disease free (independent of exposure) at the end of the cohort. Calculate the pseudo odds of disease. 5. Calculate risk-, rate- and odds ratios. May-16 H.S. 20 minutes 30 Case-Cohort vs. Nested Case-Control Design Case-Cohort: Nested Case-Control: Loss to follow up Tailored sampling Reuse controls Estimation Low efficacy if much loss to follow up Simple random Yes Modified Cox Stratified sampling. Counter matching. Not straight forward Stratified Cox or Conditional logistic Good Case-Cohort: generalist Nested Case-Control: specialist May-16 H.S. 31 Measures and regression model Outcome Exposed Unexposed Regression Continuous mean1 mean0 Linear regression Binary risk1 rate1 odds1 risk0 rate0 odds0 ? May-16 H.S. 32 Adjusted measures Remove the effect of confounders in regression models Frequency Association or Effect Difference Risk RD: Rate RD: Odds - Ratio Linear-binomial Linear-Poisson RR: IRR: OR: Log-binomial Cox, Poisson Logistic Generalized linear models Family (y|x) Link Measure binomial logit OR binomial log RR Poisson regression with robust variance binomial identity RD Linear regression with robust variance Poisson identity RateDiff May-16 Alternative H.S. 33 Summing up • Frequency – risk, rate, odds • Association – risk, rate, odds in exposed vs unexposed • Designs – Cross-sectional – Cohort – Case-control no time prospective • Case-Cohort • Nested Case Control • Traditional Case Control nested prospective retrospective Logistic regression is not the only model! May-16 H.S. 34 EXTRA MATERIAL May-16 H.S. 35 INTERPRETATIONS OF DIFFERENCE AND RATIO MEASURES May-16 H.S. 36 Cross-sectional example, interpretations Smoking + 300 140 Girls Boys Disease freq 22.0 % Exposure freq 50.0 % N Frequency Risk Rate Odds Risk time - 700 860 1 560 1 000 1 000 2 000 Difference Ratio 0.30 0.43 0.14 0.16 Association 0.16 2.1 2.6 Risk Difference=0.16: Among boys 14% smoke, girls smoke 16 percetagepoints more Risk Ratio=2.1: Among boys 14% smoke, girls smoke 2.1 times more Rate Difference: Rate Ratio: Odds Ratio=2.6: The odds of smoking is 2.6 higher for girls than for boys May-16 H.S. 37 Cohort exercise, interpretations Full Cohort, 3 year follow up Frequency Risk Rate Odds 3 year follow up CHD + Exercise + - 100 800 Disease freq 9.0 % Exposure freq 20.0 % N Risk time 2 000 8 000 10 000 5 850 22 800 28 650 - 1 900 7 200 9 100 0.050 0.017 0.053 0.100 0.035 0.111 Association Difference -0.050 -0.018 Ratio 0.50 0.49 0.47 Risk among unexposed=0.10: If no exercise the 3 years risk of coronary heart disease is 10%, Risk Difference=-0.05: those who exercise have 5 percentage points less risk Risk Ratio=0.5: those who exercise have half the risk Rate among unexposed=0.035: If no exercise the rate of CHD is 35 new cases per 1000 person years Rate Difference=-0.018: those who exercise will have 18 cases less per 1000 py Rate Ratio=0.49: those who exercise will have half the rate of CHD Odds Ratio=0.47: The odds of CHD is appr. half among subject who exercise compared to no exercise May-16 H.S. 38