Summary Measures and Design Hein Stigum Presentation, data and programs at:

advertisement
Summary
of
Measures and Design
3h
“A haunting tale of risk, rate and odds”
Hein Stigum
Presentation, data and programs at:
http://folk.uio.no/heins/
courses
May-16
H.S.
1
Concepts
Risk
cases
p
N
probability, proportion, %
Rate
new cases
r
N t
Km/h, cases/person-time
Odds o  p  N  disease
1  p N healthy
May-16
H.S.
risk
0.0100
0.1000
0.3000
Odds
0.0101
0.1111
0.4286
2
Cohorts
Closed cohort
start
Open cohort
start
end
Count persons, risk
end
Count person-time, rate
Closed cohort with time varying covariates
start
end
Count person-time, rate
May-16
H.S.
3
Epidemiological measures
• Frequency
– prevalence
– incidence
How much disease?
• Association
– Risk difference
– Risk ratio
– Odds ratio
More disease
among exposed?
• Potential impact
– Attributable fraction
May-16
Important cause?
H.S.
4
Frequency measures
Name
Type
Cohorts
Existing Cases
Population
risk
-
IP 
New Cases
Healthy
risk
closed cohort
IR 
New Cases
RiskTime
rate
any cohort
odds
-
P
Prevalence
Incidence Proportion
(Cumulative Incidence)
Incidence Rate
Equation
odds 
Odds
p
Disease

1 - p NotDisease
May convert rate to risk:
IP  1  e  IR t
May-16
 IR  t
H.S.
for small IR
5
Disease frequency depicted
Existing
cases
t
a
Healthy
New
cases
Risk time
start
end
H.S.
Exercise: Risk, Rate and Odds
Cohort of 200 subject followed for 10 years
10 year follow up
Disease
+
Risk time
-
30
1.
2.
3.
4.
N
Frequency
Risk
Rate Odds
170
200
1 850
Calculate the 10-year risk of disease
Calculate the rate of disease
Calculate the 10-year odds of disease
Explain the results in words
5 minutes
May-16
H.S.
7
Frequency measures
risk
rate
odds
May-16
Prevalence
Incidence proportion
Incidence rate
existing cases
new cases
Prevalence odds
Incidence odds
H.S.
8
Association measures
• More disease among exposed?
– Compare frequency among exposed1 and unexposed0
– Difference:
– Ratio:
RD=risk1-risk0
RR=risk1/risk0
Frequency
0=no effect
1=no effect
Association or Effect
Difference
Ratio
Risk
Risk Difference, RD
Risk Ratio, RR
Rate
Rate Difference
Rate Ratio, RR, IRR, (HRR)
Odds
-
Odds Ratio, OR
May-16
H.S.
9
DESIGNS
May-16
H.S.
10
The 2 by 2 table
Exposure
+
-
Disease
+
100 a b 100
10 c d 100
OR=
1 1 1 1
se(ln( OR )) 
  
a b c d
.01+.01+.1 +.01 =.13
10.0
The lowest number sets the precision
To increase power:
Cohort:
balance exposure
Case-Control: balance disease
May-16
(north-south)
(east-west)
H.S.
11
Time
E,D
E → D
E ← D
no time
cross-section
prospective
cohort, nested CC, Case-Cohort
retrospective
traditional CaseControl
Present time
May-16
H.S.
12
Designs
• Aims
– Disease occurrence
– Exposure-Disease association
• Designs
– Cross-sectional studies
– Cohort studies
– Case-control studies
• Case-Cohort
• Nested Case Control
• Traditional Case Control
May-16
Inside an existing cohort
At the end of an imaginary cohort
H.S.
13
3 examples
• Gender and Smoking
– Do girls smoke more than boys?
• Exercise and Coronary Heart Disease (CHD)
– Does exercise reduce the risk?
• Genes and Diabetes type 1
– Does gene-type increase the risk?
Consider:
Reversed Causality
Frequency of outcome
Recall bias
What design
should we use?
May-16
H.S.
14
CROSS-SECTION
May-16
H.S.
15
Prevalence depicted
Existing
cases
Prevalence risk
Exposed: P1
Unexposed: P0
Healthy
Prevalence odds
Exposed: O1
Unexposed: O0
start
May-16
end
H.S.
16
Cross-sectional example
Smoking
+
300
140
Girls
Boys
Disease freq
22.0 %
Exposure freq
50.0 %
Pro:
Con:
Ex
1
2
3
Exposure
Girl
Exercise
Gene
May-16
N
Frequency
Risk
Rate Odds
Risk time
-
700
860
1 560
1 000
1 000
2 000
Difference
Ratio
Interpretations in the two last slides
fast and inexpensive
reversed causality
Crosssection
Disease
Dis freq
Smoking
22 %
OK
CHD
9 % Rev. Caus.
Diabetes 1
1%
Typical problems: Rev. Caus.
H.S.
0.30
0.43
0.14
0.16
Association
0.16
2.1
2.6
CaseCohort Control
Size
Recall
17
COHORT
May-16
H.S.
18
Incidence risk, rate and odds depicted
Existing
cases
t
a
Healthy
New
cases
Risk time
Exposed:
Unexposed:
start
R1
R0
end
H.S.
Exercise: Cohort
Full Cohort, 3 year follow up
Frequency
Risk
Rate Odds
3 year follow up
CHD
+
Exercise
+
-
100
800
Disease freq
9.0 %
Exposure freq
20.0 %
1.
2.
3.
4.
N
Risk time
2 000
8 000
10 000
5 850
22 800
28 650
-
1 900
7 200
9 100
Association
Difference
Ratio
Calculate the 3-year risk of disease for exposed and unexposed
Calculate the rate of disease for exposed and unexposed
Calculate the 3-year odds of disease for exposed and unexposed
Calculate the difference and ratio association measures, use no
exercise as reference.
5. Explain the results in words
May-16
H.S.
10 minutes
20
CASE CONTROL STUDIES
May-16
H.S.
21
Full Cohort:
Traditional Case-Control
Frequency
Risk
Rate Odds
10 year follow up
Diabetes
N
Risk time
3 900
4 000
95 000 96 000
98 900 100 000
39 500
955 000
994 500
+
Gene
100
1000
1100
+
-
Disease freq
1.1 %
Exposure freq
4.0 %
-
0.025
0.010
f=0.1
Difference
Ratio
Case-Control:
Gene
+
-
100
1000
1100
0.026
0.011
Association
0.015 0.001
2.40
2.42
2.44
Sampling fraction f=0.1
Diabetes
+
0.003
0.001
N
Pseudo Frequency
Risk
Rate Odds
Risk time
-
390
9 500
9 890
0.256
0.105
Association
Difference
Ratio
• In practice: All cases, 1-5 controls per case
• Sample controls independent of exposure
• Sample controls from base population of cases
May-16
H.S.
2.44
Money saved?
22
Traditional Case-Control cont.
One Control per Case:
Diabetes
+
Gene
+
-
100
1000
1100
Pseudo Frequency
Risk
Rate Odds
Sampling fraction f=?
-
43
1 057
1 100
2.305
0.946
Association
Difference
Ratio
Ex
1
2
3
Exposure
Girl
Exercise
Gene
Disease
Smoking
CHD
Diabetes 1
CrossCaseDis freq section
Cohort Control
22 %
OK
9 % Rev. Caus.
OK Recall bias
OK
1%
Large
Typical problems: Rev. Caus.
May-16
2.44
H.S.
Size
Recall
23
Nested studies: case-control inside a cohort
Exposure information:
relatively cheap
relatively expensive
store blood, questionnaire
analyze blood
Cohort
start
store blood
quest.
quest.
full cohort
cases+controls
Cohort
end
quest.
Cases
analyze
blood
→ prospective study →
Controls
exposure
May-16
H.S.
24
Nested studies cont.
for rare outcomes
Cohort problems
Trad. Case-Control problems
Poor balance 
large study
Recall bias
Selection of controls
Nested studies:
Case-Cohort
Nested case-control
• Efficient design (balanced)
• Prospective (no recall bias)
• Easy selection of controls
May-16
H.S.
25
Case-Control studies
Design
Nested
Direction
Sample controls
Case-Cohort
Yes
Prospective At start
Nested Case-Control
Yes
Prospective
Traditional Case-Control
No
During follow up
Retrospective
At end
of (imaginary) cohort
E → D
prospective
E ← D
May-16
retrospective
H.S.
26
Case-Cohort
Existing
cases
Healthy
.
.
.
.
New
cases
controls
Risk time
start
May-16
end
H.S.
27
Nested Case-Control
Existing
cases
.
case
.
.
.
case
controls
Healthy
New
cases
controls
Risk time
.
risk set
.
risk set
start
May-16
end
H.S.
28
Traditional Case-Control
Existing
cases
Healthy
New
cases
Risk time
start
May-16
.
.
.
.
controls
end
H.S.
29
Exercise: Case-Control studies
10 year
follow up
Full cohort
cancer
+
Vit D
N
Frequency
Risk
Rate Odds
Risk time
-
714 000
1500 78 500 80 000
357 600
600 39 400 40 000
2100 117 900 . 120 000 1 071 600
+
-
Disease freq
1.8 %
Exposure freq
66.7 %
Difference
Relative
Sampling fraction f=0.1
0.019 0.002 0.019
0.015 0.002 0.015
Association
0.004 0.000
1.25
1.25
1.25
1. Include all cases from the full cohort. Sample controls as in 2-4 below:
2. Case-Cohort:
sample 10% disease free (independent of
exposure) at the start of the cohort. Calculate the pseudo risks of disease.
3. Nested Case-Control:
sample 10% of the risk time (independent of
exposure) as control. Calculate the pseudo rates of disease.
4. Traditional Case-Control: sample 10% disease free (independent of
exposure) at the end of the cohort. Calculate the pseudo odds of disease.
5. Calculate risk-, rate- and odds ratios.
May-16
H.S.
20 minutes
30
Case-Cohort vs. Nested Case-Control
Design
Case-Cohort:
Nested Case-Control:
Loss to
follow up
Tailored
sampling
Reuse
controls
Estimation
Low efficacy if much
loss to follow up
Simple random
Yes
Modified Cox
Stratified sampling.
Counter matching.
Not straight
forward
Stratified Cox or
Conditional logistic
Good
Case-Cohort:
generalist
Nested Case-Control: specialist
May-16
H.S.
31
Measures and regression model
Outcome
Exposed
Unexposed
Regression
Continuous
mean1
mean0
Linear regression
Binary
risk1
rate1
odds1
risk0
rate0
odds0
?
May-16
H.S.
32
Adjusted measures
Remove the effect of confounders in regression models
Frequency
Association or Effect
Difference
Risk
RD:
Rate
RD:
Odds
-
Ratio
Linear-binomial
Linear-Poisson
RR:
IRR:
OR:
Log-binomial
Cox, Poisson
Logistic
Generalized linear models
Family (y|x)
Link
Measure
binomial
logit
OR
binomial
log
RR
Poisson regression with robust variance
binomial
identity
RD
Linear regression with robust variance
Poisson
identity
RateDiff
May-16
Alternative
H.S.
33
Summing up
• Frequency
– risk, rate, odds
• Association
– risk, rate, odds in exposed vs unexposed
• Designs
– Cross-sectional
– Cohort
– Case-control
no time
prospective
• Case-Cohort
• Nested Case Control
• Traditional Case Control
nested prospective
retrospective
Logistic regression is not the only model!
May-16
H.S.
34
EXTRA MATERIAL
May-16
H.S.
35
INTERPRETATIONS
OF DIFFERENCE AND RATIO
MEASURES
May-16
H.S.
36
Cross-sectional example, interpretations
Smoking
+
300
140
Girls
Boys
Disease freq
22.0 %
Exposure freq
50.0 %
N
Frequency
Risk
Rate Odds
Risk time
-
700
860
1 560
1 000
1 000
2 000
Difference
Ratio
0.30
0.43
0.14
0.16
Association
0.16
2.1
2.6
Risk Difference=0.16: Among boys 14% smoke, girls smoke 16 percetagepoints more
Risk Ratio=2.1: Among boys 14% smoke, girls smoke 2.1 times more
Rate Difference:
Rate Ratio:
Odds Ratio=2.6: The odds of smoking is 2.6 higher for girls than for boys
May-16
H.S.
37
Cohort exercise, interpretations
Full Cohort, 3 year follow up
Frequency
Risk
Rate Odds
3 year follow up
CHD
+
Exercise
+
-
100
800
Disease freq
9.0 %
Exposure freq
20.0 %
N
Risk time
2 000
8 000
10 000
5 850
22 800
28 650
-
1 900
7 200
9 100
0.050 0.017 0.053
0.100 0.035 0.111
Association
Difference -0.050 -0.018
Ratio
0.50
0.49
0.47
Risk among unexposed=0.10: If no exercise the 3 years risk of coronary heart disease is 10%,
Risk Difference=-0.05: those who exercise have 5 percentage points less risk
Risk Ratio=0.5: those who exercise have half the risk
Rate among unexposed=0.035: If no exercise the rate of CHD is 35 new cases per 1000 person years
Rate Difference=-0.018: those who exercise will have 18 cases less per 1000 py
Rate Ratio=0.49: those who exercise will have half the rate of CHD
Odds Ratio=0.47: The odds of CHD is appr. half among subject who exercise compared to no exercise
May-16
H.S.
38
Download