RosenbergAndRankin_PAFs

advertisement
Multifactorial Population Attributable
Fractions: Approaches, Examples, and
Issues to Consider
Deborah Rosenberg, PhD and Kristin Rankin, PhD
Epidemiology and Biostatistics
School of Public Health
University of Illinois at Chicago
Multifactorial Population Attributable
Fractions: Approaches, Examples, and
Issues to Consider
Wednesday, May 30th, 3:15-5:00pm
Deborah
Research
Rosenberg, PhD
Associate Professor
Kristin Rankin, PhD
Research Assistant Professor

Division of Epidemiology and Biostatistics
University of IL School of Public Health

Training Course in MCH Epidemiology

Background

In any multivariable analysis, the goal is to generate
unconfounded / independent estimates of effect for
each of many factors, taking into account the
relationships among and intersection of those factors.

Different analytic approaches are required to obtain
these mutually exclusive estimates according to
whether ratio measures of association or population
attributable fractions (PAFs) are of interest.
Background
In contrast to relative risks or odds ratios
the PAF is a function of both the magnitude of association
and the prevalence of risk in the population
The crude PAF (Levin, 1953):
ProportionExposed
RelativeRisk  1

AmongT hosewith theOutcome
RelativeRisk
Extension to a multivariable PAF
 RRj  1 

p j

 RR 
j 0
j 

Rothman
 pj 

1  


j 0  RR j 
k
k
or
Bruzzi
3
Background
 When estimating relative risks or odds ratios, independence
can be achieved through use of usual adjustment
procedures to control for confounding:
“Does a risk factor confer excess risk of disease for an
individual after holding all other factors constant?”
 When estimating PAFs, usual adjustment does not result in
mutually exclusive PAFs, nor does it address the dynamics
of how the prevalence of risk factors might change in the
population over time
“How much will eliminating a risk factor reduce the
prevalence of disease in the population given that
disease may still occur in the presence of other risk
factors that have not yet been eliminated?”
4
Background



After adjustment, both ratio measures and PAFs
may be overestimates because of residual
confounding.
For PAFs, overestimation is of greater concern
because simple adjustment assumes that only the
factor of interest will be eliminated—its prevalence
will be reduced to 0—while the prevalence of other
factors remain constant.
Concern about the precision of PAF estimates is
also critical since these measures directly speak to
the potential impact of public health action
5
Background
Methods that go beyond the usual adjustment approach
have been developed to handle the problem of obtaining
mutually exclusive and mutually adjusted PAFs:

Modifiable risk factors are considered together as
belonging to what might be called a “risk system”
−
−
Expected disease reduction due to elimination of any
one risk factor is quantified by acknowledging every
possible sequence for eliminating all factors in the “risk
system” over time.
the maximum expected disease reduction due to
elimination of all risk factors in the “risk system” can also
be appropriately quantified
6
Organizing Factors into a Risk System:
A Framework for Computing PAFs

“Adjusted” PAF: The PAF for eliminating a risk
factor from a risk system after controlling for
other factors (Miettenin, 1974)

Summary PAF: The PAF for the maximum
expected disease reduction when all factors in a
risk system are simultaneously eliminated
(Bruzzi, 1985)

Component PAF: The separate PAF for every
possible combination of exposure levels in the
risk system (the set of joint and separate effects
of risk factors)
7
Organizing Factors into a Risk System:
A Framework for Computing PAFs

Sequential PAF: The PAF for eliminating a risk
factor in a particular order from a risk system; sets
of sequential PAFs comprise all possible removal
sequences

Average PAF: The PAF summarizing all possible
sequences for eliminating a single modifiable risk
factor (Eide and Gefeller, 1995)
8
Organizing Factors into a Risk System:
A Framework for Computing PAFs
The Average PAF

The AvgPAF for a risk factor is both mutually
exclusive and mutually adjusted

The sum of AvgPAFs for all modifiable risk factors
in a system equals the summary PAF (sumPAF) for
the risk system as a whole—the % of disease
reduction expected if all of the factors are
simultaneously and completely eliminated from the
population
9
Example: Simultaneously Considering 3 Risk
Factors for a Health Outcome
Let 3 factors be called M1, M2, and M3
The SummaryPAF for this “risk system” can be partitioned
into seven component PAFs:
What is the proportion
of disease attributable
to each combination
of risk factors?
M1 and M2 and M3
M1 and M2
M1 and M3
M2 and M3
M1 alone
M2 alone
M3 alone
10
Summary PAF Partitioned into
7 Component PAFs
0.04
0.02
M1 & M2 & M3
0.06
M1& M2
0.02
M1& M3
M2 & M3
0.03
0.02
0.66
M1alone
3 Risk Factors
for an Outcome
Summary PAF
= 0.34
M2alone
M3alone
0.15
Unknown
Joint and Separate Effects of M1, M2, & M3
Component PAFs still fail to provide a
single estimate of the impact of each factor
11
Example: Simultaneously Considering 3 Risk
Factors for a Health Outcome
The SummaryPAF for this “risk system” can be partitioned
six different ways into three sequential PAFs
Computation of sequential PAFs involves a series of
subtractions of adjusted and/or summary PAFs
based on repeatedly redefining the risk system and
its corresponding Summary PAF in terms of
particular combinations of risk factors
12
Example: Simultaneously Considering 3 Risk
Factors for a Health Outcome
Six Sequences for Three Risk Factors
Sequence #1: Eliminate M1, then M2, then M3
Sequence #2: Eliminate M1, then M3, then M2
Sequence #3: Eliminate M2, then M1, then M3
Sequence #4: Eliminate M2, then M3, then M1
Sequence #5: Eliminate M3, then M1, then M2
Sequence #6: Eliminate M3, then M2, then M1
For each factor, there are two 1st sequential PAFs, two 2nd
sequential PAFs, and two 3rd Sequential PAFs.
13
Example: Simultaneously Considering 3 Risk
Factors for a Health Outcome
EXAMPLE (Sequence #1):Eliminate M1, then M2, then M3
1st SeqPAF* (M1)
= PAF(M1) adjusting for M2 & M3 and other covariates**
2nd SeqPAF (M2)
= SumPAF(M1 & M2) adjusting for M3 and other covariates
– PAF(M1) adjusting for M2 & M3 and other covariates
3rd SeqPAF (M3)
= SumPAF (M1 & M2 & M3)
– SumPAF(M1 & M2) adjusting for M3 and other covariates
*“adjusted” PAF
**covariates are unmodifiable factors
14
Example: Simultaneously Considering 3 Risk
Factors for a Health Outcome
EXAMPLE (Sequence #5):Eliminate M3, then M1, then M2
1st SeqPAF* (M3)
= PAF(M3) adjusting for M1 & M2 and other covariates**
2nd SeqPAF (M1)
= SumPAF(M3 & M1) adjusting for M2 and other covariates
– PAF(M3) adjusting for M1 & M2 and other covariates
3rd SeqPAF (M2)
= SumPAF(M1 & M2 & M3)
– (SumPAF(M3 & M1) adjusting for M2 and other covariates
*“adjusted” PAF
**covariates are unmodifiable factors
15
Summary PAF Partitioned into
Sequential PAFs for Sequences #1 and #5
0.07
0.23
0.08
1st seqPAF M1
1st seqPAF M3
2nd seqPAF M2
2nd seqPAF M1
0.19
3rd seqPAF M3
3rd seqPAF M2
0.05
Unknown
Unknown
0.66
1st , 2nd , and 3rd Sequential PAFs for
the Sequence in which M1 is Eliminated
First, M2 Second, and M3 Last
0.06
0.66
1st , 2nd , and 3rd Sequential PAFs for
the Sequence in which M3 is Eliminated
First, M1 Second, and M2 Last
3 Risk Factors for an Outcome: Summary PAF = 0.34
Sequential PAFs still fail to provide
a single estimate of the impact of each factor
16
Example: Simultaneously Considering 3 Risk
Factors for a Health Outcome
In general: the number of possible sequences is a
function of the number of factors in the risk system and
becomes large quickly as the number of variables
increases.
# of Risk # of Possible Removal
Factors Orderings / Sequences
Number of Unique
Sequential PAFs for a factor
2
2! = 2
2
3
3! = 6
4
4
4! = 24
8
5
5! = 120
16
6
6! = 720
32
7
7! = 5,040
64
17
Example: Simultaneously Considering 3 Risk
Factors for a Health Outcome
The SummaryPAF for this “risk system” can be partitioned
into three average PAFs--one for each of the 3 factors
The sequences are broken apart, rearranging the
sequential PAFs so that the six 1st, 2nd, and 3rd
sequential PAFs for M1 are grouped together as are
those for M2 and M3
The AveragePAF for each factor is then calculated
as the simple arithmetic average across the 6
sequential PAFs for each factor
18
Example: Simultaneously Considering 3 Risk
Factors for a Health Outcome
For example, the Average PAF for M1 is the sum of the following
6 Sequential PAFs divided by 6:
The two 1st SeqPAFs for M1—Seq #1 & Seq #2
= PAF(M1), adjusting** for M2 & M3 (multiplied by 2)
The two 2nd SeqPAFs for M1 —Seq #3 & Seq #5
= (SumPAF(M2 & M1), adj. for M3) – PAF(M2) adj. for M1 & M3
= (SumPAF(M3 & M1), adj. for M2) – PAF(M3) adj. for M1 & M2
The two 3rd SeqPAFs for M1 —Seq #4 & Seq #6
= (SumPAF(M1 & M2 & M3) – SumPAF(M2 & M3), adj. for M1
**Adjustment also includes covariates
19
Summary PAF Partitioned into
Average PAFs for Three Risk Factors
The avgPAF for a factor is the simple average of all of
that factor’s seqPAFs
0.05
With 3 factors, the avgPAF
0.08
all 6 seqPAFs


avgPAF M1
0.21
avgPAF M2
avgPAF M3
Unknown
0.66
Average PAFs for M1, M2, and M3
Accounting for All Sequences For
Eliminating These Risk Factors
6
3 Risk Factors for an Outcome
Summary PAF = 0.34
The Average PAF is a
single estimate of the impact
of each risk factor
20
Example: Smoking, Cocaine Use
and Low Birthweight
21
Example: Smoking, Cocaine Use, and
Low Birthweight: Crude Associations
Crude RR = 10.00 = 1.60
6.25
200 1.6  1
Crude PAF 

700 1.6
 0.107
Crude RR = 30.00 = 4.77
6.29
90 4.77  1
Crude PAF 

700 4.77
 0.102
22
Smoking and Cocaine Organized into a
Risk System
If smoking and cocaine use were recoded as a
single “substance use” variable:
Freq
| LOW BIRTHWEIGHT
Row Pct |
yes |
no
|Total
_________|________|________|
smoke and|
52 |
98 | 150
cocaine| 34.67 | 65.33 |
_________|________|________|
cocaine|
38 |
112 | 150
only| 25.33 | 74.67 |
_________|________|________|
smoke|
148 |
1702 | 1850
only|
8.00 | 92.00 |
_________|________|________|
neither|
462 |
7388 | 7850
|
5.89 | 94.11 |
_________|________|________|
Total
700
9300 10000
238 1.88  1
SummaryPAF 

700 1.88
 0.16
Freq
| LOW BIRTHWEIGHT
Row Pct |
yes |
no
|Total
__________|________|________|
any smoke|
238 |
1912 | 2150
or cocaine| 11.07 | 88.93 |
__________|________|________|
neither|
462 |
7388 | 7850
|
5.89 | 94.11 |
__________|________|________|
Total
700
9300 10000
23
Component PAFs and Summary PAF for the
Smoking-Cocaine Risk System
Using Rothman’s formula:
The Summary PAF is the
sum of component PAFs
52 5.89  1
PAF3 

700 5.89
 0.062
 RRj  1 

pj

 RR 
i 0
j 

k
148 1.36  1
PAF1 

700 1.36
 0.056
+
38 4.30  1
PAF2 

700
4.3
 0.042
+
+
462 1  1
PAF0 

700 1
 0.0
= 0.16
24
Limitation of Component PAFs from the
Smoking-Cocaine Risk System
While the component PAFs of a risk system sum
to the Summary PAF for the system as a whole,
they do not provide mutually exclusive measures
of the PAF for each risk factor
Coc
Sm/Coc
Here, the Summary PAF = 0.16,
but the two factors overlap:
the component PAFs still do not
disentangle smoking and cocaine
for those who do both
Sm
0.042 0.062
0.056
0.84
25
The “Adjusted” PAF: Obtaining a Single PAF
for a Given Risk Factor
The Stratified Approach: The PAF for eliminating a
risk factor after controlling for other risk factors
With the Rothman formula, data are organized into the
more traditional strata set-up for adjustment:
Not assuming homogeneity, pj & RRj are stratum-specific:
# of strata

j
# of e xpose dcase sj
Re l ativeRiskj  1

Total C ase s,allstrata Re l ativeRiskj
Assuming homogeneity, Overall
# of exposedcases, all strata Adjusted RelativeRisk  1

T otalCases, all strata
Adjusted RelativeRisk
26
The “Adjusted” PAF: Obtaining a Single PAF
for a Given Factor
Freq
| LOW BIRTHWEIGHT
Row Pct |
yes |
no
|Total
_________|________|________|
smoke and|
52 |
98 | 150
cocaine| 34.67 | 65.33 |
_________|________|________|
cocaine|
38 |
112 | 150
only| 25.33 | 74.67 |
_________|________|________|
smoke|
148 |
1702 | 1850
only|
8.00 | 92.00 |
_________|________|________|
neither|
462 |
7388 | 7850
|
5.89 | 94.11 |
_________|________|________|
Total
700
9300 10000
Reorganizing the data to
RR = 5.89
get an adjusted PAF with
52 5.89  1
PAF


Rothman’s
700 formula
5.89
 0.062
COCAINE=YES
Freq
| LOW BIRTHWEIGHT
Row Pct|
yes |
no
|Total
_______|________|________|
smoke|
52 |
98 |150
yes| 34.67 | 65.33 |
_______|________|________|
smoke|
38 |
112 |150
no| 25.33 | 74.67 |
_______|________|________|
Total
90
210 300
COCAINE=NO
Freq|
LOW BIRTHWEIGHT
Row Pct|
yes | no
|Total
_______|________|________|
smoke|
148 |
1702 |1850
yes|
8.00 | 92.00 |
_______|________|________|
smoke|
462 |
7388 |7850
no|
5.89 | 94.11 |
_______|________|________|
Total
610
9090 9700
27
The “Adjusted” PAF: The PAF for Smoking,
Controlling for Cocaine Use*
52 1.37  1
PAF 

700 1.37
 0.020
COCAINE=YES
Freq
| LOW BIRTHWEIGHT
Row Pct|
yes |
no
|Total
_______|________|________|
smoke|
52 |
98 |150
yes| 34.67 | 65.33 |
_______|________|________|
smoke|
38 |
112 |150
no| 25.33 | 74.67 |
_______|________|________|
Total
90
210 300
COCAINE=NO
Freq|
LOW BIRTHWEIGHT
Row Pct|
yes | no
|Total
_______|________|________|
smoke|
148 |
1702 |1850
yes|
8.00 | 92.00 |
_______|________|________|
smoke|
462 |
7388 |7850
no|
5.89 | 94.11 |
_______|________|________|
Total
610
9090 9700
RR=1.37
+
148 1.36  1
PAF 

700 1.36
 0.056
=
RR=1.36 PAF"Adjusted"  0.020 0.056
 0.076
*Using stratum-specific estimates
28
The “Adjusted” PAF: The PAF for Cocaine
Controlling for Smoking*
52 4.33  1
PAF 

700 4.33
 0.057
SMOKE=YES
Freq
| LOW BIRTHWEIGHT
Row Pct|
yes |
no
|Total
_______|________|________|
cocaine|
52 |
98 | 150
yes| 34.67 | 65.33 |
_______|________|________|
cocaine|
148 |
1702 |1850
no|
8.00 | 92.00 |
_______|________|________|
Total
200
1800 2000
SMOKE=NO
Freq
| LOW BIRTHWEIGHT
Row Pct|
yes |
no
|Total
_______|________|________|
cocaine|
38 |
112 | 150
yes| 25.33 | 74.67 |
_______|________|________|
cocaine|
462 |
7388 |7850
no|
5.89 | 94.11 |
_______|________|________|
Total
500
7500 8000
RR=4.33
+
38 4.30  1
PAF 

700 4.30
 0.042
=
RR=4.30 PAF"Adjusted"  0.057 0.042
 0.099
*Using stratum-specific estimates
29
Limitations of the “Adjusted” PAF:
The resulting adjusted PAFs still are not mutually
exclusive and they do not meet the criterion of
summing to the Summary PAF for all factors
combined
0.076
0.042 0.062
0.056
0.84
0.099
≠
0.042+0.062+0.056=0.16
0.825
0.076 + 0.099 = 0.175
30
Sequential PAFs (PAFSEQ) for the
Smoking-Cocaine Risk System
For the smoking-cocaine risk system, there are 2
possible sequences:
1. Eliminate smoking first (a), controlling for
cocaine use, then eliminate cocaine use (b)
2. Eliminate cocaine use first (a), controlling for
smoking, then eliminate smoking (b)
And within each sequence, there are two
sequential PAFs
31
Sequential PAFs (PAFSEQ) for the
Smoking-Cocaine Risk System
1.The 1st sequential PAF for eliminating smoking first,
controlling for cocaine use (the “adjusted” PAF):
PAFSEQ1a (S|C) = 0.076
2.The 2nd sequential PAF for eliminating cocaine use
after smoking has already been eliminated is the
remainder of the Summary PAF
PAFSEQ1b = PAFSUM – PAFSEQ1a (S|C)
= 0.16 – 0.076 = 0.084
32
Sequential PAFs (PAFSEQ) for the
Smoking-Cocaine Risk System
1. The 1st sequential PAF for eliminating cocaine use
first, controlling for smoking (the “adjusted” PAF:
PAFSEQ2a (C|S) = 0.099
2. The 2nd PAF for eliminating smoking after cocaine
use has already been eliminated is the remainder
of the Summary PAF
PAFSEQ2b = PAFSUM – PAFSEQ2a (C|S)
= 0.16 – 0.099 = 0.061
33
Sequential PAFs (PAFSEQ) for the
Smoking-Cocaine Risk System
By definition, the sequential PAFs within the two possible
sequences sum to the Summary PAF, but they still do not
provide single measures of the impact of smoking or cocaine
use regardless of the order in which they are eliminated
0.076
0.084
0.84
Smoking First
0.076 + 0.084 = 0.16
0.099
0.061
0.84
Cocaine Use First
0.099 + 0.061 = 0.16
34
Average PAF (PAFAVG) for the
Smoking-Cocaine Risk System
To obtain a single estimate, the sequential PAFs are
rearranged, grouping the two for smoking together
and the two for cocaine together and then calculating
an average for each factor:
1. Eliminating smoking first, averaged with
eliminating smoking second
2. Eliminating cocaine use first, averaged with
eliminating cocaine use second
35
Average PAF (PAFAVG) for the
Smoking-Cocaine Risk System
Averaging Sequential PAFs
Average PAF for Smoking:
PAFSEQ S | C  PAFSUM  PAFSEQ C | S = 0.076  0.061
 0.07
2
2
Average PAF for Cocaine Use:
PAFSEQ C | S   PAFSUM  PAFSEQ S | C = 0.099  0.084
 0.09
2
2
36
Average PAFs for the
Smoking-Cocaine Risk System
The Average PAFs for each factor in the risk system
are mutually exclusive and their sum equals the
Summary PAF:
0.07
0.09
0.84
0.0685 + 0.0915 = 0.16
37
Model Building Issues and Strategies
in the Context of
Estimating and Reporting PAFs
38
Modeling to Generate AvgPAFs

Regression modeling is a flexible and efficient
method for obtaining the series of relative risks
needed for calculating seqPAFs as the number of
variables being considered increases.

As an intermediate step in estimating avgPAFs, the
modeling process will differ from that for estimating
ratio measures of association:




Variable selection and organization
Level of Measurement & choice of reference groups
Confounding and effect modification
Model building strategies for choosing a final model
39
Variable Selection and Organization
Classifying factors as modifiable or unmodifiable

When modeling to estimate relative risks or odds
ratios, explicit differentiation between modifiable and
unmodifiable factors is not necessary, although this
differentiation is certainly conceptually important.

When modeling is an intermediate step in estimating
avgPAFs, the question of modifiability must be tackled
from the start, since variable handling proceeds
differently according to how variables are classified.
40
Variable Selection and Organization
Classifying factors as modifiable or unmodifiable



Unmodifiable factors are only used as potential
confounders or effect modifiers; PAFs not calculated
Modifiable factors are factors that can possibly be
altered with clear intervention strategies; these are the
factors in the “risk system”
Being in the pool of modifiable factors not only
influences final PAF estimates, but also may change
choices about level of measurement, reference level,
and handling of confounding and effect modification
41
Variable Selection and Organization
Classifying factors as modifiable or unmodifiable
Both broad and narrow definitions of modifiability
may be reasonable, with some researchers
computationally treating all variables as modifiable,
including factors such as race/ethnicity, age, and
poverty; others treat as modifiable only factors that
reflect a much narrower perspective on modifiability
How close should the connection be
between classification as modifiable
and public health interventions?
42
Level of Measurement and Reference Groups

For unmodifiable factors, decisions about level of
measurement, categorization schemes, and choice
of reference groups can be made with the same
considerations relevant to any epidemiologic
modeling process.

For modifiable factors for which avgPAFs will be
computed, however, level of measurement is
constrained since avgPAFs are discrete
measures anchored to levels, or categories, of
exposure.
43
Level of Measurement and Reference Groups

Modifiable factors are in effect treated as dichotomous,
comparing “any” to “no” risk. This mirrors the
interpretation of a PAF as the proportion of disease
reduction given complete elimination of exposure.

Defining the reference group as “lower risk” rather than
“no” risk is one way to pull back from reporting an
unrealistic maximum impact. For example:
>= 2 days exercise, rather than >= 5 days exercise
<=1 medical risk factor rather than 0 medical risks
44
Level of Measurement and Reference Groups
Continuous or ordinal variable:

the “j” relative risks will be exponentiated multiples of a
beta coefficient for each observed value.

the “unexposed”, or reference group, will include only
those with the single lowest value of the variable;
conversely, those with any other value will be considered
“exposed”

With the single lowest value as the reference group, the
proportion of exposed will likely be artificially high (an
ordinal variable with a few levels will typically have a
broader (more inclusive) reference group)
45
Level of Measurement and
Reference Groups
Dummy variables—
whether a recoded continuous or ordinal variable
or a nominal variable with k categories:


the “j” relative risks are computed from separate
beta coefficients corresponding to the k-1
categories
for the special case of dichotomous variables, a
single relative risk is computed from a single
beta coefficient
46
Level of Measurement and Reference Groups

With dummy variables, the analyst can choose
between reporting category-specific avgPAFs or an
avgPAF for the summation over all categories. Note
that while it is not possible to obtain a single ratio
measure for a set of dummies, it is possible to obtain
a single avgPAF.

If public health programming differs by exposure
category, then using dummy variables and reporting
dummy-specific avgPAFs may be more appropriate
than summing over all categories.
47
Level of Measurement and Reference Groups
Dummy variables may also be used to break
up a single construct into modifiable and
unmodifiable parts. Using prenatal care as an
example, dummy variables might be:
1. no prenatal care, and
2. inadequate prenatal care
3. adequate+ (high end) prenatal care
Modifiable
Unmodifiable
with adequate prenatal care as the reference group
48
Confounding and Effect Modification
• For each modifiable factor, confounding by all other
modifiable and unmodifiable factors should be
assessed. This is no different than assessment of
confounding in any model when precise estimation
of multiple “exposures” is of interest
• Accounting for effect modification, however, is
related to whether it is present within the pool of
unmodifiable factors, across modifiable and
unmodifiable factors, or within the pool of modifiable
factors
49
Confounding and Effect Modification
Within unmodifiable factors—use product terms or
ignore the interaction if it does not have an impact on
the measures of association for the modifiable factors
—this may depend on the pattern and strength of the
confounding effect of the interacting unmodifiable
factors on the factors in the risk system
Across modifiable and unmodifiable factors
—this might point to doing modeling stratified by the
unmodifiable factor involved in the interaction; if the
unmodifiable variable is continuous, it would have to
be recoded into categories for stratification
50
Confounding and Effect Modification
Within modifiable factors—use either product
terms or common reference coding to create a set
of dummy variables
--even insignificant interaction among modifiable
factors might be modeled in order to obtain the
most precise joint and separate effects of the
factors in the risk system
--stratified modeling should not be used, since it
would not be possible to compute an avgPAF for
the modifiable factor defining the strata
51
Model Building Strategies and Choosing a
Final Model

Unmodifiable factors should remain in a model only if
they are confounders or effect modifiers of the
modifiable factors in the risk system—their
independent association with the outcome is not of
interest

Modifiable factors should remain in a model based on
the size and reliability of the avgPAFs computed for
them; parsimony is not as important when building a
model to obtain avgPAFs, so variables with
insignificant RRs/ORs may be included in a final model
if the PAFs based on them are meaningfully large.
52
Model Building Issues and Strategies
Possible Model building strategies
1. Build separate models for each modifiable factor plus the
complete set of unmodifiable factors, assessing the results
for the “adjusted” PAF (there is no avgPAF with only one
modifiable factor in the model).
2. Build a combined model including all modifiable factors
selected from the initial models along with all unmodifiable
factors that were confounders or effect modifiers in any of
those models.
3. Drop unmodifiable factors if their status as confounders or
effect modifiers changes in the combined model and drop
modifiable factors if their avgPAF does not meet the
researcher’s criteria for inclusion
53
Model Building Issues and Strategies
Possible Model building strategies

Alternatively, a first step might be to build
models with subsets of substantively related
modifiable variables, assessing the results for
the avgPAFs for the variables in each subset,
with steps two and three for building a combined
model and revisiting the status of unmodifiable
factors proceeding as previously described.
54
Interpretation Issues to Consider
PAF should not be mis-interpreted as the percent of
diseased who have the risk factor of interest or the
percent of cases for which an identifiable risk factor can
be found.
Example: the Summary PAF for the impact of 10 factors
on breast cancer=0.25.
Incorrect: While many risk factors have been identified as
causes of breast cancer, 75% of all breast cancer cases
do not have an identifiable risk factor.
Incorrect: Only 25 percent of breast cancer cases can be
attributed to one or more risk factors; 75% of breast
cancers occur in women with no risk factors.
Rockhill, et al., 1998
55
Interpretation Issues to Consider
Incorrect: The pie should represent 100% of subjects with
PAL, not 100% of all subjects.
Oppermann, et al., 2004
56
Interpretation Issues to Consider
Rothman: With a PAF of 25%, the following
interpretation is not completely true: 25% of
disease would be reduced if X risk factor were
eliminated.
1)
2)
Assumes all biases are absent
Assumes that absence of risk factor would
not expand person-years at risk, which could
subsequently lead to more cases (in the case
of competing risks)
Rothman, & Greenland,571998
Interpretation Issues to Consider
Rothman Example 1:
PAF=0.25 for smoking in relation to coronary deaths.
Elimination of smoking could lead to fewer lung cancer deaths,
which would lead to more people living long enough to die from
coronary heart disease. Therefore, “25% fewer coronary deaths
would have occurred had these doctors not smoked” is
somewhat misleading.
Rothman Example 2:
PAF=0.20 for spermicide in relation to Down’s syndrome
Elimination of spermicide use could lead to more pregnancies,
which would lead to more Down’s syndrome cases. Therefore,
“20% fewer Down’s syndrome cases would have occurred had
the couple not used spermicide” is somewhat misleading.
58
Interpretation Issues to Consider
Causality takes on a prominent role when attributable
risk measures are reported since these measures
claim that a health outcome can be reduced given full
or partial elimination of one or more risk factors.
Differentiating between modifiable and unmodifiable
factors only indirectly addresses causality, but public
health interventions are often appropriately focused
on factors that can be changed in the population
even when strict causal criteria are not met or
causality has yet to be established.
59
Limitations in available software packages:

computation of only the 1st sequential (adjusted) PAF
with 95% CIs, but not the average PAF (AFLOGIT in
STATA)

computation of avgPAFs but only for certain study
designs, e.g. not case-control

computation of avgPAFs, but only accommodates
dichotomous variables and handles all factors as
modifiable – no adjustment for unmodifiable covariates

Variance estimates for the adjusted and summary
PAF were derived by Benichou and Gail (1990) based
on the delta method, but fully flexible variance
estimation of the avgPAF is still not available
60
Variance Estimation and Confidence Intervals
Once variance estimates for Average PAFs are fully
incorporated into statistical software, interpretation of
resulting confidence intervals will become important.
 As always, narrower CIs will mean increased
reliability of the point estimate of the avgPAF
 The CIs across multiple avgPAFs will
undoubtedly overlap. What will the overlap imply
about the prioritization process across
modifiable factors?
 Will a CI with a lower bound < 0 mean a factor is
not significant and therefore not a priority?
61
Review and Final Comments
Methodological advances might include:


differentially weighting removal sequences prior
to computing Average PAFs in order to reflect
funding streams or political will, since in reality
not all removal sequences are equally likely
incorporating measures of uptake and efficacy of
the public health interventions aimed at
particular risk factors
62
Review and Final Comments
Potential Impact Fraction (PIF)*: the estimation of
impact; Incorporates the practicalities of interventions
Success Rate = Proportion of affected individuals
whose risk status will change due to participation
in the intervention programs
Relative Efficacy=Extent to which a successful
risk-factor change results in a reduction of risk to
the level of persons never exposed
PIF = avgPAF x Success Rate x Relative Efficacy
*Morgenstern, et al (1982); Butlers, et al (1997)
63
Review and Final Comments
PIFs and Impact Numbers:
Increasing Participation in Sports as a Strategy to Reduce
Overweight among Adolescent Females
Target Group
White Females
African-American
Females
avgPAF
Success
Ratea
Relative
Efficacyb
Potential
Impact
Fractionc
0.090
0.090
0.090
0.090
0.090
0.090
0.75
0.20
0.05
0.75
0.20
0.05
0.80
0.80
0.80
0.40
0.40
0.40
0.054
0.014
0.004
0.027
0.007
0.002
76,844
20,492
5,123
38,422
10,246
2,561
0.094
0.094
0.094
0.094
0.094
0.094
0.75
0.20
0.05
0.75
0.20
0.05
0.80
0.80
0.80
0.40
0.40
0.40
0.056
0.015
0.004
0.028
0.008
0.002
39,371
10,499
2,625
19,685
5,249
1,312
Impact
Numberd
a
Theoretical values for the proportion of adolescents whose risk status will change due to participation in the intervention program
Theoretical values for the extent to which a successful risk factor change results in a reduction of risk to the level of unexposed adolescents
c
Estimated proportion of adolescent overweight (calculated by multiplying the avgPAF by the success rate and the relative efficacy)
d
Estimated number of cases of adolescent overweight in the target group across the U.S. that could be eradicated
as a result of the theoretical intervention (calculated by multiplying the impact fraction by the total cases in the target group)
b
64
Review and Final Comments

The number of average PAFs equals the number of
variables in a risk system.

Average PAFs, by considering every possible
sequence, yield mutually exclusive estimates,
making comparisons of the potential impact of risk
reduction intervention strategies possible

The average PAF may be a better measure of
impact than the first sequential (“adjusted”) PAF
since typically there are multiple interventions
operating simultaneously—risk reduction activities
are unordered and often intersect
65
Review and Final Comments
As always, having an explicit conceptual framework /
logic model is important for multivariable analysis
Conceptualization is particularly critical when
producing PAFs because decisions about variable
handling and model building will determine the
computational steps as well as influencing the
substantive interpretation of results.
66
Review and Final Comments

Average PAFs allow for the sorting of modifiable risk
factors according to the potential impact of risk factor
reduction strategies on an outcome in the population;
Ratio measures only provide the magnitude of the
association between a risk factor and an outcome

Typically, the PAF is the proportion of an outcome that
could be reduced if a risk factor is completely eliminated
in the population – take care not to over-interpret findings

While the interpretation of average PAFs is strengthened
by evidence of causality, an average PAF cannot itself
establish causality
67
Download