The long road from it-works-somewhere to it-will-work-for-us

advertisement
Who’s Afraid of
External Validity?
Nancy Cartwright
Error/Evidence Project
Error Conference
June 2010
1
Topic:
The long road from
it-works-somewhere
to
it-will work for us
2
Context




Evidence-based medicine (EBM) and evidencebased public health policy and practice (EBPHPP).
We spend a lot of effort and money buying wellconducted RCTs in the belief they will be relevant as
evidence (E) to policy prediction (H).
The RCT structure can (in the ideal) ensure the RCT
conclusion is highly credible (P(E) is high).
What shows that this matters to H?
So: When are we entitled to treat an RCT as
evidentially relevant to our policy prediction?
3
Topic: RCTs, evidential relevance and
policy prediction
External validity: the conclusion drawn from the study
holds in the target population -

If the two populations are ‘sufficiently similar’ or
If the two share certain abstract sets of conditions on causal
and probabilistic structure.
Evidential relevance: the conclusion drawn from the
study is evidentially relevant to a policy prediction in the
target population -
Under conditions that can be discovered by good searches


Horizontally and
Vertically.
4
Policy prediction: Two kinds of causal claims
1.
It-worked-somewhere claims: T caused O somewhere
under some conditions (e.g. in study population Φ
administered by method M).
 Ideal RCTs can clinch these kinds of claims.
2.
It-will-work-for-(at-least-some-of)-us claims: T will
cause O in some units of our population administered
as it will be administered given policy P.
 RCTs can be evidentially relevant, conditional on
 theoretical evidence and
 empirical evidence
that horizontal and vertical search results are correct.
NB: I pick a weak policy conclusion to minimize the relevance requirements.
5
Two slogans assumed throughout

Achinstein: Evidential relevance =
explanatory relevance.



This works well for evidence for policy predictions.
The explanations are the ones that would hold
were the prediction true.
Cartwright: It ain’t evidence without evidence
it’s evidence.

For evidence-based policy you can’t just say it’s
evidence; you have to produce reasons, and good
ones.
6
Explanatory relevance

E is explanatorily relevant to H iff




Direct explanation: E explains H, H explains E. Or
Indirect explanatory relevance: Correct
explanations for E and H have some feature in
common.
This is an objective relation.
Subjective: Your entitlement to take E as
evidence in favour of H depends on the good
reasons you have to believe E is
explanatorily relevant to H.
7
Indirect explanatory relevance
U(E):
Unshared elements
of explanation in φ
X(H,E):
Shared element of
explanations in φ, θ
E: RCT conclusion
for φ
U(H):
Unshared elements
of explanation in θ
H: Policy prediction
for θ
8
Flow of evidential support
U(E):
Unshared elements
of explanation in φ
X(H,E):
Shared element of
explanations in φ, θ
E: RCT conclusion
for φ
U(H):
Unshared elements
of explanation in θ
H: Policy prediction
for θ
9
First lesson



The relevance of E to H flows through the
common explanatory element: X(E,H).
This route is not available unless the
unshared element of the explanation for H
(U(H)) obtains.
SO:



E is evidentially relevant to H only conditional on
the unshared element U(H) obtaining.
You are entitled to treat E as relevant evidence for
H only to the extent that you have good reasons
to support that U(H) obtains.
Advice on how to find U(H)…
10
Pies, pancakes and unshared elements
Causes are INUS
conditions:




Insufficient
Nonredundant parts of
Unnecessary but
Sufficient conditions
(taking ‘sufficiency’ in
Anscombe’s sense of
‘enough’ and
contributions not total
effects as outcomes)
A pie: a set of causes that
are jointly enough for a
contribution to the
effect:
CN
C1
C5
C2
C4
C3
11
Smoking – C3 – causes lung cancer.
But not all smokers develop lung cancer. Genetic and
environmental factors contribute. Sufficient Cause A is a ‘pie’ of
slices, including smoking, that together cause lung cancer.
And people develop lung cancer without smoking. Sufficient
Cause B is a ‘pie’ of factors, not including smoking (no C3), that
together cause lung cancer. Working in a coal mine is C8.
CN
CM
C1
C5
C2
C4
C3
Sufficient Cause A.
C1
C8
C2
C7
C6
Sufficient Cause B.
12

The treatment variable T is never enough by
itself to promote O – it is only a slice of a pie.

It won’t work without the other slices.

And it won’t work if there are any pies of
sufficient strength operating to prevent O.
13

So, the least that’s needed to ensure that T
produces O is a set that includes:




All the helping factors in a pie for T.
The negation of at least one factor from every pie
in any combination of pies strong enough to
prevent O.
Call such a set N: N is a set of necessary
complementary factors sufficient for T to
produce O.
Thus the principle: T&N c O
14
Flow of evidential support
N holds in φ
T&N c O
RCT conclusion:
Pφ(O/T)>Pφ(O/-T)
N holds in
a subset of θ
Policy prediction:
T helps some units in θ
15
Conclusion 1




The unshared explanatory element is N –
a set of complementary factors sufficient
to ensure T produces O.
Happily, N is smaller than a full set of
confounders.
But if N does not obtain, the RCT is not
evidentially relevant to policy prediction.
And without good reason to think N holds
in your population, you do not have good
reason to count the RCT as evidence.
16
Pancakes: baking powder, RCTs and evidential
relevance
Given flour, milk and eggs –
baking powder can turn the whole mix into pancakes.
+
=
17
But you can’t make pancakes without flour,
eggs and milk no matter how much baking
powder you pour in the bowl.
+
≠
RCTs can’t make
evidence without support
for the necessary
complementary factors.
18
Horizontal v vertical search


Horizontal searches look for complementary
factors.
Vertical searches survey levels of abstraction,
usually to find shared explanatory principles.

Since explanations involving features of various
levels of abstraction can all be true at once.
Ex: straight lines, great circles and Euclidean lines.


On a sphere an object subject to no forces traverses a
great circle; on a sphere an object subject to no forces
traverses a straight line.
On a sphere traversing a great circle is what it is to
traverse a straight line.
19
What counts as a
straight line?
20
Cautions


Citing the wrong level of factor in an RCT
conclusion can limit the scope of its evidential
relevance.
Worse:



You may correctly surmise that populations φ and
θ share a common causal structure – same
causal principles.
But incorrectly identify the level of abstraction of
the features in those principles.
Then you make bad predictions from taking
results in φ as evidence for predictions in θ.
21
EX: The Bangladesh Integrated Nutrition
Program (BINP)




BINP provided pregnant women with nutritional
counseling.
This had ‘worked’ elsewhere.
But analysis by the World Bank’s Operations
Evaluation Department found no significant impact
on infants’ nutritional status in the Bangladesh study
population.
This despite reasonable horizontal search:
knowledge alone is not enough; resources are
needed too. Hence a supplemental feeding
programme.
22
Failure of evidential relevance


Supposition: Explanations for the results
elsewhere and for Bangladesh success
had it occurred would share a common
principle:
Better nutritional knowledge in mothers
plus supplemental feeding improves
the nutritional status of their children.
But the two populations did not share this
principle.
23
Failure of vertical search


Two problems:
 Food leakage
 Men and mothers-in-law.
Howard White:
The program targeted the mothers of young children. But
mothers are frequently not the decision makers, and rarely the
sole decision makers, with respect to the health and nutrition of
their children. For a start, women do not go to market in rural
Bangladesh; it is men who do the shopping. And for women in
joint households – meaning they live with their mother-in-law – as
a sizeable minority do, then the mother-in-law heads the
women’s domain. Indeed, project participation rates are
significantly lower for women living with their mother-in-law in
more conservative parts of the country.
24
Genuinely shared principle
Better nutritional knowledge results in better
nutrition for a child in those who
1. have sufficient resources to use that
knowledge to improve the child’s nutrition,
2. control what food is procured with those
resources,
3. control how food gets dispensed, and
4. hold the child’s interests as central in
performing 2. and 3.
25

In the Bangladesh programme





Supplementary food did not generally constitute sufficient
resources.
The feature ‘being a mother’ did not in general constitute
the more abstract features in 2.&3.
There is a shared principle at a higher level of
abstraction but the horizontal complementary factors
required by that principle are lacking.
Earlier study results are not evidentially relevant
without these requisite complements.
Evidential relevance requires



correct horizontal factors,
correct vertical factors and
a correct match between the two.
26
Flow of evidential support
N holds in φ
T&N c O
RCT conclusion:
Pφ(O/T)>Pφ(O/-T)
N holds in
a subset of θ
Policy prediction:
T helps some units in θ
27
Epistemic demands EBPHPP


Cartwright dictum: E isn’t evidence for you
unless you have evidence that it’s evidence.
To stop the regress -

Cochrane and Campbell reviews are reasonable
stopping points for policy analysts in judging
quality/credibility of E (P(E) is high).
But who polices the evidential relevance of
E?
28
Whence the evidence for relevance?



You can’t argue you’ve got the right vertical
level and the right horizontal factors without
invoking lots of theory and lots of other
empirical results.
You may not need these to argue for quality,
but you can’t avoid them in defending
relevance.
And without relevance what’s the point?
29
What I have done
1.
2.
3.
4.
5.
6.
Reminded you that quality/credibility is not by a long
shot all that matters about evidence.
Replaced the focus from external validity to evidential
relevance.
Equated evidential and explanatory relevance.
Insisted that it’s not evidence without evidence it’s
evidence.
Used INUS pies to characterize a set of minimal
conditions that must be satisfied to turn RCT results
into evidence.
Stressed the importance of the abstract and the
concrete.
This results in….
30
Bad news



To be justified in taking a result as evidence
requires a great deal of theory, local
knowledge and empirical results.
These go far beyond the trial methodology
that we have mastered so well.
Without these taking trial results as evidence
is pure guesswork.
31
Good news
The focus on evidential and explanatory
relevance provides a structure within which to
organize what’s needed:


The theory of causes as INUS conditions
provides an abstract characterization of what
you need from a horizontal search.
And warnings about the relations of the
abstract and the concrete show the
importance of vertical search.
32
Good news
You don’t have to
know everything.
You have a catalogue
for search.
33
Download