driven science Descriptive versus hypothesis Des

advertisement
•
The value of a
scientific study can
in principle be
assessed with
respect
p
to (a)
( ) the
quality of the
science; (b) the
value of the
scientific question
being investigated.
Important
Value of
scie
entific question
Evaluating
science
Where most
science
should
be, but isn’t
Where too
much science
is.
trivial,
dangerous awful
superb
Quality of scientific study
1
University of Ottawa - BIO 4000
© Scott Findlay
24/09/2009 12:00 PM
Descriptive versus hypothesishypothesisdriven science
•
•
Descriptive science is
concerned with the
characterization and/or
quantification of
patterns in nature.
The main issues: (a)
what
h t are the
th important
i
t t
observed patterns? (b)
are they more imagined
than real?
•
•
Hypothesis-driven
science is concerned
with testing (scientific)
hypotheses advanced to
explain the (real, one
assumes) patterns
uncovered by
descriptive science.
The main issue: how
likely is it that the
hypothesis in question
is true?
2
University of Ottawa - BIO 5901
© Scott Findlay
24/09/2009 12:00 PM
Descriptive versus hypothesishypothesisdriven science
•
•
•
•
Descriptive science provides
the grist for the hypothesisdriven science mill …
… while hypothesis-driven
science often in turn provides
directions as to where it might
be productive to look for
(more) patterns.
Both types of science are
necessary, neither is sufficient
…
… but is it CRITICAL that they
be distinguished!
University of Ottawa - BIO 5901
© Scott Findlay
24/09/2009 12:00 PM
Descriptive
science
Hypothesis-driven
science
3
1
What is hypothesis
hypothesis--driven
science?
•
•
•
The accumulation of
knowledge about the world
through the testing of
causal theories
(explanations).
A causal theory is a
statement about the
cause(s) of observed
phenomenon (“events”,
“effects”)
Science attempts to infer
causal relationships (“If A,
then B”) by application of
the scientific method.
Cause
Causality
Inference
Effect
(Observed)
4
University of Ottawa - BIO 5901
© Scott Findlay
24/09/2009 12:00 PM
What makes a “scientific” hypothesis?
•
•
•
According to Sir Karl Popper, all
scientific hypotheses must be
refutable, at least in principle.
A refutable hypothesis is one for
which, at least in principle, there
are empirical observations
which could be inconsistent with
the hypothesis.
So testability = refutability
(falsifiability)
5
University of Ottawa - BIO 5901
© Scott Findlay
24/09/2009 12:00 PM
Science à la Popper
•
•
Science proceeds by
eliminating potential
hypotheses for what we
want to explain.
“When you have eliminated
th iimpossible,
the
ibl Watson,
W t
whatever remains –
however improbable – is the
truth.”
Sherlock Holmes,
The Sign of Four
University of Ottawa - BIO 5901
© Scott Findlay
24/09/2009 12:00 PM
Hypotheses
What we want to
explain
6
2
The logic of scientific inquiry:
deduction
•
Deduction: if the axioms
(premises) are true, the
conclusion is necessarily
true (reasoning from
general to particular).
All swans are white.
This bird is a swan.
∴ This bird is white.
If (this is a swan) then (it is white).
This bird is a swan.
∴ This bird is white.
7
University of Ottawa - BIO 5901
© Scott Findlay
24/09/2009 12:00 PM
The logic of scientific inquiry:
induction
• Induction: even if all the axioms are true, the
conclusion is not necessarily true (reasoning
from the particular to the general).
This bird is a swan &
it is white …
… and this bird is a swan
& it is white …
∴ All swans are white.
8
University of Ottawa - BIO 5901
© Scott Findlay
24/09/2009 12:00 PM
The scientific method
Hypothesis
Deduction
Predictions
Induction
Experiment
Conclusions
Observations
Induction
University of Ottawa - BIO 5901
© Scott Findlay
24/09/2009 12:00 PM
9
3
The logic of hypotheses and
predictions
•
•
The logic of Popper’s view of
science can be represented by a
standard deductive syllogism.
The implication is that one cannot
prove an hypothesis, only
support (corroborate) it. To do
otherwise would be to commit the
logical fallacy of affirming the
consequence.
If H then P
P
∴H
Fallacy of affirming
the consequence
If H then P
-P
∴-H
No logical fallacy!
10
University of Ottawa - BIO 5901
© Scott Findlay
24/09/2009 12:00 PM
Hypotheses and predictions: why
the bathroom light doesn’t work
Hypotheses
Power
off to house
Bulb burnt
out
What we want
to explain
Light switch on,
but no light
Short in circuit
11
University of Ottawa - BIO 5901
© Scott Findlay
24/09/2009 12:00 PM
Hypotheses and
predictions
•
•
•
Causal hypothesis: a
statement about the
cause(s) of some observed
event/pattern.
Prediction: the empirical
result/pattern one will see in
a particular experiment if
the hypothesis is true.
Inference: the conclusion (H
is supported or refuted) is
based on deductive
inference.
University of Ottawa - BIO 5901
© Scott Findlay
24/09/2009 12:00 PM
If the light bulb
is burnt out (H)…
… then the light will
work
k if th
the old
ld bulb
b lb
is replaced (P).
If H then P
-P
∴-H
12
4
Hypotheses, experiments and
predictions
Hypothesis
Experiment
Prediction
Light bulb burnt Replace bulb
out
with new bulb
Light will work
Power off to
house
Try other
electric
switches
No other switch
will work
Short in circuit
Replace bulb
with new bulb
New bulb will
blow &/or
breaker will trip
13
University of Ottawa - BIO 5901
© Scott Findlay
24/09/2009 12:00 PM
Two different hypotheses, same
experiment, same prediction
Hypothesis
Experiment
Prediction
Power off to
house
Try other
lights/outlets in
bathroom and
bedroom
No other
lights/outlets
work
Short circuit
& thrown
breaker
Try other
lights/outlets in
bathroom and
bedroom
No other
lights/outlets
work
14
University of Ottawa - BIO 5901
© Scott Findlay
24/09/2009 12:00 PM
Two different hypotheses, same
experiment, different prediction
Hypothesis
Experiment
Prediction
Power off to
house
Try outlets on
different breaker
from bathroom
No other outlets
will work
Short circuit
& thrown
breaker
Try outlets on
different breaker
from bathroom
Other outlets
will work
University of Ottawa - BIO 5901
© Scott Findlay
24/09/2009 12:00 PM
15
5
Beware ancillary
assumptions!
•
•
If the light bulb
is burnt out (H)…
In very many cases, there
are unstated and
unvalidated assumptions
that must be true in order
for the deducibilty condition
be satisfied.
Published science is FULL
of this sort of logical
misstep – be on your guard!
… then the light will
work
k if th
the old
ld bulb
b lb
is replaced (P).
Ancillary assumption: new bulb
actually works!
16
University of Ottawa - BIO 5901
© Scott Findlay
24/09/2009 12:00 PM
Good practice: reconstruct the
experimental logic
•
•
•
•
•
•
Hypothesis: the bulb is burnt out
Experiment: Replace bulb with new bulb
Prediction: there should be light!
Ancillary assumption: new bulb itself works
Validate assumption: try new bulb first in
another light that is known to be working.
Procedure: (1) reconstruct the logic of the
experiment; (2) identify necessary ancillary
assumptions (Aas); (3) how many AAs are in
fact known to be true in the experiment in
question?
17
University of Ottawa - BIO 4000
© Scott Findlay
24/09/2009 12:00 PM
Necessary and sufficient causal
hypotheses
Causal hypothesis
“A is a necessary, but not sufficient, cause of B”
“A is a sufficient, but not necessary, cause of B”
“A is both a necessary and
sufficient cause of B”
Prediction(s)
For all observed B events, A is always present. Whenever A is present, event B occurs. For all observed B events, A is always present, AND whenever A is present, event B always occurs.
“A is a contributing cause of When A is present, B is B”
more likely to occur than when it is not; OR when B is observed, A is more likely to be present.
University of Ottawa - BIO 4000
© Scott Findlay
24/09/2009 12:00 PM
Refutation
Event B occurs, but A not present
Presence of A does not result in event B
Event B occurs, but A is not present; OR
presence of A does not result in event B
When A is present, event B is no more likely to occur than when it is absent; AND when event B is observed, A is no more likely to be present than when B is not observed.
18
6
Why it is important
Causal hypothesis
Prediction(s)
“A is a necessary, but not For all observed B sufficient, cause of B”
events, A is always present. “A is a sufficient, but not Whenever A is necessary, cause of B”
present, event B always occurs. “A
A is both
is both a necessary a necessary
For all observed B
For all observed B and sufficient cause of B” events, A is always present, AND whenever A is present, event B always occurs.
•
Good design
Large number of B events required.
Large number of A events required
Large number of BOTH A and B
Large number of BOTH A and B events required.
Good experimental designs depend on the type of
hypothesis, as do the inferences drawn from the results
19
University of Ottawa - BIO 4000
© Scott Findlay
24/09/2009 12:00 PM
Hypothesis testability
•
•
•
An hypothesis H is testable in an
experimental design E iff (a) it has at least
one prediction that is derived deductively
from H; and (b) there is at least one possible
experimental outcome that is inconsistent
with P;
So, testability depends both on the
hypothesis H and the experiment E.
N.B. Testability is a minimal condition – what
really matters is how good the test is (more
on this later!)
20
University of Ottawa - BIO 4000
© Scott Findlay
24/09/2009 12:00 PM
The bathroom light: testability
Causal hypothesis
Light bulb burnt out
Power off to house
Short circuit & bathroom breaker blown
Experiment
Testable?
Try new Yes
(working) bulb
Try new Yes
(
(working) bulb
)
Try new Yes
(working) bulb
University of Ottawa - BIO 4000
© Scott Findlay
24/09/2009 12:00 PM
21
7
The bathroom light, testability
redux
Causal hypothesis
Light bulb burnt out
Power off to house
Short circuit & bathroom breaker blown
Experiment
Testable?
Try razor outlet No
in bathroom
Try razor outlet Yes
in bathroom
Try razor outlet Yes
in bathroom
22
University of Ottawa - BIO 4000
© Scott Findlay
24/09/2009 12:00 PM
Competing versus nonnon-competing
hypotheses
•
•
Two hypothesis H1 and H2 are competing if
both cannot be true, such that experimental
evidence supporting H1 is evidence refuting
H2, and vice versa, in experiments where only
one is testable.
I general,
In
l (truly)
(t l ) competing
ti causall
hypotheses are rare in science, because very
rarely can an experiment designed to test a
single hypothesis exclude the possibility of
multiple causality.
23
University of Ottawa - BIO 4000
© Scott Findlay
24/09/2009 12:00 PM
Competing versus nonnon-competing
hypotheses
Causal Experiment
Competing?
hypothesis
Light bulb burnt Check multiple No
out
non‐bathroom
outlets
Power off to Check multiple house
non‐bathroom
No
outlets
Short circuit & Check multiple bathroom non‐bathroom
breaker blown
outlets
University of Ottawa - BIO 4000
© Scott Findlay
24/09/2009 12:00 PM
No
24
8
Inferential strength, Ψ
•
•
All HD studies should conclude that
either (a) the hypothesis H is
supported (results are consistent
with predictions); or (b) H is not
supported (results are inconsistent
with predictions).
Inductive inference means inferring
from (a) or (b) that H is indeed true or
false.
An experiment E having large Ψ is
one for which result (a) means that H
is very likely to be true; and for which
result (b) means that H is very likely
to be false.
Experiment 1
Probability that hypothesis is true
•
1
Low Ψ
0
R11 R12 R13 R14
1
Experiment 2
High Ψ
0
R21 R22R23 R24
25
University of Ottawa - BIO 4000
© Scott Findlay
24/09/2009 12:00 PM
•
•
The a priori inferential
strength Ψ of an
experiment is the
maximum Ψ that is
achievable irrespective
of the experimental
results.
This sets the upper limit
on a posteriori Ψ∗
inferential strength, i.e.
that associated with a
given experimental
result:
A priori infere
ential strength
A priori (Ψ) and a posteriori (Ψ∗)
inferential strength
A posteriori inferential strength
University of Ottawa - BIO 4000
© Scott Findlay
24/09/2009 12:00 PM
26
What determines the a priori
inferential strength of an
experiment?
•
•
•
•
•
•
•
Type of experiment (manipulative versus
correlational (observational))
Number of independent predictions per hypothesis
Number of hypotheses tested (in the same
experiment)
Adequacy of controls
Accuracy and precision (validity) of experimental
methods
Extent of extrapolation from experimental (model)
system to system of real interest.
Sample size
University of Ottawa - BIO 5901
© Scott Findlay
24/09/2009 12:00 PM
27
9
Factors affecting a priori
inferential strength: study type
•
Correlation
Manipulative
A priori infe
erential
strengtth
All other things being
equal, manipulative
studies (where
hypothesized causal
factors are
experimentally
manipulated) have
greater inferential
strength than
correlational studies
because in the former,
other potential causal
factors are
(presumably!) controlled
28
University of Ottawa - BIO 4000
© Scott Findlay
24/09/2009 12:00 PM
Primary prod
duction
Inferential strength
and study type:
manipulative versus
correlational
Manipulative
study
Correlational study
Phosphorous concentration
29
University of Ottawa - BIO 4000
© Scott Findlay
24/09/2009 12:00 PM
20
Control
E1
15
E3
10
5
Log10 na
ative richness
Average nattive richness
Effects of exotic wetland plants on
native wetland biodiversity
2.4
2.2
2.0
1.8
1.6
0.0
0.4
0.8
1.2
Log10 exotic abundance
0
1 m2 scale
University of Ottawa - BIO 4000
© Scott Findlay
24/09/2009 12:00 PM
Wetland scale
30
10
Factors affecting a priori inferential
strength: number of independent
predictions
•
Candidate hypotheses
Each independent
prediction serves as
a “filter” for different
candidate
hypotheses, so the
more predictions (P1,
P2, …), the more the
set of candidate
hypotheses is
winnowed down.
P1
X
X
P2
X X
X X
31
University of Ottawa - BIO 4000
© Scott Findlay
24/09/2009 12:00 PM
Factors affecting a priori inferential
strength: number of independent
predictions
•
Multiple predictions provide
more independent
opportunities for rejection,
making for a stronger test.
If H then (P1 and P2)
- P1; P1
∴ - H (R ); -(-H) (S)
If H then (P1 and P2)
- P1& P2; P1& - P2 ; -P1& - P2; P1& P2
Testable
∴ - H (R ); -H (R ); - H ( R); -(-H) (S)
32
University of Ottawa - BIO 5901
© Scott Findlay
24/09/2009 12:00 PM
Factors affecting a priori inferential
strength: number of alternate hypotheses
strength:
•
•
Because for any
observed pattern
there are many
alternate
explanations …
… the more
hypotheses tested in
a single experiment,
the greater the
inferential strength.
Candidate hypotheses
E1
P1
X
X
P2
X
University of Ottawa - BIO 4000
© Scott Findlay
24/09/2009 12:00 PM
E2
X X
33
11
Factors affecting a priori inferential
strength:: adequacy of controls
strength
•
•
•
To unambiguously attribute observed effects to
manipulated or correlated factors, one must ensure
that the effect of other factors are adequately
controlled.
In manipulative studies, a good control is a
“treatment” (or set thereof) that allows one to isolate
the effect of only the putative causal factor(s).
In correlational studies, good control means also
measuring variables that might be expected to have
some influence on the outcome of interest, in
addition to those for which one has hypotheses (and
predictions)
University of Ottawa - BIO 4000
© Scott Findlay
24/09/2009 12:00 PM
34
Elements of good control
•
•
Randomization of sample units (subjects, plots, etc.)
to treatments (not always possible).
Design allows for the evaluation of the potential
effects of different elements of the experimental
methodology on the outcome of interest.
interest
University of Ottawa - BIO 4000
© Scott Findlay
24/09/2009 12:00 PM
35
Example: effects of an oncolytic
virus on tumour development in
mice
• Biological question: does in vitro selection for
•
•
•
tumour host specificity increase tumour-killing
effectiveness in vivo?
Procedure: Evolve two different (A, B) viral quasispecies
i in
i vitro
it using
i
t
tumour
h t cells
host
ll to
t select
l t for
f
high tumour specificity. Assess change in
specificity over generations.
Introduce evolved virus into spontaneous tumour
mouse model, assess tumour growth
Design question: what are the appropriate controls?
University of Ottawa - BIO 4000
© Scott Findlay
24/09/2009 12:00 PM
36
12
Possible design: is it adequately
controlled?
•
•
•
Treatment 1: 10 mice inoculated with evolved
viral species A
Treatment 2: 10 mice inoculated with evolved
viral species B
Control: 10 untreated mice
37
University of Ottawa - BIO 4000
© Scott Findlay
24/09/2009 12:00 PM
Good design, adequately
controlled
•
•
•
•
•
•
Treatment 1: 10 mice inoculated with evolved
viral species A
Treatment 2: 10 mice inoculated with evolved
viral species B
Control 1: 10 untreated mice
Control 2: 10 mice with sham inoculations on
same schedule as T1 and T2
Control 3: 10 mice treated with ancestral
(unevolved) virus A
Control 4: 10 mice treated with ancestral
(unevolved) virus B
38
University of Ottawa - BIO 4000
© Scott Findlay
24/09/2009 12:00 PM
•
Results consistent with
prediction that [P] limits PP, ,
but there is no
randomization, so observed
relationship may arise from
the fact that, e.g. lakes with
more phosphorous also have
more nitrogen (so other
factors have not been
randomized)
So, measurement of nitrogen
provides a type of control.
Nitrogen
concentration
•
Primary
production
Example: primary
production (PP) in lakes
and phosphorous [P]
Phosphorous
University of Ottawa - BIO 4000
© Scott Findlay
24/09/2009 12:00 PM
39
13
Predictions as models
•
All science (descriptive or
hypothesis-driven) involves
determining the probability
φ that some model M [which
represents a possible
pattern (descriptive
science) or a prediction
(hypothesis-driven science]
is in fact true.
Hypothesis testing then
requires estimating how
well the observed pattern
matches the predicted
pattern M.
Observations (poor fit)
Observations (good fit)
H: light bulb burnt out
1
Probability
y of light
•
M
0
Not working
Working
Replacement bulb
40
University of Ottawa - BIO 5901
© Scott Findlay
24/09/2009 12:00 PM
Models: dependent and
independent variables
•
•
A model can be represented
mathematically as an equation,
so alternate models give
different equations
To convert an hypothesis to a
model, the (putative) effect Y
i on the
is
th left
l ft hand
h d side
id (the
(th
dependent variable),and the
(putative) cause(s) X are on the
right hand side.
All statistical analysis (even a
t-test!) involves fitting a
specific mathematical model.
Y = f (X )
Y
•
Y = g(X )
X
observations
Two alternate models (f, g)
41
University of Ottawa - BIO 5901
© Scott Findlay
24/09/2009 12:00 PM
Alternate hypotheses, alternate
models
•
•
H1: increases in X cause
a proportional increase
in Y (model f)
H2 : increases in X result
in a decelerating
increase in Y, indicating
saturation (model g)
Question: how well do
the observations fit the
two hypotheses?
Y = f (X )
Y
•
Y = g(X )
X
observations
Two alternate models (f, g)
University of Ottawa - BIO 5901
© Scott Findlay
24/09/2009 12:00 PM
42
14
Model fit , data accuracy
and inferential strength
• The lower the measurement
•
•
Y = g(X )
High accuracy
Y
•
accuracy (greater bias), the
lower the extent to which
“observed” model fit
reflects the true model fit …
…so that inferences about
which model fits better will
be invalid
E.g. method that
overestimates at larger
values of Y…
Leading to incorrect
inferences concerning true
model fit (f better fit than g)
Y = f (X )
Biased
observations
Low accuracy (bias)
X
43
University of Ottawa - BIO 5901
© Scott Findlay
24/09/2009 12:00 PM
Y = f (X )
Model fit , data
precision and
inferential strength
•
As measurement precision
declines, the differences in
observed fit among
different models declines.
Because one is unsure
which is the better/best
model, inferences about
which hypotheses are
supported/refuted are
weaker.
Y = g(X )
High precision
Y
•
Low precision
X
44
University of Ottawa - BIO 5901
© Scott Findlay
24/09/2009 12:00 PM
•
•
•
Experiments are almost
always conducted on “model”
systems.
Thus, drawing inferences from
results of experiments with
model systems requires that
we assume that the causal
structure of model system is
similar to that of the system
we are really interested in.
The greater the degree of
extrapolation, the less likely
this is to be true, and the lower
the inferential strength.
University of Ottawa - BIO 4000
© Scott Findlay
24/09/2009 12:00 PM
Temporral scale
Factors affecting a priori inferential
strength: extrapolation
System of
interest
Extrapolation
Model system
Spatial scale
45
15
Common types of extrapolation
•
•
•
•
Interspecies (very common in biomedical
studies – are rats really people?)
From experimental indicators (that which we
measure or estimate) to system properties of
real interest (e.g. from expression levels to
protein
t i llevels,
l ffrom species
i richness
i h
to
t
“biodiversity”, etc.)
Spatial and temporal scales
In vitro to in vivo
46
University of Ottawa - BIO 4000
© Scott Findlay
24/09/2009 12:00 PM
Factors affecting a priori inferential
strength: sample size
•
•
The larger the sample size,
the smaller the uncertainty
associated with estimates
based on the sample…
… and thus,, the greater
g
the
ability to detect differences
in fit among competing
models …
…leading to stronger
inferences about degree of
support or refutation.
Y = f (X )
Y
•
Y = g(X )
Large N
Small N
X
University of Ottawa - BIO 5901
© Scott Findlay
24/09/2009 12:00 PM
47
Conclusions: are the conclusions
supported by the data?
•
•
Are observed results consistent or
inconsistent with bona fide predictions?
Does the study have strong a posteriori
inferential strength?
University of Ottawa - BIO 4000
© Scott Findlay
24/09/2009 12:00 PM
48
16
Download