false alarms

advertisement
Decision making as a model
2. Statistics and decision making
Bayesian statistics:
p(H|D)  p(H)∙p(D|H)
If H refers to possible values of θ:
pdf(θ|D)  pdf(θ)∙L(θ|D)
NB: L: Likelihood function!
From about 1925 Bayesian approach in inductive statistics
was marginalised (now a come back)
In “classial ” statistics frequentist interpretation
of probability is preferred
Hypotheses are TRUE or FALSE
(we don’t know which for certain
- not a matter of probability)
and are accepted of rejected based on
data D and likelihood p(D|H)
e.g. test of significance
probability density
compute pdf(S|H0)
(for sample of n)
Sx
p
Statistic S
Null hypothesis about some population parameter
do experiment ( Sx, p)
If p is small, reject H0,
you could accept some alternative
Fisher
Neyman & Pearson
pdf(S|H0)
pdf(S|H1)
Statistic S
Specify H0 ,H1 and their pdf’s.
Decide on a criterion based on
β p(type II error)
and α p(type I error)
do experiment, compute Sx and choose between H0 and H1
Neyman & Pearson more suitable for decision making than
for science!
For completeness:
Likelihood approach without priors:
Fisher, Royall
p(H|D)  p(H)∙p(D|H)
Irrespective of p(H):
how strong is D’s support for H ?
Example: model selection:
Akaike AIC = -log(L) + k
BIC = -log(L) + k log(n)/2
Signal Detection Theory
Application of Neyman-Pearson to processing sonar or radar
signals on noisy background
Military technology (WW2):
Hypothesis 0:
there is no signal, only noise
Hypothesis 1:
there is a signal and noise
NB.1
On the basis of some “evidence” I have to
act, although I do not know which H is true!
NB.2
This is typically a “classic” approach, but at
the end Bayes will creep in by the back door!
Probability density
fundamental assumptions
of signal detection theory
“Evidence”, e.g.…..????
1. Effect (= a value of “Evidence”) of signal is variable (according
to a probability distribution).
2. Effect of Noise is also variable.
Problem: is this “Evidence” (= a point on x-axis)
the effect of a signal (+ noise) or of noise only?
“No” “Yes”
3.
If signal is weak, distributions overlap
and errors are unavoidable, whichever criterion is adopted
Terminology:
“No” “Yes”
“No”
Signal (+noise)
(only) noise
miss
correct
rejection
“Yes”
hit
false alarm
The stronger the signal (or the better the
detector)… the further the distributions
lie apart
Given some sensitivity (= a
distribution for noise and one
for signal)
several response criteria can
be adopted
“No” “Yes”
Dependent on van personal preference or “pay off”
in this situation:
-How bad is a miss, how important is a hit?
-How bad is a false alarm, how important is a
correct rejection?
-Hoe often do signals occur? (think of Bayes!)
Two types of applications:
1.
Normative: distributions are known,
try to find optimal criterion (for
optimal behavior)
-Is that a hostile plane?
-Does this mammogram indicate a malignancy?
-Is there a weapon in this suitcase?
-Can we admit this student to this school?
-What is the best cut-off score for this test?
Two types of application:
2.
Descriptive: Behavior is known, try to
reconstruct distributions and criterion as a
rational model
How good is this person in detecting a v
among u’s?
Is this person inclined to say “yes” in a
recognition test?
How well judges or juries are able to
distinguish between the guilty
and the innocent?
Do judges and lay juries differ in their
bias for convicting or acquitting?
How good is this test?
“No” “Yes”
An experiment with noise
(blank) and and signal
(target) trials:
A strict (“high”) criterion
results in few hits and few
false alarms
Hit rate =
Proportion
hits
(of signal
trials)
False Alarm Rate =
Proportion false alarms (of noise trials)
“No” “Yes”
hits
A lax “low” criterion results in
more hits and more false alarms
-given the same sensitivity
false alarms
The ROC-(response operating characteristic) curve
connects points in a Hit/FA- plot, resulting from adopting several
criteria given the same sensitivity (= same distributions)
ROC-curve characterises detector sensitivity (or signal
strength) independent of criterion
important: sensitivity and criterion theoretically
independent
Same sensitivity (for this
signal), several criteria
hits
ROC-curve
Receiver Operating Characteristic
Relative Operating Characteristic
Isosensitivity Curve
false alarms
Greater sensitivity:
ROC-curve
further from diagonal
(Perfection would be: all hits
and no false alarms)
hits
Suggests two types of measure for sensitivity (independent of
criterion:)
1. distance between signal and
noise distributions
(e.g. d' )
2.
Area under ROC-Curve: A
No distinction between signal and noise:
A = .50
(ROC-curve reflects only bias for saying “yes” or “no”)
Perfect distinction between signal and noise: A  1.
Types of measures for criterion:
f
c
1. Position on op x-axis (e.g. c)
2. Likelihood ratio
p(xc|S)/p(xc|N)
= h/f (e.g. β)
3. Position in ROC-plot
(left down. vs right up)
4. Slope of tangent on ROC
h
Signal Detection Theory is applied in many contexts!
Breast cancer?
Hit rate
PSA-indices for
screening prostate
cancer
FA rate
Psychodiagnosis:
1.
How good is this test distinguishing relevant categories?
2.
What is good cut-off score
(at which score should I hire the candidate/admit the student /
send the cliënt to a psychiatrist or an asylum?
Control group
Test score
patients
Comer & Kendall 2005:
Children’s Depression Inventory detects depression in a
sample of anxious and anxious + depressive children
Several cut-off scores
What are the costs missing
a weapon/explosive at an
airport?
What are the costs a
false alarm?
What are the costs of
screening (apparatus,
personnel, delay)?
Download