Do we teach too much S.H.I.T.

advertisement
Outline
Plan
Do
Check
Do we teach too much S.H.I.T.
Paul Hewson
School of Computing and Mathematics
and Royal Statistical Society Centre for Statistical Education
paul.hewson@plymouth.ac.uk
7th July 2011
Review
Outline
Plan
1
Plan
2
Do
3
Check
4
Review
Do
Check
Review
Outline
Plan
The question
Do we teach too much
Statistical
Hypothesis
Inference
Testing
Do
Check
Review
Outline
Plan
The answer
Yes
Do
Check
Review
Outline
Plan
Thank you for listening
Any questions?
Do
Check
Review
Outline
Let’s do a quiz
Plan
Do
Check
Review
Outline
Plan
Do
Check
The answer
Disease No disease
Test +ve
950
99,900
100,850
Test -ve
50
899,100
899,150
Total
1000
999,000 1,000,000
Your probability of having the disease if screened positive is
950
100,850 = 0.009
Review
Outline
Plan
Do
Check
The point of that exercise
Prob(A given B) isnt’t the same as Prob(B given A)
So Prob(Data given Null) isn’t the same as Prob(Null given Data)
Review
Outline
Plan
Do
Check
Statement 1
If the null hypothesis is correct then this Datum cannot
occur
It has, however, occurred
Therefore the null hypothesis is false
(Aristotle, Modus tollens - deny the antecedent by denying the
consequent - fine if there is no uncertainty involved)
Review
Outline
Plan
Do
Check
But let’s add uncertainty
If the null hypothesis is correct then these data are highly
unlikely
These data have occurred
Therefore the null hypothesis is highly unlikely
This analogy is invalid, as we shall see!
Review
Outline
Plan
Do
Check
Example of statement 1
If a person is a Martian, then he is not a Member of
Parliament
This person is a Member of Parliament
Therefore he is not a Martian
(correct Modus tollens)
Review
Outline
Plan
Do
Check
Let’s be a little bit more sensible
If a person is British, then he is not a Member of
Parliament (WRONG!)
This person is a Member of Parliament
Therefore he is not British
Needs a slight correction to make this valid, we have to allow for
uncertainty
Review
Outline
Plan
Do
Check
Probabilising it
If a person is British, he is probably not a Member of
Parliament
This person is a Member of Parliament
Therefore he is probably not British
The logic follows, but leads us to an invalid conclusion
Review
Outline
Plan
Do
Check
Which is the same as saying
If H0 is true then this result would probably not occur
This result has occurred
Therefore H0 is probably not true
Review
Outline
Plan
Do
Check
My contention
Maybe the students who don’t “get” hypothesis testing are
the ones who are paying attention and thinking about it.
Maybe the dangerous ones are the ones that can pass the
“learn and churn” assessment without thinking, and go on to
use this misunderstanding in the “real” world
Review
Outline
Plan
Do
Check
This is all well documented
“Mindless Statistics”, G. Girerenzer (2004) The Journal of
SocioEconomics 33:537-606
“The Religion of Statistics as Practiced in Medical Journals”
D Salsburg (1985) The American Statistician 39:220-273
And many more
Review
Outline
Plan
Do
Check
It’s in the popular press
Chances Are, Steven Strogatz The New York Times 25th
April 2010
Odds are it’s wrong Tom Siegfried Science News 27th March
2010
Review
Outline
Plan
Do
Check
And it does matter!
Ioannidis JPA, 2005 Why Most Published Research Findings
Are False. PLoS Med 2(8): e124.
doi:10.1371/journal.pmed.0020124
Review
Outline
Plan
Do
Check
Summary
There are lots of problems with significance testing both in:
Pedagogy
Research practice
So why perservere
Inertia
It suits publish or perish
Review
Outline
Plan
Do
Check
Two flavours of SHIT
Fisher - all about rejecting a null - BUT ONLY WHERE YOU
HAVE NOTHING ELSE TO GUIDE YOU, i.e., weak,
exploratory research
Neyman-Pearson - selection of alternative hypotheses requires power calculations
And weird hybrids
Review
Outline
Plan
Do
Check
Shift emphasis to statistical literacy
Study design is vital (why do randomised experiments work,
how does random sampling have validity)
Bias in data collection/design
How do results generalise beyond study population
Graphing and understanding data
Inference is only a very small part of the toolbox
(very well documented, Aliaga et al, Utts and Heckard etc. etc.
etc. - called “post reform” approach in US)
Review
Outline
Plan
Do
Think about effect sizes
Practical versus statistical significance
Check
Review
Outline
Plan
Do
Check
“Proper” science is reproducible
What does one study tell us anyway, regardless of what the
p-value says
Review
Outline
Plan
Do
Let’s all use Bayesian Statistics
Hopefully we have time for a quick demo
Check
Review
Download