PSYCH ARTICLE - Most brain science papers are

advertisement
Most brain science papers are neurotrash: Official
Don't believe everything you read
By Andrew Orlowski • Get more from this author
Posted in Science, 12th April 2013 11:24 GMT
A group of academics from Oxford, Stanford, Virginia and Bristol universities have looked at a range of
subfields of neuroscience and concluded that most of the results are statistically worthless.
The researchers found that most structural and volumetric MRI studies are very small and have minimal
power to detect differences between compared groups (for example, healthy people versus those with mental
health diseases). Their paper also stated that, specifically, a clear excess of "significance bias" (too many
results deemed statistically significant) has been demonstrated in studies of brain volume abnormalities, and
similar problems appear to exist in fMRI studies of the blood-oxygen-level-dependent response.
The team, researchers at Stanford Medical School, Virginia, Bristol and the Human Genetics dept at Oxford,
looked at 246 neuroscience articles published in 2011 and and excluded papers where the test data was
unavailable. They found that the papers' median statistical power - the possibility that a study will identify an
effect when there is an effect there to be found - was just 21 per cent. What that means in practice is that if
you were to run one of the experiments five times, you’d only find the effect once.
A further survey of papers drawn from fMRI brain scanners - and studies using such scanners have long filled
the popular media with dramatic claims - found that their statistical power was just 8 per cent.
Low statistical power caused three problems, the authors said. Firstly, there is a low probability of finding true
effects; secondly, there is a low probability that a "true" finding is actually true; and thirdly, exaggerating the
magnitude of the effect when a positive is discovered.
There were further problems that led them to believe the power is even lower than they suggest. They noted:
[T]he summary effect size estimates that we used to determine the statistical power of individual
studies are themselves likely to be inflated owing to bias — our excess of significance test provided
clear evidence for this. Therefore, the average statistical power of studies in our analysis may in fact
be even lower than the 8–31% range we observed.
Publishing is a highly competitive enterprise, with certain kinds of findings more likely to be
published than others. Research that produces novel results, statistically significant results (that is,
typically p < 0.05) and seemingly "clean" results is more likely to be published. As a consequence,
researchers have strong incentives to engage in research practices that make their findings
publishable quickly, even if those practices reduce the likelihood that the findings reflect a true (that
is, non-null) effect.
The paper is titled Power failure: Why small sample size undermines the reliability of neuroscience and is
published in the May 2013 edition of Nature Reviews' Neuroscience journal. The conclusions have wide
implications for the field.
Button et al note that advances in computer processing have made crunching large data sets faster and
easier, but the statistical rigour hasn't kept pace. They call for research to be fundamentally redesigned to
maintain the credibility of neuroscience.
These dramatic advances in the flexibility of research design and analysis have occurred without
accompanying changes to other aspects of research design, particularly power. For example, the
average sample size has not changed substantially over time despite the fact that neuroscientists are
likely to be pursuing smaller effects.
The increase in research flexibility and the complexity of study designs combined with the stability of
sample size and search for increasingly subtle effects has a disquieting consequence: a dramatic
increase in the likelihood that statistically significant findings are spurious. This may be at the root of
the recent replication failures in the preclinical literature8 and the correspondingly poor translation of
these findings into humans.
Kate Button, one of the authors behind the paper, has a nice article at the Guardian explaining the issues.
"The current reliance on small, low-powered studies is wasteful and inefficient, and it undermines the ability of
neuroscience to gain genuine insight into brain function and behaviour. It takes longer for studies to converge
on the true effect, and litters the research literature with bogus or misleading results," writes Button.
Demand for brain science has increased from policy wonks and other pseuds looking for a "neuroscientific
explanation" to settle their turf war; from journalists, eager to fill pages with a brightly coloured pictures and
grabby headlines; and from academics chasing after publications.
And it isn't just the brain boffins who are cutting corners and making improbable exaggerations. Neuroscience
evangelist journalist Jonah Lehrer resigned from The New Yorker magazine and later parted ways
with WiReD last year after he admitted making up quotes that he had attributed to Bob Dylan. ®



Related stories
Chemical-dipped TRANSPARENT BRAINS bare all for science (11 April 2013)
WiReD surgically removes damaged neurotrash 'expert' (4 September 2012)
Neurotrash creativity 'expert' created Dylan quotes from thin air (31 July 2012)
Download