Over the last 25 years, functional MRI (fMRI) has risen to a prominent place in neuroresearch, spawning tens of thousands of peer-reviewed papers and even more popular science articles on how various stimuli affect our brains. Now, one paper has challenged everything, claiming that up to 70 percent of results may be artifacts of the software used, rather than real insights into the brain.
Neuroimaging research uses the fact that when a brain region is active, it uses more oxygen. Alternatives require invasive procedures, but fMRI machines detect the oxygen-rich blood clustered in the most in-use parts of the brain.
Unfortunately, drawing results from fMRI images is more than a matter of taking a scan and observing the pattern. Images are broken up into “voxels”, like three-dimensional pixels. Most often, recordings of activity during a task are compared with activity when the brain is at rest, and run through software packages that can be used to determine whether bright voxels are random noise or represent significant clusters that indicate a brain region is in high use.
A paper in the Proceedings of the National Academy of Sciences argues that the three most common software packages have a statistical flaw and give numerous false positives, showing clustering where none exists.
The damning claim is made by experienced contributors to the field. First author Dr Anders Eklund of Linköping University, Sweden, has written many highly-sited peer-reviewed papers on the use of fMRI.
“These results question the validity of some 40,000 fMRI studies and may have a large impact on the interpretation of neuroimaging results,” the authors write.
Previous papers have indicated that false positives exist in certain fMRI datasets, but it was unclear if the problem would disappear when multiple scans were grouped together, as usually occurs. Eklund’s latest work indicates it applies to a disturbingly large number of papers. Eklund collected 499 resting-state fMRI scans used by other researchers as controls in their studies and matched them up in random groups of 20. This produced a total of 3 million comparisons with which to use for analysis.
Where properly functioning software would show differences in the brain behavior between these groups in no more than 5 percent of cases, the packages showed differences in up to 70 percent. For one of the packages, the problem was exacerbated by a bug in the 15-year-old software that was only fixed in May 2015.
"If you've spent months gathering data at great cost, you should be more interested in letting the analysis take time so that it's correct," Eklund said in a statement.
A preprint of the paper generated considerable debate within the neuroimaging community. The makers of one of the packages have argued that the problems Eklund has found only occur when the package is used inappropriately. Even if this is true, it is possible that such use has been widespread.
The software packages assume a Gaussian shape (a type of bell curve also known as a normal distribution) to activity levels of neighboring voxels. When brain activation veers away from normal distribution patterns, the packages can sometimes attribute relationships that don't exist, making it look as though a cluster of voxels are firing together, when this may not be the case
If Eklund is right, it's possible some of those images you have seen comparing how the brain looks on drugs, or in love, or when watching porn, compared to a normal baseline, may represent glitches in the software, rather than real differences.
Eklund and his co-authors conclude: “It is not feasible to redo 40,000 fMRI studies, and lamentable archiving and data-sharing practices mean most could not be reanalyzed either.” They propose validating existing fMRI methods and increasing data-sharing to try to work out which past studies were actually valid.
[H/T: The Register]