Posted on Categories Discover Magazine
I recently decided to revisit a 2014 case that regular readers might remember.
Back in 2014, I posted about a terrible piece of statistical ‘spin’ that somehow made it into the peer-reviewed Journal of Psychiatric Research. The offending authors, led by Swedish psychiatrist Lars H. Thorell, had run a study to determine whether an electrodermal hyporeactivity test was able to predict suicide and suicidal behaviour in depressed inpatients.
Now, the standard way to evaluate the performance of a predictive test is with the two metrics sensitivity and specificity. Each of these can range from 0 to 1 (alternatively written as 0 to 100%), but on their own, neither of them means much: you have to consider them together. For a test which is completely uninformative (like flipping a proverbial coin), sensitivity + specificity will total 1 (100%). For a perfect test, they’ll total 2 (200%). Any introductory stats textbook will tell you this.
In their 2013 paper, Thorell et al. reported the “sensitivity” and “specificity” of their test, and the numbers looked very good. Check out for example Table 3:
But what Thorell et al. called “specificity” – or “raw specificity”, a term unknown to statistics before that point – was actually a different metric called the NPV. The true specificity of the electrodermal test in predicting suicide and suicide attempts was poor (around 33%).
Thorell et al.’s specificity switcheroo was so outrageous that it led to two letters to the editor (1, 2) in complaint. However, the Thorell et al. paper was never retracted.
Now, I decided to revisit “raw specificity” four years later. It turns out that Thorell and his colleagues have continued to cite the 2013 paper in subsequent publications, repeating the claim that the electrodermal hyporeactivity test has good “sensitivity and raw specificity”. The most recent such reference was earlier this year in the journal BMC Psychiatry.
The survival of ‘raw specificity’ is no surprise. Once the concept had entered (or contaminated) the literature in the 2013 paper, the damage was done. Subsequent peer reviewers can hardly be blamed for allowing an author to quote the conclusions of their previous, peer-reviewed paper. Perhaps more should have been done to push for the retraction of the 2013 paper, but that ship has sailed.
There could still be icebergs in the ship’s path, however. It turns out that Thorell, and a company he directs called Emotra AB, have run a new trial of electrodermal testing, called EUDOR. EUDOR is big: 1573 patients were recruited. And it could mean big money for Emotra AB, who in June raised 13.8 million Swedish kronor ($1.63m) from investors.
Stay tuned for Part 2 where we’ll see whether this was money well spent.