Gelman’s full analysis of the paper is featured at Columbia’s Statistical Modeling, Causal Inference, and Social Science website. Reading through the preprint edition of the study in Santa Clara County, Gelman looked at how they handled each step of the test and what measures were taken to address potential errors such as selection bias in the sample group. He also points out how the authors specifically don’t address some issues concerning how they adjusted their data to account for location, sex, or ethnicity. For that matter, the preprint also doesn’t include the raw data. It only includes the calculated results. Without that data trying to reverse engineer what actually happened in this study is more than a little problematic.
Still, Gelman plows on to look at just how these values were generated. The first big issue is with the values for specificity (the ability to correctly measure negatives) and sensitivity (the ability to accurately measure positives).
“If the specificity is 90%, we’re sunk. With a 90% specificity, you’d expect to see 333 positive tests out of 3330, even if nobody had the antibodies at all. … On the other hand, if the specificity were 100%, then we could take the result at face value.”
The authors of the paper list a specificity of 99.5%, a value apparently calculated through testing 30 known positive samples and 30 known negative samples. In fact, in the Los Angeles study, the same authors give the same test at 100% specificity.
However, as Gelman points out, the testing reported doesn’t produce a set value of 99.5%. It actually sets a range of specificity of somewhere between 98% and 100%. Considering that a value of 98.5% would automatically generate the number of positives reported without a single real positive being in the sample … that’s a big issue. And it’s just one of several factors that Gelman points out that lead to a very concise summary.
“I think the authors of the above-linked paper owe us all an apology. We wasted time and effort discussing this paper whose main selling point was some numbers that were essentially the product of a statistical error.”
As Gelman goes on to explain, his demand for an apology is serious, not because the authors of the study made errors, but because they made obvious errors that were easily detected by having someone review the technique and results. He also points out that one of his own studies on novel coronavirus generated some very exciting results. Only rather than calling up the media, he double checked the math, found some points that could potentially invalidate the results, and shelved it to await additional data and analysis.
By pushing these results into the public without review, the researchers involved haven’t just planted a false narrative in the public mind, they’ve reinforced the “Lots more people have had it. Hey, we all had it back in (insert month here)” talking point that will not seem to die, despite all evidence to the contrary. That action has given impetus to the right-wing media sources pushing to “reopen America.” It’s also planted doubts about the accuracy of antibody testing, which is a critical component to safely bringing this outbreak to an eventual close.
And it’s also painted a big bull’s-eye on Stanford.
“The study got attention and credibility in part because of the reputation of Stanford. Fair enough: Stanford’s a great institution. Amazing things are done at Stanford. But Stanford has also paid a small price for publicizing this work, because people will remember that ‘the Stanford study’ was hyped but it had issues.”
Finally, it’s worth noting that Gelman isn’t discounting the value of this testing, or even the results—there does seem to be a “signal” here. It’s only that the study fails to be upfront about the potential uncertainties and the way in which data was manipulated. The data has value. The reported results … maybe not.
Having posted critical articles concerning this work twice in one day, it may seem like I am going out of my way to point out the potential errors and damage generated by this study and by the attention it has received in the media. Because yes.
Thanks to those who pointed me at Andrew Gelman’s critique, and to AKALib who pointed out that, despite the way the media had tagged the Santa Clara study as coming from Stanford, and the LA County study as coming from USC, it was largely the same group of researchers carrying the same test in each case—and quite remarkably producing almost identical results despite distinctly different populations. Almost as if they were measuring their own error rate rather than real data.