Something seemed odd when I first read that 90% of medical articles on Wikipedia contained errors. The paper was from an obscure osteopathic school called Campbell University in Buies Creek, North Carolina.

Like most abstracts you read, this one was a little fuzzy about the methods used. The top 10 most-costly medical conditions were identified, and a Wikipedia article for each was reviewed by 2 randomly assigned investigators.

For 9 of the 10 conditions, significant discordance between the assertions found on Wikipedia and selected peer-reviewed sources existed. The authors concluded, “Caution should be used when using Wikipedia to answer questions regarding patient care.”

Continue Reading

This led to the usual barrage of media handwringing. For example, BBC News Health headlined its story “Trust your doctor, not Wikipedia, say scientists.” To its credit, the BBC did include some dissent from a representative of the Wiki Project Med Foundation.

FiveThirtyEight, overseen by the famous statistician Nate Silver, accepted the study’s results without question, as did the Huffington Post. But, there are many fundamental problems with the paper (full text here).

The 2 assigned reviewers were either medical residents or rotating interns. Obviously, they may not have had the experience or background to properly determine the correctness of any assertions found on Wikipedia.

The authors assumed that the comparator peer-reviewed articles chosen by the young doctors were appropriate and authoritative. This is not necessarily true. My experience from 40 years of teaching residents is that most of them are ill equipped to judge the validity of research papers.

In nearly every case, the 2 reviewers of each topic did not find the same number of assertions, and the number of discordant comparisons for them differed greatly, as you can see from a portion of the paper’s Table 3 below.

The statistics used raise many questions.

As someone on reddit astutely pointed out, the reviewers found a seemingly higher number of assertion errors in the concussion article than the one on diabetes. From the table above, take a look at the numbers.

Somehow, concussion was the only area that did not have a statistically significant number of discordant assertions, but diabetes did.

I’ve read the paper 3 times and I still don’t quite understand what P values the authors are actually comparing. In fact, the paper makes no sense to me.

The most commonsense take on the paper was that of Dr. Kent Sepkowitz on The Daily Beast, who pointed out that most decisions that doctors make are not evidence based.

Often, a particular patient doesn’t quite match up with the evidence, no solid evidence exists, or the evidence is conflicting. Everyone talks about guidelines, but just look at the controversies that continue to swirl around guidelines for prostate screening, mammography, statins, and many other topics. It’s just not that simple.

I think it’s OK for trainees and doctors to continue to use Wikipedia. However, when using any information source, good judgment, healthy skepticism, and understanding the specific situation with the patient should be considered when reaching a final choice about care decisions.