Who’s afraid of Big Bad Data?

In the last 24 hours I’ve watched two lectures about the Big Data revolution in medicine. Someday soon, I’ve learned, we’ll have the gazillions of data points needed to provide just the right treatment for your cancer (or whatever else ails you). As with so many human problems these days, the solution lies in gathering enough data and then massaging them mercilessly.

But I wonder if in the rush to embrace the promise of Big Data, some people may be forgetting the GIGO principle. Data are only useful if they’re accurate, and human beings are notoriously bad at accuracy.

A few years ago I spent a ridiculous amount of time calling and writing to a provider of diagnostic services to try to get an incorrect diagnosis code changed. Someone had transposed two digits in the code, so that instead of leg pain I was on record as having cloudy hemodialysis fluid. This diagnosis remains in my records with both the provider and the insurer, because after ten phone calls over a period of several months I finally gave up on getting it changed.

Why the insurer paid for a procedure connected with dialysis, when I’ve never been anywhere near a dialysis machine, is beyond me. But my point is that this kind of thing happens thousands of times every day. Some errors are never caught, and some anomalies are introduced intentionally by unprincipled researchers.

Surely all this accumulated garbage is wreaking havoc in Big Data repositories. When you’re dealing with people’s health, you don’t want someone to have transposed two nucleotides or entered the wrong dose of medication into a spreadsheet. Bigger is not necessarily better.

In short, until we find someone more reliable than us to enter the data, you won’t see me jumping on the Big Data bandwagon.