The Problems Inherent in Testing Large Populations With Even Relatively-Reliable Methods

Summary: An example of the political, law-enforcement, and practical problems of detecting vs. dealing with potential terrorists supports Echidne’s analysis of the problem with breast-cancer detection recommendations.

Matthew Yglesias discusses the difference between “common sense” anecdotal evidence and statistical evidence.

Suppose I invent a magical device that can be pointed at a Muslim and say with 90% accuracy whether or not he’s an al-Qaeda operative. Well, if I start waving it around and it starts beeping on one guy, what should we conclude about him? A terrifyingly large number of people are going to say “there’s a ninety percent chance he’s with al-Qaeda! Let’s panic!” In fact, that’s not the case. There are a billion Muslims in the world. A test with 90 percent accuracy is going to mistakenly classify about 100 million of them as al-Qaeda operatives. And al-Qaeda actually has fewer than 10,000 people working for it. I’m going to get something like 10,000 false positives for every actual terrorist I find.

Meanwhile, applying the test to people is going to have severe consequences. The public doesn’t understand this correctly and is going to be put into a wholly unwarranted state of panic about the prevalence of terrorists. People will, of course, demand that those flagged by my machine be subjected to extra-heightened scrutiny. It’s easy to imagine lots of innocent people being mistakenly killed or subjected to discrimination or shunning. And that sense of beseigement and unfair treatment would ultimately heighten tensions between the world’s Muslims and the West, while wasting massive quantities of law enforcement resources chasing basically worthless leads.

Read the quote in context here.

It seems odd to call a discussion of terrorism and racial profiling “non-controversial,” and perhaps even more odd for me to quote so extensively about something seemingly so remote from anything having to do with my main topics of relationships, sex, and gender.

Yet I bring it up to support a post by Echidne of the Snakes defending the statistics and methodology behind the new mammogram restrictions.

It was seriously principled, and courageous, for her to go out on a limb like that. Like a lot of problems in mitigation it’s easy to point to someone who benefitted from the status quo, but harder to identify those who suffered from it.

I think Yglesias’ post explaining the cost of more testing at certain ages (even if the tests were very accurate — which they aren’t in either Yglesias’ nor Echidne’s cases) would tend to overwhelm the system, and individuals, with false positives on the one hand, and still-treatable cases on the other.

Without intending any gender equivalencies, at all, it’s instructive to note that a similar situation arose in prostate cancer detection 10 or 15 years ago: PSA tests brought the price of detection down and the early detection way, way up. But, as you note, detection isn’t the same thing as treatment. At all. In fact detection isn’t even the same thing as understanding the disease!

For better or worse, because the imbalance between detection on the one hand and both understanding and treatment on the other hand was so lopsided it became a big problem for medical ethics: first, it turns out overwhelming numbers of men over 50 or so have detectable early prostate cancer. But for most it’s so slow to grow they die of old age before they can die of the cancer. For most but not all. Enough die, and die fairly horribly, to make treatment a consideration. But the treatments (burning off, cutting off, or poisoning) are generally so debilitating and expensive they shouldn’t be undertaken unless you’re sure it’s the bad kind. Which makes it a shame that researchers then, and now, still can’t tell whether an early cancer will go bad.

The line between the risks and benefits of breast-cancer testing are much harder to draw than prostate-cancer testing was. And so we’re stuck (or I should say “stuck”) with statistical analysis. Which is why it’s really nice to have a committed, ethical, and highly-interested statistician explain these particular findings for us. And with breast cancer the benefits are close enough to the costs (barring further progress in the development of treatment anyway) that it’s really hard to say what the right thing to do might be. And so we’re likely to run into really big shifts in the conclusions.

On a final note I especially appreciated Echidne’s explanation of not only the cost vs. benefit of testing, but how the cost incurred for marginally-valuable testing might be diverting funding from research into treatment or prevention. (emphasis mine.)

Screening is not treatment. To do it at all is based on the hope that early detection raises the odds of survival. This has been shown to be true for cervical cancer and the pap test and also for colon cancer screenings. But the most recent evidence suggests that breast cancer screening is less effective than previously thought. As I mentioned in an earlier post, researchers now suspect that mammograms capture a lot of tumors which might either disappear on their own or never grow much, while missing the very aggressive tumors which develop very rapidly. It is the latter types which are reflected in the mortality statistics

...

The choice to pay for screening (by both individuals and the society) is ultimately a value judgment. But resources are not infinite. If money is spent (by both individuals and the society) in one type of screening, it is not available for other types of screening or for other types of prevention or treatment.

It’s hard when answers aren’t cut and dried, and even harder when the ranges are so close you can get these big shifts in recommendations. And when it’s a controversial subject it’s even harder. Cool that she was willing to dig into it.

Update: See also Amanda Marcotte’s take, with another allusion to prostate cancer (it’s being downscaled too) and more backup links.

#permalink

It confused me how many people got up-in-arms about the new recommendations and started screaming “rationing!” I mean, all they did was say instead of every year it should be every other year. If they think it should be more often, why not every 6 months? Every 3? Every week?? Clearly at some point pretty much everyone will be willing to concede that we don’t need THAT frequent a testing schedule! So why is once a year the magic number such that doing it less frequently is suddenly insufficient?

[Well, there are actually a lot of good reasons to be suspicious, but probably not necessarily to be paranoid. More to Echidne’s point, if going to every other year is roughly as effective outcome-wise, can we work things so any savings that do arise can be channeled towards investigating prevention and cures? Thanks, P. —fl]

#permalink

This post gave me new insight about a subject I really didn’t understand. Thanks for sharing the perspectives that way. Very helpful and well-written.

[I think Matt Yglesias and Echidne have to get most of the credit, but I was glad to pass it along. Thanks, Cristy. —fl]

Post new comment

The content of this field is kept private and will not be shown publicly.