The Inevitability of False Positives

I was reading an article about web scanner coverage and false positives by Larry Suto that RSnake linked to on ha.ckers. Though this is only tangentially related to the actual paper, it reminded me of something interesting — the inevitability of false positives when detecting something rare.

When measuring the error of a detection process, there are three pertinent statistics — Type I error (false positive, detecting something that isn’t really there,) Type II error (false negative, missing something that is there,) and crossover error rate (the error rate at which the rates of Type I and Type II error are equal — essentially, the minimum error of the process.) We normally think of trying to minimize the crossover error rate — after all, we want detection processes that are as accurate as possible — but sometimes one sort of error is objectively worse than the other, so we will choose, say, to minimize false negatives even if this leads to more false positives being detected.

For instance, it is very annoying if the fingerprint scanner used to log onto your laptop fails to recognize you routinely, requiring you to use the reader repeatedly. Thus, too many false negatives annoy the user. Of course, if it let everyone in, that would be even worse, but we’re willing to run the risk that somebody with fingerprints sort of similar to yours might be able to get in if it makes the thing work better. On the other hand, if the fingerprint scanner is on the vault with the nuclear weapons in it, false positives are very bad, while a false negative is really not too terrible — you probably don’t need to access the nuclear weapons very often, so if you need to swipe your finger four times to get in, that’s okay. In this process, you’ll optimize to minimize Type I error, even if this raises your rate of Type II error and your crossover error rate.

However, what people often fail to recognize is that error rates become very oddly skewed when the thing to be detected is exceedingly rare. For instance, we currently have many processes in the country designed, ultimately, to detect terrorists — border guards, profiling, no-fly lists, etc. These all have error rates — sometimes, they would miss a real terrorist, and to the dismay of civil libertarians and air travelers everywhere, sometimes they “catch” innocent people.

A Type I error rate of 0.001% sounds pretty good. Imagine you have a terrorist detector with a Type II error rate of zero — it always detects real terrorists. And its Type I rate is only 0.001% — it generates false alarms only one time in 10,000. Sounds great, doesn’t it? We should make use of them immediately! If this thing points out a terrorist, you’ve got the right guy. The government can proudly advertise that their detector is 99.999% accurate.

But wait… there are 280 million people in the United States. How many are actual terrorists? I hope not very many, but let’s be paranoid and imagine there are 1,000 lying in wait (though I’d wager if there were, we might have seen at least one terrorist attack on U.S. soil sometime within the last 5 years.) This means that we’ll be scanning a real terrorist — and set off the alarm, since our terrorist detector has a false negative rate of zero — 0.000036% of the time. Our false positive rate is 0.001% is actually more than the rate of real terrorists in the population. In fact, while a negative from our terrorist detector is right every time, a positive from it is wrong 97% of the time. In other words, if the alarm goes off, you can be 3% sure that you’ve got the right guy!

Doesn’t sound so good put that way. When the alarm goes off, you can be almost certain (97% certain, at least) that you’ve got an innocent man. The problem of detecting a rare thing without false positives is actually quite difficult.

risk, statistics, terrorism

If you enjoyed this post, please consider to leave a comment or subscribe to the feed and get future articles delivered to your feed reader.