Screenings and Type I and Type II errors
by Sherry F. Colb
In my column for this week, I talk about some commonalities between breast cancer (and other medical) screenings and dog sniffs. In both cases, I explain, the screening might seem relatively innocuous and uninvasive, a good reason to go ahead and have it done. But as it turns out, because of type I errors (false positives), the odds that the first screening step will be the last diminish, and the decision to screen therefore can unwittingly become a decision to take the many invasive steps that follow a positive result. In the column, I suggested that this seemingly inexorable momentum from screening to more invasive searches may represent a good reason to avoid many screenings in the first place, in the absence of some independent reason to believe that there is something to find (i.e., some symptom of the disease for which one is testing or some basis for suspecting that a person to be "sniffed" by a dog is in possession of narcotics).
In this post, I want to be clear about what I am not saying. I am not suggesting that screenings (of the medical or forensic variety) are best avoided at all times. Whether to "screen" in the absence of evidence of a problem is a question best answered by reference to factors including how accurate the screening device is (i.e., how likely we are to get a type I error of the sort that worries me in the column) and how helpful it would be to learn now (and how costly it might be to delay learning until a later point) about the condition for which one would be screening. To illustrate without unnecessarily courting controversy, I shall provide two completely hypothetical examples.
Assume that we have a medical screening test (a blood test) that will indicate a heightened possibility that one currently has cancer of the esophagus. Assume as well that absent this test, esophageal cancer will generally be untreatable by the time the patient actually develops symptoms that might send him or her to the doctor with a complaint. Assume further that the screening test is about 80% accurate (in the sense that just one fifth of the patients who test positive for esophageal cancer turn out to be cancer free). Assume finally that the test to confirm the cancer involves a biopsy of the esophagus that can occur under a local anesthetic and from which the patient recovers within a day in almost all cases. Given all of these assumptions, it seems to me that the screening is worthwhile, notwithstanding the fact that one could avoid type I errors by skipping it.
Assume now that we have a physical device that can, from outside a building, indicate a heightened possibility of the presence of nuclear weapons inside the building. Assume as well that failure to identify the presence of nuclear weapons inside the building could result in the obvious. Assume further that the device is approximately 80% accurate (in the same way as the esophageal cancer screening device is about 80% accurate) and that, as in the medical test, the confirmation process is not enormously invasive. Assume that confirmation involves entering the building with a more precisely calibrated detection device and waving the device around in some part of each room to get an accurate measure. In this case too, it appears that the screening is worthwhile.
Yet even in these relatively easy cases, we can see that there are non-inconsiderable costs. If everyone acts on the assumption that the screening is a good idea, then a large number of healthy people will experience unnecessary biopsies of their esophagus, along with the accompanying anxiety. And a correspondingly large number of innocent people will have police walking through their homes waving nuclear-weapon-detection devices in each room of their homes (and therefore also seeing many private things inside those homes). We may nonetheless make the choice (professionally, to recommend a medical screen or judicially, to authorize a forensic screen), because on balance, it provides the greatest good for the greatest number of individuals. This is a utilitarian calculus, though it may be less offensive to deontologists than those in which we know in advance which individuals will benefit and which will suffer needlessly (e.g., if we were to experiment upon A directly in order to help B and C).
As a result, when we are speaking of individual rights (to privacy or to medical integrity), it is appropriate that we feel some residual discomfort about pushing screenings upon a reluctant audience. This does not mean that we must refrain from doing so, but the residual discomfort remains, is worthy of consideration, and is largely traceable to the type I errors discussed in my column.
In my column for this week, I talk about some commonalities between breast cancer (and other medical) screenings and dog sniffs. In both cases, I explain, the screening might seem relatively innocuous and uninvasive, a good reason to go ahead and have it done. But as it turns out, because of type I errors (false positives), the odds that the first screening step will be the last diminish, and the decision to screen therefore can unwittingly become a decision to take the many invasive steps that follow a positive result. In the column, I suggested that this seemingly inexorable momentum from screening to more invasive searches may represent a good reason to avoid many screenings in the first place, in the absence of some independent reason to believe that there is something to find (i.e., some symptom of the disease for which one is testing or some basis for suspecting that a person to be "sniffed" by a dog is in possession of narcotics).
In this post, I want to be clear about what I am not saying. I am not suggesting that screenings (of the medical or forensic variety) are best avoided at all times. Whether to "screen" in the absence of evidence of a problem is a question best answered by reference to factors including how accurate the screening device is (i.e., how likely we are to get a type I error of the sort that worries me in the column) and how helpful it would be to learn now (and how costly it might be to delay learning until a later point) about the condition for which one would be screening. To illustrate without unnecessarily courting controversy, I shall provide two completely hypothetical examples.
Assume that we have a medical screening test (a blood test) that will indicate a heightened possibility that one currently has cancer of the esophagus. Assume as well that absent this test, esophageal cancer will generally be untreatable by the time the patient actually develops symptoms that might send him or her to the doctor with a complaint. Assume further that the screening test is about 80% accurate (in the sense that just one fifth of the patients who test positive for esophageal cancer turn out to be cancer free). Assume finally that the test to confirm the cancer involves a biopsy of the esophagus that can occur under a local anesthetic and from which the patient recovers within a day in almost all cases. Given all of these assumptions, it seems to me that the screening is worthwhile, notwithstanding the fact that one could avoid type I errors by skipping it.
Assume now that we have a physical device that can, from outside a building, indicate a heightened possibility of the presence of nuclear weapons inside the building. Assume as well that failure to identify the presence of nuclear weapons inside the building could result in the obvious. Assume further that the device is approximately 80% accurate (in the same way as the esophageal cancer screening device is about 80% accurate) and that, as in the medical test, the confirmation process is not enormously invasive. Assume that confirmation involves entering the building with a more precisely calibrated detection device and waving the device around in some part of each room to get an accurate measure. In this case too, it appears that the screening is worthwhile.
Yet even in these relatively easy cases, we can see that there are non-inconsiderable costs. If everyone acts on the assumption that the screening is a good idea, then a large number of healthy people will experience unnecessary biopsies of their esophagus, along with the accompanying anxiety. And a correspondingly large number of innocent people will have police walking through their homes waving nuclear-weapon-detection devices in each room of their homes (and therefore also seeing many private things inside those homes). We may nonetheless make the choice (professionally, to recommend a medical screen or judicially, to authorize a forensic screen), because on balance, it provides the greatest good for the greatest number of individuals. This is a utilitarian calculus, though it may be less offensive to deontologists than those in which we know in advance which individuals will benefit and which will suffer needlessly (e.g., if we were to experiment upon A directly in order to help B and C).
As a result, when we are speaking of individual rights (to privacy or to medical integrity), it is appropriate that we feel some residual discomfort about pushing screenings upon a reluctant audience. This does not mean that we must refrain from doing so, but the residual discomfort remains, is worthy of consideration, and is largely traceable to the type I errors discussed in my column.