The use of error rates in forensic DNA statistics.
The results of DNA typing may either qualitatively include or exclude a potential suspect as the donor of DNA in an evidence sample. Exclusions require no statistical analysis since they are absolute. Inclusions require some assessment of how likely it would be to get the observed genetic match if, in fact, the suspect did not leave the evidence DNA. To complete this statistical assessment requires a quantification of the chance of two events. (i) How likely is it that the suspect coincidentally has a DNA profile that matches the profile of the unknown person who really did leave the DNA evidence. (ii) How likely is it that the laboratory would declare a match between the evidence and the suspect when in fact their DNA profiles do not match? The events described in questions (i) and (ii) could both lead to a declared match when the suspect was not the source of the DNA and thus the chances of each event must be statistically evaluated.
The product rule answers only the first of the two important statistical questions in a forensic DNA case. The second question requires some estimate of laboratory false positive rates. All forensic laboratories I am familiar with fail to provide any estimate of lab error rate. The only circumstance under which the lab error would be unimportant is when it is much less than the chance of a coincidental genetic match. In many STR cases using Profiler Plus and Cofiler, for instance, the chance of coincidental genetic matches may be in the range of 1 in billions to 1 in several quadrillion. When lab errors are less than this they can be safely ignored The second National Research Council Committee (NRC II) in choosing to disagree with the first NRC committee never argued that lab errors rates are this small. They made several other arguments. These arguments have been criticized by the scientific community and as I will document are flawed either in logic or by current scientific evidence that the NRC II committee was apparently unaware of.
There were four basic arguments offered by the NRC II committee to support their conclusion that lab errors rates can be ignored. (i) Errors in a specific forensic case depend on many variables that can't be accounted for by proficiency tests or any single number. (ii) Estimates of lab error rates would require excessively large numbers of proficiency tests. (iii) While the number of total proficiency tests could be increased by pooling tests from many labs this would unnecessarily punish good labs for the errors of bad labs. (iv) When labs make errors they take steps to improve and time has shown, like in the case of Cellmark, that errors now are less common than they were in 1988 and 1989.
To evaluate these comments we need to step back and reassess what we are really trying to do. As stated above what we need to know is: are false matches likely to occur at a rate of about 1 in several billion or less or is it more likely that the rate is, say, 1 in 1000. While the points raised by NRC II and summarized in (i) and (ii) above clearly prevent accurate discrimination between errors rates like 1 in 800 vs. 1 in 1000 they certainly don't prevent one from distinguishing between rates of 1 in 1000 vs. 1 in 200 billion! This point is even more obvious by noting that almost all proficiency tests where errors are typically observed are much less demanding than actual forensic cases. Thus, it would be ludicrous to assume that if errors occur 1 in 1000 times on simple proficiency tests they will occur less often in real case work. Point (iii) is true to a limited extent. Some labs might be penalized by using an industry wide error rate while others would have their apparent error rate diminished. While it stands to reason that no two labs will have exactly the same error rate there is no reason to believe that labs will differ over a range that really matters like 1 in 1000 to 1 in 200 billion. This last conclusion is supported by the observation that lab errors are not generated by only one lab and they usually involve human error. The final argument made by the NRC II committee (iv) is simply wrong. For instance the NRC II committee used Cellmark as an example. They cited the facts that the laboratory had made 2 false matches in 1988 and 1989 but based on 450 additional tests through 1994 had made no additional errors.
For some reason the NRC II committee was not aware that on 17 November 1995, seven months before their report was released, Cellmark discovered a false match that occurred in an actual case (People of the State of California vs. John Ivan Kocak, No. SCD110465). Apparently in reaching their conclusion about the improved stature of DNA testing the NRC II committee was not aware of (i) the false match in a proficiency test by a technician in the California Department of Justice in July, 1993, (ii) the false conclusion of paternity made in a DNA test by Genelex in October of 1993, or (iii) the Kocak case. Since the NRC II report additional evidence of these types of problems continue to appear such as (i) the false match made in a proficiency test by SERI in California in September of 1997, (ii) the switching of evidence samples that occurred at the Minnesota BCA in October of 1997, (iii) the errors in the APEX proficiency tests, and (iv) the Philadelphia Police Department false positive to name a few.