You are testifying in court regarding the results of an image forensic analysis that has detected evidence of tampering in a critical piece of evidence. On cross examination you are asked how precise your forensic analysis is. How do you answer this question?
There are three basic quantities that need to be considered when assessing the precision of a forensic analysis. Here I will discuss two of them and in a follow-up entry I will describe the often overlooked third component. The first quantity, the true positive rate, is the frequency with which a forensic analysis will correctly detect a specific manipulation when the image is fake1. The second quantity, the false positive rate, is the frequency with which a forensic analysis will incorrectly detect a manipulation even though the image is authentic. The ideal forensic tool has a high true positive rate and a low false positive rate. Both of these values must be taken into consideration when assessing precision. There are other statistical measures that could be considered such as sensitivity and specificity, but precision most directly measures what we are most interested in — the likelihood that an image is fake given that a forensic analysis classifies it as fake.
Consider a forensic analysis with a true positive rate of 90% and a false positive rate of 5%. If this forensic analysis detects evidence of tampering, then the precision is computed as a ratio of the true to true plus false positives: 0.9/(0.9 + 0.05) = 94.7%. Notice that if the false positive rate is 0%, then the precision is 100%, even though the true positive is only 90% — in fact, even if the true positive is 1%, the precision remains 100%. With a low true positve and a 0% false positive your forensic analysis won’t often find a fake, but when it does, you will be absolutely sure that it is a fake.
The true and false positive rates might be available from the original scientific studies on which a specific forensic technique was based. If they are not, then you need to measure them. This can sometimes be a challenge, but it is critical to determining the precision of your forensic analysis. To begin, collect a large number of authentic images. To be sure that you have original images, you might want to take these images yourself, but make sure that they are varied in terms of their content, lighting, quality, resolution, etc. In order to compute the false positive rate, apply your forensic analysis to each image. The false positive is the number of images that are incorrectly identified as fake. In order to compute the true positive, apply your forensic analysis to a varied and large number of fake images. The true positive is the number of correctly identified fakes.
The number of images that you need to test will vary from a few thousand to tens to hundreds of thousands depending on the complexity of your forensic analysis. This is why a large database of known original images can be invaluable to a forensics lab.
Determining the precision of any forensic analysis is critical. To do so we must measure the frequency with which a forensic analysis correctly identifies a fake and the frequency with which it incorrectly identifies a fake. These two quantities combine to provide a measure of the precision of a forensic analysis. There is, however, one important piece of data that is missing, the prior likelihood, which I will describe in next week’s posting.
1 Some forensic techniques work in reverse, classifying an image as original based on certain patterns. In this case, the true positive is the frequency with which a forensic analysis will correctly classifiy an image as original, and the false positive rate is the frequency with which it will incorrectly classify a fake as an original. In either case, the precision is measured in the same way.