What Does the Test Result Mean?
What Does the Test Result Mean? I thought I would take a moment to write about understanding the test result again, because I find it an oft-misunderstood (or at least under-understood) concept. Some parts of this can get a bit challenging. I hope you find it useful. Many people (including examiners) have the following thought process: “Suspect takes a polygraph, suspect fails a polygraph, and polygraph is about 85% accurate, so there is an 85% chance the suspect is lying.” Although this line of reasoning might be true under some circumstances, it is actually more likely this statement is not true. There may be a greater than 85% chance the suspect is lying or, there may be a less than 85% chance. The result really depends on the probability that the suspect was lying before taking the polygraph – something called “the prior probability” or “the prior” for short.
To begin, let us just start with the concept of the Prior Probability. Sometimes the term “base rate” is used as a synonym for the prior probability because base rates can be used to estimate the prior. The better term is prior probability and it essentially refers to the prevalence of the target behavior or crime that you are evaluating in a person or testing population. For demonstration purposes, I will use the term prior probability of Guilt. Incidentally, if you know the prior probability of Guilt, you can estimate the prior probability of Innocence, as it is 1 – the probability of Guilt. The following are several levels of prior probability of Guilt. Note that Guilt and Innocent refer to the ground truth state of the testing population. The red circles represent the Guilty and the green circles represent the Innocent:
Next, let us talk about base rates of test results in the testing population. In this case, I am referring to the pass/fail rates of the test and they can be very different depending on the population. For example, pass/fail rates are likely to differ among sex offender maintenance screening, sex offender monitoring screening, public safety pre-employment screening, periodic security screening, diagnostic criminal testing, etc. Essentially, pass/fail rates vary depending on the test used, the target behaviors selected, and the test data analysis model used.
Here is something important to remember—the pass/fail rate of a test is NOT the base rate or prior probability of Guilt (lying) or Innocence (truth-telling). Many people incorrectly conflate these two concepts (mix them together.) The pass/ fail base rates will be comprised of those tests that were correct plus those were incorrect. Most we get right, but some we get wrong. The prior probability of Guilt refers to the condition you have at the beginning of the test—how many in the testing population are Guilty and how many are Innocent. Recall, the test result is a combination of the number of Guilty or Innocent that are correctly identified and how many errors occurred. An accurate, reliable test will make more correct identifications than errors over time. In short, when giving someone a test result, we should be confident that the subject was lying, given that they failed the test, or visa-versa. This concept has been called the “Outcome Confidence” and it describes the level of confidence that you have in the test result. The more formal names for this is positive predictive value (PPV) and negative predictive value (NPV). Basically this concept provides the consumer or end-user with a value that can be used to update the assessment of the test subject. Recall we had a prior probability of Guilt estimate before taking the test. After taking the test, you can have an updated estimate of Guilt or Innocence because of the test result.
But how can you derive this post-test estimate? Bayes’ Theorem Bayes’ Theorem is a formula that can be used in polygraph testing to estimate how much a test result changes the Outcome Confidence of Guilt or Innocence. Bayes’ Theorem takes what is known before the test, applies what is known about the test, and updates the knowledge about the subject, given the test results. Here is Reverend Bayes and his theorem applied to polygraph testing. p(deceptive | SR) = p(SR | deceptive)* p(deceptive) ÷ p(SR | deceptive)*p(deceptive) + p(SR | truthful)*p(truthful) p(truthful | NSR) = p(NSR | truthful) * p(truthful) ÷ (p(NSR | truthful) p(truthful) + p(NSR | deceptive)*p(deceptive) I will now dissect this equation and to help better understand the meaning. Let’s look at the equation for the Outcome Confidence in a failed polygraph test. To do this, I will need to define some terms and concepts. a) True Positives = the sensitivity of the test. This refers to the number of Guilty subjects properly identified by failing the test. b) False Positives = refers to the number of errors made on Innocent subjects that had a positive test result (failed the lie detection test). c) True Negatives = the specificity of the test. This refers to the number of Innocent subjects properly identified by passing the polygraph. d) False Negatives = refers to the number of errors made on Guilty subjects that ended up passing the test. e) SR = Significant Response, a positive test result, or failed the test. f) NSR = No Significant response, a negative test result, or passed the test. g) p(deceptive | SR) = “The probability the subject is deceptive, given that they failed the test.” h) p(SR | deceptive) = “The probability the subject would fail the test if they were deceptive.” This is estimated by the test sensitivity. i) p(deceptive) = refers to the prior probability of Guilt before the subject took the test, sometimes called the “base rate of Guilt or lying.” j) p(SR | truthful) = refers to the probability of failing the test, given that the subject is telling the truth (false positive rate). k) p(truthful) = refers to the prior probability of Innocent before they took the test, sometimes called the “base rate of Innocent or truth.” Now consider a plain English explanation of Bayes’ equation applied to a failed polygraph test. p(deceptive | SR) = p(SR | deceptive)* p(deceptive) ÷ p(SR | deceptive)*p(deceptive) + p(SR | truthful)*p(truthful) Question: “What is the probability the subject is deceptive, given they failed the test?” p(deceptive | SR) Answer: It is the probability someone will fail the test if they are lying (sensitivity) times the prior probability (or base rate of Guilt.) p(SR | deceptive)* p(deceptive) DIVIDED BY The probability someone will fail the test if they are lying (sensitivity) times the prior probability (or base rate) of Guilt. p(SR | deceptive)* p(deceptive) PLUS The probability someone will fail the test if they are telling the truth (false positive rate) times the prior probability (or base rate) of Innocent. p(SR | truthful)*p(truthful) Stated another way, it is equal to the True Positives ÷ All Positives while taking into consideration the prior probability (or the base rate) of deception and truth telling. Obviously, you should do the same thing to estimate the Outcome Confidence in a passed polygraph result as well. To do that, take the equation in green next to Reverend Bayes’ picture shown above.
In that case, it would be True Negatives ÷ All Negatives and account for the prior probabilities (or base rates.) You should see now that the number and type of errors made depends on the accuracy PLUS the prior probability of Guilt (or base rate) of the testing population. If you are testing mostly Guilty subjects, you will make more errors on Guilty subjects. If there are more Innocent subjects in the testing pool, then you will make more errors on them. You can use estimates from the APA Meta-analytic review for sensitivity, specificity, false positive and false negative rates for a given technique or for all techniques combined to calculate the Outcome Confidence. But one thing you have to do is to estimate a prior probability of Guilt. Interestingly, you can do this based on historical testing data and Gubin et al. wrote an article about how to do this. Pictures versus a thousand (or more) words OK, now I will use the “picture is worth a thousand words” approach to visually demonstrate these concepts.
Let’s assume there is a polygraph test that is 90% accurate in identifying Guilty and Innocent subjects and has no Inconclusive results. Also assume the prior probability of Guilt is 50%, or .5. That means the Guilty and Innocent subjects are evenly split 50/50 in the testing population. Membership News 14 APA Magazine 2017, 50 (4) In the table above, the Outcome Confidence mirrors the test accuracy (sensitivity and specificity) and In this example, there were 500 Guilty and 500 Innocent subjects tested. Of the 500 subjects that passed the test, the test correctly identified 90% —so the Outcome Confidence is .9 or 90%. You can tell the end-user or consumer that you are 90% confident that those who passed the test were telling and Innocent subjects and has no Inconclusive results. Also assume the prior probability of Guilt is 50%, or .5. That means the Guilty and Innocent subjects are evenly split 50/50 in the testing population.
In the table above, the Outcome Confidence mirrors the test accuracy (sensitivity and specificity) and that is great! In the image below, I will show the same concept in a “testing funnel.” In this example, there were 500 Guilty and 500 Innocent subjects tested. Of the 500 subjects that passed the test, the test correctly identified 90% —so the Outcome Confidence is .9 or 90%. You can tell the end-user or consumer that you are 90% confident that those who passed the test were telling the truth. “Real-World” Estimate Examples Now, let’s take a look at what happens when we use the estimated accuracy and error rates for all polygraph techniques from Table 2 in the Meta-analytic review APA, 2012) and vary the base rate. PDD- All techniques (APA, 2012) Pass Test Fail Test Innocent 0.72 (TN) 0.14 (FP) Guilt 0.08 (FN) 0.81 (TP) With an even base rate Starting with even base rates, assume that you test 500 Guilty and 500 Innocent subjects. Of the 475 that failed the test, the test correctly identified 85%. The Outcome Confidence is .85. You can tell the end-user or consumer that you are 85% confident that those who failed this test, under these conditions, are probably lying. Here is the testing funnel. the truth.“ Real-World” Estimate Examples Now, let’s take a look at what happens when we use the estimated accuracy and error rates for all polygraph techniques from Table 2 in the Meta-analytic review APA, 2012) and vary the base rate. that is great!
I will show the same concept in a “testing funnel.” With an even base rate Starting with even base rates, assume that you test 500 Guilty and 500 Innocent subjects. Of the 475 that failed the test, the test correctly identified 85%. The Outcome Confidence is .85. You can tell the end-user or consumer that you are 85% confident that those who failed this test, under these conditions, are probably lying. Here is the testing funnel. “Real-World” Estimate Examples Now, let’s take a look at what happens when we use the estimated accuracy and error rates for all polygraph techniques from Table 2 in the Meta-analytic review APA, 2012) and vary the base rate. PDD- All techniques (APA, 2012) Pass Test Fail Test Innocent 0.72 (TN) 0.14 (FP) Guilt 0.08 (FN) 0.81 (TP) With an even base rate Starting with even base rates, assume that you test 500 Guilty and 500 Innocent subjects. Of the 475 that failed the test, the test correctly identified 85%. The Outcome Confidence is .85. You can tell the end-user or consumer that you are 85% confident that those who failed this test, under these conditions, are probably lying. Here is the testing funnel. When base rates are lower Now we will apply the testing funnel approach to another “real-world” polygraph testing situation: PCSOT monitoring testing, a screening approach currently taught in APA training programs. Note: I am not advocating in favor of, or against, this screening tool. I am simply using this because I read a recent article in Federal Probation (Cohen & Spidell, 2016) that provides what seem to be reliable estimates of the base rate of the target for this testing. In this article, they estimated the 3-year recidivism rate for a new sex crime by federally convicted sex offenders to be 2.8%. This means that in a group of 1000 sex offenders, only 28 are guilty of re-offending and 972 are innocent. As you can imagine, with these ratios, the test is likely going to have a large number of false positives simply because of the large number of innocent people in the testing population. Using the 2.8% value as the prior probability (base rate) and the values from Table 2 in the Meta-analytic review, the testing funnel would look like the following: When base rates are lower Now we will apply the testing funnel approach to another “real-world” polygraph testing situation: PCSOT monitoring testing, a screening approach currently taught in APA training programs. Note: I am not advocating in favor of, or against, this screening tool. I am simply using this because I read a recent article in Federal Probation (Cohen & Spidell, 2016) that provides what seem to be reliable estimates of the base rate of the target for this testing. In this article, they estimated the 3-year recidivism rate for a new sex crime by federally convicted sex offenders to be 2.8%. This means that in a group of 1000 sex offenders, only 28 are guilty of reoffending and 972 are innocent.
As you can imagine, with these ratios, the test is likely going to have a large number of false positives simply because of the large number of innocent people in the testing population. Using the 2.8% value as the prior probability (base rate) and the values from Table 2 in the Meta-analytic review, the testing funnel would look like the following: That suggests that a failed test under these circumstances provides an Outcome Confidence of about 14%. In other words, of those who failed, 14% were Guilty subjects and the rest were Innocent. This implies that a failed test result here may not be very helpful to the end-user or consumer. If a person fails the monitoring test under these conditions, you can be about 14% confident they were actually reoffending. Discussion This phenomenon is often surprising to examiners and end users. As stated earlier, a common misunderstanding exists that if a person takes a test and fails the test, assuming the test is 85% accurate, there is an 85% chance the person is lying about the relevant question targets. As you That suggests that a failed test under these circumstances provides an Outcome Confidence of about 14%. In other words, of those who failed, 14% were Guilty subjects and the rest were Innocent. This implies that a failed test result here may not be very helpful to the end-user or consumer. If a person fails the monitoring test under these conditions, you can be about 14% confident they were actually reoffending. Discussion This phenomenon is often surprising to examiners and end users. As stated earlier, a common misunderstanding exists that if a person takes a test and fails the test, assuming the test is 85% accurate, there is an 85% chance the person is lying about the relevant question targets. As you can see in the examples, that may be true if the prior probability is near chance (50%). But once that prior starts moving away from 50%, Outcome Confidence changes. In conclusion, this review of Bayes’ Theorem should help you better understand the phenomenon that prior probabilities (base rates) do affect Outcome Confidences. As such, your clients need to also be aware of this in order to properly assess risk and to consider actions to be taken with the test results derived. This concept may put you in the uncomfortable position of having to estimate the prior probability of Guilt (base rate) but it is a vitally important consideration to appreciate what testing result really mean. Credibility assessment professionals must learn to understand Outcome Confidence and be able to explain it as it relates to the test result. Responsible professionals have a duty to understand and report testing results and the error estimate surrounding that result. The days of “He failed the heck out of the test” are long behind us.