False positives and false negatives

TL;DR

  1. The goal of testing people is to find out whether they are infected.
  2. Tests can always generate false positive (test incorrectly indicates person is infected) and false negative (test incorrectly does not indicate person is infected) results.
  3. Only people who are infected can have false negative test results.
  4. Only people who are not infected can have false positive test results.
  5. We do not know and will never find out the true infection rate … We try to make inferences based on observable things like test results.
  6. With random testing, given infection rate \(\rho\), false positive probability \(\alpha\), and false negative probability \(\beta\), and sample size \(N\), we expect:
    1. Number of infected people: \(\rho N\)
    2. Number of non-infected people: \((1-\rho) N\)
    3. Number of true positives: \((1-\beta)\rho N\)
    4. Number of false negatives: \(\beta\rho N\)
    5. Number of true negatives: \((1-\beta)(1-\rho) N\)
    6. Number of false positives: \(\alpha(1-\rho) N\)
    7. Proportion of false positives among positive tests: \[\omega=\frac{\alpha(1-\rho)}{(1-\beta)\rho + \alpha(1-\rho)}\]
    8. For \(\rho=1\%\), \(\alpha=1\%\), \(\beta=30\%\), for example, \[\omega\approx 58.6\%\] indicating that almost 60% of all positive test results are false positives.
    9. Testing everyone all the time would waste resources if infection rates are low.
    10. Testing everyone all the time would be unnecessary if infection rates are high: Just assume everyone is infected.
    11. Testing sick people whose illness is not easily explained by other conditions seems optimal (IIRC, this is close to the what health authorities used to recommend before what I can only call "March madness.")
    12. You can use the accompanying simulator to try out scenarios using other assumptions. Note that the proportion of false positives diminishes if you assume higher infection rates, but in that world large absolute numbers of people still test positive. And, false negatives become an issue.

Discussion

False positives and false negatives are perfectly natural consequences of the fact that most decisions are made in circumstances that are subject to some amount of random influence. In popular media, the results of these tests are presented as dichotomies, but, in fact, they have four possible outcomes.

People tend to think of two outcomes, sick or not sick, when a person is tested for a disease. As the CDC explains politely,

Laboratory test results should always be considered in the context of clinical observations and epidemiological data in making a final diagnosis and patient management decisions.

The condition of the person should indicate whether the person is sick or not.

The goal of a diagnostic test is to tell, with some uncertainty, whether a specific condition is present. A positive test result indicates that the person has that specific condition. A negative test result does not indicate that the person has that condition.

Every test has four possible outcomes:

At the time test result is determined, you do not know which positive is a true positive and which positive is a false positive and which negative is a true negative and which negative is a false negative …

If you had a way of telling, you wouldn't need the test, would you?

Excuse the snark there, but that last little bit seems to be significant in how often it is ignored … Especially when it comes to Covid-19

If you look carefully, you see that people refer to every positive test result as a case or every person who tested positive as a patient. It bears repeating that no one is a patient unless and until they are actually sick. If someone tests positive and never shows symptoms, maybe they are just not infected instead of being asymptomatic spreaders.

As part of developing tests, researchers try to estimate the false positive and false negative probabilities associated with them. These are calculated using the assumption that randomly and independently picked test subjects being tested in perfect conditions. They don't apply when the underlying random sampling assumption is violated.

If, for example, a government decides they are going to test everyone who is arriving at the country's ports or all health care personnel or all office workers or everyone who has a cough etc, these are not randomly independently picked test subjects. In what follows, we'll make some calculations based on random testing, but this point bears emphasizing: If people who are more likely to be infected are more likely to be tested, the actual false positive rate will be lower than the theoretical one. If people who are likely not to be infected, the actual false negative rate will be lower than the theoretical one.

In addition, even with random testing, real world false positive and false negative rates are likely to be higher than the theoretical levels based on the assumption that every lab and every lab worker and every sample collector works perfectly:

… under laboratory conditions, these RT-PCR tests should never show more than 5% false positives or 5% false negatives.

It is important to remember that laboratory testing verifies the analytical sensitivity and analytical specificity of the RT-PCR tests. They represent idealised testing. In a clinical or community setting there may be inefficient sampling, lab contamination, sample degradation or other sources of error that will lead to increased numbers of false positives or false negatives. The diagnostic sensitivityand diagnostic specificity of a test can only be measured in operational conditions.

Even when these estimates of false positive and negative probabilities are not applicable, knowing their implications helps us to put published numbers about test results in context.

Expected numbers of false positive and false negative test results

Denote the rate of infection at a given moment in time using \(\rho\). Roughly speaking, this is the probability that a randomly picked member of the population is infected. As with all such parameters in Stats, the true value of \(\rho\) is not observable. We try to deduce it by obtaining samples of test results.

In every sample of size \(T\), there will be \(P\) positive test results and \(N\) negative test results. Denote the percentage of positive test results using \(\pi\equiv P/T\) and the percentage of negative test results using \(\nu\equiv N/T\).

Clearly, \[P+N\equiv T\] and \[\pi+\nu\equiv 1.\]

Finally, denote the probability of a false positive using \(\alpha\) and the probability of a false negative using \(\beta\).

If we know (we don't and we can't, but, assume, for the sake of argument) the infection rate, \(\rho\), the false positive rate, \(\alpha\), and the false negative rate \(\beta\), we can calculate the expected percentage of positive test results. The key is to remember that false positives (by definition) can only come from among the not infected and false negatives can only come from among the infected.

Given \(T\) randomly and independently picked test subjects, we expect \(\rho\%\) of them to be infected. \(\beta\%\) of the infected will falsely test negative. Therefore, the expected number of infected who test positive will be \[P_T = (1 - \beta)\rho T.\] Here the subscript \({}_T\) refers to the fact that this is the expected number of true positive test results.

That is not the only source of positive test results. Some of the test subjects who are not infected will also test positive. First, the expected number of test subjects who are not infected is \((1-\rho)T\). Among these, \(\alpha\%\) of them will falsely test positive. Therefore, the expected number of false positives is given by \(P_F=\alpha(1 - \rho)T\).

This means the total number of positive test results will be given by the sum of the number of people who are infected and test positive (true positives) and the number of people who are not infected and test positive (false positives): \[P\equiv\ P_T+P_F\equiv \left[(1-\beta)\rho + \alpha(1 - \rho)\right]T.\]

The logic works similarly for false negatives. The total number of people who test negative will be the sum of the true negatives and the false negatives. The true negatives can only come from among people who are not infected who do not falsely test positive: \[N_T=(1-\alpha)(1-\rho)T.\] The false negatives come from among the infected: \[N_F=\beta\rho T.\] Therefore, the expected number of negative test results is given by \[N\equiv N_T+N_F\equiv\left[(1-\alpha)(1-\rho) + \beta\rho\right]T.\]

The expected percentages of positive and negative test results, \(\pi\) and \(\nu\), respectively, are then given simply by dropping the \(T\) term from those expressions. We can check our logic by making sure they add up to 100%: \[\begin{eqnarray} \pi + \nu &=& \left[(1-\beta)\rho + \alpha(1 - \rho)\right] + \left[(1-\alpha)(1-\rho) + \beta\rho\right] \\ &=& \left[(1-\beta) + \beta\right]\rho + \left[\alpha + (1-\alpha)\right](1-\rho) \\ &=& \rho + (1 - \rho) \\ &\equiv& 1.\end{eqnarray}\]

At any point in time, we really only know the number of positive and negative test results, and the theoretical false positive and false negative rates calculated using the assumption of random sampling. We do not and cannot know the true infection rate. But, we can use either of these equations to express it in terms of the other quantities. For example, starting from the percentage of positive test results, we have: \[\begin{eqnarray} \pi &=& \left[(1-\beta)\rho + \alpha(1 - \rho)\right] \\ &=& \rho - \beta\rho + \alpha-\alpha\rho \\ &=& (1-\alpha-\beta)\rho + \alpha \\ (\pi - \alpha) &=& (1-\alpha-\beta)\rho \\ \rho &=& \frac{\pi-\alpha}{1-\alpha-\beta}.\end{eqnarray}\]

All we did was to rewrite the expression for the percentage of positive test results to express \(\rho\) as a function of it. This immediately allows us to see a couple of things more clearly. First, since the infection rate cannot be negative, we know that both the numerator and the denominator must have the same sign. I sure hope no one is using tests where the sum of the probabilities of false positives and false negatives is greater than 100%. So, let's assume that the denominator is positive. That means, the numerator cannot be negative. That is, \[\pi\geq\alpha\] must hold. Intuitively, this means that observed percentage of positive tests cannot be less than the false positive rate. This makes sense: Even if no one is infected, there will be at least \(\alpha\%\) positive test results.

Another observation is that it is impossible for more than 100% of the population to be infected. Therefore, assuming both the numerator and the denominator are positive, we must have \[\pi-\alpha\leq 1-\alpha-\beta\rightarrow\pi\leq 1-\beta. \] That is, the observed percentage of positive test results must be less than the true negative probability. Or, paraphrasing, the observed percentage of negative test results (recall that \(\nu\equiv 1-\pi\)) must be no less than the false negative probability.

To put this in context, consider the numbers for a country I am familiar with. On September 21, this country reported approximately 110,000 tests were conducted and about 1,500 were positive. That is, about 1.4% tested positive. If we assume that the false positive probability is 1% and the false negative probability is 30%, then we know that this is consistent with an infection rate of about 0.6% assuming random testing. Of course, testing may not be random and limited to only those people who present with illness, in which case the test results are not informative regarding the overall infection rate among the whole population.

According to covidtracking.com, New York state reported 0.98% positive test results. Under the same assumption, a population infection rate of 0 is consistent with this result, assuming random testing. Once again, if testing is not random, then the results are not informative regarding the whole population.

Some real world numbers

You can only have so much fun rewriting the same equations in different ways. Let's what we can deduce from some real numbers. Unfortunately, really good estimates of real life false positive and false negative probabilities for SARS-Cov2 tests are hard to come by. One paper puts the false positive probability in the range of 5% to 30% and false negative probability in, again, the really wide, range of 2% to 29%. This article about a rapid SARS-CoV2 test mentions a false negative rate of 30% with current tests. The review of this device in Lancet claims this particular test does not produce false positives (although the 95% confidence intervals cover 0–4%). Personnally, I cannot imagine a real world context where there are never any false positives from any test.

To show the effects of various assumptions about infection, false positive, and false negative probabilities, I built a simple false positive/false negative simulator using the arithmetic explained above. You can try it out to see the impact of various assumptions on the infection rate and the probabilities of false positives and false negatives.

Let's suppose 1% of the population is infected at a given point in time. Also, assume that the false positive probability is relatively small at 1% (I find that a little unrealistic given that everyone involved has an incentive to find positives everywhere, but let's stick with that). Finally, let's assume that the false negative probability with existing tests is 30%.

Note that we do not know who's infected and who's not—that's why are people being tested.

Let's suppose we test 100,000 randomly and independently picked individuals. It is important to note here that the Lancet review of the CovidNudge device makes it clear that the assesment for that device was not done on a random selection of people:

Samples were collected from three groups: self-referred, health-care workers or their family members with suspected COVID-19 who were not admitted to hospital (between April 10 and May 12, at St Mary's Hospital and the John Radcliffe Hospital); patients admitted to an emergency department with suspected COVID-19 (between April 2 and 24, at St Mary's Hospital); and consecutive hospital inpatient admissions with or without suspected COVID-19…

So, let's note that and move on to the arithmetic.

Given a 1% infection rate, we expect 1,000 people in our sample to be infected. Since the false negative probability is 30%, 300 of those will test negative and 700 of them will test positive.

There are also 99,000 people in our sample who are not infected.

Since the false positive probability is 1%, we expect 990 of them to test positive.

That is, out of a random sample of 100,000, we expect a total of 1,690 people to test positive. Out of these 700 will actually be infected, and 990 will not.

That is, almost 60% of the people who test positive will be false positives (people identified as infected when they really aren't).

Now, imagine you test 100,000,000 people instead of 100,000. The relative numbers remain the same, but you now need to multiply all absolute numbers by a thousand.

So, out of 100,000,000 people, about 1,690,000 will test positive. Out of those, 700,000 will be infected, but 990,000 will be false positives (people identified as infected when they are not).

The relative numbers do get "better" if we, say, assume an infection rate of 10%. In that case, 89% of those who test positive will indeed be infected (7,000,000) for a paltry 900,000 false positives. However, now we have to contend with the 3,000,000 people who are infected but tested negative.

Now let's suppose it is possible to bring the false negative probability to 6%.

In that case, we expect 91.3% (9,400,000) of those who test positive to be infected, whereas 8.7% (900,000) will be false positives. However, only 0.7% of those who tested negative will be infected. However, given the total number of tests, that still represents 600,000 infected people incorrectly testing negative (assuming that the infection rate is 10%).

Let's think about the case where the infection rate is 1%: In that case, we are back in the world where more than half (51.3%, 990,000) of those who test positive will be false positives. However, even that case, a test that reduces false negative probability from 30% to 6% is preferable as the proportion of infected among people who testnegative falls to about 0.1% (60,000 out of 98,070,000).

These all matter because there is a push to test everyone and extensively contact trace everyone with whom those who test positive have come in contact. In some countries, those who test positive lose a significant portion of their freedom indefinitely. If false positives dominate test results, that means significant harm is being done to those people and their contacts directly. Indirectly, resources are being wasted chasing down people who are not likely to be infected (that is, contacts of a non-infected personwho falsely tested positive.) In addition, people who falsely test negative remain free to roam and infect others and imposing costs that way.

A more effective approach would be not to try to test everyone, but focus on those with symptoms that are not explained by other infections. For example, it seems to me that requiring a negative test for SARS-Cov-2 to rule out infection is not the most effective way (and serves too many vested interests). If a person presents with flu symptoms and tests positive for the flu, that seems to be a good enough indication that they don't need to be tested for SARS-Cov-2, especially since there is a quick test that is fit for the purpose during flu season. Second, requiring that people who are not sick test negative for SARS-Cov-2 (usually at regular intervals) is destructive in the presence of false positives. If everyone is tested weekly, there is about 41% chance that a non-infected individual is going to falsely test positive due to pure random chance. That is, eventually, significant portions of the population will just be laid off for two weeks or more soley due to false positives.

Rationally considering courses of action means not just focusing on the potentially beneficial aspects e.g., If we test everyone all the time, we'll identify more infected individuals, but also the real costs of said action e.g., significant parts of the economy will be idled, reducing opportunity for everyone.

Since late February, early March, almost half of the world's annual GDP has been destroyed, and almost all countries' leaders have explicitly condemned their citizens to a lower growth rate in the years to come. The opportunities foregone are real costs, they are not simply "just money" even though most of the time we measure them in currency so as to have a common yardstick.

You can use the accompanying simulator to try out scenarios using other assumptions.

Here are links to some scenarios you might want to look at:

For your convenience, I've also embedded the app here:

Views expressed here are personal opinions of Covid2020.icu authors and do not represent any of our past, present, or future clients and/or employers.