positivity.utf8

Why and when positivity is misleading

By Daniel Simons

Last Updated: 29 November 2020

The university has repeatedly touted the low “positivity ratio” in press releases, press conferences, mass emails, and editorials. They told community members to watch that number as an indicator of how the campus was doing, and they used it to claim that our rates were better than those at other schools. But, that statistic is meaningless when you’re testing everyone multiple times each week and it doesn’t permit direct comparison of infection rates between communities. Those are misuses—abuses, really—of the positivity statistic, and they are misleading. This page describes what the positivity ratio actually is, when it is (and isn’t) useful to report positivity, and when it is a meaningless or deceptive statistic.

What is the positivity ratio?

The positivity ratio is a statistic used to estimate rates of infection in a broader population when you can’t test everyone. If you have a population of 100,000 people, you can test 1,000 of them and use the positivity ratio for that thousand to estimate how many of the 100,000 are infected. For example, if 20 out of 1000 tested positive (2%), then to the extent that your sample of 1000 was typical of that broader community, you could estimate that had you tested everyone, about 2000 would have tested positive. If you took a different sample of 1,000 on another day, you could get another estimate of that population infection rate as a way to see whether positivity has increased, decreased, or stayed the same. In both cases, you are attempting to estimate the percentage of people in the broader population who are infected. A stable positivity ratio given a comparable number of tests implies that the rate of infection is stable.

Back in March, testing nationwide was focused primarily on symptomatic people, and the question was what percentage of them had Covid-19 as opposed to some other illness. In that context, the population of interest was people with symptoms, and the goal was to determine whether Covid-19 was the cause of those symptoms. For those symptomatic people, the ratio of positive tests to all tests—the “positivity ratio”—gave an estimate of the proportion of sick people who had Covid. As testing has become more widely available, the positivity ratio has been used to estimate the infection rate in entire communities, not just in those who are showing symptoms.

Why is positivity ratio meaningless and misleading when you’re testing everyone?

The key idea of a positivity ratio is that it provides an estimate of the infection rate in an entire population based on a sample of people from that population. Much like a political poll, it assumes that the sample of a relatively small number of people is “representative” or “typical” of the broader population so that whatever you learn from that sample will generalize to everyone. If you could poll everyone (i.e., have an election), you no longer need the polls because you actually know the vote totals. Similarly, if you can actually test everyone, there’s no need to estimate the population infection rate from a sample because you have tested the whole population: You actually know the number you’re trying to estimate.

Put simply, you should just report the proportion of people who are infected rather than the proportion of tests that came back positive. Why is it misleading to report the positivity ratio when you’re testing everyone repeatedly? Let’s work through the logic with a concrete example.

This summer, the university tested all returning athletes repeatedly. They refused to release the testing results at the time, but they eventually provided a summary of those early summer test results on August 3. In their press release, they touted the low positivity ratio among the athletes of 1.9% as of July 30. You might think, based on that 1.9% positivity ratio, that only 1.9% of the returning athletes tested positive for Covid over the summer. But, in reality, 14% of the returning athletes tested positive!

\({12 \over 164} = 7.3%\)

\({11 \over 152} = 7.2%\)

Fall semester

\({23 \over 164} = 14.0%\)

Misleading positivity numbers on the campus dashboard

When you test everyone, like Illinois is doing, there is no need to estimate a rate of infection — you can just compute the actual rate. The purpose of a positivity ratio is to estimate the overall infection rate from a sample of people in that population, but when you are testing everyone, it is an invalid statistic. It does not estimate the population infection rate. The 7-day positivity featured prominently on the dashboard and touted repeatedly to the media and public is even more meaningless and misleading. The numerator is the number of positive tests during that 7-day window. The denominator is the total number of tests. But, that total number includes multiple tests from the same people each week. Just like the example of why the 1.9% positivity for testing of athletes this summer was misleading because the same athletes were tested repeatedly, the 7-day positivity is not interpretable because it combines multiple tests of the same people. It is not a valid estimate of the population infection rate. A more meaningful number to report would be the number of people newly infected during that week. If you want to report a proportion, a more meaningful one would be the proportion of the people tested that week who were positive (i.e., use people tested rather than total tests in the denominator). Another concrete example might clarify why positivity is an invalid estimate of infections when you are testing everyone:

\({cases \over people} = {100 \over 10,000} = 0.01\)

\({cases \over tests} = {100 \over 19950} = 0.005\)

\({cases \over people} = {1179 \over 10000} = .1179\)

\({cases \over tests} = {1179 \over 235535} = 0.005\)

Day	Daily Tests	Daily Positivity	Daily Cases	Cumulative Cases	Cumulative Tests
1	10000	0.005	50	50	10000
2	9950	0.005	50	100	19950
3	9900	0.005	50	150	29850
4	9850	0.005	49	199	39700
5	9801	0.005	49	248	49501
6	9752	0.005	49	297	59253
7	9703	0.005	49	346	68956
8	9654	0.005	48	394	78610
9	9606	0.005	48	442	88216
10	9558	0.005	48	490	97774
11	9510	0.005	48	538	107284
12	9462	0.005	47	585	116746
13	9415	0.005	47	632	126161
14	9368	0.005	47	679	135529
15	9321	0.005	47	726	144850
16	9274	0.005	46	772	154124
17	9228	0.005	46	818	163352
18	9182	0.005	46	864	172534
19	9136	0.005	46	910	181670
20	9090	0.005	45	955	190760
21	9045	0.005	45	1000	199805
22	9000	0.005	45	1045	208805
23	8955	0.005	45	1090	217760
24	8910	0.005	45	1135	226670
25	8865	0.005	44	1179	235535

Testing everyone repeatedly, with rapid results, is essential for containing spread on campus. But, it invalidates positivity as an estimate of the infection rate. When testing everyone, you can just compute the numbers that matter directly: The number of cases and the proportion of tested people who are infected. We need the numbers analogous to the 14% we could compute for the athletes. The university does not provide that information. It also does not provide the information necessary to compute it.

(Subtle statistical point: The campus tested about 38,000 people each week, with many people testing 2 or 3 times each week. On average, they conducted about 10,000 tests each day. Given that people are tested repeatedly every 3-4 days, the single-day positivity can’t be used to estimate the campus infection rate because the daily samples are not independent of each other.)

The university touts the low positivity ratio as an important index of how the campus is doing. Ironically, in a mass email sent right before the start of the semester (August 21), Chancellor Jones defines the positivity ratio incorrectly as “the percentage of people who test positive out of those who have been tested.” That number is interesting and important - it is the proportion infected (\(cases \over people\)), though, not the positivity ratio (\(cases \over tests\)). Even more ironically, the proportion infected is what we should be paying attention to, but the campus never provided that number. They also never discussed the proportion infected in press releases, interviews, or emails, possibly because it would show that more than 8% of the entire tested population (about 49000 people) and more than 14% of an estimated 25,000 undergraduates tested positive during the fall semester. The email also stated that the positivity ratio “tells us how widespread the infection is here in our community and whether our levels of testing are keeping up with levels of disease transmission.” It doesn’t. When you are testing everyone in the population of interest multiple times each week, neither the daily positivity ratio, nor the 7-day positivity ratio, nor the total positivity ratio provides an index of the infection rate. The actual infection rate does.

Positivity ratios are hard to interpret even when used properly

The CUPHD and IDPH have used positivity more appropriately. They estimate the rates of infection for Champaign County, or Illinois Region 6, or the whole state of Illinois by estimating those rates using (mostly) independent samples from those populations of interest. The people who get tested one day in Champaign County likely have little overlap with those tested on other days, and the groups of people tested from day to day within a region are likely fairly similar in their makeup. Consequently, changes in the positivity ratio can be meaningfully interpreted as changes in the infection rate in the population.

But, even then, positivity is not straightforward to interpret. Positivity varies not just with the proportion of the population that is infected, but also with the number of tests conducted and who has access to those tests (only sick people? anyone who wants one?). If two communities have different access to testing or different testing strategies, differences in positivity do not necessarily mean that they have differences in the prevalence of infection. A really high positivity rate likely means that a community is conducting too little testing and is focusing the tests they are conducting on people with symptoms. A really low positivity rate could mean that infection rates are low and/or that the community is conducting widespread testing. Positivity combines infection rates and testing prevalence in ways that make it hard to compare ratios directly whenever the amount of testing varies over time or differs across regions.

Within a community, if the number of tests is stable (or increasing) and the people being tested are relatively similar from day to day, an increase in the positivity ratio reflects an increase in the proportion of people infected with Covid. If positivity is increasing faster than the rate of increase in testing, that also reflects increasing infection levels. So, when Illinois Region 6 and Champaign County saw a big spike in positivity throughout October and November, those increases were meaningful because the increases in positivity were greater than the increases in testing. When positivity declines, that can be due to increased testing or to reduced infections.

Although changes in positivity within a community can be meaningful, the absolute levels of infection can be hard to interpret. It’s essential to think about how your tested sample of people is similar to and different from the broader population to which you want to generalize. For positivity to provide an unbiased estimate of the infection rate in the community, you would need to test people from the community at random to ensure that there was nothing systematically different about the people who were and were not tested. That almost never happens in practice, so it’s essential to think about whether the sample of people who were tested is representative of the broader community. If they aren’t, then positivity in a sample might not generalize to infection rates in the broader community.