Update on Dec 8, 2020: An IllinoisNewsroom article by Lee Gaines on December 7 reported additional numbers from the university. Some were changes from previous claims/reports and others were new. I have updated the prose and numbers below to reflect this new information, with all changes flagged with “Update”
Table of Contents
The numbers
849912 Total tests during the in-person part of the semester, including 8 days prior to the start of classes when students returned to Champaign-Urbana and began testing (August 16 - November 22; an average of 8585 tests per day).
3641 Total cases during the in-person part of the semester (August 24 - Nov. 22).
3923 Total cases during the in-person part of the semester, including 8 days prior to the start of classes when students returned to Champaign-Urbana and began testing (August 16 - November 22).
40 Cases/Day Average daily cases between August 16 and November 22 (99 days).
~13% Estimated percentage of undergraduates on campus who tested positive between August 16 and November 22 (assuming 25k undergraduates in Champaign-Urbana with undergraduates constituting 95% of positive tests). Update: IllinoisNewsroom reported 24112 undergrads tested in week 1, with 83% of positives (3226/3885) cases from undergraduates (changed from 95% – it’s unclear if the percentage changed over the semester or the original claim was wrong). Using my 25k estimate (slightly conservative) and 83% changed my estimate of the undergraduate infection percentage from 14.9% to 13%.
The “Calculations and assumptions” section describes the assumptions underlying these calculations
Summary: What went well
Summary: What went poorly
Narrative Summary
Preparations and predictions
During the spring and summer of 2020, the University developed its own saliva-based test that proved to have low false-positive and false-negative rates. That testing capability made reopening a possibility, and the University has promoted that achievement heavily in the popular media. A lab in Vet Med was converted to handle processing of the tests. The SHIELD Committee was set up to lead the effort. Many employees at the University dropped other aspects of their careers to devote their time to this huge effort.
The initial plan to reopen was based on testing everyone once per week. But, modeling by the physicists on the SHIELD Committee (led by Nigel Goldenfeld and Sergei Maslov) showed that once/week testing would be insufficient to mitigate the spread of covid on campus (see time stamp 29:31 in this video), and that twice/week testing was necessary. In various press releases and interviews (e.g., Illinois Newsroom story, NPR story, and video briefing), the University announced a “conservative” model prediction of fewer than 700 cases in total over the course of the semester. Based on extant infection rates in Illinois, the University modelers guessed that about 200 people would arrive on campus already infected. If all went as planned, rapid testing would identify those cases promptly, infected people would be isolated, and close contacts would be quarantined. According to the University, after a brief surge of cases as students arrived and intermingled, the rapid and frequent testing would mitigate spread. The modelers predicted that within about 3 weeks, new daily cases would be reduced to low single-digits per day, with active infections remaining “below 100 at any given time” (see timepoint 32:59 in this video). Once case numbers dropped, new cases were predicted to come mostly from people outside of Champaign-Urbana bringing Covid to the University rather than from uncontrolled spread at the University.
Testing on arrival
Mandatory testing for returning students began on August 16, although many students arrived in Champaign-Urbana before that (leases tend to start on August 1 or August 15). The University required all 5300 students living on-campus to be tested before moving in. Although they asked students who lived off campus to test as soon as they arrived in Champaign-Urbana, the University did not know how many students had returned to Champaign-Urbana, so comprehensive enforcement of the requirement to test upon arrival in Champaign-Urbana was not possible. Before classes started on Monday August 24, a total of 282 people had tested positive, but that number likely underestimates the number of people who returned to Champaign-Urbana already infected. Many students living off campus likely were not tested until after the start of classes (see the large spike in tests during the first week), meaning that they might have spread Covid-19 before getting tested. Update: IllinoisNewsroom reported 5236 undergraduates in university housing, fewer than the original approximation of 5300.
The initial surge
The weekend before classes started and the weekend at the end of the first week of classes both reportedly were huge party weekends. By the start of classes (August 24), the 7-day average for daily cases already was up to 47/day. By the end of the first week of classes, the University had already exceeded the predicted semester total of 700 cases. By September 4, the 7-day average was 128 cases/day. Test results were increasingly delayed, extending to more than 40 hours during the first week.
According to the modelers, without a change in policies, the rapid and uncontrolled spread at the University during the initial weeks would result in 8000 cases by the end of the semester. On September 2, the University announced a 2-week “lock down” in which students were asked to leave their rooms only for essential activities (e.g., testing, classes, meals). In that announcement (see also this press release), the Chancellor blamed the surge in cases on “recent uacceptable behavior by a small number of students.” The University amplified that message during a Zoom press conference (not posted online, but the link had been posted publicly, so I watched and took notes), and SHIELD Committee Chair Martin Burke placed blame exclusively on those students. During the media conference, a journalist asked the administration whether they took any responsibility for the outbreak or for the “experiment” of bringing students back in person. The University denied that it was an experiment and declined to accept any responsibility. (See this July 21 article in The Atlantic that predicted colleges would blame students for outbreaks.)
Mitigation
Weekly new case numbers dropped to 100-300/week for the 5 weeks during and following the restrictions on non-essential activities. Week 8 had the lowest weekly total of the semester (74 total cases), and the 7-day average hit its lowest point (10 cases/day) on October 19 and 20.
A second surge
After October 20, cases began climbing steadily. Week 10 had 236 cases and following Halloween and the start of Big10 football, Week 11 had 360. That was the worst week since Week 2 of the semester. The University again asked students to minimize non-essential activities, and cases began dropping slightly with 282 cases in Week 12 and 216 in Week 13. Thanksgiving break began at the end of Week 13, and students were discouraged from returning to campus after break (classes shifted to remote instruction after break).
Final outcome
Between August 16 and November 22, the University conducted 849912 Covid tests. That testing found 3923 cases of Covid-19. Over that 99 days, the University averaged 40 cases/day. Spread at the University was never entirely eliminated, and it never reliably dropped to the single-digit daily numbers predicted by the modeling that was used to justify reopening. By Thanksgiving break, the University had more than 5x the number of cases that the University’s modelers had described as a conservative estimate.
Interpretation
The University had far more cases than it has predicted, and far more than other universities with similarly pervasive testing:
Although testing is necessary for success, it does not guarantee it. Why did other schools succeed in preventing a large outbreak when the University did not? Perhaps it was bad luck. Perhaps the party culture at the University was more problematic than at other universities. Perhaps other institutions went to greater efforts to develop community buy in for mitigation efforts (e.g., with rewards in addition to threats, student-guided initiatives, transparency/communication about the situation on campus). Perhaps other institutions did a better job of initial testing and quarantining as students arrived to the campus, preventing spread from those arriving already infected. Perhaps they made the quarantine and isolation procedures non-aversive to students (e.g., Cornell isolated/quarantined students at their on-campus hotel). Perhaps they did a better job of tracking who was in town so that they could verify the coverage of their testing net.
What this fall semester showed clearly is that testing alone is insufficient. Avoiding a repeat of the fall semester in the spring will require creativity, student-guided initiatives, improved transparency, administrative leadership, and, of course, massive testing. A good place to start would be to examine how the University’s policies and strategies differed from those of universities that were more successful.
The blame game
The University repeatedly blamed the surge of cases during the opening weeks on a small group of non-compliant students. SHIELD Committee chair, Martin Burke, stated “that a small number of students can cause an extraordinary level of damage if they choose to willfully break the law.” The University amplified that message by threatening punishments for non-compliant behavior. They even adopted post-911 anti-terrorism language—“see something, say something”—in promoting a website where students could anonymously report the behavior of peers. That “blame the students” message was amplified in popular media interviews and stories.
Non-compliant students are an easy target for ire. The action of throwing or attending a party while knowingly positive for Covid-19 is deplorable—it puts the the University community (and broader Champaign-Urbana community) at risk. No doubt such non-compliance contributed to spread. But singling out one cause and blaming everything on it ignores other causes for the failure to contain spread. In fact, the very modeling that the University relied on to reopen suggests a different factor likely played a much bigger role than the handful of students who deliberately “broke the law”: The failure to provide timely enough test results.
The modeling suggested that with timely results and twice-weekly testing, most infected people could be isolated within 2-3 days, before they had reached their most infectious levels. The few infected people missed by an initial test—due to a (rare) false negative result or insufficient viral load to be detected—would be caught by the next test a few days later. With frequent tests, rapid results, and immediate isolation and tracing, you’d quickly catch and contain spread, and the number of new cases would be reduced to a trickle. According to the modeling, even some partying would not massively increase the rate of infection.
Experts at the University knew that rapid test results were essential: In an August 19 press release about the University’s saliva test, Rebecca Smith, an epidemiologist on the SHIELD committee stated: “unless we have a test that can give them results very quickly, by the time somebody finds out they are infected, they will have spread the virus.” The University—particularly SHIELD Committee Chair Martin Burke— repeatedly and publicly claimed that the University would return test results in under 5-6 hours (e.g., this Washington Post story).
Audacious plans, especially never-before-attempted ones, rarely go as smoothly as expected. This one didn’t. According to a number of reports from students, notification delays for test results ballooned to more than 40 hours during the first week of classes. Students who tested on Friday morning of the first week of classes didn’t learn their results until Sunday.
The delay in providing test results meant that during the first week of classes, and especially during the weekend following that first week of classes, there were hundreds of infected students who didn’t yet know they were infected. As far as they knew, they were negative. Had they known that they were infected, the vast majority of those students would have done exactly what was asked of them—isolated themselves and helped contact tracers identify who else might have been exposed. Instead, because of delays in learning their test results, they likely attended social gatherings while maximally infectious, unwittingly and unintentionally exposing others to Covid-19. Whereas only a few students deliberately “broke the law” by breaking isolation when they knew they were positive for Covid-19, perhaps 100x as many students unwittingly infected others through no fault of their own, simply because they didn’t yet know the results of their tests.
It’s easier to place all of the blame on a few non-compliant students than to confront the fact that the impressive testing effort did not scale efficiently enough to prevent outbreaks of an aggressive virus. When planning something as logistically massive as the testing operation at the University, it’s essential to anticipate the consequences of delays and glitches, because they are inevitable. If it takes 5 hours from test to results when processing 1500 tests in a day, the per-test time almost inevitably will increase when scaled to 15000 tests.
In stories and interviews, Burke claimed that daily testing capacity was 10,000 tests or even 20,000 tests per day. For example, in an August 28 interview in The Scientist, Burke stated, “we had to get to 20,000 tests per day in order to be able to meet the demand of twice per week for 60,000 people. I’m very excited to tell you that we achieved that.” Burke similarly told the Washington Post that they could complete 20,000 tests per day with results in 3-5 hours (see also CBS Evening News on Sept 10]).
Yet, investigative reporting by Willie Cui at the Daily Illini showed that the Vet Med lab established to process the tests was built and staffed to handle only once-weekly testing of everyone, not twice-weekly testing; the University did not increase the capacity of that lab when it became clear that twice-weekly testing would be necessary. Consequently, the lab was not built to handle more than 10,000 tests per day and never could have returned results as promptly as needed even if there had been no other logistics problems. Only at the very end of the semester was the lab capable of processing slightly more than 10,000 tests per day.
Without increasing the capacity of the test-processing lab, the University had no chance to return results within 5-6 hours. More than 15,000 people tested on four of the first five days of classes. Delays of many hours or days were inevitable. Not until mid-semester did delays drop below 24 hours, and even that was not consistent. Only in the final 4-5 weeks did they drop under 10 hours. Note that the University did not report the testing delays even though they have that information - these estimates are based on many anecdotal reports from students.
Did student non-compliance contribute to the outbreak at the University? Of course. Gatherings, bars, partying, and risky behavior undoubtedly were a major source of spread. But, failed logistics were a substantial factor as well. The University clearly recognized that delays were a problem. They worked to improve test processing logistics, adjusted testing schedules to better distribute tests across days of the week, and reduced testing frequency for faculty, staff and grad students who were at lower risk. Yet, the University refused to acknowledge any role for test delays in the surge of cases, continuing to focus all blame on scofflaw students.
When deciding whether to bring students back to Champaign-Urbana in the midst of a pandemic, it was imperative to anticipate failures and delays. Knowing in advance where all of the fail points will be is nearly impossible, especially for an operation this big, but it is inevitable that fail points will emerge. The University hoped everything would scale without a hitch and adopted the most optimistic possible modeling predictions. Instead, the University should have evaluated what would happen if the testing operation did not scale and should have justified reopening by addressing what would happen with more realistic assumptions.
The models got it right (if you start with the right assumptions)
By blaming the outbreak entirely on non-compliant students, the University indirectly placed the blame on the modelers for not predicting how students would act. The modelers were pilloried online for inadequately accounting for student non-compliance, including by the well-known web comic XKCD. Some of that criticsm was deserved: Nigel Goldenfeld had stated in an August 11 press briefing (time stamp: 24:47) that they had modeled a “worst-case scenario” because the “social life of students is not well documented.” Did the models fully anticipate how existing incentive structures might lead to non-compliance or address all aspects of student social engagement optimally? No. The modelers largely ignored guidance from social scientists about likely non-compliance by those subsets of the undergraduate population that likely would engage in highly risky partying. But, delayed test results likely contributed more to the surge than a handful of non-compliant students. And, critically, the consequences of test delays were predicted by the University’s models!
The University relied on the prediction of fewer than 700 cases all semester to justify reopening. Ideally, they would have made a reopening decision based on realistic or even conservative (“bad case”) scenarios, but they instead adopted the most optimistic possible assumptions. Those optimistic assumptions largely proved to be misguided or wrong, some predictably so (like the assumption of rapid results for more than 10,000 tests/day). The list of questionable or incorrect assumptions, many of which were explained in the “blame game” and “narrative summary” sections of this document, included the following:
The model forecast of 700 cases in total over the course of the fall semester assumed that test results would be returned in 5-6 hours so that contact tracing could start promptly. But, the models also showed what would happen without assuming rapid test results. In fact, they predicted what would happen to the total number of cases in the fall semester if contact tracing were delayed, either due to slow notification of a positive test or to failures to reach infected people promptly. And that prediction was accurate! Unfortunately, the modelers did not clearly communicate the effects of such delays (at least not publicly).
The alternative predictions that took delays into account were mentioned by the University only in a short segment during a video briefing (time point: 30:40) in which Nigel Goldefeld (one of the University’s modelers) presented the following slide:
The slide is hard to understand, and it is not well explained in the video. Here’s what it shows:
The slide notes that a 1-day delay in notification leads to 10x the number of quarantined people, but it doesn’t directly state the prediction for the total number of infections. You have to read that off of the complicated plot on the left that shows predictions for the total number of cases over the course of the fall semester. It is a color “heat map” using a log scale of the infection count on the right axis, delay to notify someone of a positive test on the X axis, and compliance with contact tracing on the left axis.
To find the predicted total number of cases, you first have to locate the color patch corresponding to a combination of compliance on the left and delay on the bottom. You then find that color on the scale to the right of the heat map and find the number that is aligned with it. Once you find that number (let’s call it \(x\)), you raise10 to that power (\(10^x\)).
For example, to arrive at the University’s prediction of 700 cases for the whole semester, you need to make the most optimistic possible assumptions about compliance and delay: the dark purple patch in the upper left corner of the graph with perfect compliance and no delay in notification. That dark purple color appears at the bottom of scale on the right, which I estimate to be somewhere between 2.8 and 2.9 on that scale. So, if we assume an exponent of 2.85, we get \(10^{2.85}\) which is a prediction of about 700 cases.
The graph also shows what the models predicted with different assumptions. For example, to see the effect of delayed test results, just shift along the X axis. Even if we assume perfect compliance (staying on the top row of the graph), a delay of 1 Day on the X axis shifts you into one of the aqua-green squares, an exponent of about 3.6. So, a delay of 1 day in starting contact tracing alone leads to a model prediction of \(10^{3.6}\) or just under 4000 cases for the semester. That’s exactly what happened!
As Goldenfeld said in the video, when relying almost exclusively on testing to mitigate spread, “the only way contact tracing works is if you do it effectively and fast. And that’s why our test is so important, because it is so fast” (timestamp 31:09). Burke also publicly noted that “you really do have a preciously short window of time to find out who’s positive, quickly help them isolate safely and stop them from spreading it to others.” Ironically, the models showed exactly what what would happen if test results were delayed—rapid test results are essential to prevent spread. It’s not easy to grasp the consequences of delayed test results or contact tracing from a one minute explanation of a color heat map on a log scale, though.
The nearly 4000 cases we experienced this term are consistent with what the models showed would happen if the start of contact tracing were delayed by 24-30 hours, as it was for much of the term. The models got it right, but the modelers and the University based their forecasts on modeling assumptions that they knew (or at least should have known) were impossibly optimistic.
Transparency, Openness, and Accuracy
From before the opening until the end of the semester, the University touted their transparency and openness. In a November semester wrap-up video, for example, Chancellor Jones talked about “the importance of communication” and claimed that “we are truly transparent” (timestamp 59:38). But, the University provided minimal information, less than many other institutions (e.g., Ohio State, Northeastern, Boston University, Cornell). The University dashboard provided the daily numbers of tests and cases but little other useful information. In addition to missing and incomplete information, the University also provided misleading information.
Missing information
Misleading information
The University repeatedly claimed that test results (10k or 20k per day) would be returned rapidly (in 5-6 hours) and that such rapid testing was necessary for successful mitigation. But, the lab responsible for processing tests was never equipped to handle two tests per week for everyone being tested. Consequently, it was never feasible to test more than 10,000 people per day while still returning tests rapidly. The model predictions used to justify reopening assumed rapid test results. Those same models showed that testing delays of approximately 24-48 hours would result in 3000-8000 cases at the University, and that’s what we saw. It’s not clear why the University continually claimed that results would be returned quickly when the lab wasn’t built to handle the necessary capacity (see “the blame game”).
The University falsely claimed to have an FDA Emergency Use Authorization (an EUA) for their test (see this story from Lee Gaines and Christine Herman who filed a FOIA request to get the information). The announcement that the test had an EUA was promoted by Gov. Pritzker in an August 19 news conference. The University subsequently altered their earlier press release with no indication that it had been changed. (The original press release also implied that the University could process up to 20,000 tests/day).
The University claimed that people would only be allowed to enter buildings and classrooms by showing evidence that they were “cleared” by a negative test in the past 4 days using the SaferIllinois App. The University planned to station 800 “wellness support monitors” at building entrances to limit access to those how had tested negative recently, but according to investigative reporting by a team of journalism students writing for CU-CitizenAccess, the University was unable to do so. In the end, they hired only 250 monitors, and in spot-checks, the journalism students found no monitoring at 2 of 9 classroom buildings on October 20 and 3 of 9 on November 10. It’s not surprising that hiring that many monitors was problematic, but the University didn’t transparently report those hiring troubles or acknowledge that they were having to monitor buildings selectively.
The University repeatedly touted the importance of a low positivity ratio (positive tests divided by the number of tests). Both daily positivity and 7-day average positivity are featured prominently on the dashboard (see also this November 25 News-Gazette Column by President Tim Killeen). In a mass email, Chancellor Jones stated that the positivity ratio “tells us how widespread the infection is here in our community.” It doesn’t. When you are testing everyone in the population of interest multiple times each week, neither the daily positivity ratio, nor the 7-day positivity ratio, nor the total positivity ratio provides an index of the infection rate in the community. When testing everyone, you can just divide the number of positive tests by the number of people who were tested to find the actual proportion of people who were newly infected. It is uninformative to divide positive tests by the number of tests when each person is testing multiple times/week and you’re testing all of them. When you’re repeatedly testing everyone, positivity is uninformative at best and misleading at worst. I’ve written a separate essay explaining why.
Not only did the University promote its low “positivity,” they misdefined it. In a mass email sent at the start of the semester (August 21), Chancellor Jones defined the positivity ratio as “the percentage of people who test positive out of those who have been tested.” That number is interesting and important - it is the proportion infected (\(cases \over people\)), though, not the positivity ratio (\(cases \over tests\)). Ironically, the proportion infected is what we should be paying attention to when testing everyone repeatedly, but the University never provided that number.
Details and considerations
Glossary of terms
Calculations and assumptions
Graphs
Interpretation: The number of tests varied from day to day (generally fewer tests on Wednesday, Saturday, and Sunday). Focus on the overall trends rather than the absolute numbers (see the rolling 7-day graph below for a smoothed version).
Interpretation: The dark blue line and numeric values present the rolling 7-day average of the number of new positive cases. For example, the value for September 20 is based on an average of the number of daily cases between September 14 and September 20. The faint blue bars are the daily case numbers from the above graph, repeated here to show the day-to-day variability. The 7-day average smooths out those differences because it averages a full week’s values (the blue line is essentially a smoothed version of the faint blue bars).
Interpretation: Total number of positive cases over different time periods, starting with the return of students (August 16-23). Note that Thanksgiving break began on Saturday Nov. 21.
Interpretation: Rolling 7-day average of the number of daily tests on campus. A 7-day moving window includes all 7 days of the week, so comparisons of one day to the next are comparable (test numbers are lower on Sat, Sun, and Wed, but each data point in the moving average includes all days of the week).