By Gordon Hull
A recent piece in the Guardian points to the limitations of models in understanding the spread of COVID-19: principally, those models are often not based on reliable data specific to the disease, so they can be wildly off. As I argued earlier, the bottom line is that we really have no idea what we’re doing, and that lack of knowledge is an important reason why we’re having to adopt non-risk-based strategies like maximin to deal with the virus spread. The decision to move to those strategies needs to be understood as a (bio)political one (just as the general preference for risk is); that they are not risk-based is probably part of why right-wingers are deciding it’s time to go “back to work.” It’s not that they have any better idea what’s going on than anybody else, they’ve just adopted a different political epistemology, and have decided that their idea of economics is more important than everybody else’s view of human life.
In the middle of any efforts at modeling or analysis are indicators. These have been the subject of a fair amount of recent interest, as for example the recent (2015) anthology, The World of Indicators. As the editors note in the book’s introduction, “indicators are typically presented as taken-for-granted facts. Yet, indicators are not neutral representations of the world, but novel epistemic objects of regulation, domination, experimentation and critique” (5). More specifically, there is a fundamentally socio-political process at work behind any indicator. We can all see the point with indicators like “women’s social equality,” since that concept means different things to different people. So too, the point is clear enough in psychological categories like “normal” or “deviant” or “personality disorder,” as Foucault made clear.
Less obviously, it applies in more narrowly medical situations as well. Consider malaria. Malaria is a thing in that it’s a disease caused by a parasite. Everyone knows that – it’s not an essentially-contested concept like “privacy.” And yet, as Rene Gerrets indicates in his contribution to the anthology, similar problems apply to malaria tracking in sub-Saharan Africa.
Let’s now consider COVID-19. We know from epidemiologists that the growth in new cases will follow a more-or-less predictable curve: a period of exponential growth followed by a tapering and then a decline. We also know that various social distancing measures should be able to “flatten the curve,” i.e., to keep the growth rate low enough that the total number of cases will not overwhelm the healthcare system. But how do we know how many cases we have? Well, ideally we would test them, and the number of positive tests represents the number of positive cases we have. But remember the malaria example: the number of positive tests is at best an indicator of the disease spread. The most obvious limitation to testing is that we aren’t testing everybody, so we have to figure out how many cases of disease aren’t being tested, or (more to the point) what kind of a sample of the total the number of tested cases represents. In particular, the symptoms are close enough to the flu or other respiratory infection that that will tend to mask some of the cases (or cause overreporting if one uses clinical diagnoses), and an unknown but potentially significant percentage of cases are sufficiently mild (or even asymptomatic) that they won’t register at all. These limitations have been discussed a lot.
But even if you stay with the positive-test regime, it’s important to note just how much the test as an indicator introduces its own complexities. The chart above represents the number of new cases a day in North Carolina, viewed on 3/26. If you stand back and look from a little distance, it looks encouraging – the curve seems to be somewhat leveling out (of course, the chart is an indicator of an indicator, and the underlying data can be represented in different ways, such as time-to-double. That’s obviously important, but not the point here). On the other hand, if you look closer, you see wild discrepancies in the daily totals that should give you pause. What’s going on? Some of the variation is of course noise and driven by the relatively small numbers involved. But others could be the product of underlying complexities in how “COVID cases” function as an indicator. If you look at the chart from two days later, you can at least see the problem. Here it is for 3/28, suddenly a lot less encouraging:
A big problem with indicators, as the malaria case underlines, is they smooth out what’s going on in their collection – and those changes are the results of politics and policy (in the broad sense). On 3/26, for example, citing dwindling testing supplies and increasing wait times for results, North Carolina’s Department of Health and Human Services urged those with mild symptoms to stay home and not get a test. From a clinical point of view, this is an easy call: if you have a scarce test, and the result is not going to make any difference in what you do for someone (because the person will stay at home), then you don’t administer the test: you save it for when it will make a clinical difference. In addition, lots of testing that will make no clinical difference takes healthcare workers and PPE away from helping more severely ill patients. So it’s also good planning. But that policy shift also means that “number of COVID cases” indicates something different, because it now is drawn from the set of people who exhibit pretty severe symptoms, rather than those who show any symptoms or are asymptomatic. It immediately becomes less useful for modeling, unless you feel pretty good about your understanding of the relation between the subset of people who are tested and the larger universe of those who have the disease.
Perhaps this is why the state said, last Monday, that it would be moving to a surveillance model analogous to the one it uses for seasonal influenza. Also useful, though remember that we began with the Guardian’s analysis of the limitations of models (rumor has it that there will be an explanation of what that means later today). More to the point here, the shift means that “number of COVID-19” cases will indicate something different than it did the week before. In the meantime, earlier numbers are also suspect, because they increased in line with an increase in testing. So it’s not possible to know from the number itself whether the number of cases is increasing at the same rate as the number of tested positives.
Moreover, I’ve never seen an analysis of the error rate of COVID-19 tests – or a comparison between the various tests in use. If the test is highly sensitive, it will tend to overstate the number of cases. If it is highly specific, it will tend to understate that number. There is some evidence in Wuhan that negatives can be unreliable in apparently recovered patients. Is that true here? According to the NPR story, it’s possible that the initial tests the Chinese were using only picked up COVID-19 30-50% of the time, and that the retests are false positives based on lingering genetic material.
The sum of these problems is that it’s very hard to know what you’re looking at when you look at charts like the ones above.
Cross-state comparisons are even harder, because different states both have different testing policies, and have different capacities for private testing by private labs, hospitals, or academic hospitals and labs. This handy chart in Vox makes the discrepancies clear. As of that chart’s compilation, New York had 30,811 confirmed cases, having tested 103,479 people, for a testing rate of 5,319 per million (and, thus, 29.8% positive-test rate). Michigan had 2294 positives out of 4363 tests, for a testing rate of 437 per million (and a 52.6% positive rate). What can we say about COVID-19 spread comparisons between New York and Michigan? Well, New York has a lot more cases and a lot more tests… but more granular comparisons are a lot harder, and depend in part on testing protocols. What about looking at global data? Everybody knows about the differences in testing between, say, South Korea and the U.S. But what about countries with less capacity?
All of which is to say that COVID-19 testing offers an excellent example of both why indicators are necessary, and why they are difficult. For COVID-19, we need an indicator that is useful enough to guide policy about how much social distancing is enough, and that can do so in geographically-specific locations. We need to know how much testing and of what kind is necessary to support targeted social distancing policies. In other words, it needs to be robust enough to get us off the maximin strategy. We know how many “COVID-19 cases” we have. We just don’t know what that means.
Recent Comments