Seeing the pandemic forest for the epidemic trees

Handicapping national case numbers to reveal under-reporting is a good idea, but not without international context

(Originally published May 28 in “What in the World“) This week the New York Times takes a valiant stab at handicapping India’s official Covid infection rates and death numbers to account for the widespread problem of under-counting. The article helpfully identifies a new source of infection data, SeroTracker, which compiles results of surveys using blood tests to search for Covid antibodies. The Times then uses it to conclude that India’s infection is rate is likely at least 20 times higher than what’s being reported, while the death rate is likely at least five times higher.

There’s little doubt that India’s official counts miss many of those infected and who eventually die of Covid. And while some stop short of saying this is part of a willful effort to minimize the pandemic and the government’s failure to control it, they accuse the government of taking credit for case data that it knows underestimate the situation.

The Times’ estimates, unfortunately, do little to clear things up. While using surveys of past infection are a good cross-check of tests for current infections—after all, we can only determine infections in people we bother to test—they’re only surveys and so aren’t conclusive. And the Times’ estimates rely on some assumptions that raise even more questions. The biggest problem is that the Times only applies this cross-check to India, while the problem of under-counting is a problem everywhere, whether or not governments do it on purpose or not.

First, a bit of a review on Covid testing. There are two ways to test for viral infection: a polymerase chain reaction (PCR) test, which determines the presence of a particular virus, and an antibody test, which tests for the presence of antibodies to that virus. In many ways, they test for the same thing: if a person is infected with a virus, the body produces antibodies against it, so the presence of antibodies is usually just as reliable an indicator of infection as a PCR test. And because testing for antibodies is usually cheaper and faster, it’s the standard go-to for determining infection. While the lab work behind PCR tests is more complicated, collecting samples involves merely swabbing a patient’s mucosa (usually the nasopharyngeal area). Antibody testing is more straightforward, and some tests require only saliva or an oral swab, but to be really reliable, antibody testing requires drawing blood.

The bigger problem is that it takes time for the body to produce antibodies, during which time an infected person can already be contagious. And because antibodies against a virus linger in the body long after infection, positive antibody tests can often flag past infections in people who are no longer contagious. Hence the emphasis in the Covid pandemic on using PCR tests to determine infection.

In a perfect world, a doctor uses both kinds of test when diagnosing a patient. This enables them to estimate whether their patient is in the early stages of infection and can still benefit from treatment to limit infection, like antiretrovirals, or whether they’re already in the latter stages of infection, at which point there’s little to do but alleviate the symptoms and let nature take its course.

In a pandemic, there’s no time to do both. So governments have relied on antibody, or antigen, testing as a quick-and-dirty test, while insisting on PCR tests when they need to know for sure.

Statistically, this creates problems, because few countries have systematic testing. Instead, we test only those people who require testing, either because they volunteer for it, or because they’ve been identified as at-risk. That means that the percentage of positive results among those tested should be much higher than the proportion of positive results relative to the entire population. That’s why we don’t use the ratio of positive tests to total tests and extrapolate outward to estimate infection rates in a country. The sample of people being tested is skewed to a higher likelihood of infection.

That said, this skewed sampling method of PCR testing—and the infection rates produced by using confirmed infections—leaves out the vast majority of the population of people who haven’t been tested. Many people live in rural areas and don’t have access to tests. Others have access to tests but simply don’t want to be tested. Or, as is so often the problem with Covid, they don’t have any symptoms and so it wouldn’t occur to them to go get tested.

Leaving aside the problem this creates for infection statistics, it’s a central challenge to defeating Covid, because the lack of systematic testing for an infection that is so often asymptomatic creates the potential for large reservoirs of infection that can go undetected. This is why mass vaccination is so important.

The only other alternative is mass testing and isolation of those who test positive. Iceland was notable early in the pandemic for doing this and has today conducted twice as many tests as it has people. That’s feasible in a relatively prosperous nation of 343,000 people; less so in a poor country like India of 1.2 billion. Thus, the list of places with the most tests-per-capita is dominated by small, wealthier locales: Denmark, Gibraltar, the Faeroe Islands, Cyprus, and the UAE.

Rather than test everyone, therefore, scientists are using the same technique used to estimate occurrences of opinion in large populations: surveys. Because surveys take time, it’s no good using PCR tests for them. Instead, surveys use antibody tests to determine past infections to get a sense of how many people have been infected. The problem with these blood test surveys, or serological surveys, is that just like opinion surveys, the results are only as representative as the sample.

Hence, the Times appears to rely only on national surveys for its estimates of India’s real infection rate. The danger of doing this is somewhat obvious: taking a sample across a huge and diverse population is less likely to be representative than one taken across a smaller, more homogeneous population. But, hey, we do what we can with the tools we have.

The Times concludes that India’s infection rate is probably 15 times higher than the one officially reported, and that this is probably a conservative estimate. At least 539 million people in India, it says, are more likely to have been infected with Covid, therefore. That’s 20 times the official count of 27 million confirmed cases. And this might make sense until you realize that the Times is now asking us to believe that roughly 40% of India’s entire population has been infected with Covid. It assumes that even if 0.3% of all those infected die, the total death toll would be 1.6 million, five times the official Covid death toll.

Why, you may wonder, is India’s death toll so much less severely underestimated, according to the Times, than its infection rate? The Times doesn’t say, but it seems reasonable to assume that it’s because the infection rate is underestimated because so many untested Covid carriers are asymptomatic. Thus the death toll among confirmed cases is likely to be higher than the overall Covid mortality rate, because as noted above we tend to test only those we believe likely to be infected. That includes those whom the virus has already made ill. If we applied the mortality rate among confirmed cases to the Times’ new estimate of infections, it would suggest that 6.2 million people in India have died of Covid.

So where does the Times get its 0.3% mortality rate? Not from serological surveys, but by applying an estimate for U.S. mortality rates calculated at the end of 2020 by a single study. Should we believe that Indians are just as likely to die of Covid as Americans were at the end of last year? Given that only about 10 million people die in India each year from any cause, the Times’ figure of 1.6 million seems plausible, particularly given that the U.S. CDC already estimates that Covid accounted for 10% of all U.S. deaths in 2020. This only makes sense if both India’s infection rates are as high as what the Times is estimating and its mortality rate is roughly the same as that in the U.S. But this seems unlikely: on the one hand, India’s healthcare system isn’t as good as America’s, so presumably anyone in India sick with Covid is less likely to survive than someone in the U.S. On the other hand, Americans without health insurance are less likely to get tested or, if the fall ill, seek medical care until they’re in critical condition. And mortality rates due to Covid in much of the developing world have remained lower than those in developed country, a phenomenon the Times has also investigated. Yes, under-reporting may be one reason, but developing countries also tend to have younger populations less susceptible to Covid and have a lower incidence of “diseases of affluence” like diabetes and atherosclerosis that have been shown to increase the likelihood of a severe or fatal Covid infection.

Let’s assume, then, that these conflicting factors wash out and India’s Covid fatality rate really is roughly similar to the U.S. end-2020 estimate of 0.3%. That takes us back to the Time’s incredible estimate that 40% of India’s population is infected. Broad national surveys can be deceptive because of their relatively narrow sample size. If we look at the Times’ own source, SeroTracker actually allows us to include surveys of narrower populations, and ends up with a median infection rate of roughly 20%, half what the Times considered a “more likely” scenario, but still 10 times the official toll. If we hold our noses and apply the Times’ 0.3% death toll to that, we’d get a death toll about 2.5 times the official toll. Still dramatic, but more plausible than what the Times is suggesting.

And what the Times didn’t tell us is that other countries have higher infection rates than India, according to SeroTracker, including some affluent ones like Kuwait, Qatar and Singapore. And the under-reporting problem goes well beyond India, too: South Africa’s median survey rate of 43% is 17 times the official infection rate of just 2.7%; Peru’s 43% median survey is 7 times its official infection rate of 5.8%; and Singapore, which has earned plaudits for controlling the virus, has a median survey rate of 32% vs. an official rate of confirmed infections of just 1%.

The lesson here? Don’t test, don’t tell. Which is why former U.S. President Trump at one point called for a slowdown in testing to reduce the bad press of rising case numbers.

But if you want perhaps the best example why the Times’ approach may not be reliable, apply the same methodology to the U.S.: the median survey rate for the U.S., according to SeroTracker, is only about 4.7%. Even national surveys only yield a median infection rate of 6.5%. Yet the Times’ own source for fatality rates estimates that at least one-third of all Americans have been infected. The official rate for U.S. infections is about 10%. That would suggest that the U.S. has been overreporting infections.

Survey says? [Bzzzz]

Clearly, there needs to be a consistent way to handicap confirmed case counts to accommodate for lack of testing. Serological surveys offer a helpful look backwards, but they may not be the way to do that while we’re still in the grips of the pandemic. And using them to heap further criticism on one country’s handling of the virus is misleading.

Handicapping national case numbers to reveal under-reporting is a good idea, but not without international context

Leave a Reply Cancel Reply