## Test by batches to save tests

Given the severe shortage of covid testing capacity, creative approaches are needed to expand the effective testing capacity. If the population base rate is relatively low, it can be effective to pool samples and test by batches. Doing so would imply substantial rates of false positives since every member of a test batch would be presumptively sick, but this would still allow three major use cases that will be crucial for relaxing quarantine without allowing the epidemic to go critical so long as testing capacity remains finite:

1. Quickly clear from quarantine every member of a batch with negative results
2. Ration tests more effectively for individual diagnoses with either no loss or an identifiable loss in precision but at the expense of delay
3. Use random sampling to creating accurate tracking data with known confidence intervals for the population of the country as a whole or for metropolitan areas

Here is the outline of the algorithm. Assume that one has x people who were exposed to the virus, the virus has an infectiousness rate of 1/x, and you have only one test. Without testing, every one of these x people must quarantine. Under these assumptions, the expected number of actually infected people is 1 out of the x exposed, or more precisely a random draw from a Poisson distribution with a mean of one. This means that 36.8% of the time nobody is infected, 36.8% of the time one person is infected, 18.4% of the time two are infected, and 8% of the time three or more are infected. With only one test, only one individual can be tested and cleared, but if you pool and test them as a batch, over a third of the time you can clear the whole batch.

Alternately, suppose that you have two tests available for testing x people with an expected value of one infection. Divide the x exposed people into batch A and B then test each batch. If nobody is infected both batches will clear. If one person is infected, one batch will clear and the other will not. Even if two people are infected, there is a 50% chance they will both be in the same batch and thus the other batch will clear, and if there are three there is a 12.5% chance they are all in the same batch, etc. Thus with only two tests there is a 36.8% chance you can clear both batches and a 47.7% chance you can clear one of the two batches. This is just an illustration based on the assumption that the overall pool has a single expected case. The actual probabilities will vary depending on the population mean. The lower the population mean (or base rate), the larger you can afford to make the size of the test batches and still gain useful information.

This most basic application of testing pooled samples is sufficient to understand the first use case: to clear batches of people immediately. Use cases could be to clear groups of recently arrived international travelers or to weekly test medical personnel or first responders. There would be substantial false positives, but this is still preferable to a situation where we quarantine the entire population.

Ideally though we want to get individual diagnoses, which implies applying the test iteratively. Return to the previous example but suppose that we have access to (x/2)+2 tests. We use the first two tests to divide the exposed pool into two test batches. There is a 36.7% chance both batches test negative, in which case no further action is necessary and we can save the remaining x/2 test kits. The modal outcome though (47.7%) is that one of the two test batches tests positive. Since we have x/2 test kits remaining and x/2 people in the positive batch, we can now test each person in the positive batch individually and meanwhile release everyone in the negative batch from quarantine.

There is also a 15.5% chance that both batches test positive, in which case the remaining x/2 test kits will prove inadequate, but if we are repeating this procedure often enough we can borrow kits from the ⅓ of cases where both batches test negative. Thus with testing capacity of approximately ½ the size of the suspected population to be tested, we can test the entire suspected population. A batch test / individual test protocol will slow down testing as some people will need to be tested twice (though their samples only collected once) but allow us to greatly economize on testing capacity.

Finally, we can use pooled batches to monitor population level infection rates as an indication of when we can ratchet up or ratchet down social distancing without diverting too many tests from clinical or customs applications. Each batch will either test positive or negative and so the survey will only show whether at least one person in the batch was positive, but not how many.

For instance, suppose one collects a thousand nasal swabs from a particular city and divides them into a hundred batches of ten each and then finds that only two of these hundred batches test positive. This is equivalent to 98% rate of batches having exactly zero infected test subjects. Even though the test batch data are dichotomous, one can infer the mean of a Poisson just from the number of zeroes and so this corresponds to a population mean of about 2%. Although this sounds equivalent to simply testing individuals, the two numbers can diverge considerably. For instance, if half the batches test positive, this implies a population base rate of 70%.

[Update]

On Twitter, Rich Davis notes that large batches risk increasing false negatives. This is a crucial empirical question. I lack the bench science knowledge to provide estimates but experts would need to provide empirical estimates before this system were implemented.

Entry filed under: Uncategorized.