ArticlePDF Available

Group Testing for SARS-CoV-2 Allows for Up to 10-Fold Efficiency Increase Across Realistic Scenarios and Testing Strategies

Authors:

Abstract and Figures

Background: Due to the ongoing COVID-19 pandemic, demand for diagnostic testing has increased drastically, resulting in shortages of necessary materials to conduct the tests and overwhelming the capacity of testing laboratories. The supply scarcity and capacity limits affect test administration: priority must be given to hospitalized patients and symptomatic individuals, which can prevent the identification of asymptomatic and presymptomatic individuals and hence effective tracking and tracing policies. We describe optimized group testing strategies applicable to SARS-CoV-2 tests in scenarios tailored to the current COVID-19 pandemic and assess significant gains compared to individual testing. Methods: We account for biochemically realistic scenarios in the context of dilution effects on SARS-CoV-2 samples and consider evidence on specificity and sensitivity of PCR-based tests for the novel coronavirus. Because of the current uncertainty and the temporal and spatial changes in the prevalence regime, we provide analysis for several realistic scenarios and propose fast and reliable strategies for massive testing procedures. Key Findings: We find significant efficiency gaps between different group testing strategies in realistic scenarios for SARS-CoV-2 testing, highlighting the need for an informed decision of the pooling protocol depending on estimated prevalence, target specificity, and high- vs. low-risk population. For example, using one of the presented methods, all 1.47 million inhabitants of Munich, Germany, could be tested using only around 141 thousand tests if the infection rate is below 0.4% is assumed. Using 1 million tests, the 6.69 million inhabitants from the city of Rio de Janeiro, Brazil, could be tested as long as the infection rate does not exceed 1%. Moreover, we provide an interactive web application, available at www.grouptexting.com , for visualizing the different strategies and designing pooling schemes according to specific prevalence scenarios and test configurations. Interpretation: Altogether, this work may help provide a basis for an efficient upscaling of current testing procedures, which takes the population heterogeneity into account and is fine-grained towards the desired study populations, e.g., mild/asymptomatic individuals vs. symptomatic ones but also mixtures thereof. Funding: German Science Foundation (DFG), German Federal Ministry of Education and Research (BMBF), Chan Zuckerberg Initiative DAF, and Austrian Science Fund (FWF).
Content may be subject to copyright.
METHODS
published: 18 August 2021
doi: 10.3389/fpubh.2021.583377
Frontiers in Public Health | www.frontiersin.org 1August 2021 | Volume 9 | Article 583377
Edited by:
Olivier Vandenberg,
Laboratoire Hospitalier Universitaire
de Bruxelles (LHUB-ULB), Belgium
Reviewed by:
Elizaveta Padalko,
Ghent University Hospital, Belgium
Mohamed Gomaa Kamel,
Minia University, Egypt
*Correspondence:
Felix Krahmer
felix.krahmer@tum.de
These authors have contributed
equally to this work
These authors have contributed
equally to the repository and the web
application
Specialty section:
This article was submitted to
Infectious Diseases – Surveillance,
Prevention and Treatment,
a section of the journal
Frontiers in Public Health
Received: 14 July 2020
Accepted: 26 July 2021
Published: 18 August 2021
Citation:
Verdun CM, Fuchs T, Harar P,
Elbrächter D, Fischer DS, Berner J,
Grohs P, Theis FJ and Krahmer F
(2021) Group Testing for SARS-CoV-2
Allows for Up to 10-Fold Efficiency
Increase Across Realistic Scenarios
and Testing Strategies.
Front. Public Health 9:583377.
doi: 10.3389/fpubh.2021.583377
Group Testing for SARS-CoV-2
Allows for Up to 10-Fold Efficiency
Increase Across Realistic Scenarios
and Testing Strategies
Claudio M. Verdun 1,2†, Tim Fuchs 1†, Pavol Harar 3, 4‡, Dennis Elbrächter 5‡ , David S. Fischer 6,
Julius Berner 5‡, Philipp Grohs 3,5, 7, Fabian J. Theis 1,6 and Felix Krahmer 1,8
*
1Department of Mathematics, Technical University of Munich, Garching, Germany, 2Department of Electrical and Computer
Engineering, Technical University of Munich, Munich, Germany, 3Research Network Data Science, University of Vienna,
Vienna, Austria, 4Department of Telecommunications, Brno University of Technology, Brno, Czechia, 5Faculty of
Mathematics, University of Vienna, Vienna, Austria, 6Institute of Computational Biology, Helmholtz Zentrum München,
Munich, Germany, 7Johann Radon Institute for Computational and Applied Mathematics, Austrian Academy of Sciences,
Linz, Austria, 8Munich Data Science Institute, Technical University of Munich, Garching, Germany
Background: Due to the ongoing COVID-19 pandemic, demand for diagnostic testing
has increased drastically, resulting in shortages of necessary materials to conduct the
tests and overwhelming the capacity of testing laboratories. The supply scarcity and
capacity limits affect test administration: priority must be given to hospitalized patients
and symptomatic individuals, which can prevent the identification of asymptomatic
and presymptomatic individuals and hence effective tracking and tracing policies. We
describe optimized group testing strategies applicable to SARS-CoV-2 tests in scenarios
tailored to the current COVID-19 pandemic and assess significant gains compared to
individual testing.
Methods: We account for biochemically realistic scenarios in the context of dilution
effects on SARS-CoV-2 samples and consider evidence on specificity and sensitivity
of PCR-based tests for the novel coronavirus. Because of the current uncertainty
and the temporal and spatial changes in the prevalence regime, we provide analysis
for several realistic scenarios and propose fast and reliable strategies for massive
testing procedures.
Key Findings: We find significant efficiency gaps between different group testing
strategies in realistic scenarios for SARS-CoV-2 testing, highlighting the need for an
informed decision of the pooling protocol depending on estimated prevalence, target
specificity, and high- vs. low-risk population. For example, using one of the presented
methods, all 1.47 million inhabitants of Munich, Germany, could be tested using only
around 141 thousand tests if the infection rate is below 0.4% is assumed. Using 1
million tests, the 6.69 million inhabitants from the city of Rio de Janeiro, Brazil, could
be tested as long as the infection rate does not exceed 1%. Moreover, we provide
an interactive web application, available at www.group-testing.com, for visualizing the
different strategies and designing pooling schemes according to specific prevalence
scenarios and test configurations.
Verdun et al. Efficiency Increase for SARS-CoV-2 Testing
Interpretation: Altogether, this work may help provide a basis for an efficient upscaling
of current testing procedures, which takes the population heterogeneity into account
and is fine-grained towards the desired study populations, e.g., mild/asymptomatic
individuals vs. symptomatic ones but also mixtures thereof.
Funding: German Science Foundation (DFG), German Federal Ministry of Education and
Research (BMBF), Chan Zuckerberg Initiative DAF, and Austrian Science Fund (FWF).
Keywords: group testing, SARS-CoV-2, pooling, COVID-19, informative testing, RT-PCR
1. INTRODUCTION
The current spreading state of the COVID-19 pandemic urges
authorities around the world to take measures in order to
contain the disease or, at least, to reduce its propagation speed,
as commonly referred to by the term “curve flattening1.” At
the time of writing, the World Health Organization (WHO)
reported 12,552,765 cases and 561,617 deaths with 230,370
new cases in the last 24 hours2. In particular, more than 50
countries experiencing larger outbreaks of local transmission and
severe depletion of the workforce, for example, among healthcare
workers (HCWs), had been reported to the WHO. Also, given
the current number of tests described by several government
agencies, this number likely underrepresents the total number of
SARS-CoV-2 infections globally.
Even though a lot of research is currently being performed
toward a cure of this infectious disease, to date, the most
effective reasonable measure against its spread is the tracking
and subsequent isolation of positive cases via an intensive test
procedure on a large part of the population or at least important
risk groups (1). A pilot study conducted by the University of
Padua and the Italian Red Cross in Vò, Italy, showed encouraging
results in this direction3.
At present, the standard tests for the detection of SARS-
CoV-2, are nucleic acid amplification tests (NAAT), such
as the quantitative reverse transcription-polymerase chain
reaction (qRT-PCR). These biochemical tests are based on
samples from the lower respiratory or upper respiratory
tract of tested individuals4. The former is too delicate
of an operation to be widely applicable and usually only
feasible for hospitalized patients. In the routine laboratory
diagnosis, however, sampling the upper respiratory tract with
nasopharyngeal and oropharyngeal swabs is much less invasive
and usually the method of choice.
1Why outbreaks like coronavirus spread exponentially, and how to “flatten
the curve,” The Washington Post. Available online at: https://wapo.st/2wLMbzI
(accessed July 10, 2020).
2Coronavirus disease 2019 (COVID-19), Situation Report—174, World Health
Organization Webpage. Available online at: https://bit.ly/2ZoN8JJ (accessed July
13, 2020).
3In one Italian town, we showed mass testing could eradicate the coronavirus, The
Guardian. Available online at: https://bit.ly/2VBsmDM (accessed July 10, 2020).
4Laboratory testing for 2019 novel coronavirus (2019-nCoV) in suspected human
cases, World Health Organization Webpage. Available online at: https://bit.ly/
38SLDH1 (accessed July 10, 2020).
The demand for this type of SARS-CoV-2 testing, however,
is drastically increasing in many healthcare systems, resulting in
shortages of necessary materials to conduct the test or capacity
limits of the testing laboratories5.
The concept of group testing (also called pooled testing or
pooling) is a promising way to make better use of the available
capacities by mixing the samples of different individuals before
testing, and to first perform the test on these mixtures, the so-
called pools, as if it were only one sample. This idea goes back
to mathematical ideas developed in the 1940s and has since been
used for tests based on various biospecimens such as swab, urine,
and blood (24). In particular, group tests are employed when
testing for sexually transmitted diseases such as HIV, chlamydia,
and gonorrhea, and were recently used in viral epidemics such as
influenza, e.g. (5,6) and references therein.
Very recently, there have also been successful proofs of
concept for experimental pooling strategies in SARS-CoV-2
testing. An Israeli research team demonstrated the feasibility of
pooling up to 32 samples; they encountered false negative rates of
around 10% (7). Subsequently, a German initiative filed a patent
for a new approach that allows for so-called minipools combining
5–10 samples with a significantly reduced false negative rate (8).
Similarly, a US American research group performed a test with
12 pools of 5 specimens, each from individuals at risk, and were
able to correctly identify the two infected individuals out of the
60 with only 22 tests (9).
The main goal of these works is to demonstrate the feasibility
of the experimental design; they propose to use the original
group testing design by Dorfman of including each specimen
into exactly one pool then testing every specimen of the
pool again individually in case of a positive outcome of the
group test (2). Other works over the last weeks have suggested
refined approaches, typically based on examples or, from a more
theoretical viewpoint, with a simplified model (1016).
In this manuscript, we will demonstrate and systematically
explore that even within the limitations of the initial experimental
designs for COVID-19 testing, more sophisticated pooling
strategies can lead to a significantly reduced number of tests.
Thus connecting the recent SARS-CoV-2 pool tests to the rich
literature on group testing developed over the last decades
may be a key ingredient for effectual national responses to the
current pandemic. Such connections have been established by
5Why Widespread Coronavirus Testing Isn’t Coming Anytime Soon, The New
Yorker. Available online at: https://bit.ly/3dCAHz9 (accessed July 10, 2020).
Frontiers in Public Health | www.frontiersin.org 2August 2021 | Volume 9 | Article 583377
Verdun et al. Efficiency Increase for SARS-CoV-2 Testing
FIGURE 1 | Hierarchical Testing as proposed by Dorfman vs. array testing: In the left figure, 100 specimens are randomly sorted in groups/rows of size 10. As
indicated on the right-hand side, the row-wise group test correctly identifies the groups which contain a positive sample (indicated by the red color). Every sample of a
positive group will be flagged as possibly positive (indicated by the bold circle) and used for the next stage of tests. In the right figure, we illustrate array testing where,
in addition to testing the row groups, column group tests are performed simultaneously. Only specimens which were tested positive in both group tests will be flagged
as possibly positive. While this is an example with two simultaneous pool tests, also a higher amount of simultaneous tests can be performed.
Abdalhamid, Bilder and McCutchen by incorporating a decision
step regarding how to optimize the number of samples within
each pool based on the estimated infection rate—this led to the
choice of 5 for the pool size (9). The problem of choosing the right
pool size had previously been analyzed in many works (1719).
And we argue that a massive testing program based on pooled
tests can have significant positive effects on the physical and
mental health of the general population, given that it can allow for
partial reopenings or the use of less restrictive social distancing
measures, hence allaying social deprivation and isolation with its
strong negative effects (20,21).
The theoretical and practical understanding of group testing
developed since the first results of Dorfman (2), however, goes far
beyond merely optimizing the pool sizes (22,23). For example,
it is also possible to study group testing in the case of responses
involving three categories or more (24), and to use pooling for the
more involved problem of estimating the prevalence of a disease
in a population (17).
The main message of this paper is that in realistic prevalence
regimes for the current COVID-19 pandemic, concepts like
array testing and informative testing, explained in detail in
section 2, may help to improve the testing efficiency even
significantly beyond the gain achieved by the simple pooling
strategies implemented in the first approaches. By no means
we claim statistical originality; our goal is rather to explore
and numerically compare classical methods for a variety of
realistic parameter choices, demonstrating their efficiency for
large-scale SARS-CoV-2 testing. This paper is accompanied by
a repository of source code that allows for parallel computation
and comparative visualizations6.
6Harar P, Berner J, Elbrächter D, et al. Group Testing Simulations (2020). Available
online at: https://gitlab.com/hararticles/group-testing-simulations.
2. GROUP TESTING
As described in section 1, group testing (GT) is the procedure of
performing joint tests on mixtures of specimens, so-called pools,
as a whole, instead of administering individual tests, thereby
requiring significantly fewer tests than the number of specimens
to be tested. Ideally, this joint test will produce a positive outcome
if any one of the specimens in the pool is infected and a
negative outcome otherwise. Because of the limited information
contained in a positive outcome, it is required to test certain
specimens multiple times—either in parallel for all the specimens
or sequentially with additional testing only for those specimens
with positive test results.
Sequential test designs in which the grouping of samples into
pools in each stage depends on the results of the former stages
are called adaptive. For non-adaptive methods, in contrast, all the
sample groupings are specified in advance, which translates into
a one-stage procedure in which all pool tests can be performed
in parallel.
A special class of adaptive test designs is hierarchical tests,
where in the first stage, each specimen is included in exactly one
pool, and, in every subsequent stage, groups with positive results
are divided into smaller non-overlapping groups and retested,
while all specimens contained in groups with negative results are
discarded. The original Dorfman test, for example, is a two-stage
hierarchical group test.
The left part of Figure 1 illustrates the hierarchical structure
of the Dorfman test with a 10 ×10 illustrative microplate. Each
circle in the plate represents specimens from separate individuals
and the red circles are the infected ones that need to be identified.
The specimens are then amalgamated row by row to perform
a group test for each row. A positive test result indicates that
some individual in the corresponding row is infected. Once the
results from the group tests are available, they can be used for the
Frontiers in Public Health | www.frontiersin.org 3August 2021 | Volume 9 | Article 583377
Verdun et al. Efficiency Increase for SARS-CoV-2 Testing
next stage, so only the specimens sharing a pool with an infected
specimen will need to be retested.
Entirely non-adaptive group testing procedures have been
designed and analyzed using techniques at the interface of
coding theory (25), information theory (23), and compressive
sensing (2628). The symbiosis among those fields leads to
developments such as the establishment of optimal theoretical
bounds for the best expected group testing strategies (29).
However, some of the developments lead to algorithms that
may not be practically efficient to implement and, consequently,
are not suited for many medical applications including
SARS-CoV-2 testing.
Nevertheless, the idea of including every specimen in multiple
pools to be tested in parallel is an integral part of many medical
testing procedures, as the implementation of hierarchical tests
with many stages can be rather complex and hard to automatize.
Often, the test proceeds by arranging the specimens in a two-
dimensional array and assembling all the specimens of each
column in a pool. Then, the same procedure is done with all
the specimens of each row (30). This testing strategy is a special
instance of the so-called array testing, already mentioned in
section 1. In this way, every specimen is included in exactly
two pools. All the specimens in the intersection of two pools
with positive test results have to be retested in a second stage,
but the number of these individual tests can be considerably
smaller than for the Dorfman design. Figure 1 illustrates the
array testing procedure for a 10 ×10 microplate with two
infected individuals; here only four of the 100 specimens need to
be retested.
Sometimes, for array tests, an initial master pool consisting
of all specimens in a certain array is formed and all the k2
individuals are tested together. This allows for a rejection of a
large group in case it exhibits a negative result. Otherwise one
proceeds with the array strategy described above. It is important
to note, however, that master pooling should be used when there
are no clear restrictions on the pool size, e.g., given by dilution
effects. In case that such effects are not present, as claimed
recently at least for small pool sizes (8), master pooling strategies
could be explored.
Another important methodological advancement in group
testing is the design of informative tests, i.e., testing strategies
that are not based on the assumption of a uniform infection
rate, but rather incorporate different estimates for the infection
rate of subgroups of the population. We expect that such
strategies will be of particular relevance for SARS-CoV-2 testing;
for example, the infection rate among healthcare professionals
or elderly care workers is expected to be higher than for
citizens working from home due to different levels of exposure
and, similarly, a stratification based on the level of symptoms
also seems reasonable. A first attempt to make use of such
a stratification for SARS-CoV-2 testing has recently been
made with two subpopulations (13). This paper, however, only
assembles homogeneous pools within the two subpopulations
and hence does not make use of the full power of informative
testing. Namely, the testing efficiency can be significantly
improved by smartly assembling combined pools with members
of both subpopulations.
Indeed our simulations confirm that this approach, when
available, can help improve testing efficiency for realistic
choices of parameters. At the same time, we expect that for
best performance, one will have to employ a combination of
different approaches.
As for many other applications, the design of the GT strategy
needs to be driven by the following challenges (31).
i. What practical considerations restrict the pooling strategies
available to the laboratory?
ii. How do the pool size and the choice of the assay for NAAT
affect the ability of a pooling algorithm to detect infected
individuals in a testing population?
iii. Given the assay and maximum pool size, what efficiencies
can be expected for different pooling strategies in testing
populations with different prevalences of the disease or well-
defined subgroups of varying prevalence?
iv. How can pooling strategies be expected to impact the accuracy
of the results?
Especially the fourth point has not received much attention in
the literature on GT approaches for SARS-CoV-2 testing yet.
Like most other testing procedures, qRT-PCR for COVID-19
misclassifies some negative specimens as positive and vice versa,
as quantified by the sensitivity and the specificity of the test (the
precise definitions are recalled in section 3).
Causes of these inaccuracies that have been documented
include low viral load in some patients, difficulty to collect samples
from COVID-19 patients, insufficient sample loading during qRT-
PCR tests, and RNA degradation during the sample handling
process (32). Some of these effects are to be amplified in group
testing procedures, so it becomes even more important to take
errors into account.
At the same time, the accuracy of a test is difficult to assess.
Namely, as described above, NAAT is used to quantify the
abundance of SARS-CoV-2 genetic material in a sample similarly
to tests for other viral infections (33). In the specific case of
qRT-PCR, the abundance measurement is on a continuous scale,
the cycle (Ct) at which the readout, given by a fluorescence
trace, surpasses a threshold. A decision boundary for a positive
observation, i.e., infected, has to be established based on negative
samples, i.e., biological control. Accordingly, the estimates on
false negative and false positive rates of NAAT tests (and group
tests in particular) for the SARS-CoV-2 infection depend on the
strength of the classifier induced by this decision boundary. The
accuracy of this classifier is influenced by several factors such as
the following.
1. The ability of the test to selectively amplify virus genetic
material depends on primer design. Multiple primers for qRT-
PCR testing on COVID-19 samples were recently compared
and found to be similarly strong, with a few exceptions of
published weaker primers (34).
2. A large worry about group testing is that the pooling of few
positive samples with many negative samples could push the
virus concentration in the pooled sample below the detection
limit, increasing the false negative rate. This effect has
been investigated by studying the test accuracy for dilutions
Frontiers in Public Health | www.frontiersin.org 4August 2021 | Volume 9 | Article 583377
Verdun et al. Efficiency Increase for SARS-CoV-2 Testing
containing virus samples, and false negatives rates were found
to be below 10% at a wide range of dilutions, suggesting that
the qRT-PCR stage of the testing pipeline introduces small
error rates only (7). Still, it is of fundamental importance to
accurately estimate the errors introduced by dilution effects
since a good understanding of the error is crucial to allow for
any reliable inference in a disease study (35).
3. Thirdly, sample extraction methods may have varying yield in
virus material: This yield depends on the tissue or fluid that is
sampled and on the processing of the sample, such as the time
between sampling and qRT-PCR or the temperature at which
the sample is held. One would expect this sample extraction
to mostly have a destructive effect and to inflate negative rates
rather than inflate positive rates.
4. The establishment of gold standard disease labels on samples
that were also tested with NAAT is of fundamental importance
to assess the overall accuracy of the classifier. There is little
such data for COVID-19 testing right now. To this end, a
recent study analyzed the positive test result rate of qRT-PCR
tests on COVID-19 patients identified based on symptoms,
where the symptom-based diagnosis served as ground truth
(36). They found false negative rates of individual tests of
around 11–25% on sputum samples. At the same time, false
positive rates are hard to estimate in the current situation
in which non-symptomatic infections occur at an unknown
frequency and because of the lack of reference gold-standard
labels for positive observations that are non-symptomatic.
However, as sample collection does likely contribute little to
false positive rates, the overall false positive rate of a group
test would largely depend on the qRT-PCR stage in which
there is reason to believe that it should be small. Some
previous studies on the use of PCR for similar infectious
diseases such as SARS-like viruses as well as for SARS-CoV-2
reported high sensitivity for PCR (34,37). Indeed, in the
absence of cell culture methods, qRT-PCR tests are considered
to be the gold standard for the identification and diagnosis
of most pathogens.
The importance of such estimates described above lead to a
recent collaborative effort between FIND, a Swiss foundation
for diagnostics, and the World Health Organization for the
COVID-19 pandemic in order to evaluate the qRT-PCR tests
and to assess their accuracy (38). FIND is currently evaluating
a list of more than 300 SARS-CoV-2 commercially available
tests and establishing accurate estimates for sensitivity and
specificity with their respective confidence intervals7. Based
on the preliminary findings, in this work we will assume
that the specificity of a single PCR test is 99%. For the
sensitivity, we will mostly assume the value of 99% as well
but also explore the impact of lower values to account for
potential dilution effects along the several tables presented in
the Appendix.
7FIND Evaluation Update: SARS-CoV-2 Molecular Diagnostics, Foundation for
Innovative New Diagnostics. Available online at: https://www.finddx.org/covid-
19/sarscov2-eval- molecular/ (accessed July 10, 2020).
A common thread in the various aspects discussed in this
section seems to be the large variety of relevant parameters due
to differences between testing scenarios and uncertainty as a
consequence of infected individuals without symptoms. In this
note, we aim to illustrate that the test design of choice should
very much depend on these parameters to make the best use of
the testing capacities. We will provide a numerical comparison
between different designs for large classes of parameters, such as
the sensitivity, specificity, and the expected number of tests per
person, so the design can be constantly adapted to what is the
best fit to the current best estimate of, e.g., the infection rate and
the sensitivity.
Before discussing our numerical results, we will precisely
introduce the relevant design parameters and testing strategies
in the next section.
3. METHODS
3.1. Terminology
We start by introducing some terminology.
Prevalence p: This is the assumed infection rate of the
population that is going to be tested, that is, the fraction of
the population that is infected. Hence it also is the probability
of infection for a randomly selected individual. For simplicity
of notation, we will write q=1pfor the probability
that a randomly selected individual is negative. When the test
subjects can be divided into groups with different fractions
of infected subjects, we also speak of the prevalences of these
subgroups. Without further specification, however, the term
refers to the full population to be tested.
Number of stages: This denotes how many steps the method
performs sequentially and these steps are characterized by the
fact that each stage requires the results from the previous one.
In this paper, we will study adaptive methods with up to three
stages, even though more stages, usually up to four in the case
of infectious diseases, can be used (30).
Divisibility: This refers to the maximal number of tests that
can be performed on a given specimen. This number provides
a limitation on how many group tests can be performed, in
parallel or in different stages, that include the corresponding
test subject.
Group size k: This is the size of the groups that are used in a
pooling scheme. For a testing strategy to be feasible, one needs
to ensure that the maximal group size kstill allows for reliable
detection of a single positive in a pool of size k.
Sensitivity Se: This is the probability that an individual test
correctly returns a positive result when applied to a positive
specimen or pool. A priori, this probability can be different
depending on the number of positives included, for example,
due to dilution effects (35,39,40), but we will neglect this
important distinction for the mathematical description below
and assume that a PCR test has a fixed sensitivity independent
of pool size. Analogously, for a pooling strategy X,Se(X) is the
probability of the whole method Xreturning a positive result
for a positive specimen.
Frontiers in Public Health | www.frontiersin.org 5August 2021 | Volume 9 | Article 583377
Verdun et al. Efficiency Increase for SARS-CoV-2 Testing
Specificity Sp: This is the probability that an individual test
correctly returns a negative result when applied to a negative
specimen or pool. Again we assume that a PCR test has a
fixed specificity independent of pool size. In case dilution
effects are taken into account and more specific information
on how the sensitivity/specificity changes with the pool size k
is added, one should write Seand Spwith a dependency on k.
Analogously, for a pooling strategy X,Sp(X) is the probability
of the method Xreturning a negative result when a negative
specimen is tested.
Expected number of tests per person E: We consider the
expected number of tests per person as a measure of efficiency.
Naturally, the expected number of tests per person of a method
depends on the prevalence pas well as Seand Sp, but also on
the design parameters, such as the group size kand the number
of stages. We will write E(X) to denote the expected number of
tests per person for a method X, without explicitly indicating
its dependence on these parameters for the sake of notational
simplicity. There exist recent discussion in the literature about
alternative objective functions which take directly into account
the effects on specificity and sensitivity. The findings, however,
show that such an alternative choice most often does not affect
the optimal group testing configuration (41).
The optimal choice of design will depend on the
aforementioned parameters. In section 4, we will explore
these dependencies numerically.
There is also some theory on the optimal design choice and
the necessary amount of tests. An argument given by Sobel
and Groll (42), which is based on the seminal works by
Shannon and Huffman (43,44), shows the theoretical lower
bound for the expected number of tests per individual of any
given group testing method. More precisely, they showed that
E(X)≥ −plog pqlog qmust hold for any method Xwith
Se(X)=Sp(X)=1. In addition to its theoretical interest, it
pragmatically indicates how much further improvement might
still be possible. Note that it is only a bound, which may very well
not be achievable with practically feasible methods. Figures 5,6
illustrate how the methods discussed here compare to this bound
and how much gain one could expect for any large-scale group
testing strategy.
Regarding the influence of the infection rate, it has been
established by Ungar that for infection rates p(3 5)/2
38%, the optimal pool size is 1, so there does not exist a group
testing scheme that is better than individual testing (45). Also, on
an intuitive level, one may think that the higher the prevalence,
the higher the expected number of tests should be. In fact, Yao
and Hwang proved that the minimum of the expected number
of tests with respect to all possible test strategies should be
non-decreasing with respect to p, if p<(3 5)/2 (46).
Therefore, in the COVID-19 pandemic where the prevalence
in most countries, both among the tested individuals and
the entire population is clearly believed to be smaller than
the threshold provided by Ungar’s theorem, one can expect
a significant reduction in the average number of tests by
employing suitable group testing methods. In the following
subsection, we will discuss some of these methods and their
mathematical formulation.
3.2. Standard Group Testing Methods
In this subsection, we will recall some standard methods for
group testing that we will numerically explore in the following
section. An overview of these methods and their mathematical
formulation can be found in the book by Kim and colleagues
while their mathematical derivation was published by Johnson
et al. (47,48).
3.2.1. 2-Stage Hierarchical Testing (D2)
Dorfman’s method is an adaptive method, which tests, in a first
stage, each individual as part of a group of size k(2). Then, in
the groups that tested positive, all the individuals are tested again
individually in a second stage. Consequently, the test requires
divisibility of 2. The probability of a pool of size k, here denoted
by Pk, drawn at random from the population to test positive is
Pk=(1 Sp)qk+Se(1 qk), (1)
the expected number of tests per person of the method is given by
E(D2) =1
k+Pk,
and its sensitivity and specificity are
Se(D2) =S2
e,Sp(D2) =1(1 Sp)Pk1.
A slight improvement of Dorfman’s method is possible by
omitting one of the individual tests per pool in the second
stage and only performing it in a third stage when at least
one of the other second-stage tests of that pool has a positive
result—exploiting that if all test results in the second stage are
negative, the last specimen must be infected for the group test to
be valid (42).
A more significant modification was proposed by Sterrett
(3). In his method, the second stage is modified by performing
individual tests until the first positive is found. Then a pooling
procedure similar to the first stage is performed for the
remaining, still unlabeled, specimens, and this scheme is repeated
until all specimens are labeled. While requiring a smaller number
of tests per individual on average, especially for small infection
rates (19), the number of stages that need to be performed
sequentially is not known a priori and may be very high. As
such Sterrett’s method is more involved in practice, while D2 is
a simple and straightforward procedure. Thus the latter is often
preferred in applications, which is also why we will perform the
simulations for the original form of D2 in this paper.
3.2.2. 3-Stage Hierarchical Testing (D3)
In this method, each individual is tested as part of a pool of size
kin the first stage. Every pool that tests positive is then split into
subgroups, which are tested in a second stage. Every member of
a subgroup with a positive result in the second stage is tested
individually in a third stage. Consequently, this method requires
divisibility 3. In this paper, we will focus on the case that all
Frontiers in Public Health | www.frontiersin.org 6August 2021 | Volume 9 | Article 583377
Verdun et al. Efficiency Increase for SARS-CoV-2 Testing
FIGURE 2 | Comparison between D2 and D3.
subgroups are of size s. Expected number of tests per person,
sensitivity, and specificity of this method are given by
E(D3) =1
k+1
sPk+S2
e(1 qs)+(1 Sp)qsPks,
Se(D3) =S3
e,
Sp(D3) =1(1 Sp)2qs1Pks(1 Sp)S2
e(1 qs1).
A schematic comparison between the hierarchical tests with two
and three stages, D2 and D3, is given in Figure 2.
3.2.3. Array Testing (A2)
This is a 2-stage method, originally proposed by Phatarfod and
Sudbury and later explored by Kim et al. (47,49,50), that
tests every individual twice in a first stage as a part of two
different groups of size k. In a second stage all the individuals, for
which both group test results are positive, are tested individually.
Consequently, this method requires divisibility 3. A schematic
overview of array testing for different scenarios is given in
Figures 3,4.
Precisely determining the optimal way to assemble the pools is
rather non-trivial, see, e.g., the publication by Kim et al. (47), but
the following configuration provides a good trade-off between
simplicity and the expected number of tests. At first, k2specimens
are arranged in a k×karray, then every row and every column
is pooled and subjected to a group test. This ensures that each
specimen is tested exactly twice as part of a group of size k
and constitutes the unique intersection of these two pools. For
Sp=Se=1 it is sufficient to only test a person individually
if both its row and column tests return positive results. In
this case one obtains the following formula for the expected
number of tests
E(A2) =2
k+p+q(1 qk1).
If Seor Spdiffer from 100%, the first stage may yield positive rows
without any positive columns or vice versa. In this case, it makes
sense to test every member of such a row or column individually
(47,51). This results in a slight increase in sensitivity at the
expense of a slight increase in the expected number of tests per
person. As this change makes the formulas much more involved,
we omit them here and refer to the corresponding literature (47).
FIGURE 3 | Illustration of a simple A2 procedure where the positive individuals
are uniquely determinable after the first stage. Every individual a,b,c,. . . ,igets
pooled exactly twice.
FIGURE 4 | Illustration of a simple A2 procedure where also two negative
samples got flagged the first stage.
A2 can be generalized to procedures with three or more
simultaneous pools. In this case, the pools could be assembled, for
example, by creating pools along the diagonals and/or the anti-
diagonals8of an array, in addition to rows and columns (51).
An advantage of such approaches is that the group tests for all
these pools can be performed in parallel, which can lead to faster
test results, but one has to take into account that the sensitivity is
decreasing with the number of pool tests per individual.
The method above can be extended to higher-dimensional
procedures, i.e., j>3, and a connection to optimally efficient
two-stage methods can be established. Note that these arrays have
size kjrendering this approach practically infeasible very quickly
as jand kincrease. More concretely, Berger et al. (52) showed
that if the prevalence is p=0.01, then an (almost) optimally
efficient two-stage method can be achieved by j=6 and k=74,
i.e., a 6-dimensional array with side length 74. However, the
population, in this case, would need to contain 746164 billion
individuals to be screened, which is impractical to be applied in
any real-world problem. Thus, the quest for methods that use
the same principles but are effective for a realistic population size
still remains.
3.2.4. Non-adaptive Array Testing (A1)
All the group testing methods discussed so far terminate with
an individual test for all specimens with positive test results
in all previous stages to avoid false positives only based on
8By combining different diagonals resp. anti-diagonals in a suitable way, such that
one gets groups of size kobeying the unique intersection property.
Frontiers in Public Health | www.frontiersin.org 7August 2021 | Volume 9 | Article 583377
Verdun et al. Efficiency Increase for SARS-CoV-2 Testing
the choice of the pools. In a situation with a shortage of test
components, there may be scenarios where one is willing to
accept a significant number of additional false positives as a
means to reduce the expected number of tests and simplify the
test design—in particular, it is desirable to perform all different
tests necessary for a testing procedure in parallel.
Toward this goal, one may consider replacing the last stage of
individual tests in an adaptive procedure by an additional pooling
dimension to be performed in parallel, hence transforming the
adaptive into a non-adaptive method.
When this adaption is applied to the Dorfman method, one
obtains a procedure A12that is identical to the first stage of A2.
When applied to A2, this yields a method A13of three parallel
pool tests per specimen, again without a decisive individual test
at the end. By design, the resulting methods have a significantly
lower specificity, but lead to a reduction in necessary tests. An
additional advantage is that the resulting methods are fully non-
adaptive and can be performed in a single testing stage, allowing
for faster test results. At the same time, the adaptation from
the methods D2 and A2 does not affect the divisibility required
nor the sensitivity of the resulting procedure as adding another
additional pooling dimension is accompanied by omitting the last
stage—one is really just trading specificity for a lower number of
tests and non-adaptivity.
Hence a suitable decision parameter is the minimal acceptable
specificity. By the trade-off just mentioned, this also implicitly
determines the group size and hence the expected number of tests
per person via the relations
E(A1j)=j
k
Se(A1j)=Sj
e,
Sp(A1j)=1Pj
k1,
where j=2, 3. It is important to note that such tests can
only be used when a certain false positive rate can be accepted.
If a non-adaptive method with perfect detection of positive
individuals, i.e., assuming perfectly accurate RT-PCR, is required,
a theoretical result by Aldridge shows that no testing strategy
is better than individual testing (53). Also, in contrast to the
adaptive tests discussed above, the minimal number of expected
tests per person alone is not a viable measure for the optimal
choice of the group size k—it would yield a strong bias toward
tests with many false positives. For the remainder of this work,
the threshold for the minimal acceptable specificity is set to 95%.
Nevertheless, we will give a short comparison with a preset of 90
and 97% in section 4.
3.3. Extension to the Informative Case
As described in section 2, it is possible to incorporate prior
information such as demographic, clinical, spatial, or temporal
knowledge into refined estimates for the prevalence and to
stratify the population accordingly, reflecting the heterogeneous
distribution of the infected individuals. This heterogeneity, first
explored by Nebenzahl and Sobel (54), and Hwang (55), can be
exploited for refined GT strategies.
From a mathematical point of view, informative tests are
somewhat more challenging to analyze (5659). To illustrate
the findings of the analysis of the informative tests and
demonstrate its relevance for SARS-CoV-2 testing, we will work
with a scenario where two distinct subpopulations, one with
a high prevalence phigh (e.g., HCWs) and another, larger,
subpopulation of individuals with low prevalence plow (e.g.,
representative samples of the general population) are to be tested.
As shown for example by Bilder and Tebbs (60), informative
testing reduces the expected number of tests per individual
even further when compared with their corresponding non-
informative counterparts. As argued by them, it is crucial to
exploit this heterogeneity and employ an efficient mixing strategy
of individuals from both subpopulations to form the pools.
Our goal here has a different perspective on how to exploit
such strategies as will be discussed in the next section. It sheds
light on testing methodologies where as much individuals as
possible should be tested with the available tests while subject
to the constraint of constantly testing high-risk individuals
such as HCWs.
4. NUMERICAL RESULTS
In this section, we will numerically explore different design
choices in group testing for SARS-CoV-2. A key tool is the R-
package binGroup for identification and estimation using group
testing, that features the computation of optimal parameter
choices for standard group testing algorithms9(61). We have
complemented this package with a repository of source code for
parallel computation and comparative visualization that has been
used to create all the graphics in this section and is available
for the reader to produce visualizations adapted to different
prevalence ranges of interest10.
As indicated in the previous section, the choices of the correct
method and the optimal group size kheavily depend on several
constraints, most importantly the underlying prevalence p(or
the subpopulation prevalences for a refined model). In this work,
instead of attempting to find the optimal method, we evaluate the
properties of a group testing design for a single fixed group size.
We will investigate different infection scenarios with the different
group testing methods described above. We apply the tests D2,
D3, A2, and A1jwith overall prevalence varying from 0.25 to
15%. The results for D2, D3, and A2 have been simulated using
binGroup2 while A1jhas been implemented separately11.
An important aspect to take into account when putting the
number of individuals tested per available test into perspective is
that methods based on multiple pools or stages will typically have
a smaller overall sensitivity than individual tests, cf. section 3.2.
It is crucial to integrate the sensitivity considerations into
any pooling strategy (40). In Tables A2–A4, we will illustrate
(potential) efficiency increase assuming a sensitivity of 99 and
9While working on this manuscript, the updated package binGroup2 with
improved and unified functionalities was released. Even though some of our
calculations were performed with it, since the repository makes use of the previous
version, we kept it here for consistency.
10Harar P, Berner J, Elbrächter D, et al. Group Testing Simulations (2020). Available
online at: https://gitlab.com/hararticles/group-testing-simulations.
11Bilder CR, Zahng B, Schaarschmidt F, M Tebbs J, et al. binGroup2: A Package For
Grouptesting (2020). Available online at https://cran.r-project.org/web/packages/
binGroup2/binGroup2.pdf, Version 1.0.2.
Frontiers in Public Health | www.frontiersin.org 8August 2021 | Volume 9 | Article 583377
Verdun et al. Efficiency Increase for SARS-CoV-2 Testing
FIGURE 5 | (A) The number of individuals that can be tested per test available for the different adaptive methods. Here, the sensitivity and specificity are assumed to
be 99%. The theoretical bound given by (42) is also shown for comparison and the maximum group size is assumed to be 16. (B) zoomed version of (A) that
illustrates the low prevalence regime of infection rates up to 2%.
90%, respectively, for the qRT-PCR test. As mentioned before,
extensive tests are currently being performed to confirm the
high accuracy of qRT-PCR for SARS-CoV-2 testing. Indeed, they
indicate that many available PCR procedures for SARS-CoV-2
testing show a sensitivity of or close to 100% (62). Nevertheless,
an appropriate quantitative understanding of pooling effects
and viral load progression on the sensitivity is still an active
discussion (63).
For a PCR sensitivity of 99%, we observe that the reduction
caused by the use of a pooling method is very small (97% for D3,
A2, and A13; 98% for D2 and A12). Only a single PCR procedure
showed low sensitivity of 90% when choosing a specific gene
target (compared to 100% when choosing another target) (62).
In that case, we find a sensitivity of 73% for D3, A2, and A13
and 81% for D2 and A12. While the specificity of PCR already
appears to be close to 100%, the tables indicate that D2, D3,
and A2 improve the specificity even further while A12and A13
fulfill the preset threshold Sp(.) 95%. Due to the specificity
constraint, A12can not be recommended for very high infection
rates of at least 12% as there is no reduction of necessary tests
over individual testing. A13is more robust but shows the same
behavior at p>15%.
Se(.) and Sp(.) depend mostly on the method and underlying
sensitivity Seof the qRT-PCR test and barely change for
increasing p. Therefore, Table A5 shows the change of Se(.) and
Sp(.) for p=3% and varying Se. It should be noted that the
sensitivity Se(.) virtually does not depend on the specificity Spof
PCR. Only a slight change in initial group size can be detected.
As explained in section 3, the sensitivity can be computed as
Se(D2) =Se(A12)=S2
eand Se(A2) Se(D3) =Se(A13)=S3
e.
To reflect practical considerations such as dilution effects
(7), we constrain the group size to at most sixteen12. We
observe that all the methods yield a significantly reduced
expected number of tests per person as compared to individual
12Since writing the article, further publications demonstrate the feasibility of
pooling specimens for even larger pool sizes of up to 30 (64).
testing. This improvement decays with the growing infection
rate, in line with our discussion above. For prevalence values
below 4%, and hence including the estimated range of current
infection rates for SARS-CoV-2 in different countries13, all
adaptive methods (D2, D3, A2) allow to test at least 3 times
as many individuals with the same amount of tests. Around
a prevalence of 3% both non-adaptive methods allow testing
around 5 individuals per test if a false positive rate up to 5% can
be accepted.
Compared to individual testing where only a single individual
can be tested per available test, Figures 5,6demonstrate the
average amount of individuals who can be tested per available
test when applying different group testing methods. For infection
rates as high as 2%, up to 5 times as many individuals compared
to the amount of available tests can be tested using adaptive
methods. For a low prevalence below 0.5% this number varies
between a 7- and 15-fold efficiency increase.
Figure 6 shows the efficiency improvement of A1jcompared
to the corresponding adaptive method. The specificity reduction,
the biggest drawback of the proposed non-adaptive methods,
is controlled by setting the threshold to 90, 95, and 97%.
Naturally, the methods relying on the lowest threshold show the
biggest improvement. The suggested threshold of 95% leads to a
significant improvement of A12compared to D2 for an infection
rate between 0.4 and 5%. A13significantly exceeds A2 and D3 for
a prevalence between 2.5 and 5%.
This is exemplified by some numerical examples in Table A1;
for example, this entails that for an infection rate of 0.4%, the
city of Munich with 1.47 million inhabitants could be tested with
only 141 thousand tests using D3, the 6.69 million inhabitants of
Rio de Janeiro could be tested using around 1 million tests if the
infection rate does not exceed 1% and the adaptive methods D3
or A2 are performed. If a false positive rate up to 5% is considered
acceptable, the non-adaptive method A12would only require
13Infection rates of viruses involved in outbreaks worldwide as of 2020. Statista.
Available online at: https://bit.ly/2wOmuyo (accessed April 28, 2020).
Frontiers in Public Health | www.frontiersin.org 9August 2021 | Volume 9 | Article 583377
Verdun et al. Efficiency Increase for SARS-CoV-2 Testing
FIGURE 6 | The number of individuals that can be tested per test available for different non-adaptive and their corresponding adaptive methods. A1j(9X) denotes the
non-adaptive method A1jwith a specificity threshold 9X%. Here, the sensitivity and specificity of qRT-PCR are assumed to be 99%. The theoretical bound given by
(42) is also shown for comparison and the maximum group size is assumed to be 16.
836,000 tests and at the same time allow for higher prevalence
values of up to 1.5%.
To summarize, below 1% infection rate, any of the presented
group testing procedures will constitute an extreme improvement
over individual testing while D3 shows the best performance.
For 1% p<6%, A2 and D3 show a comparable performance
which is superior to D2. For p10% all adaptive methods show
a similar performance.
Considering the non-adaptive methods, A12requires a
significantly reduced expected amount of tests for an infection
rate between 1 and 4%. For a prevalence between 3 and 8%,
A13shows the highest reduction in the number of tests of all
methods. However, the trade-off between the lowest amount of
tests and a false positive rate of up to 5% has to be considered
when choosing the testing method.
Next, we numerically explore the average number of tests
of different approaches for informative testing, with the goal
of finding the best way to incorporate refined knowledge
about different prevalences for distinct subpopulations. Each
plot of Figure A1 compares the expected number of tests
per person of two informative testing methods, namely the
approach of choosing pools separately for the subpopulations,
and the approach of assembling the pools with members of all
subpopulations. We study a model with two subpopulations of
different prevalence, and consider prevalence values between 5
and 25% for the high-risk and between 0.1 and 5% for the low-
risk group. As far as we are aware, this assumption regarding
different prevalence values for two groups, in line with the
two subpopulations we mention, was first mentioned in the
context of SARS-CoV-2 by Deckert et al. (13), where they speak
of homogeneous pools and use non-informative D2 for their
analysis. However, the question of whether and how to adjust
the testing procedure based on subpopulation knowledge did not
arise in this work.
We find that for A2 and D3, the advantage of assembling
combined pools from both subpopulations gets larger when the
prevalence of the low-risk group decreases. How it depends on
the prevalence of the high-risk group differs depending on the
methods and also the constraints imposed on the group size. For
D2, however, the same phenomenon was not observed. More
experiments of the same type but with different group sizes as
well as different sensitivities and specificities can be visualized at
our web application14.
5. DISCUSSION
In this manuscript, we provide a comparison of general strategies
for group testing in view of their application to medical diagnosis
in the current COVID-19 pandemic.
Our numerical study confirms the recent observation that
even under practical constraints for pooled SARS-CoV-2 tests,
such as restrictions on the pool size, and for prevalence
values in the estimated range of current infection rates in
many regions13, group testing is typically more efficient than
individual testing and it allows for an efficiency increase of up
to a factor 10 across realistic scenarios and testing strategies.
We also find significant efficiency gaps between different
group testing strategies in realistic scenarios for SARS-CoV-2
testing, highlighting the need for an informed decision of
the pooling protocol. The repository for parallel computation
and comparative visualization accompanying this manuscript
allows the reader to visualize the performance of the different
approaches similarly to the tables and graphics contained in this
paper for different sets of parameters12.
For every scenario and method, an optimal pool size can be
determined. However, the pool size is constrained biochemically
by dilution effects and by sensitivity considerations. For a low
prevalence, this can prevent choosing the optimal pool size.
We find that within pooling protocols, sophisticated methods
that employ multiple stages or multiple pools per sample,
or exploit prevalence estimates for subpopulations have the
strongest advantages at low prevalences.
Such low prevalence values are realistic assumptions especially
for large-scale tests of representative parts of the population,
so these methods are particularly suited for full population
14 Harar P, Berner J, Elbrächter D, et al. Group Testing Simulations (2020). Available
online at: https://gitlab.com/hararticles/group-testing-simulations.
Frontiers in Public Health | www.frontiersin.org 10 August 2021 | Volume 9 | Article 583377
Verdun et al. Efficiency Increase for SARS-CoV-2 Testing
screens or representative sub-population screens with the goal
of reducing transmission and flattening the infection curve.
This is of fundamental importance since transmission before
the onset of symptoms has been commonly reported and
asymptomatic cases seem to be very common (65). For example,
328 of the 634 positive cases on board of the formerly
quarantined Diamond Princess cruise ship were asymptomatic
at the time of testing, which corresponds to 52% of the cases.
Another study conducted in a homeless shelter in Boston,
MA, USA, confirmed that standard COVID-19 symptoms like
cough, shortness of breath, and fever were uncommon among
individuals who tested positive and strongly argues for universal
PCR testing on that basis (66). Also, besides enhancing the tests
of mild/asymptomatic cases, some disease control centers, such
as the ECDC, recommend that group testing should potentially
be applied to prevalence studies15.
The pooling schemes suggested here can also include routine
tests of cohesive subpopulations with high prevalence, such as
healthcare workers, and therefore propose a sensible way to
include commonly available information about risk groups into
the setup (67). For certain scenarios, our numerical experiments
show a reduced expected number of tests when employing
combined pools consisting of high-risk and low-risk individuals
provided some estimates for the prevalence in these two parts of
the test population are available.
One could also envision separate pooled tests with different
requirements on specificity and population coverage in sub-
populations with different prevalence, again highlighting the
importance of proper stratification: High specificity is for
example likely desirable among healthcare workers whereas
specificity may be partially traded for coverage during contact
tracing. At the heart of these trade-offs lie considerations about
the societal cost of false positives in comparison to the cost of
missed diagnosis because of a lack of available tests.
The improved test efficiency of group testing is, however,
only one aspect of test design. Carefully tracing every single
specimen throughout the whole process is of utmost importance.
As this already is the case for individual tests, the additional
requirements for tracing pooled probes are therefore rather
minor and typically covered by the specimen registration into
the laboratory information systems (LIS). Moreover, the FDA
has published an amendment for pooling protocols which
includes guidelines for the appropriate traceability/registration
of the samples pooled16. From the IT point of view, sample
tracing can be implemented for example via a hash file, which
has proven successful for large-scale implementations of group
testing, see (68).
Nevertheless, practitioners have to take several factors
into account when deciding if group testing can provide
a feasible solution for massive tests procedures (40). Some
important practical considerations are time constraints,
specimen conservation for multi-stage testing, and resource
15Laboratory support for COVID-19 in the EU/EEA, European Center for
Disease Prevention and Control. Available online at: https://www.ecdc.europa.eu/
en/novel-coronavirus/laboratory-support (accessed April 28, 2020).
16In Vitro Diagnostics EUAs - Molecular Diagnostic Tests for SARS-CoV-2,
US Food and Drug Administration. Available online at: https://bit.ly/2QCxCIQ
(accessed April 30, 2021).
availability, as well as the actual execution of the test in the
labs, such as variations in pipetting and sample collection. In
particular, the decision at which stage the pooling is taking place
(pre-pre analytical, pre-analytical, or analytical) is crucial for the
expected turnaround time (69). All of these aspects need to be
carefully considered before the establishment of massive pooled
test policies.
qRT-PCR-based tests are currently widely deployed for
COVID-19 diagnosis and, more generally, to identify current
infections (37,70). As for any nucleic acid amplification tests,
one can only identify cases where virus particles can still be
detected. Thus for long-term disease monitoring, NAATs will
have to be complemented by serological tests, as these can be
used to infer the immunity state of a patient and hence identify
past asymptomatic infections through detection of disease-
specific antibodies. Such tests have already been deployed in a
few cases (71,72). In contrast to the PCR testing procedures
mainly discussed in this paper, the main intention of serological
testing is to obtain accurate estimations of the number of
unidentified previous infections as a measure for the progress
toward herd immunity. Group testing can also be expected to
yield accuracy gains for this problem. Namely, group testing for
prevalence estimation is an active area of research with many
recent advancements. Also, in settings like a hospital, nursing
homes, or similar, the employment of rapid and massive testing
may be superior for overall infection control compared to less
frequent, highly sensitive tests with prolonged turnaround times.
Therefore, pooling strategies for antigen tests or other point-of-
care tests should also be considered in this scenario and we are
confident that some of these results can be employed once pooled
tests become available (17,73). In any case, there are still many
well-established methodological tools available in the literature
that have not yet been explored for SARS-CoV-2 testing, so we
advocate for a continued exchange between theory, simulation
and visualization, and practice.
DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online
repositories. The names of the repository/repositories and
accession number(s) can be found below: Source code is
available on Gitlab: https://gitlab.com/hararticles/group-
testing-simulations.
AUTHOR CONTRIBUTIONS
CV: conceptualization (manuscript), methodology, validation,
formal analysis, investigation, writing—original draft, review and
editing, and visualization. TF: conceptualization (manuscript),
methodology, software, validation, formal analysis, investigation,
writing—original draft, review and editing, and visualization. PH,
DE, and JB: methodology, software, validation, formal analysis,
investigation, data curation, writing—review and editing, and
visualization. DF: validation, investigation, writing—original
draft, and review and editing. PG: conceptualization (project),
supervision, and project administration. FT: supervision and
project administration. FK: conceptualization (project and
Frontiers in Public Health | www.frontiersin.org 11 August 2021 | Volume 9 | Article 583377
Verdun et al. Efficiency Increase for SARS-CoV-2 Testing
manuscript), supervision, and project administration. All authors
contributed to the article and approved the submitted version.
FUNDING
CV gratefully acknowledge support by German Science
Foundation (DFG) within the Gottfried Wilhelm Leibniz Prize
under Grant BO 1734/20-1, under contract number PO-1347/3-2
and within Germany’s Excellence Strategy EXC-2111 390814868.
CV and FK gratefully acknowledge support by German Science
Foundation in the context of the Emmy Noether junior research
group KR 4512/1-1. TF and FK gratefully support funding by
German Science Foundation (project KR 4512/2-2). FT gratefully
acknowledges support by the BMBF (grant# 01IS18036A and
grant# 01IS18053A) and by the Chan Zuckerberg Initiative
DAF (advised fund of Silicon Valley Community Foundation,
182835). JB and DE gratefully acknowledge support by Austrian
Science Fund (FWF) under grants I3403-N32 and P 30148. PG
and PH declare that no external funding was received and DF
acknowledges support from a German Research Foundation
(DFG) fellowship through the Graduate School of Quantitative
Biosciences Munich (QBM) [GSC 1006 to DF] and by the
Joachim Herz Stiftung. All data are publicly available. FK had the
final responsibility for the decision to submit for publication.
ACKNOWLEDGMENTS
The authors are grateful to Luciana Jesus da Costa, from the
Virology Department at the Federal University of Rio de Janeiro,
for insightful discussions about SARS-CoV-2. This manuscript
has been released as a pre-print at medRxiv (74).
SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found
online at: https://www.frontiersin.org/articles/10.3389/fpubh.
2021.583377/full#supplementary-material
REFERENCES
1. Fraser C, Riley S, Ferguson NM. Factors that make an infectious disease
outbreak controllable. Proc Natl Acad Sci USA. (2004) 101:6146–51.
doi: 10.1073/pnas.0307506101
2. Dorfman R. The detection of defective members of large populations. Ann
Math Statist. (1943) 14:436–40. doi: 10.1214/aoms/1177731363
3. Sterrett A. On the detection of defective members of large populations. Ann
Math Statist. (1957) 28:1033–6. doi: 10.1214/aoms/1177706807
4. Sobel M, Groll PA. Binomial group-testing with an unknown proportion of
defectives. Technometrics. (1966) 8:631–56. doi: 10.2307/1266636
5. Tebbs JM, McMahan CS, Bilder CR. Two-stage hierarchical group testing
for multiple infections with application to the Infertility Prevention Project.
Biometrics. (2013) 69:1064–73. doi: 10.1111/biom.12080
6. Hourfar MK, Themann A, Eickmann M, Puthavathana P, Laue T, Seifried
E, et al. Blood screening for influenza. Emerg Infect Dis. (2007) 13:1081–3.
doi: 10.3201/eid1307.060861
7. Yelin I, Aharony N, Tamar ES, Argoetti A, Messer E, Berenbaum D, et al.
Evaluation of COVID-19 RT-qPCR test in multi-sample pools. Clin Infect Dis.
(2020) 71:2073–78. doi: 10.1093/cid/ciaa531
8. Schmidt M, Hoehl S, Berger A, Zeichhardt H, Hourfar K, Ciesek S, et al.
FACT-Frankfurt adjusted COVID-19 testing- a novel method enables high-
throughput SARS-CoV-2 screening without loss of sensitivity. medRxiv
[Preprint]. (2020). doi: 10.1101/2020.04.28.20074187
9. Abdalhamid B, Bilder CR, McCutchen EL, Hinrichs SH, Koepsell SA, Iwen PC.
Assessment of specimen pooling to conserve SARS CoV-2 testing resources.
Am J Clin Pathol. (2020) 153:715–8. doi: 10.1101/2020.04.03.20050195
10. Shani-Narkiss H, David Gilday O, Yayon N, Daniel Landau I. Efficient and
practical sample pooling for high-throughput PCR diagnosis of COVID-19.
medRxiv [Preprint]. (2020). doi: 10.1101/2020.04.06.20052159
11. Mentus C, Romeo M, DiPaola C. Analysis and applications of non-adaptive
and adaptive group testing methods for COVID-19. medRxiv [Preprint].
(2020). doi: 10.1101/2020.04.05.20050245
12. Sinnott-Armstrong N, Klein D, Hickey B. Evaluation of group
testing for SARS-CoV-2 RNA. medRxiv [Preprint]. (2020).
doi: 10.1101/2020.03.27.20043968
13. Deckert A, Barnighausen T, Kyei N. Pooled-sample analysis strategies for
COVID-19 mass testing: a simulation study. Bull World Health Organ. (2020)
98:590. doi: 10.2471/BLT.20.257188
14. Theagarajan LN. Group testing for COVID-19: how to stop worrying and test
more. arXiv [Preprint]. arXiv:2004.06306. (2020).
15. de Wolff T, Pfluger D, Rehme M, Heuer J, Bittner MI. Evaluation of pool-based
testing approaches to enable population-wide screening for COVID-19. PLoS
ONE. (2020) 15:e0243692. doi: 10.1371/journal.pone.0243692
16. Shental N, Levy S, Skorniakov S, Wuvshet V, Shemer-Avni Y, Porgador A,
et al. Efficient high throughput SARS-CoV-2 testing to detect asymptomatic
carriers. medRxiv [Preprint]. (2020). doi: 10.1101/2020.04.14.20064618
17. Bilder CR. Group testing for estimation. In: Balakrishnan N, Colton
T, Everitt B, Piegorsch W, Ruggeri F, and Teugels JL, editor. Wiley
StatsRef: Statistics Reference Online. John Wiley & Sons (2019). p. 1–11.
doi: 10.1002/9781118445112.stat08231
18. Hughes-Oliver JM. Pooling experiments for bloodscreening and drug
discovery. In: Dean A, Lewis S, editor. Screening. New York, NY: Springer
(2006) 48–68. doi: 10.1007/0-387-28014-6_3
19. Malinovsky Y, Albert PS. Revisiting nested group testing procedures:
new results, comparisons, and robustness. Am Stat. (2019) 73:117–25.
doi: 10.1080/00031305.2017.1366367
20. Orben A, Tomova L, Blakemore SJ. The effects of social deprivation on
adolescent development and mental health. Lancet Child AdolescHealth.
(2020) 4:634–40. doi: 10.1016/S2352-4642(20)30186-3
21. Lyng GD, Sheils NE, Kennedy CJ, Griffin DO, Berke EM. Identifying
optimal COVID-19 testing strategies for schools and businesses: balancing
testing frequency, individual test technology, and cost. PLoS ONE. (2021)
16:e0248783. doi: 10.1371/journal.pone.0248783
22. Du DZ, Hwang FK. Combinatorial Group Testing and Its Applications. 2nd ed.
Singapore: World Scientific (2000). doi: 10.1142/4252
23. Aldridge M, Johnson O, Scarlett J. Group testing: an information theory
perspective. Found Trends Commun Inform Theory. (2019) 15:196–392.
doi: 10.1561/0100000099
24. Kumar S. Multinomial group-testing. SIAM J Appl Math. (1970) 19:340–50.
doi: 10.1137/0119032
25. Atia KG, Saligrama V. Boolean compressed sensing and noisy group testing.
IEEE Trans Inf Theory. (2012) 58:1880–901. doi: 10.1109/TIT.2011.2178156
26. Bryan K, Leise T. Making do with less: an introduction to compressed sensing.
SIAM Rev. (2013) 55:547–66. doi: 10.1137/110837681
27. Gilbert AC, Iwen MA, Strauss MJ. Group testing and sparse signal
recovery. In: 42nd Asilomar Conference on Signals, Systems and Computers.
Pacific Grove, CA (2009). p. 1059–62. doi: 10.1109/ACSSC.2008.5074574
28. Chan CL, Jaggi S, Saligrama V, Agnihotri S. Non-adaptive group testing:
explicit bounds and novel algorithms. IEEE Trans Inf Theory. (2014)
60:3019–35. doi: 10.1109/TIT.2014.2310477
29. Zaman N, Pippenger N. Asymptotic analysis of optimal nested
group-testing procedures. Prob Eng Inform Sci. (2016) 30:547–52.
doi: 10.1017/S0269964816000267
30. Bilder CR. Group testing for identification. In: Wiley StatsRef: Statistics
Reference Online. (2019). p. 1–11. doi: 10.1002/9781118445112.stat08227
31. Westreich DJ, Hudgens MG, Fiscus SA, Pilcher CD. Optimizing
screening for acute human immunodeficiency virus infection with
Frontiers in Public Health | www.frontiersin.org 12 August 2021 | Volume 9 | Article 583377
Verdun et al. Efficiency Increase for SARS-CoV-2 Testing
pooled nucleic acid amplification test. J Clin Microbiol. (2008) 46:1785–92.
doi: 10.1128/JCM.00787-07
32. Lu R, Wang J, Li M, Wang Y, Dong J, Cai W. SARS-CoV-2 detection using
digital PCR for COVID-19 diagnosis, treatment monitoring and criteria for
discharge. medRxiv [Preprint]. (2020). doi: 10.1101/2020.03.24.20042689
33. Sheridan C. Coronavirus and the race to distribute reliable diagnostics. Nat
Biotechnol. (2019) 38:382–4. doi: 10.1038/d41587-020-00002-2
34. Vogels CBF, Brito AF, Wyllie AL, Fauveret JR, Ott IM, Kalinich CC, et al.
Analytical sensitivity and efficiency comparisons of SARS-COV-2 qRT-PCR
assays. Nat Microbiol. (2020) 5:1299–305. doi: 10.1038/s41564-020-0761-6
35. Wein LM, Zenios S. Pooled testing for HIV screening: capturing the dilution
effect. Operat Res. (1996) 44:543–69. doi: 10.1287/opre.44.4.543
36. Yang Y, Yang M, Shen C, Wang F, Li J, Zhang M, et al. Evaluating the
accuracy of different respiratory specimens in the laboratory diagnosis
and monitoring the viral shedding of 2019-nCoV infections. Innov. (2020)
1:100061. doi: 10.1101/2020.02.11.20021493
37. Corman VM, Landt O, Kaiser M, Molenkamp R, Meijer A, Chu DKW.
Detection of 2019 novel coronavirus (2019-nCoV) by real-time RT-PCR. Euro
Surveill. (2020) 25:1–8. doi: 10.2807/1560-7917.ES.2020.25.3.2000045
38. Waltz E. Testing the Tests: Which COVID-19 Tests Are Most Accurate? IEEE
Spectrum. (2020).
39. Ransohoff DF, Feinstein AR. Problems of spectrum and bias in evaluating
the efficacy of diagnostic tests. N Engl J Med. (1978) 299:926–30.
doi: 10.1056/NEJM197810262991705
40. Haber G, Malinovsky Y, Albert PS. Is group testing ready for prime-time in
disease identification? arXiv [Preprint]. arXiv:2004.04837. (2020).
41. Hitt BD, Bilder CR, Tebbs JM, McMahan CS. The objective function
controversy for group testing: much ado about nothing? Stat Med. (2019)
38:4912–23. doi: 10.1002/sim.8341
42. Sobel M, Groll PA. Group testing to eliminate efficiently all defectives
in a binomial sample. J Bell System Tech. (1959) 38:1179–252.
doi: 10.1002/j.1538-7305.1959.tb03914.x
43. Shannon CE. A mathematical theory of communication. Bell Syst Tech J.
(1948) 27:379–423. doi: 10.1002/j.1538-7305.1948.tb01338.x
44. Huffman DA. A method for the construction of minimum redundancy codes.
Proc IRE. (1952) 40:1098–103. doi: 10.1109/JRPROC.1952.273898
45. Ungar P. The cutoff point for group testing. Commun Pure Appl Math. (1960)
13:49–54. doi: 10.1002/cpa.3160130105
46. Yao YC, Hwang FK. A fundamental monotonicity in group testing. SIAM J
Discrete Math. (1988) 1:256–9. doi: 10.1137/0401026
47. Kim HY, Hudgens MG, Dreyfuss JM, Westreich DJ, Pilcher CD. Comparison
of group testing algorithms for case identification in the presence of test error.
Biometrics. (2007) 63:1152–63. doi: 10.1111/j.1541-0420.2007.00817.x
48. Johnson NL, Kotz S, Wu XZ. Inspection Errors for Attributes in Quality
Control. London: Chapman and Hall Ltd (1991).
49. Phatarfod RM, Sudbury A. The use of a square array scheme in blood testing.
Stat Med. (1994) 13:2337–43. doi: 10.1002/sim.4780132205
50. Kim HY, Hudgens MG. Three-dimensional array-based
group testing algorithms. Biometrics. (2009) 65:903–10.
doi: 10.1111/j.1541-0420.2008.01158.x
51. Woodbury CP, Fitzloff JF, Vincent SS. Sample multiplexing for greater
throughput in HPLC and related methods. Anal Chem. (1995) 67:885–90.
doi: 10.1021/ac00101a015
52. Berger T, Mandell JW, Subrahmanya P. Maximally
efficient two-stage screening. Biometrics. (2000) 56:833–40.
doi: 10.1111/j.0006-341X.2000.00833.x
53. Aldridge M. Individual testing is optimal for nonadaptive group testing
in the linear regime. IEEE Trans Inform Theory. (2019) 65:1059–62.
doi: 10.1109/TIT.2018.2873136
54. Nebenzahl E, Sobel M. Finite and infinite models for generalized group-testing
with unequal probabilities of success for each item. In: Cacoullos T, editor.
Discriminant Analysis and Applications. New York, NY: Academic Press Inc.
(1973). p. 239–84.
55. Hwang FK. A generalized binomial group testing problem. J
Am Stat Assoc. (1975) 70:923–6. doi: 10.1080/01621459.1975.1048
0324
56. Bilder CR, Tebbs J, Chen P. Informative retesting. J Am Stat Assoc. (2010)
105:942–55. doi: 10.1198/jasa.2010.ap09231
57. McMahan C, Tebbs J, Bilder CR. Informative Dorfman screening.
Biometrics. (2012) 68:287–96. doi: 10.1111/j.1541-0420.2011.
01644.x
58. McMahan C, Tebbs J, Bilder CR. Two-dimensional informative array testing.
Biometrics. (2012) 68:793–804. doi: 10.1111/j.1541-0420.2011.01726.x
59. Black MS, Bilder CR, Tebbs JM. Optimal retesting configurations for
hierarchical group testing. J R Stat Soc Ser C. (2015) 64:693–710.
doi: 10.1111/rssc.12097
60. Bilder CR, Tebbs JM. Pooled-testing procedures for screening high volume
clinical specimens in heterogeneous populations. Stat Med. (2012) 31:3261–8.
doi: 10.1002/sim.5334
61. Bilder CR, Zahng B, Schaarschmidt F, Tebbs JM. binGroup: a package for
group testing. R J. (2010) 2:56–60. doi: 10.32614/RJ-2010-016
62. FIND Evaluaion Update: SARS-CoV-2 Molecular Diagnostics, Foundation
for Innovative New Diagnostics. Available online at: https://www.finddx.org/
covid-19/sarscov2- eval-molecular/ (accessed April 28, 2020).
63. Nguyen NT, Aprahamian H, Bish EK, Bish DR. A methodology for deriving
the sensitivity of pooled testing, based on viral load progression and pooling
dilution. J Transl Med. (2019) 17:49–54. doi: 10.1186/s12967-019-1992-2
64. Lohse S, Pfuhl T, Berkó-Göttel B, Rissland J,Geißler T, Gärtner B, et al. Pooling
of samples for testing for SARS-CoV-2 in asymptomatic people. Lancet Infect
Dis. (2020) 20:1231–2. doi: 10.1016/S1473-3099(20)30362-5
65. Zhang J, Litvinova M, Wang W, Wang Y, Deng X, Chen X, et al. Evolving
epidemiology and transmission dynamics of coronavirus disease 2019 outside
Hubei province, China: a descriptive and modelling study. Lancet Infect Dis.
(2020) 20:793–802. doi: 10.1016/S1473-3099(20)30230-9
66. Baggett TP, Keyes H, Sporn N, M Gaeta J. COVID-19 outbreak at a large
homeless shelter in Boston: Implications for universal testing. medRxiv
[Preprint]. (2020) doi: 10.1101/2020.04.12.20059618
67. Black JRM, Bailey C, Przewrocka J, Dijkstra KK, Swanton C. COVID-19: the
case for health-care worker screening to prevent hospital transmission. Lancet.
(2020) 395:1418–20. doi: 10.1016/S0140-6736(20)30917-X
68. Barak N, Ben-Ami R , Sido T, Perri A, Shtoyer A, Rivkin M, et al. Lessons from
applied large-scale pooling of 133,816 SARS-CoV-2 RT-PCR tests. Sci. Transl.
Med. (2021) 13. doi: 10.1101/2020.10.16.20213405
69. Tan JG, Omar A, Lee WB, Wong MS. Considerations for group testing: a
practical approach for the clinical laboratory. Clin Biochem Rev. (2020) 41:79.
doi: 10.33176/AACB-20-00007
70. Chu DKW, Pan Y, Cheng SMS, P Y Hui K, Krishnan P, Liu Y, et al.
Molecular diagnosis of a novel coronavirus (2019-nCoV) causing an outbreak
of pneumonia. Clin Chem. (2020) 66:549–55. doi: 10.1093/clinchem/hvaa029
71. Kontou PI, Braliou GG, Dimou NL, Nikolopoulos G, Bagos PG. Antibody
tests in detecting SARS-CoV-2 infection: a meta-analysis. Diagnostics. (2020)
10:319. doi: 10.3390/diagnostics10050319
72. GeurtsvanKessel CH, Okba NMA, Igloi Z, Bogers S, Embregts CWE,
Laksono BM, et al. An evaluation of COVID-19 serological assays informs
future diagnostics and exposure assessment. Nat Commun. (2020). 11:3436.
doi: 10.1038/s41467-020-17317-y
73. Malinovsky Y, Zacks S. Proportional closeness estimation of probability
of contamination under group testing. Sequential Anal. (2018) 37:145–57.
doi: 10.1080/07474946.2018.1466518
74. Verdun CM, Fuchs T, Harar P, Elbrächter D, Fischer DS, Berner J, et al.
Group testing for SARS-CoV-2 allows for up to 10-fold efficiency increase
across realistic scenarios and testing strategies. medRxiv [Preprint]. (2020).
doi: 10.1101/2020.04.30.20085290
Conflict of Interest: The authors declare that the research was conducted in the
absence of any commercial or financial relationships that could be construed as a
potential conflict of interest.
Publisher’s Note: All claims expressed in this article are solely those of the authors
and do not necessarily represent those of their affiliated organizations, or those of
the publisher, the editors and the reviewers. Any product that may be evaluated in
this article, or claim that may be made by its manufacturer, is not guaranteed or
endorsed by the publisher.
Copyright © 2021 Verdun, Fuchs, Harar, Elbrächter, Fischer, Berner, Grohs,
Theis and Krahmer. This is an open-access article distributed under the terms
of the Creative Commons Attribution License (CC BY). The use, distribution or
reproduction in other forums is permitted, provided the original author(s) and the
copyright owner(s) are credited and that the original publication in this journal
is cited, in accordance with accepted academic practice. No use, distribution or
reproduction is permitted which does not comply with these terms.
Frontiers in Public Health | www.frontiersin.org 13 August 2021 | Volume 9 | Article 583377
... As such, widespread, scalable, and frequent testing is a defining challenge in combatting COVID-19 in the face of local, national, and global resource constraints. Pooled testing has recently arisen as a promising efficient scientific solution to the world-wide challenge of increasing COVID-19 testing capacity [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17] , encouraged in part by the finding that a single positive sample can be reliably detected by RT-qPCR in large pools 18 . ...
... Many recent works [8][9][10][12][13][14][15][16]61 focus on developing pooled testing methods for COVID-19. We will focus here on one-stage and two-stage approaches; multi-stage approaches can make robust lab implementation more difficult and can take longer to complete, which can make them less suitable for time-sensitive public health settings like COVID-19 testing. ...
... Since the flexibility of HYPER allows for many designs, we next compared different HYPER designs and their various tradeoffs. Specifically, we considered various choices for the number of pools (Fig. 2c, m = 32, 16,12) and the number of splits (Fig. 2d, q = 1, 2, 3). Similar to earlier studies of random assignment designs 11 , the HYPER designs with a smaller number of pools m are generally more efficient (especially when the prevalence is small) but slightly less sensitive. ...
Article
Full-text available
Large scale screening is a critical tool in the life sciences, but is often limited by reagents, samples, or cost. An important recent example is the challenge of achieving widespread COVID-19 testing in the face of substantial resource constraints. To tackle this challenge, screening methods must efficiently use testing resources. However, given the global nature of the pandemic, they must also be simple (to aid implementation) and flexible (to be tailored for each setting). Here we propose HYPER, a group testing method based on hypergraph factorization. We provide theoretical characterizations under a general statistical model, and carefully evaluate HYPER with alternatives proposed for COVID-19 under realistic simulations of epidemic spread and viral kinetics. We find that HYPER matches or outperforms the alternatives across a broad range of testing-constrained environments, while also being simpler and more flexible. We provide an online tool to aid lab implementation: http://hyper.covid19-analysis.org. This paper proposes HYPER, a method for screening more people using fewer tests by testing pools formed via hypergraph factorization. HYPER is not only efficient but is also simple to implement, flexible, and has maximally balanced pools.
... This idea was then applied to many other areas: screening vaccines for contamination, building clone libraries for DNA sequences, data forensics for altered documents, modification tolerant digital signatures [8,6,13,14,15,16,17,18]. Currently, it is considered a promising scheme for saving time and resources in COVID-19 testing [5,23,24,25,29]. In fact, several countries, such as China, India, Germany and the United States, have adopted group testing as a way of saving time and resources [23]. ...
... In addition, in non-adaptive CGT, we can have more balanced sizes of the groups (items in each test), which is limited in some real applications. For COVID-19 screening, researchers are testing how many samples can be grouped together without compromising the detection of positive results [23,29]. ...
Preprint
Combinatorial group testing (CGT) is used to identify defective items from a set of items by grouping them together and performing a small number of tests on the groups. Recently, group testing has been used to design efficient COVID-19 testing, so that resources are saved while still identifying all infected individuals. Due to test waiting times, a focus is given to non-adaptive CGT, where groups are designed a priori and all tests can be done in parallel. The design of the groups can be done using Cover-Free Families (CFFs). The main assumption behind CFFs is that a small number $d$ of positives are randomly spread across a population of $n$ individuals. However, for infectious diseases, it is reasonable to assume that infections show up in clusters of individuals with high contact (children in the same classroom within a school, households within a neighbourhood, students taking the same courses within a university, people seating close to each other in a stadium). The general structure of these communities can be modeled using hypergraphs, where vertices are items to be tested and edges represent clusters containing high contacts. We consider hypergraphs with non-overlapping edges and overlapping edges (first two examples and last two examples, respectively). We give constructions of what we call structure-aware CFF, which uses the structure of the underlying hypergraph. We revisit old CFF constructions, boosting the number of defectives they can identify by taking the hypergraph structure into account. We also provide new constructions based on hypergraph parameters.
... 16,17 In addition to simple pooling testing (e.g., Dorfman testing or hierarchical group testing), more complicated methodologies such as nonhierarchical group testing or array testing have also been developed. [18][19][20] Recently, these pooling strategies have been adopted to efficiently detect SARS-CoV-2 by speeding up the tests and increasing the testing capacity, and have thus been recommended by the US FDA as well. 21 Several simulation studies tried to determine the optimal pooling strategy by taking into account relevant parameters such as pool size and positive rate (PR). ...
Article
Full-text available
Background: This study aimed to compare the testing strategies for COVID-19 (i.e., individual, simple pooling, and matrix pooling) in terms of cost. Methods: We simulated the total expenditures of each testing strategy for running 10,000 tests. Three parameters were used: positive rate (PR), pool size, and test cost. We compared the total testing costs under two hypothetical scenarios in South Korea. We also simulated country-specific circumstances in India, South Africa, South Korea, the UK, and the USA. Results: At extreme PRs of 0.01% and 10%, simple pooling was the most economic option and resulted in cost reductions of 98.0% (pool size ≥80) and 36.7% (pool size = 3), respectively. At moderate PRs of 0.1%, 1%, 2%, and 5%, the matrix pooling strategy was the most economic option and resulted in cost reductions of 97.0% (pool size ≥88), 86.1% (pool size = 22), 77.9% (pool size = 14), and 59.2% (pool size = 7), respectively. In both hypothetical scenarios of South Korea, simple pooling costs less than matrix pooling. However, the preferable options for achieving cost savings differed depending on each country's cost per test and PRs. Conclusions: Both pooling strategies resulted in notable cost reductions compared with individual testing in most scenarios pertinent to real-life situations. The appropriate type of testing strategy should be chosen by considering the PR of COVID-19 in the community and the test cost while using an appropriate pooling size such as five specimens.
... 13 Even for the recent COVID-19 outbreak, many countries and regions have employed pooled testing to address the overwhelming need for rapid and mass community testing. 14,15 Although it has been reported that pooling samples into groups of 4−10 reduces the number of tests by 50−60%, 16 there are certain limitations. First, sensitivity is reduced because of the dilution from negative samples, 17 and second, selectivity is compromised due to a higher probability of crosscontamination. ...
... There is scarce data on pool testing as compare to individual testing, since this varies significant from lab to lab. However, findings from simulation studies in large populations suggest that pool testing for population-wide screening, such as in health care workers and essential personnel, could be 8-10 times faster than individual testing (29,44). Labs could also use matrix pool testing, i.e., twodimensional array of rows and columns, with each sample included in a row-pool and a column pool to avoid pool deconvolution (45,46). ...
Article
Full-text available
Background: In spite of the worth of pool testing in public health, data on the sensitivity and efficiency of real-time quantitative polymerase chain reaction (RT-qPCR) pool testing for the diagnosis of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection in middle and low-income countries are limited. Methods: We mixed single specimens of extracted RNA positive for the SARS-CoV-2 envelope (E) gene by RT-qPCR with negative specimens, in pools of 4 (n=89), 8 (n=92), 16 (n=102), and 32 (n=105) specimens each. We estimated the average change in cycle threshold (Ct) for each pool size and added it to the Ct values of the first 1,350 tests in our lab, to obtain dilution-corrected Ct values. We estimated pool sensitivity as the proportion of samples with dilution-corrected Ct >40, and used it in simulations of the efficiency (tests used/true case detected) of binary split pool testing. Results: We tested 388 pools. Average Ct changes were 2.21, 2.51, 3.27, and 3.94 cycles, for pools of 4, 8, 16, and 32 specimens, respectively. Corresponding pool tests sensitivities were 91.1%, 89.6%, 85.8% and 82.5%. Pool testing was substantially more efficient than individual testing. For prevalence of 0.5% to 2.0%, the efficiency of pools of ≥8 specimens was 30% to 280% higher, and the number of people tested was 4.4 to 13.9 times higher than those of individual testing. Conclusions: Binary split pool testing substantially increases the number of people tested and the number of true cases detected per test used. This strategy is key to curtail the transmission of SARS-CoV-2, by increasing efficiency in the identification and isolation of symptomatic and asymptomatic infected individuals.
Chapter
Combinatorial group testing (CGT) is used to identify defective items from a set of items by grouping them together and performing a small number of tests on the groups. Recently, group testing has been used to design efficient COVID-19 testing, so that resources are saved while still identifying all infected individuals. Due to test waiting times, a focus is given to non-adaptive CGT, where groups are designed a priori and all tests can be done in parallel. The design of the groups can be done using Cover-Free Families (CFFs). The main assumption behind CFFs is that a small number d of positives are randomly spread across a population of n individuals. However, for infectious diseases, it is reasonable to assume that infections show up in clusters of individuals with high contact (children in the same classroom within a school, households within a neighbourhood, students taking the same courses within a university, people seating close to each other in a stadium). The general structure of these communities can be modeled using hypergraphs, where vertices are items to be tested and edges represent clusters containing high contacts. We consider hypergraphs with non-overlapping edges and overlapping edges (first two examples and last two examples, respectively). We give constructions of what we call structure-aware CFF, which uses the structure of the underlying hypergraph. We revisit old CFF constructions, boosting the number of defectives they can identify by taking the hypergraph structure into account. We also provide new constructions based on hypergraph parameters.
Article
Full-text available
Background: The rapid identification and isolation of individuals infected with SARS-CoV-2 are fundamental countermeasures for the efficient control of the COVID-19 pandemic, which has affected millions of people around the world. Real-time RT-PCR is one of the most commonly applied reference methods for virus detection, and the use of pooled testing has been proposed as an effective way to increase the throughput of routine diagnostic tests. However, the clinical applicability of different types of real-time RT-PCR tests in a given group size remains inconclusive due to inconsistent regional disease prevalence and test demands. Methods: In this study, the performance of one dual-target conventional and two point-of-care real-time RT-PCR tests in a 5-specimen pooled testing strategy for the detection of SARS-COV-2 was evaluated. Results: We demonstrated the proof of concept that all of these real-time RT-PCR tests could feasibly detect SARS-CoV-2 from nasopharyngeal and oropharyngeal specimens that contain viral RNA loads in the range of 3.48 × 105 to 3.42 × 102 copies/ml through pooled testing in a group size of 5 with overall positive percent agreement (pooling vs. individual testing) ranging from 100% to 93.75%. Furthermore, the two POC real-time RT-PCR tests exhibited comparable sensitivity to that of the dual-target conventional one when clinical specimens were tested individually. Conclusion: Our findings support the feasibility of using real-time RT-PCR tests developed as a variety of platforms in routine laboratory detection of suspected COVID-19 cases through a pooled testing strategy that is beneficial to increasing the daily diagnostic capacity.
Article
Full-text available
Large‐scale disease screening is a complicated process in which high costs must be balanced against pressing public health needs. When the goal is screening for infectious disease, one approach is group testing in which samples are initially tested in pools and individual samples are retested only if the initial pooled test was positive. Intuitively, if the prevalence of infection is small, this could result in a large reduction of the total number of tests required. Despite this, the use of group testing in medical studies has been limited, largely due to skepticism about the impact of pooling on the accuracy of a given assay. While there is a large body of research addressing the issue of testing errors in group testing studies, it is customary to assume that the misclassification parameters are known from an external population and/or that the values do not change with the group size. Both of these assumptions are highly questionable for many medical practitioners considering group testing in their study design. In this article, we explore how the failure of these assumptions might impact the efficacy of a group testing design and, consequently, whether group testing is currently feasible for medical screening. Specifically, we look at how incorrect assumptions about the sensitivity function at the design stage can lead to poor estimation of a procedure's overall sensitivity and expected number of tests. Furthermore, if a validation study is used to estimate the pooled misclassification parameters of a given assay, we show that the sample sizes required are so large as to be prohibitive in all but the largest screening programs.
Article
Full-text available
Background COVID-19 test sensitivity and specificity have been widely examined and discussed, yet optimal use of these tests will depend on the goals of testing, the population or setting, and the anticipated underlying disease prevalence. We model various combinations of key variables to identify and compare a range of effective and practical surveillance strategies for schools and businesses. Methods We coupled a simulated data set incorporating actual community prevalence and test performance characteristics to a susceptible, infectious, removed (SIR) compartmental model, modeling the impact of base and tunable variables including test sensitivity, testing frequency, results lag, sample pooling, disease prevalence, externally-acquired infections, symptom checking, and test cost on outcomes including case reduction and false positives. Findings Increasing testing frequency was associated with a non-linear positive effect on cases averted over 100 days. While precise reductions in cumulative number of infections depended on community disease prevalence, testing every 3 days versus every 14 days (even with a lower sensitivity test) reduces the disease burden substantially. Pooling provided cost savings and made a high-frequency approach practical; one high-performing strategy, testing every 3 days, yielded per person per day costs as low as $1.32. Interpretation A range of practically viable testing strategies emerged for schools and businesses. Key characteristics of these strategies include high frequency testing with a moderate or high sensitivity test and minimal results delay. Sample pooling allowed for operational efficiency and cost savings with minimal loss of model performance.
Article
Full-text available
Pooling multiple swab samples prior to RNA extraction and real-time reverse-transcription (RT-PCR) analysis has been proposed as a strategy to reduce costs and increase throughput of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) tests. However, reports on practical large-scale group testing for SARS-CoV-2 have been scant. Key open questions concern reduced sensitivity due to sample dilution, the rate of false positives, the actual efficiency (number of tests saved by pooling), and the impact of infection rate in the population on assay performance. Here we report an analysis of 133,816 samples collected between April-September 2020 and tested by Dorfman pooling for the presence of SARS-CoV-2. We spared 76% of RNA extraction and RT-PCR tests, despite the frequently changing prevalence (0.5%-6%). We observed pooling efficiency and sensitivity that exceeded theoretical predictions, which resulted from the non-random distribution of positive samples in pools. Overall, our findings support the use of pooling for efficient large-scale SARS-CoV-2 testing.
Article
Full-text available
Objective Rapid testing is paramount during a pandemic to prevent continued viral spread and excess morbidity and mortality. This study investigates whether testing strategies based on sample pooling can increase the speed and throughput of screening for SARS-CoV-2, especially in resource-limited settings. Methods In a mathematical modelling approach conducted in May 2020, six different testing strategies were simulated based on key input parameters such as infection rate, test characteristics, population size, and testing capacity. The situations in five countries were simulated, reflecting a broad variety of population sizes and testing capacities. The primary study outcome measurements were time and number of tests required, number of cases identified, and number of false positives. Findings The performance of all tested methods depends on the input parameters, i.e. the specific circumstances of a screening campaign. To screen one tenth of each country’s population at an infection rate of 1%, realistic optimised testing strategies enable such a campaign to be completed in ca. 29 days in the US, 71 in the UK, 25 in Singapore, 17 in Italy, and 10 in Germany. This is ca. eight times faster compared to individual testing. When infection rates are lower, or when employing an optimal, yet more complex pooling method, the gains are more pronounced. Pool-based approaches also reduce the number of false positive diagnoses by a factor of up to 100. Conclusions The results of this study provide a rationale for adoption of pool-based testing strategies to increase speed and throughput of testing for SARS-CoV-2, hence saving time and resources compared with individual testing.
Preprint
Full-text available
Pooling multiple swab samples prior to RNA extraction and RT-PCR analysis was proposed as a strategy to reduce costs and increase throughput of SARS-CoV-2 tests. However, reports on practical large-scale group testing for SARS-CoV-2 have been scant. Key open questions concern reduced sensitivity due to sample dilution; the rate of false positives; the actual efficiency (number of tests saved by pooling) and the impact of infection rate in the population on assay performance. Here we report analysis of 133,816 samples collected at April-September 2020, tested by pooling for the presence of SARS-CoV-2. We spared 76% of RNA extraction and RT-PCR tests, despite the reality of frequently changing prevalence rate (0.5%-6%). Surprisingly, we observed pooling efficiency and sensitivity that exceed theoretical predictions, which resulted from non-random distribution of positive samples in pools. Overall, the findings strongly support the use of pooling for efficient large high throughput SARS-CoV-2 testing.
Article
Full-text available
Recent reports suggest that 10-30% of SARS-CoV-2 infected patients are asymptomatic and that significant viral shedding may occur prior to symptom onset. Therefore, there is an urgent need to increase diagnostic testing capabilities to prevent disease spread. We developed P-BEST - a method for P ooling- B ased E fficient S ARS-CoV-2 T esting which identifies all positive subjects within a large set of samples using a single round of testing. Each sample is assigned into multiple pools using a combinatorial pooling strategy based on compressed sensing designed for maximizing carrier detection. In our current study we pooled sets of 384 samples into 48 pools providing both an 8-fold increase in testing efficiency, as well as an 8-fold reduction in test costs. We successfully identified up to 5 positive carriers within sets of 384 samples. We then used P-BEST to screen 1115 healthcare workers using 144 tests. P-BEST provides an efficient and easy-to-implement solution for increasing testing capacity that can be easily integrated into diagnostic laboratories.
Article
Full-text available
The recent spread of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) exemplifies the critical need for accurate and rapid diagnostic assays to prompt clinical and public health interventions. Currently, several quantitative reverse transcription–PCR (RT–qPCR) assays are being used by clinical, research and public health laboratories. However, it is currently unclear whether results from different tests are comparable. Our goal was to make independent evaluations of primer–probe sets used in four common SARS-CoV-2 diagnostic assays. From our comparisons of RT–qPCR analytical efficiency and sensitivity, we show that all primer–probe sets can be used to detect SARS-CoV-2 at 500 viral RNA copies per reaction. The exception for this is the RdRp-SARSr (Charité) confirmatory primer–probe set which has low sensitivity, probably due to a mismatch to circulating SARS-CoV-2 in the reverse primer. We did not find evidence for background amplification with pre-COVID-19 samples or recent SARS-CoV-2 evolution decreasing sensitivity. Our recommendation for SARS-CoV-2 diagnostic testing is to select an assay with high sensitivity and that is regionally used, to ease comparability between outcomes. This is a comparative analysis of the performance of the primer–probe sets from four open-source molecular diagnostic assays for SARS-CoV-2 recommended by the World Health Organization.
Article
Full-text available
The world is entering a new era of the COVID-19 pandemic in which there is an increasing call for reliable antibody testing. To support decision making on the deployment of serology for either population screening or diagnostics, we present a detailed comparison of serological COVID-19 assays. We show that among the selected assays there is a wide diversity in assay performance in different scenarios and when correlated to virus neutralizing antibodies. The Wantai ELISA detecting total immunoglobulins against the receptor binding domain of SARS CoV-2, has the best overall characteristics to detect functional antibodies in different stages and severity of disease, including the potential to set a cut-off indicating the presence of protective antibodies. The large variety of available serological assays requires proper assay validation before deciding on deployment of assays for specific applications. SARS-CoV-2 is causing a global pandemic in which the implementation of serology can support decision making in different scenarios. Here, the authors compare the outcome of eight commercially available assays to virus neutralization and discuss their use in diagnostics and exposure assessment of SARS-CoV-2.
Article
Group testing, also known as pooled sample testing, was first proposed by Robert Dorfman in 1943. While sample pooling has been widely practiced in blood-banking, it is traditionally seen as anathema for clinical laboratories. However, the ongoing COVID-19 pandemic has re-ignited interest for group testing among clinical laboratories to mitigate supply shortages. We propose five criteria to assess the suitability of an analyte for pooled sample testing in general and outline a practical approach that a clinical laboratory may use to implement pooled testing for SARS-CoV-2 PCR testing. The five criteria we propose are: (1) the analyte concentrations in the diseased persons should be at least one order of magnitude (10 times) higher than in healthy persons; (2) sample dilution should not overly reduce clinical sensitivity; (3) the current prevalence must be sufficiently low for the number of samples pooled for the specific protocol; (4) there is no requirement for a fast turnaround time; and (5) there is an imperative need for resource rationing to maximise public health outcomes. The five key steps we suggest for a successful implementation are: (1) determination of when pooling takes place (pre-pre analytical, pre-analytical, analytical); (2) validation of the pooling protocol; (3) ensuring an adequate infrastructure and archival system; (4) configuration of the laboratory information system; and (5) staff training. While pool testing is not a panacea to overcome reagent shortage, it may allow broader access to testing but at the cost of reduction in sensitivity and increased turnaround time.