ArticlePDF Available

Abstract

The group testing problem consists of determining a small set of defective items from a larger set of items based on a number of possibly-noisy tests, and is relevant in applications such as medical testing, communication protocols, pattern matching, and many more. One of the defining features of the group testing problem is the distinction between the non-adaptive and adaptive settings: In the non-adaptive case, all tests must be designed in advance, whereas in the adaptive case, each test can be designed based on the previous outcomes. While tight information-theoretic limits and near-optimal practical algorithms are known for the adaptive setting in the absence of noise, surprisingly little is known in the noisy adaptive setting. In this paper, we address this gap by providing information-theoretic achievability and converse bounds under a widely-adopted symmetric noise model, as well as a slightly weaker achievability bound for a computationally efficient variant. These bounds are shown to be tight or near-tight in a broad range of scaling regimes, particularly at low noise levels. The algorithms used for the achievability results have the notable feature of only using two or three stages of adaptivity.
1
Noisy Adaptive Group Testing:
Bounds and Algorithms
Jonathan Scarlett
Abstract
The group testing problem consists of determining a small set of defective items from a larger set of items based
on a number of possibly-noisy tests, and is relevant in applications such as medical testing, communication protocols,
pattern matching, and many more. One of the defining features of the group testing problem is the distinction between
the non-adaptive and adaptive settings: In the non-adaptive case, all tests must be designed in advance, whereas in
the adaptive case, each test can be designed based on the previous outcomes. While tight information-theoretic limits
and near-optimal practical algorithms are known for the adaptive setting in the absence of noise, surprisingly little is
known in the noisy adaptive setting. In this paper, we address this gap by providing information-theoretic achievability
and converse bounds under a widely-adopted symmetric noise model, as well as a slightly weaker achievability bound
for a computationally efficient variant. These bounds are shown to be tight or near-tight in a broad range of scaling
regimes, particularly at low noise levels. The algorithms used for the achievability results have the notable feature of
only using two or three stages of adaptivity.
Index Terms
Group testing, sparsity, information-theoretic limits, adaptive algorithms
I. INTRODUCTION
The group testing problem consists of determining a small subset Sof “defective” items within a larger set of
items {1, . . . , p}, based on a number of possibly-noisy tests. This problem has a history in medical testing [1],
and has regained significant attention with following new applications in areas such as communication protocols
[2], pattern matching [3], and database systems [4], and new connections with compressive sensing [5], [6]. In the
noiseless setting, each test takes the form
Y=_
jS
Xj,(1)
where the test vector X= (X1, . . . , Xp)∈ {0,1}pindicates which items are included in the test, and Yis the
resulting observation. That is, the output indicates whether at least one defective item was included in the test. One
wishes to design a sequence of tests X(1), . . . , X (n), with nideally as small as possible, such that the outcomes
can be used to reliably recover the defective set Swith probability close to one.
The author is with the Department of Computer Science & Department of Mathematics, National University of Singapore (e-mail:
scarlett@comp.nus.edu.sg). This work was supported by an NUS Startup Grant.
March 15, 2018 DRAFT
arXiv:1803.05099v1 [cs.IT] 14 Mar 2018
2
One of the defining features of the group testing problem is the distinction between the non-adaptive and adaptive
settings. In the non-adaptive setting, every test must be designed prior to observing any outcomes, whereas in the
adaptive setting, a given test X(i)can be designed based on the previous outcomes Y(1), . . . , Y (i1). It is an active
area of research to determine the extent to which this additional freedom helps in reducing the number of tests.
In the noiseless setting, a number of interesting results have been discovered along these lines:
When the number of defectives k:= |S|scales as k=O(p1/3), the minimal number of tests permitting
vanishing error probability scales as n=klog2p
k(1 + o(1)) in both the adaptive and non-adaptive settings
[7], [8]. Hence, at least information-theoretically, there is no asymptotic adaptivity gain.
For scalings of the form k= Θ(pθ)with θ1
3,1, the behavior n=klog2p
k(1+o(1)) remains unchanged
in the adaptive setting [7], but it remains open as to whether this can be attained non-adaptively. For θclose
to one, the best known non-adaptive achievability bounds are far from this threshold.
Even in the first case above with no adaptivity gain, the adaptive algorithms known to achieve the optimal
threshold are practical, having low storage and computation requirements [9]. In contrast, in the non-adaptive
case, only computationally intractable algorithms have been shown to attain the optimal threshold [8], [10].
It has recently been established that there is a provable adaptivity gap under certain scalings of the form
k= Θ(p), i.e., the linear regime [11], [12].
Despite this progress for the noiseless setting, there has been surprisingly little work on adaptivity in noisy settings;
the vast majority of existing group testing algorithms for random noise models are non-adaptive [13]–[16]. In this
paper, we address this gap by providing new achievability and converse bounds for noisy adaptive group testing,
focusing primarily on a widely-adopted symmetric noise model. Before outlining our contributions, we formally
introduce the setup.
A. Problem Setup
Except where stated otherwise, we let the defective set Sbe uniform on the p
ksubsets of {1, . . . , p}of cardinality
k. As mentioned above, an adaptive algorithm iteratively designs a sequence of tests X(1), . . . , X (n), with X(i)
{0,1}p, and the corresponding outcomes are denoted by Y= (Y(1), . . . , Y (n)), with Y(i)∈ {0,1}. A given test is
allowed to depend on all of the previous outcomes.
Generalizing (1), we consider the following widely-adopted symmetric noise model:
Y=_
jS
XjZ, (2)
where ZBernoulli(ρ)for some ρ0,1
2, and denotes modulo-2 addition. In Section V, we will also consider
other asymmetric noise models.
Given the tests and their outcomes, a decoder forms an estimate b
Sof S. We consider the exact recovery criterion,
in which the error probability is given by
Pe:= P[b
S6=S],(3)
March 15, 2018 DRAFT
3
and is taken over the randomness of the defective set S, the tests X(1), . . . , X (n)(if randomized), and the noisy
outcomes Y(1), . . . , Y (n).
As a stepping stone towards exact recovery results, we will also consider a less stringent partial recovery criterion,
in which we allow for up to dmax false positives and up to dmax false negatives, for some dmax >0. That is, the
error probability is
Pe(dmax) := P[d(S, b
S)> dmax],(4)
where
d(S, b
S) = max{|S\b
S|,|b
S\S|}.(5)
Understanding partial recovery is, of course, also of interest in its own right. However, the results of [8], [17]
indicate that there is no adaptivity gain under this criterion, at least when k=o(p)and dmax =o(k).
Except where stated otherwise, we assume that the noise level ρand number of defectives kare known. In Section
IV, we will consider cases where kis only approximately known.
B. Related work
Non-adaptive setting. The information-theoretic limits of group testing were first studied in the Russian literature
[13], [18], and have recently become increasingly well-understood [8], [10], [17], [19]–[21]. Among the existing
works, the results most relevant to the present paper are as follows:
In the adaptive setting, it was shown by Baldassini et al. [7] that if the output Yis produced by passing the
noiseless outcome U=jSXjthrough a binary channel PY|U, then the number of tests for attaining Pe0
must satisfy n1
Cklog p
k(1 o(1)),1where Cis the Shannon capacity of PY|Uin nats. For the symmetric
noise model (2), this yields
nklog p
k
log 2 H2(ρ)(1 o(1)),(6)
where H2(ρ) = ρlog 1
ρ+ (1 ρ) log 1
1ρis the binary entropy function.
In the non-adaptive setting with symmetric noise, it was shown that an information-theoretic threshold decoder
attains the bound (6) for k=o(p)under the partial recovery criterion with dmax = Θ(k)and an arbitrarily
small implied constant [8], [17]. For exact recovery, a more complicated bound was also given in [8] that
matches (6) when k= Θ(pθ)for sufficiently small θ > 0.
Several non-adaptive noisy group testing algorithms have been shown to come with rigorous guarantees. We will
use two of these non-adaptive algorithms as building blocks in our adaptive methods:
The Noisy Combinatorial Orthogonal Matching Pursuit (NCOMP) algorithm checks, for each item, the
proportion of tests it was included in that returned positive, and declares the item to be defectie if this number
exceeds a suitably-chosen threhsold. This is known to provide optimal scaling laws for the regime k= Θ(pθ)
(θ(0,1)) [14], [15], albeit with somewhat suboptimal constants.
1Here and subsequently, the function log(·)has base e.
March 15, 2018 DRAFT
4
The method of separate decoding of items, also known as separate testing of inputs [13], [16], also considers the
items separately, but uses all of the tests. Specifically, a given item’s status is selected via a binary hypothesis
test. This method was studied for k=O(1) in [13], and for k= Θ(pθ)in [16]; in particular, it was shown that
the number of tests is within a factor log 2 of the optimal information-theoretic threshold under exact recovery
as θ0, and under partial recovery (with dmax = Θ(k)) for all θ(0,1).
A different line of works has considered group testing with adversarial errors (e.g., see [22]–[24]); these are less
relevant to the present paper.
Adaptive setting. As mentioned above, adaptive algorithms have enjoyed a great deal of success in the noiseless
setting [25], [26]. To our knowledge, the first algorithm that was proved to achieve n=klog2p
k(1 + o(1)) for
all k=o(p)is Hwang’s generalized binary splitting algorithm [9], [25]. More recently, there has been interest in
algorithms that only use limited rounds of adaptivity [26]–[28], and among other things, it has been shown that the
same guarantee can be attained using at most four stages [26].
In the noisy adaptive setting, the existing work is relatively limited. In [29], an adaptive algorithm called
GROTESQUE was shown to provide optimal scaling laws in terms of both samples and runtime. Our focus in
this paper is only on the number of samples, but with a much greater emphasis on the constant factors. In [30,
Ch. 4], noisy adaptive group testing algorithms were proposed for two different noise models based on the Z-channel
and reverse Z-channel, also achieving an order-optimal required number of tests with reasonable constant factors.
We discuss these noise models further in Section V.
C. Contributions
In this paper, we characterize both the information-theoretic limits and the performance of practical algorithms
for noisy adaptive group testing, characterizing the asymptotic required number of tests for Pe0as p→ ∞.
For the achievability part, we propose an adaptive algorithm whose first stage can be taken as any non-adaptive
algorithm that comes with partial recovery guarantees, and whose second stage (and third stage in a refined version)
improve this initial estimate. By letting the first stage use the information-theoretic threshold decoder of [8], we
attain an achievability bound that is near-tight in many cases of interest, whereas by using separate decoding of
items as per [13], [16], we attain a slightly weaker guarantee while still maintaining computational efficiency. In
addition, we provide a novel converse bound showing that Ω(klog k)tests are always necessary, and hence, the
implied constant in any scaling of the form n= Θklog p
kwith k= Θ(pθ)must grow unbounded as θ1.
Our results are summarized in Figure 1, where we observe a considerable gain over the best known non-adaptive
guarantees, particularly when the noise level ρis small. Although there is a gap between the achievability and
converse bounds for most values of θ, the converse has the notable feature of showing that n=klog2
p
k
log 2H2(ρ)(1+o(1))
is not always achievable, as one might conjecture based on the noiseless setting. In addition, the gap between the
(refined) achievability bound and the converse bound is zero or nearly zero in at least two cases: (i) θis small; (ii)
θis close to one and ρis close to zero. The algorithms used in our upper bounds have the notable feature of only
using two or three rounds of adaptivity, i.e., two in the simple version, and three in the refined version.
March 15, 2018 DRAFT
5
Value 3such that k=#(p3)
0 0.2 0.4 0.6 0.8 1
Asymptotic ratio of klog2p=k to n
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Non-adaptive
Adaptive (Simple)
Adaptive (Practical)
Adaptive (Refined)
Converse
Value 3such that k=#(p3)
0 0.2 0.4 0.6 0.8 1
Asymptotic ratio of klog2p=k to n
0
0.2
0.4
0.6
0.8
1
Figure 1: Asymptotic thresholds on the number of tests required for vanishing error probability under the noise
levels ρ= 0.11 (Left) and ρ= 104(Right).
In addition to these contributions for the symmetric noise model, we provide the following results for other
observation models:
In the noiseless case, we recover the threshold n=klog2p
k(1 + o(1)) for all θ(0,1) using a two-stage
adaptive algorithm. Previously, the best known number of stages was four [26].
For the Z-channel noise model (defined formally in Section V), we show that one can attain n=
1
Cklog2p
k(1 + o(1)) for all θ(0,1), where Cis the Shannon capacity of the channel. This matches
the general converse bound given in [7], i.e., the generalized version of (6). As a result, we improve on the
above-mentioned bounds of [30], which contain reasonable yet strictly suboptimal constant factors.
For the reverse Z-channel noise model (defined formally in Section V), we prove a similar converse bound to
the one mentioned above for the symmetric noise model, thus showing that one cannot match the converse
bound of [7] for all θ(0,1).
The remainder of the paper is organized as follows. For the symmetric noise model, we present the simple version
of our achievability bound in Section II, the refined version in Section III, and the converse in Section IV. The
other observation models mentioned above are considered in Section V, and conclusions are drawn in Section VI.
II. ACHIEVABILITY (SIMPLE VER SI ON )
In this section, we formally state our simplest achievability results; a more complicated but powerful variant is
given in Section III. Using a common two-stage approach, we provide achievability bounds for both a computationally
intractable information-theoretic decoder, and computationally efficient decoder.
March 15, 2018 DRAFT
6
A. Information-theoretic decoder
The two-stage algorithm that we adopt is outlined informally in Algorithm 1; we describe the steps more precisely
in the proof of Theorem 1 below. The high-level intuition is to use a non-adaptive algorithm with partial recovery
guarantees, and then refine the solution by resolving the false negatives and false positives separately, i.e., Steps
2a and 2b. While these latter steps are stated separately in Algorithm 1, the tests that they use can be performed
together in a single round of adaptivity, so that the overall algorithm is a two-stage procedure.
Algorithm 1: Two-stage algorithm for noisy group testing (informal).
1. Apply the information-theoretic threshold decoder of [8] (see Appendix A) to the ground set {1, . . . , p}to find
an estimate b
S1of cardinality ksuch that
max{|b
S1\S|,|S\b
S1|} ≤ α1k(7)
with high probability, for some small α1>0.
2a. Apply a variation of NCOMP [14] (see Appendix C) to the reduced ground set {1, . . . , p}\b
S1to exactly identify
the false negatives S\b
S1from the first step. Let these items be denoted by b
S0
2a.
2b. Test the items in b
S1individually entimes, and let b
S0
2bcontain the items that returned positive at least
en
2times.
The final estimate of the defective set is given by b
S:= b
S0
2ab
S0
2b.
Our first main information-theoretic achievability result is as follows.
Theorem 1. Under the symmetric noisy group testing setup with crossover probability ρ0,1
2, with k= Θ(pθ)
for some θ(0,1), there exists a two-stage adaptive group testing algorithm such that
nklog p
k
log 2 H2(ρ)+klog k
1
2log 1
4ρ(1ρ)(1 + o(1)) (8)
and such that Pe0as p→ ∞.
Proof. We study the guarantees of the three steps in Algorithm 1, and the number of tests used for each one.
Step 1. It was shown in [8] that, for an arbitrarily small constant α1>0, there exists a non-adaptive group
testing algorithm returning some set b
S1of cardinality ksuch that
max{|b
S1\S|,|S\b
S1|} ≤ α1k, (9)
with probability approaching one, with the number of tests being at most
n1klog p
k
log 2 H2(ρ)(1 + o(1)).(10)
In Appendix A, we recap the decoding algorithm and its analysis. The non-adaptive test design for this stage is the
ubiquitous i.i.d. Bernoulli design.
Step 2a. Let us condition in the first step being successful, in the sense that (9) holds. We claim that there
exists a non-adaptive algorithm that, when applied to the reduced ground set {1, . . . , p}\b
S1, returns b
S0
2acontaining
March 15, 2018 DRAFT
7
precisely the set of (at most α1k) defective items S\b
S1with probability approaching one, with the number of
samples behaving as
n2=Oα1klog p
α1k.(11)
If the number of defectives k1:= |Sb
S1|in the reduced ground set were known, this would simply be an application
of the O(k1log p)scaling derived in [14] for the NCOMP algorithm. In Appendix C, we adapt the algorithm and
analysis of [14] to handle the case that k1is only known up to a constant factor.
In fact, in the present setting, we only know that k1[0, α1k], so we do not even know k1up to a constant
factor. To get around this, we apply a simple trick that is done purely for the purpose of the analysis: Instead of
applying the modified NCOMP algorithm directly to {1, . . . , p}\b
S1, apply it to the slightly larger set in which α1k
“dummy” defective items are included. Then, the number of defectives is in [α1k, 2α1k], and is known up to a
factor of two. We do not expect that this trick would ever be useful practice, but it is convenient for the sake of
the analysis.
Step 2b. Since we conditioned on the first step being successful, at most α1kof the kitems in b
S1are non-
defective. In the final step, we simply test each item in b
S1individually entimes, and declare the item positive if
and only if at least half of the outcomes are positive.
To study the success probability, we use a well-known Chernoff-based concentration bound for Binomial random
variables: If ZBinomial(N, q), then
P[Znq0]eN D2(q0kq), q0< q, (12)
where D2(q0kq) = q0log q0
q+ (1 q0) log 1q0
1qis the binary KL divergence function.
Fix an arbitrary item j, and let e
Nj,1be the number of its entests that are positive. Since the test outcomes are
distributed as Bernoulli(1 ρ)for defective jand Bernoulli(ρ)for non-defective j, we obtain from (12) that
Pe
Nj,1en
2eenD2(1
2k1ρ)=eenD2(1
2kρ), j S(13)
Pe
Nj,1en
2eenD2(1
2kρ), j /S. (14)
Hence, we obtain from the union bound over the kitems in b
S1that
P[b
S0
2b6= (Sb
S1)] k·eenD2(1
2kρ).(15)
For any η > 0, the right-hand side tends to zero as p→ ∞ under the choice
en=log k
D2(1
2kρ)(1 + η),(16)
which gives a total number of tests in Step 2b of
n2b=klog k
D2(1
2kρ)(1 + η).(17)
The proof is concluded by noting that ηcan be arbitrarily small, and writing D2(1
2kρ) = 1
2log 1
4ρ(1ρ).
March 15, 2018 DRAFT
8
A weakness of Theorem 1 is that it does not achieve the threshold n=klog p
k
log 2H2(ρ)(1 + o(1)) for any value of
θ > 0(see Figure 1), even though such a threshold is achievable for sufficiently small θeven non-adaptively [8].
We overcome this limitation via a refined three-stage algorithm in Section III.
B. Practical decoder
Of the three steps given in Algorithm 1 and the proof of Theorem 1, the only one that is computationally
challenging is the first, which uses an information-theoretic threshold decoder to identify Sup to a distance (cf.,
(5)) of d(S, b
S)α1k, for small α1>0. A similar approximate recovery result was also shown in [16] for separate
decoding of items, which is computationally efficient. The asymptotic threshold on nfor separate decoding of items
is only a log 2 factor worse than the optimal information-theoretic threshold [16], and this fact leads to the following
counterpart to Theorem 1.
Theorem 2. Under the symmetric noisy group testing setup with crossover probability ρ0,1
2, and k= Θ(pθ)
for some θ(0,1), there exists a computationally efficient two-stage adaptive group testing algorithm such that
nklog p
k
log 2 ·(log 2 H2(ρ)) +klog k
1
2log 1
4ρ(1ρ)(1 + o(1)) (18)
and Pe0as p→ ∞.
The proof is nearly identical to that of Theorem 1, except that the required number of tests in the first stage is
multiplied by 1
log 2 in accordance with [16]. For brevity, we omit the details.
III. ACHIEVABILITY (RE FIN ED VERSION)
As mentioned previously, a weakness of Theorem 1 is that it only achieves the behavior nklog p
k
log 2H2(ρ)(1 +
o(1)) (for which a matching converse is known [7]) in the limit as θ0, even though this can be achieved even
non-adaptively for sufficiently small θ[8]. Since adaptivity provides extra freedom in the design, we should expect
the corresponding bounds to be at least as good as the non-adaptive setting.
While we can simply take the better of Theorem 1 and the exact recovery result of [8], this is a rather unsatisfying
solution, and it leads to a discontinuity in the asymptotic threshold (cf., Figure 1). It is clearly more desirable
to construct an adaptive scheme that “smoothly” transitions between the two. In this section, we attain such an
improvement by modifying Algorithm 1 in two ways. The resulting algorithm is outlined informally in Algorithm
2, and the modifications are as follows:
In the first stage, instead of learning Sup to a distance of α1kfor some constant α1(0,1), we learn it up
to a distance of kγfor some γ(0,1). The non-adaptive partial recovery analysis of [8] requires non-trivial
modifications for this purpose; we provide the details in Appendix A.
We split Step 2b of Algorithm 1 into two stages, one comprising Step 2b in Algorithm 2, and the other
comprising Step 3. The former of these identifies most of the defective items, and the latter resolves the rest.
March 15, 2018 DRAFT
9
It is worth noting that, at least using our analysis techniques, neither of the above modifications alone is enough to
obtain a bound that is always at least as good as the non-adaptive exact recovery result of [8]. We will shortly see,
however, that the two modifications combined do suffice.
Algorithm 2: Three-stage algorithm for noisy group testing (informal).
1. Apply the information-theoretic threshold decoder of [8] (see Appendix A) to the ground set {1, . . . , p}to find
an estimate b
S1of cardinality ksuch that
max{|b
S1\S|,|S\b
S1|} ≤ kγ(19)
with high probability, where γ(0,1).
2a. Apply a variation of NCOMP [14] (see Appendix C) to the reduced ground set {1, . . . , p}\b
S1to exactly identify
the false negatives from the first step. Let these items be denoted by b
S0
2a.
2b. Test each item in b
S1individually ˇntimes, and let b
S0
2bb
S1contain the kα2kitems that returned positive
the highest number of times, for some small α2>0.
3. Test the items in b
S1\b
S0
2b(of which there are α2k) individually entimes, and let b
S0
3contain the items that
returned positive at least
en
2times. The final estimate of the defective set is given by b
S:= b
S0
2ab
S0
2bb
S0
3.
The following theorem characterizes the asymptotic number of tests required.
Theorem 3. Under the symmetric noisy group testing setup with crossover probability ρ0,1
2, under the scaling
k= Θ(pθ)for some θ(0,1), there exists a three-stage adaptive group testing algorithm such that
ninf
γ(0,1)2(0,1) max nMI,1, nMI,2(γ, δ2), nConc (γ, δ2)+nIndiv (γ)(1 + o(1)) (20)
and Pe0as p→ ∞, where:
The standard mutual information based term is
nMI,1=klog p
k
log 2 H2(ρ).(21)
An additional mutual information based term is
nMI,2(γ, δ2) = 2
(log 2)(1 2ρ) log 1ρ
ρ·1
1δ2·(1 θ)klog p+ 2(1 γ)klog k.(22)
The term associated with a concentration bound is
nConc(γ , δ2) = 4(1 + 1
3δ2(1 2ρ))
(log 2)δ2
2(1 2ρ)2·(1 γ)klog k. (23)
The term associated with individual testing is
nIndiv(γ) = γk log k
D2(ρk1ρ).(24)
While the theorem statement is somewhat complex, it is closely related to other simpler results on group testing:
March 15, 2018 DRAFT
10
In the limit as γ0, the term max nMI,1, nMI,2(δ2), nConc(γ , δ2)corresponds to the condition for exact
recovery derived in [8]. Since nIndiv(γ)becomes negligible as γ0, this means that we have the above-
mentioned desired property of being at least as good as the exact recovery result.
Taking γ1and δ20in a manner such that 1γ
δ2
20, we recover a strengthened version of Theorem 1
with D1
2k1ρ) = 1
2log 1
4ρ(1ρ)increased to Dρk1ρ).2
The parameter δ2controls the trade-off between the concentration behavior associated with nConc and the mutual
information term associated with nMI,2.
A. Proof of Theorem 3
The proof follows similar steps to those of Theorem 1, considering the four steps of Algorithm 2 separately.
Step 1. We show in Appendix A that the approximate recovery result of [8] can be extended as follows: There
exists a non-adaptive algorithm recovering an an estimate b
S1of cardinality ksuch that d(S, b
S1)kγwith probability
approaching one, provided that the number of tests n1satisfies
n1max nMI,1, nMI,2(γ, δ2), nConc (γ, δ2)·(1 + o(1)) (25)
for some δ2(0,1), under the definitions in (21)–(23). This algorithm and its corresponding estimate b
S1constitute
the first step.
Step 2a. The algorithm and analysis for Stage 2 are identical to that of Theorem 1: We use the variation of
NCOMP given in Appendix C to identify all defective items in {1, . . . , p} \ b
S1with probability approaching one,
while only using O(kγlog p) = o(klog p)tests.
Step 2b. For this step, we need to show that the set b
S0
2bconstructed in Algorithm 1 only contains defective
items. Recall that this set is constructed by testing each item in b
S1individually ˇntimes, and keeping items that
returned positive the highest number of times. Since b
S0
2bcontains |b
S1| − α2kitems, requiring all of these items
to be defective is equivalent to requiring that the set of α2kitems with the smallest number of positive outcomes
includes the kγ(or fewer) non-defective items in b
S1. For any ζ > 0, the following two conditions suffice for this
purpose:
Event A1: All non-defective items in b
S1return positive less than ζˇntimes;
Event A2: At most α2kkγdefective items return positive less than ζˇntimes.
Here we assume that kit sufficiently large so that α2k > kγ, which is valid since γ < 1and α2>0are constant.
Fix an arbitrary item j, and let ˇ
Nj,1be the number of its ˇntests that are positive. Since the test outcomes are
distributed as Bernoulli(1 ρ)for defective jand Bernoulli(ρ)for non-defective j, we obtain from (12) that
Pˇ
Nj,1ζˇneˇnD2(ζk1ρ), j S(26)
Pˇ
Nj,1ζˇneˇnD2(ζkρ), j /S. (27)
2By letting the first stage of Algorithm 2 use separate decoding of items [16], one can obtain a strengthened version of Theorem 2 with the
same improvement. This result is omitted for the sake of brevity, as the main purpose of the refinements given in this section is to obtain a
bound that is always at least as good as the non-adaptive information-theoretic bound of [8].
March 15, 2018 DRAFT
11
Hence, we obtain from the union bound over the non-defective items in b
S1that
P[Ac
1]kγ·eˇnD2(ζkρ),(28)
which is upper bounded by δ3>0as long as
ˇnlog kγ
δ3
D2(ζkρ).(29)
Moreover, regarding the event A2, the average number of defective items that return positive less than ζˇntimes is
upper bounded by keˇnD2(ζk1ρ)(recall that |b
S1|=k), and hence, Markov’s inequality gives
P[Ac
2]keˇnD2(ζk1ρ)
α2kkγ.(30)
This is upper bounded by
k
log k
α2kkγ0as long as ˇnlog log k
D2(ζk1ρ). This, in turn, behaves as o(log k)for any
ζ < 1ρ. Hence, we are left with only the condition on ˇnin (29), and choosing ζarbitrarily close to 1ρmeans
that we only need the following to hold for arbitrarily small η > 0:
ˇnγlog k
D2(1 ρkρ)(1 + η),(31)
since log kγ
δ3= (γlog k)(1 + o(1)) no matter how small δ3is. Multiplying by k(i.e., the number of items that are
tested individually ˇntimes) and noting that D(1 ρkρ) = D(ρk1ρ), we deduce that the number of tests in this
stage is asymptotically at most nIndiv(γ), defined in (24).
Step 3. This step is the same as that of Step 2b in Algorithm 1, but we are now working with α2kitems rather
than kitems. As a result, the number of tests required is O(α2klog k), meaning that the coefficient to klog kcan
be made arbitrarily small by a suitable choice of α2.
IV. CONVERSE
To our knowledge, the best-known existing converse bound for the symmetric noise model in the adaptive setting
is that of Baldissini et al. [7], shown in (6). On the other hand, the achievability bound of Theorem 1 contains a
klog kterm, meaning that the gap between the achievability and converse grows unbounded as θ1under the
scaling k= Θ(pθ). In this section, we provide a novel converse bound revealing that the Ω(klog k)behavior is
unavoidable.
There is a minor caveat to this converse result: We have not been able to prove it in the case that Sis known
to have cardinality exactly k, but rather, only in the case that it is known to have cardinality either kor k1.
We strongly conjecture that this distinction has no impact on the fundamental limits; we argue in Appendix B that
Theorem 1 remains true even when kis only known up to a multiplicative 1 + o(1) term, and Theorem 3 remains
true when kis only known up to an additive o(kγ)term. Since we assume that k→ ∞, these assumptions are
much weaker than the assumption |S| ∈ {k1, k}
To make the model definition more precise, fix k2p, and define
Sk,p =S⊆ {1, . . . , p}:|S|=k,(32)
March 15, 2018 DRAFT
12
and similarly for Sk1,p. We consider the following distribution for the random defective set:
SUniform(Sk,p ∪ Sk1,p).(33)
Under this slightly modified model, we have the following.
Theorem 4. Consider the symmetric noisy group testing setup with crossover probability ρ0,1
2,Sdistributed
according to (33), and k→ ∞ with kp
2. For any adaptive algorithm, in order to achieve Pe0, it is necessary
that
nmax klog p
k
log 2 H2(ρ),klog k
log 1ρ
ρ(1 o(1)).(34)
The first term is precisely (6), so our novelty is in deriving the second term. This result provides the first counter-
example to the natural conjecture that the optimal number of tests is klog p
k
log 2H2(ρ)(1 + o(1)) whenever k= Θ(pθ)
with θ(0,1). Indeed, the Ω(klog k)lower bound reveals that the constant pre-factor to klog p
kmust grow
unbounded as θ1.
It is interesting to observe the behavior of Theorems 1, 3, and 4 in the limit as ρ0. As one should expect,
under the scaling k= Θ(pθ)for fixed θ(0,1), both the achievability and converse bounds (see (8) and (34))
tend towards the noiseless limit klog2p
k(1 + o(1)) as ρ0. Moreover, the achievability and converse bounds
scale similarly with respect to ρ, in the sense that the klog kterm is scaled by Θ1
log 1
ρin both cases.
In fact, if we consider the refined achievability bound (Theorem 3), we can make a stronger claim. If we take
γ1and θ1simultaneously, then the bound in (20) is asymptotically equivalent to nIndiv(1), since nMI,1
scales as klog p
kklog k, whereas the constant factors in nMI,2and nConc vanish (see (21)–(23)). Hence, we are
only left with nIndiv(1) in (24), and if ρis small, then the denominator D2(ρk1ρ) = ρlog ρ
1ρ+ (1 ρ) log 1ρ
ρ
is approximately equal to log 1
ρ. The exact same statement is true for the denominator in (34), and hence, the
achievability and converse bounds exhibit matching constant factors. Specifically, this statement holds when the
order of the limits is first n→ ∞, then θ1, then ρ0. This fact explains the near-identical behavior of the
achievability and converse in Figure 1 for θclose to one in the low noise setting, ρ= 104.
On the other hand, for fixed θ(0,1), the logarithmic decay of the Θ1
log 1
ρfactor to zero is quite slow, which
explains the non-negligible deviation from the noiseless threshold (i.e., a straight line at height 1) in Figure 1, both
in the high-noise and low-noise cases.
Another interesting consequence of Theorem 4 is that in the linear regime k= Θ(p), one requires n= Ω(plog p)
in the presence of noise. This is in stark contrast to the noiseless setting, where individual testing trivially identifies
Swith only ptests. In the non-adaptive setting, establishing the necessity of n= Ω(plog p)is straightforward:3If
a genie reveals S∪ {j}to the decoder for some j /S, then the decoder can only identify the final non-defective
jby testing each item Ω(log p)times. However, this argument does not extend to the adaptive setting.
3This argument is due to Sidharth Jaggi, whose insight is gratefully acknowledged.
March 15, 2018 DRAFT
13
Instead, the proof of Theorem 4 is inspired by that of a converse bound for the top-marm identification problem
from the multi-armed bandit (MAB) literature [31]. Compared with the latter, our setting has a number of distinct
features that are non-trivial to handle:
In group testing, one does not necessarily test one item at a time, whereas in the MAB setting of [31], one
pulls one arm at a time.
In contrast with [31], we do not consider a minimax lower bound, but rather, a Bayesian lower bound for a
given distribution on S. The latter is more difficult, in the sense that a Bayesian lower bound implies a minimax
lower bound but not vice versa.
In our setting, the status of each item is binary-valued (defective or non-defective), whereas the construction
of a hard MAB problem in [31] consists of three distinct types of items (or “arms” in the MAB terminology),
corresponding to high reward, medium reward, and low reward.
We now proceed with the proof.
A. Proof of Theorem 4
We assume without loss of optimality that any given test X(i)is deterministic given Y(1), . . . , Y (i1), and that
the final estimate b
Sis similarly deterministic given the test outcomes. To see that it suffices to consider this case,
we note that
P[error] = EP[error |A]max
A
P[error |A =A],(35)
where Adenotes a randomized algorithm (i.e., combination of test design and decoder), and Ais a realization of
Acorresponding to a deterministic algorithm.
Suppose that after Sis randomly generated according to (33), a genie reveals STto the decoder, where Tis a
uniformly random set of non-defective items such that |ST|= 2k(i.e., Thas cardinality 2k|S|∈{k, k + 1}).
Hence, we are left with an easier group testing problem consisting of 2kitems, k1or kof which are defective.
Since the prior distribution on Sin (33) is uniform, we see that conditioned on the ground set of size 2k, the
defective set Sis uniform on the 2k
k+2k
k1possibilities.
Without loss of generality, assume that the 2krevealed items are {1,...,2k}, and hence, the new distribution of
Sgiven the information from the genie is
SUniform(Sk,2k∪ Sk1,2k).(36)
We first study the error probability conditioned on a given defective set S⊂ {1,...,2k}having cardinality k. For
any such fixed choice, we denote probabilities and expectations by PSand ES.
Fix (0,1), and for each jS, let Njbe the (random) number of tests containing item jand no other
defective items. Since PjSNjnwith probability one, we have PjSES[Nj]n, meaning that at most
(1 )kof the jShave ES[Nj]n
(1)k. For all other j, we have ES[Nj]n
(1)k, and Markov’s inequality
gives PS[Nj(1+)n
k]1
(1)(1+)=1
12. We have therefore proved the following.
March 15, 2018 DRAFT
14
Lemma 1. For any  > 0, and any set S⊂ {1,...,2k}of cardinality k, there exist at least k items jSsuch
that PS[Nj(1+)n
k]1
12.
The following lemma, consisting of a change of measure between the probabilities under two different defective
sets, will also be crucial. Recalling that we are considering test designs that are deterministic given the past samples,
we see that Njis a deterministic function of Y, so we write the corresponding function as nj(y). Moreover, we
let YSbe the set of ysequences that are decoded as S, and we write P[y]and P[YS]as shorthands for P[Y=y]
and P[YYS], respectively.
Lemma 2. Given Sof cardinality k, for any jS, and any output sequence ysuch that nj(y)(1+)n
k, we have
PS\{j}[y]PS[y]ρ
1ρ(1+)n
k.(37)
Consequently, if jSis such that PS[Nj(1+)n
k]1
12, then
PS\{j}[YS]PS[YS]1
12ρ
1ρ(1+)n
k.(38)
Proof. Again using the fact that the test designs that are deterministic given the past samples, we can write
PS[y] =
n
Y
i=1
PS[y(i)|y(1), . . . , y(i1)](39)
=
n
Y
i=1
PS[y(i)|x(i)],(40)
where x(i)∈ {0,1}pis the i-th test. Note that (40) holds because Y(i)depends on the previous samples only
through X(i). An analogous expression also holds for PS\{j}[y].
Due to the “or” operation in the observation model (2), the only tests for which the outcome probability changes
as a result of removing jfrom Sare those for which jwas the unique defective item tested. We have at most
(1+)n
ksuch tests by assumption, and each of them causes the probability of y(i)(given x(i)) to be multiplied or
divided by ρ
1ρ. Since ρ < 0.5, we deduce the lower bound in (37), corresponding to the case that all (1+)n
kof
them are multiplied by this factor.
To prove the second part, we write
PS\{j}[YS]PS\{j}Y∈ YSNj(1 + )n
k(41)
PSY∈ YSNj(1 + )n
kρ
1ρ(1+)n
k(42)
PS[YS]1
12ρ
1ρ(1+)n
k,(43)
where (42) follows from the first part of the lemma, and (43) follows by writing P[AB]P[A]P[Bc].
March 15, 2018 DRAFT
15
The idea behind applying this lemma is that if a given yis decoded to S, then it cannot be decoded to S\{j};
hence, if a given sequence ycontributes to PS[no error], then it also contributes to PS\{j}[error]. We formalize
this idea as follows. Recalling that Sk,2kis the set of all subsets of {1,...,2k}of cardinality k, we have
X
S0∈Sk1,2k
PS0[error] X
S0∈Sk1,2kX
j /S0
PS0[YS0∪{j}](44)
=X
S0∈Sk1,2kX
j /S0X
S∈Sk,2k
1S=S0∪ {j}PS0[YS](45)
=X
S0∈Sk1,2k
2k
X
j=1 X
S∈Sk,2k
1S=S0∪ {j}PS0[YS](46)
=X
S∈Sk,2kX
jSX
S0∈Sk1,2k
1S=S0∪ {j}PS0[YS](47)
=X
S∈Sk,2kX
jS
PS\{j}[YS],(48)
where (44) follows since S0differs from S0∪ {j}, (45) follows since the indicator function is only equal to one
for S=S0∪ {j}, (46), (46) follows since the extra jincluded in the middle summation (i.e., jS) also make
the indicator function equal zero, (47) follows by re-ordering the summations and noting that the indicator function
equals zero when j /S, and (48) follows by only keeping the S0for which the indicator function is one.
The following lemma is based on lower bounding (48) using Lemma 2.
Lemma 3. If 1
|Sk,2k|PS∈Sk,2k
PS[error] δfor some δ > 0, then
1
|Sk1,2k|X
S0∈Sk1,2k
PS0[error] k ·12δ1
12·ρ
1ρ(1+)n
k(49)
for any (0,1).
Proof. Since 1
|Sk,2k|PS∈Sk,2k
PS[error] δand |Sk,2k|=2k
k, there must exist at least 1
22k
kdefective sets
S∈ Sk,2ksuch that PS[error] 2δ. We lower bound the first summation in (48) by a summation over such
S, and for each one, we lower bound the summation over jSby the set of size at least k given in Lemma
1. For the choices of Sand jthat are kept in this lower bound, the summand PS\{j}[YS]is lower bounded by
12δ1
12 ρ
1ρ(1+)n
kby the second part of Lemma 2 (with PS[YS] = PS[no error] 12δ). Putting it all
together, we obtain
X
S0∈Sk1,2k
PS0[error] 1
22k
k·k ·12δ1
12·ρ
1ρ(1+)n
k.(50)
Using the identity 2k
k=2k
k1·2kk
k= 22k
k1, this yields
1
2k
k1X
S0∈Sk1,2k
PS0[error] k ·12δ1
12·ρ
1ρ(1+)n
k,(51)
thus proving the lemma.
Observe that for sufficiently small δ, we can choose to be arbitrarily small while still ensuring that 12δ1
12>
0. Moreover, P[S∈ Sk,2k]and P[S∈ Sk1,2k]are both bounded away from zero under the distribution in (33). Most
March 15, 2018 DRAFT
16
0
1
0
1
0
1
0
1
1
1
1
1
0
1
0
1
0
1
0
1
1
1
1
1
Figure 2: Z-channel (Left) and reverse Z-channel (Right).
importantly, the term kρ
1ρ(1+)n
kappearing in (49) is lower bounded by δ0>0as long as nklog(0)
(1+) log 1ρ
ρ
. Since
may be arbitrarily small and log(0) = (log k)(1 + o(1)), we deduce that the following condition is necessary
for attaining arbitrarily small error probability:
nklog k
log 1ρ
ρ
(1 η),(52)
where η > 0is arbitrarily small. This completes the proof of Theorem 4.
V. OTHER OBSERVATI ON MO DE LS
While we have focused on the symmetric noise model (2) for concreteness, most of our algorithms and analysis
techniques can be extended to other observation models. In this section, we present some of the resulting bounds
for three different models: The noiseless model (1), the Z-channel model,
PY|U(0|0) = 1, PY|U(1|0) = 1,(53)
PY|U(0|1) = ρ, PY|U(1|1) = 1 ρ, (54)
and the reverse Z-channel model,
PY|U(0|0) = 1 ρ, PY|U(1|0) = ρ, (55)
PY|U(0|1) = 0, PY|U(1|1) = 0,(56)
where in both cases we define U=jSXj. That is, we pass the noiseless observation through its corresponding
binary channel; see Figure 2 for an illustration. Under the Z-channel model, positive tests indicate with certainty
that a defective item is included, whereas under the reverse Z-channel model, negative tests indicate with certainty
that no defective item is included. While the two channels have the same capacity, it is interesting to ask whether
one of the two is fundamentally more difficult to handle in the context of group testing.
A. Noiseless setting
In the noiseless setting, the final step of Algorithm 1 is much simpler: Simply test the items in b
S2individually
once each. This only requires ktests, and succeeds with certainty, yielding the following.
March 15, 2018 DRAFT
17
Theorem 5. Under the scaling k= Θ(pθ)for some θ(0,1), there exists a two-stage algorithm for noiseless
adaptive group testing that succeeds with probability approaching one, with a number of tests bounded by
nklog2
p
k(1 + o(1)).(57)
Moreover, there exists a computationally efficient two-stage algorithm that succeeds with probability approaching
one, with a number of tests bounded by
n1
log 2 klog2
p
k(1 + o(1)).(58)
The upper bound (57) is tight, as it matches the so-called counting bound, e.g., see [32]. To our knowledge, the
minimum number of stages used to attain this bound previously for all θ(0,1) was four [26]. It is worth noting,
however, that the algorithm of [26] has low computational complexity, unlike Algorithm 1.
The bound (57) does not contradict the converse bound of Mézard and Toninelli [28]; the latter states that any
two-stage algorithm with zero error probability must have an average number of tests of 1
log 2 klog2p
k(1 + o(1))
or higher. In contrast, (57) corresponds to vanishing error probability and a fixed number of tests.
B. Z-channel model
Under the Z-channel model, the capacity-based converse bound of [7] turns out to be tight for all θ(0,1), as
stated in the following.
Theorem 6. Under the noisy group testing model with Z-channel noise having parameter ρ(0,1), and a number
of defectives satisfying k= Θ(pθ)for some θ(0,1), there exists a three-stage adaptive algorithm achieving
vanishing error probability with
nklog p
k
C(ρ)(1 + o(1)),(59)
where C(ρ)is the capacity of the Z-channel in nats.
Proof. The analysis is similar to that of the symmetric noise model, so we omit most of the details.
In the first stage, we use i.i.d. Bernoulli testing with parameter ν > 0chosen to ensure that the induced
distribution PUof U=jSXiequals the capacity-achieving input distribution of the Z-channel. Under this
choice, a straightforward extension of the analysis of [8] (see the final part of Appendix A for details) reveals that
we can find a set b
S1of cardinality ksuch that d(S, b
S1)α1kwith nsatisfying (59), where dis defined in (5),
and α1>0is arbitrarily small.
The second stage is similar to steps 2a and 2b in Algorithm 2. The modifications required in step 2a are stated
in Appendix C, and step 2b is in fact simpler: We include a given item in b
S0
2bif and only if any of its tests
returned positive. Due to the nature of the Z-channel, no non-defectives are included in b
S0
2b. On the other hand, the
probability of a positive item returning negative on all ˇntests is given by ρˇn, and is asymptotically vanishing if
ˇn= log log k(say). Hence, by Markov’s inequality, we have with probability approaching one that the number of
defective items that fail to be placed in b
S0
2bis smaller than α1kwith probability approaching one. Moreover, the
required number of tests is O(klog log k), which is asymptotically negligible.
March 15, 2018 DRAFT
18
In the third stage, as in Algorithm 2, we test each item individually entimes. Here, however, we let b
S0
3contain
the items that returned positive in any test. There are again no false positives, and a given defective item is a false
negative with probability ρ
en. By the union bound and the fact that there are at most 2α1kitems and α1kdefective
items, we readily deduce vanishing error probability as long as en=O(log k), meaning the total number of tests is
O(α1klog k). This is asymptotically negligible, since α1is arbitrarily small.
This result shows that under Z-channel noise, the conjecture of the optimal (inverse) coefficient to klog p
kequaling
the channel capacity (e.g., see [7]) is true for all θ(0,1), in stark contrast to the symmetric noise model.
It is worth noting that the converse analysis of Section IV does not apply to the Z channel model. This is because
any analog of Lemma 2 is impossible: If there exists a test outcome yi= 1 where jis the only defective included,
then PS\{j}[y]=0, meaning we cannot hope for an inequality of the form (37).
C. Reverse Z-channel model
Under the reverse Z-channel model, we have the following analog of the converse bound in Theorem 4.
Theorem 7. Consider the noisy group testing setup with reverse Z-channel noise having parameter ρ(0,1),S
distributed according to (33), and k→ ∞ with kp
2. For any adaptive algorithm, in order to achieve Pe0, it
is necessary that
nmax klog p
k
C(ρ),klog k
log 1
ρ(1 o(1)),(60)
where C(ρ)is the capacity of the Z-channel in nats.
Proof. The first bound in (60) is the capacity-based bound from [7]. On the other hand, the second bound follows
from a near-identical analysis to the proof of Theorem 4, with the only difference being that ρ
1ρis replaced by ρ
in (37) and the subsequent equations that make use of (37).
We note that unlike the Z-channel, the cases where one of PS[y]and PS\{j}[y]is zero and the other is non-zero
are not problematic. Specifically, this only occurs when PS[y]=0, and in this case, any inequality of the form (37)
is trivially true.
Interestingly, this result shows that reverse Z-channel noise is more difficult to handle than Z-channel noise by
an arbitrarily large factor as θgets closer to one, even though the two channels have the same capacity. In this
sense, we deduce that at least under optimal adaptive testing, having reliable positive test outcomes is preferable to
having reliable negative test outcomes.
VI. CONCLUSION
We have developed both information-theoretic limits and practical performance guarantees for noisy adaptive group
testing. Some of the main implications of our results include the following:
Under the scaling k= Θ(pθ), for most most θ(0,1), our information-theoretic achievability guarantees for
the symmetric noise model are significantly better than the best known non-adaptive achievability guarantees,
and similarly when it comes to practical guarantees.
March 15, 2018 DRAFT
19
Our converse for the symmetric noise model reveals that n= Ω(klog k)is necessary, and hence, the implied
constant to n= Θklog p
kmust grow unbounded as θ1. This phenomenon also holds true for the reverse
Z-channel noise model, but not for the Z-channel noise model.
Our bounds are tight or near-tight in several cases of interest, including small values of θand low noise levels.
Moreover, in the noiseless case, we obtain the optimal threshold using a two-stage algorithm; previously the
smallest known number of stages was four.
It is worth noting that our two-stage (or three-stage) algorithm and its analysis remain applicable when any non-
adaptive algorithm is used in the first stage, as long as it identifies a suitably high fraction of the defective set. Hence,
improved practical or information-theoretic guarantees for partial recovery in the non-adaptive setting immediately
transfer to improved exact recovery guarantees in the adaptive setting.
APPENDIX
A. Non-Adaptive Partial Recovery Result with Vanishing Number of Errors
The analysis of [8] considers the case that the number of allowed errors scales as dmax = Θ(k). In this section,
we adapt the analysis therein to the case dmax = Θ(kγ)for some γ(0,1). This generalization is useful for the
refined achievability bound given in Section III (cf., Theorem 3), and is also of interest in its own right.
1) Notation: Recall that Sis uniform on the set of subsets of {1, . . . , p}having a given cardinality k. As in [8],
we consider non-adaptive i.i.d. Bernoulli testing, where each item is placed in a given test with probability ν
kfor
some ν > 0. We focus our attention on ν= log 2, though we will still write νfor the parts of the analysis that
apply more generally. The test matrix is denoted by X∈ {0,1}n×p(i.e., the i-th row is X(i)), and the notation Xs
denotes the sub-matrix obtained by keeping only the columns indexed by s{1, . . . , p}.
Next, we recall some notation from [8]. It will prove convenient to work with random variables that are implicitly
conditioned on a fixed value of S, say s={1, . . . , k}. We write PY|Xsfor the conditional test outcome probability,
where Xsis the subset of the test vector Xindexed by s. Moreover, we write
PXsY(xs, y) := Pk
X(xs)PY|Xs(y|xs)(61)
PXsY(xs,y) := Pn×k
X(xs)Pn
Y|Xs(y|xs),(62)
where Pn
Y|Xs(·|·)is the n-fold product of PY|Xs(·|·), and P(·)
Xdenotes the i.i.d. Bernoulliν
kdistribution for a
vector or matrix of the size indexed in the superscript. The random variables (Xs, Y )and (Xs,Y)are distributed
as
(Xs, Y )PXsY(63)
(Xs,Y)PXsY,(64)
and the remaining entries of the measurement matrix are distributed as XscPn×(pk)
X, independent of (Xs,Y).
March 15, 2018 DRAFT
20
In our analysis, we consider partitions of the defective set sinto two sets sdif 6=and seq. One can think of seq
as corresponding to an overlap ssbetween the true set sand some incorrect set s, with sdif corresponding to the
indices s\sin one set but not the other. For a fixed defective set s, and a corresponding pair (sdif, seq ), we write
PY|Xsdif Xseq (y|xsdif , xseq ) := PY|Xs(y|xs),(65)
where PY|Xsis the marginal distribution of (62). This form of the conditional error probability allows us to introduce
the marginal distribution
PY|Xseq (y|xseq ) := X
xsdif
P`
X(xsdif )PY|Xsdif Xseq (y|xsdif , xseq ),(66)
where `:= |sdif |. Using the preceding definitions, we introduce the information density [33]
ın(xsdif ;y|xseq ) :=
n
X
i=1
ı(x(i)
sdif ;y(i)|x(i)
seq )(67)
ı(xsdif ;y|xseq ) := log PY|Xsdif Xseq (y|xsdif , xseq )
PY|Xseq (y|xseq )(68)
where (·)(i)denotes the i-th entry (respectively, row) of a vector (respectively, matrix). Averaging (68) with respect
to (Xs, Y )in (63) yields a conditional mutual information, which we denote by
I`:= I(Xsdif ;Y|Xseq ),(69)
where `:= |sdif |; by symmetry, the mutual information for each (sdif, seq )depends only on this quantity.
2) Choice of decoder: We use the same information-theoretic threshold decoder as that in [8]: Fix the constants
{γ`}k
`=dmax+1 , and search for a set sof cardinality ksuch that
ın(Xsdif ;Y|Xseq )γ|sdif |,(sdif , seq)such that |sdif |> dmax .(70)
If multiple such sexist, or if none exist, then an error is declared. This decoder is inspired by analogous thresholding
techniques from the channel coding literature [34], [35].
3) Useful existing results: We build heavily on several intermediate results given in [17], stated as follows:
Initial bounds. Since the analysis is the same for any defective set sof cardinality k, we assume without loss
of generality that s={1, . . . , k}. The initial non-asymptotic bound of [8] takes the form
Pe(dmax)P[
(sdif ,seq) : |sdif |>dmax
ın(Xsdif ;Y|Xseq )log pk
|sdif |+ log k
δ1k
|sdif |+δ1(71)
for any δ1>0. A simple consequence of this non-asymptotic bound is the following: For any positive constants
{δ2,`}k
`=dmax+1 , if the number of tests is at least
nmax
`=dmax+1,...,k
log pk
`+ log k
δ1k
`
I`(1 δ2,`),(72)
and if each information density satisfies a concentration bound of the form
Pın(Xsdif ;Y|Xseq )n(1 δ2,`)I`ψ`(n, δ2,` ),(73)
March 15, 2018 DRAFT
21
for some functions {ψ`}k
`=dmax+1 , then
Pe(dmax)
k
X
`=dmax+1 k
`ψ`(n, δ2,`) + δ1.(74)
Characterization of mutual information. Under the symmetric noise model with crossover probability ρ
0,1
2, the conditional mutual information behaves as follows:
If `
k0, then
I`=eνν`
k(1 2ρ) log 1ρ
ρ(1 + o(1)).(75)
If `
kα(0,1], then
I`=e(1α)νH2eαν ? ρH2(ρ)(1 + o(1)).(76)
Concentration bounds. The following concentration bounds provide explicit choices for ψ`satisfying (73):
For all `and δ > 0, we have
Phın(Xsdif ;Y|Xseq )nI`i2 exp δ2n
4(8 + δ)(77)
for all (sdif , seq)with |sdif |=`.
If `
k0, then for any  > 0and δ2>0(not depending on p), the following holds for sufficiently large p:
Phın(Xsdif ;Y|Xseq )nI`(1 δ2)iexp n`
keννδ2
2(1 2ρ)2
2(1 + 1
3δ2(1 2ρ))(1 ).(78)
for all (sdif , seq)with |sdif |=`.
With these tools in place, we proceed by obtaining an explicit bound on the number of tests for the case dmax =
Θ(kγ).
4) Bounding the error probability: We split the summation over `in (74) into two terms:
T1:=
k
log k
X
`=dmax+1 k
`ψ`(n, δ(1)
2), T2:=
k
X
`=k
log k
k
`ψ`(n, δ(2)
2),(79)
where we have let δ2,` equal a given value δ(1)
2(0,1) for all `in the first sum, and a different value δ(2)
2(0,1)
for all `in the second sum.
To bound T1, we consider ψ`(n, δ2)equaling the right-hand side of (78). Letting c(δ2) = eννδ2
2(12ρ)2
2(1+ 1
3δ2(12ρ))
for brevity, we have
T1kmax
`=dmax+1,..., k
log kk
`en·`
k·c(δ(1)
2),(80)
where we have upper bounded the summation defining T1by ktimes the maximum. Re-arranging, we find in order
to attain T1δ1, it suffices that
nmax
`=dmax+1,..., k
log k
1
c(δ(1)
2)·k
`·log k
`+ log k
δ1.(81)
March 15, 2018 DRAFT
22
Writing log k
`=`log k
`(1 + o(1)), this simplifies to
nmax
`=dmax+1,..., k
log k
1
c(δ(1)
2)·klog k
`+k
`log k
δ1(82)
=1
c(δ(1)
2)·(1 γ)klog k(1 + o(1)),(83)
since the maximum is achieved by the smallest value dmax + 1 = Θ(kγ), and for that value, the second term is
asymptotically negligible compared to the first. Substituting the definition of c(·), we obtain the condition
n2(1 + 1
3δ2(1 2ρ))
eννδ2
2(1 2ρ)2·(1 γ)klog k(1 + o(1)) (84)
To bound T2, we consider ψ`(n, δ2)equaling the right-hand side of (77) with δ=δ2I`. Again upper bounding
the summation by ktimes the maximum, and defining c0(δ2) = δ2
2
4(8+δ2I`), we obtain
T2kmax
`=k
log k,...,k k
`·2 exp c0(δ2)I2
`n.(85)
It follows that in order to attain T2δ1, it suffices that
n1
c0(δ(2)
2)I2
`
log 2k·k
`
δ1
.(86)
By the mutual information characterizations in (75)–(76), we have c0(δ(2)
2) = Θ(1) for any δ(2)
2(0,1), and
I2
`= Θ`
k2. By also writing log 2k·(k
`)
δ1= Θ`log k
`, we find that (86) takes the form n= Ωk2
`log k
`. The
most stringent condition is then provided by the smallest value `=k
log k, yielding n= Ωk·log log k·log k.
Thus, T2vanishes for any scaling of the form n= Ωklog p
k, since log p
k= Θ(log p)in the sub-linear regime
k= Θ(pθ)with θ(0,1).
5) Characterizing the mutual-information based condition (72):Recall that we require the number of tests to
satisfy (72). For the values of `corresponding to T1in (79), we have chosen δ2,` =δ(1)
2, and the mutual information
characterization (75) yields the condition
nmax
`=dmax+1,..., k
log k
klog p
`+klog k
`+k
`log k
δ1
eνν(1 2ρ) log 1ρ
ρ(1 δ(1)
2)(1 + o(1)),(87)
where we have applied log pk
`=`log p
`(1 + o(1)) and log k
`=`log k
`(1 + o(1)) for `=o(k). Writing
klog p
`+klog k
`=klog p
k+klog k2
`2and recalling that k= Θ(pθ)and dmax = Θ(kγ), we find that (87) simplifies
to
n(1 θ)klog p+ 2(1 γ) log k
eνν(1 2ρ) log 1ρ
ρ(1 δ(1)
2)(1 + o(1)),(88)
since the maximum over `is achieved by the smallest value, `=dmax + 1 = Θ(kγ).
For the `values corresponding to T2in (79), the condition (72) was already simplified in [8]. It was shown that
under the choice ν= log 2, the dominant condition is that of the highest value, `=k, and the resulting condition
on the number of tests is
nklog p
k
(log 2 H2(ρ))(1 δ(2)
2)(1 + o(1)).(89)
March 15, 2018 DRAFT
23
6) Wrapping up: We obtain the final condition on nby combining (84), (88), and (89). We take δ(2)
2to be
arbitrarily small, while renaming δ(1)
2to δ2and letting it remain a free parameter. Also recalling the choice ν= log 2,
we obtain the following generalization of the partial recovery bound given in [8].
Theorem 8. Under the symmetric noise model (2), in the regime k= Θ(pθ)and dmax = Θ(kγ)with θ, γ (0,1),
there exists a non-adaptive group testing algorithm such that Pe0as p→ ∞ with a number of tests satisfying
ninf
δ2(0,1) max nMI,1, nMI,2(γ, δ2), nConc (γ, δ2)(1 + o(1)),(90)
where nMI,1,nMI,2, and nConc are defined in (21)(23).
Variation for the Z-channel. For general γ(0,1), the preceding analysis is non-trivial to extend to the Z-
channel noise model, which we consider in Section V. However, it is relatively easy to obtain a partial recovery
result for the case dmax = Θ(k), and such a result suffices for our purposes. We outline the required changes here.
We continue to assume that the test matrix Xis i.i.d. Bernoulli, but now the probability of a given entry being one
is ν
kfor some ν > 0to be chosen later.
As was observed in [8], the analysis is considerably simplified by the fact that we do not need to consider the
case `
k0. This means that we do not need to consider the rely exclusively on (77), which is known to hold for
any binary-output noise model [8]. Consequently, one finds that the only requirement on nis that (72) holds, with
the conditional mutual information I`=I(Xsdif ;Y|Xseq )suitably modified due to the different noise model. By
some asymptotic simplifications and the fact that `= Θ(k)for all `under consideration, this condition simplifies
to
nmax
`>dmax
`log p
k
I`
(1 + o(1)).(91)
Next, we note that an early result of Malyutov and Mateev [13] (see also [36]) implies that `
I`is maximized at
`=k. For completeness, we provide a short proof. Assuming without loss of generality that s={1, . . . , k}, and
letting Xj0
jdenote the collection (Xj, . . . , Xj0)for indices 1jj0k, we have
I`
`=1
`I(Xk
k`+1;Y|Xk`
1)(92)
=1
`
k
X
j=k`+1
I(Xj;Y|Xj1
1)(93)
=1
`
k
X
j=k`+1 H(Xj)H(Xj|Y, X j1
1),(94)
where (92) follows since I`=I(Xsdif ;Y|Xseq )only depends on the sets (sdif , seq)through their cardinalities, (93)
follows from the chain rule for mutual information, and (94) follows since Xjis independent of Xj1
1. We establish
the desired claim by observing that I`
`is decreasing in `: The term H(Xj)is the same for all j, whereas the term
H(Xj|Y, X j1
1)is smaller for higher values of jbecause conditioning reduces entropy.
Using this observation, the condition in (91) simplifies to
nmax
`>dmax
klog p
k
Ik
(1 + o(1)).(95)
March 15, 2018 DRAFT
24
We can further replace Ik=I(Xs;Y)by the capacity of the Z-channel upon optimizing the i.i.d. Bernoulli
parameter ν > 0. The optimal value is the one that makes P[jsXs= 1] the same as P
U(1), where P
Uis the
capacity-achieving input distribution of the Z-channel PY|U.4
B. Partial Recovery Result with Unknown k
In this section, we explain how to adapt the partial recovery analysis of [8] (as well as that of Appendix A for
dmax = Θ(kγ)) to the case that kis only known to lie within a certain interval Kof length ∆ = o(dmax), where
dmax is the partial recovery threshold. Specifically, we argue that for any defective set swith |s|∈K, there exists
a decoder that knows Kbut not |s|, such that the error probability P[b
S6=s|S=s]vanishes under i.i.d. Bernoulli
testing, with the same requirement on nis the case of known |s|. Of course, this also implies that P[b
S6=S]vanishes
under any prior distribution on Ssuch that |S|∈Kalmost surely.
We consider the same non-adaptive setup of Appendix A, denoting the test matrix by X∈ {0,1}pand making
extensive use of the information densities defined in in (67)–(68). Since k:= |s|is unknown, we can no longer
assume that the test matrix is i.i.d. with distribution PXBernoulliν
k, so we instead use PXBernoulliν
kmax ,
with kmax equaling the maximum value in K.
In the case of known k, we considered the decoder in (70), first proposed in [8]. In the present setting, we modify
the decoder to consider all possible k, and to allow sdif seq to be a strict subset of s. More specifically, the decoder
is defined as follows. For any pair (sdif, seq )such that |sdif seq|equals some constant k0, let ın
k0(xsdif ;y|xseq )be
the information density corresponding to the case that the defective set equals sdif seq, with an explicit dependence
on the cardinality k0. We consider a decoder that searches over all s⊆ {1, . . . , p}whose cardinality is in K, and
seeks a set such that
ın
k0(Xsdif ;Y|Xseq )γk0,`,(sdif , seq )˜
Ss,(96)
where {γk0,`}is a set of constants depending on k0:= |sdif seq |and `:= |sdif |, and ˜
Ssis the set of pairs (sdif , seq)
satisfying the following:
1. sdif sand seq sare disjoint;
2. The total cardinality k0=|sdif seq|lies in K;
3. The “distance” `+kk0exceeds dmax. Specifically, if sis the true defective set and ˆsis some estimate of
cardinality k0kwith sˆs=seq and |seq |=k0`, then we have `+kk0false negatives, and `false
positives, so that d(s, ˆs) = `+kk0under the distance function in (5).
If multiple ssatisfy (96), then the one with the smallest cardinality k:= |s|is chosen, with any remaining ties
broken arbitrarily. If none of the ssatisfy (96), an error is declared.
4In fact, this analysis applies to any binary channel PY|U.
March 15, 2018 DRAFT
25
Under this decoder, an error occurs if the true defective set sfails the threshold test (96), or if some s0with
|s0| ≤ |s|and d(s, s0)> dmax passes it. By the union bound, the first of these occurs with probability at most
P(1)
e(s, dmax)X
(k0,`) : k0∈K,`k0k,
`+kk0>dmax
k
k0k0
`Pın
k0(Xsdif ;Y|Xseq )γk0,`,(97)
where (sdif , seq)is an arbitrary pair with |sdif |=`and |sdif seq |=k0. Here the combinatorial terms arise by
choosing k0elements of sto form sdif seq, and then choosing `of those elements to form sdif .
As for the probability of some incorrect s0passing the threshold test, we have the following. Let seq =s0s
and sdif =s0\s. Since only sets with |s0| ≤ |s|can cause errors, k0:= |s0|=|seq sdif |is upper bounded by k,
and since only sets with d(s, s0)> dmax can cause errors, we can also assume that this holds. Defining `=|sdif |,
we can upper bound the probability of s0passing the test (96) for all (sdif , seq)by the probability of passing it for
the specific pair (sdif , seq). By doing so, and summing over all possible s0, we find that the second error event is
upper bounded as follows for any given s:
P(2)
e(s, dmax)X
(k0,`) : k0∈K,`k0k,
`+kk0>dmax
pk
` k
k0`Pın
k0(Xsdif ;Y|Xseq )γk0,`,(98)
where the combinatorial terms corresponding to choosing `elements of {1, . . . , p} \ sto form sdif , and choosing
k0`elements of sto form seq.
Combining the above, the overall upper bound on the error probability given sis
Pe(s)P(1)
e(s, dmax) + P(2)
e(s, dmax).(99)
Upon substituting the upper bounds in (97) and (98), we obtain an expression that is nearly the same as that when
kis known [8], except that we sum over a number of different k0, rather than only k0=k. We proceed by arguing
that this does not affect the final bound, as long as dmax = Θ(kγ)for some γ(0,1], and ∆ = o(dmax)(recall
that is the highest possible difference between two possible kvalues).
The main additional difficulty here is that the information density ık0(xsdif ;y|xseq) = log PY|Xsdif ,Xseq (y|xsdif ,xseq )
PY|Xseq (y|xseq )
is defined with respect to (seq, sdif )of total cardinality k0, whereas the output variables Yare distributed according
to the true model in which there are kdefectives. The following lemma allows us to perform a change of measure
to circumvent this issue.
Lemma 4. Fix a defective set sof cardinality k, let (sdif , seq)be disjoint subsets of swith total cardinality k0k,
and let P(k)
Y|Xsdif ,Xseq be the conditional probability of Ygiven the partial test vector (Xsdif , Xseq), in the case of
a test vector with i.i.d. Bernoulliν
kmax entries, where kmax =k(1 + o(1)). Similarly, let P(k0)
Y|Xsdif ,Xseq denote the
conditional transition law when s0=sdif seq is the true defective set. Then, if |kk0| ≤ ∆ = o(k), we have
max
xsdif ,xseq ,y
P(k)
Y|Xsdif ,Xseq (y|xsdif , xseq )
P(k0)
Y|Xsdif ,Xseq (y|xsdif , xseq )1 + O
k.(100)
March 15, 2018 DRAFT
26
Consequently, the corresponding n-letter product distributions P(k)
Y|Xsdif ,Xseq and P(k0)
Y|Xsdif ,Xseq for conditionally
independent observations satisfy the following:
max
xsdif ,xseq ,y
P(k)
Y|Xsdif ,Xseq (y|xsdif ,xseq )
P(k0)
Y|Xsdif ,Xseq (y|xsdif ,xseq )eO(n
k)(101)
Proof. First observe that if xsdif or xseq contain an entry equal to one, then the ratio in (100) equals one, as Y= 1
with probability 1ρin either case. Hence, it suffices to prove the claim for xsdif and xseq having all entries equal
to zero. In the denominator, we have
P(k0)
Y|Xsdif ,Xseq (1|xsdif , xseq ) = ρ, (102)
since there (sdif , seq)corresponds to the entire defective set. On the other hand, in the numerator, there are kk0
additional defective items, and the probability of one or more of them being defective is := 1 1ν
k)kk0=
O
kmax , where we applied the assumptions |kk0| ≤ ∆ = o(k)and kmax =k(1 + o(1)), along with some
asymptotic simplifications. Therefore, we have
P(k)
Y|Xsdif ,Xseq (1|xsdif , xseq ) = (1 )ρ+(1 ρ)(103)
=ρ+(1 2ρ).(104)
The ratio of (104) and (102) evaluates to 1 + O(), and similarly for the conditional probabilities of Y= 0 obtained
by taking one minus the right-hand sides. Since =O
k, this proves (100).
We obtain (101) by raising the right-hand side of (100) to the power of n, and applying 1 + αeα.
We now show how to use Lemma 4 to bound P(1)
e(s, dmax)and P(2)
e(s, dmax). Starting with the former, we
observe that Yin (97) is conditionally distributed according to P(k)
Y|Xsdif ,Xseq , and hence, (101) yields
Pın
k0(Xsdif ;Y|Xseq )γk0,`eO(n
k)·Pın
k0(Xsdif ;e
Y|Xseq )γk0,`,(105)
where e
Yis conditionally distributed according to P(k0)
Y|Xsdif ,Xseq .
For P(2)
e(s, dmax), we first note that a similar bound to (101) holds when we condition on Xseq alone; this is
seen by simply moving the denominator to the right-hand side and averaging over Xsdif on both sides. Since Yin
(98) is conditionally distributed according to P(k)
Y|Xseq , we obtain from (101) that
Pın
k0(Xsdif ;Y|Xseq )γk0,`eO(n
k)·Pın
k0(Xsdif ;e
Y|Xseq )γk0,`,(106)
where e
Yis conditionally distributed according to P(k0)
Y|Xseq .
Next, observe that if the number of tests satisfies n=O(klog p), then we can simplify the term eO(n
k)to
eO(∆ log p). By doing so, and substituting (105) and (106) into (97)–(99), we obtain
Pe(s)eO(∆ log p)X
(k0,`) : k0∈K,`k0k,
`+kk0>dmax
k
k0k0
`Pın
k0(Xsdif ;e
Y|Xseq )γk0,`(107)
+eO(∆ log p)X
(k0,`) : k0∈K,`k0k,
`+kk0>dmax
pk
` k
k0`Pın
k0(Xsdif ;e
Y|Xseq )γk0,`.(108)
March 15, 2018 DRAFT
27
This bound is now of a similar form to that analyzed in [8], in the sense that the joint distributions of the tests
and outcomes match those that define the information density. The only differences are the presence of additional
k0values beyond only k=k0, and the presence of the eO(∆ log p)terms. We conclude by explaining how these
differences do not impact the final result as long as ∆ = o(dmax)with dmax = Θ(kγ)for some γ(0,1]:
The term pk
`satisfies log pk
`=`log p
`(1 + o(1)), and the assumption |kk0| ≤ ∆ = o(dmax) = o(k)
implies that the term k0
`satisfies log k0
`=`log k
`(1+ o(1)). On the other hand, the logarithm of eO(∆ log p)
is O(∆ log p), so it is dominated by the other combinatorial terms due to the fact that ∆ = o(dmax)and
`= Ω(dmax). Similarly, the term k
k0=k
kk0satisfies log k
k0=O(∆ log k), and is dominated by k0
`.
The term k
k0`simplifies to k
kk0+`=k
`(1+o(1))(by the assumption ∆ = o(dmax)), and hence, the
asymptotic behavior for any k0is the same as k
k`, the term corresponding to k=k0. Similarly, the asymptotics
of the tail probabilities of the information densities are unaffected by switching from kto k0=k(1 + o(1)).
In [8] the number of `being summed over is upper bounded by k, whereas here we can upper bound the number
of (k0, `)being summed over by k. Since ∆ = o(k), this simplifies to k1+o(1). Since it is the logarithm of
this term that appears in the final expression, this difference only amounts to a multiplication by 1 + o(1).
C. NCOMP with Unknown Number of Defectives
Chan et al. [14] showed that Noisy Combinatorial Orthogonal Matching Pursuit (NCOMP), used in conjunction
with i.i.d. Bernoulli test matrices, ensures exact recovery of a defective set Sof cardinality kwith high probability
under the scaling n=O(klog p), which in turn behaves as Oklog p
kwhen k=O(pθ)for some θ < 1. However,
the random test design and the decoding rule in [14] assume knowledge of k, meaning the result cannot immediately
be used for our purposes in Step 2 of Algorithm 1. In this section, we modify the algorithm and analysis of [14]
to handle the case that kis only known up to a constant factor.
Suppose that k[c0kmax, kmax ]for some kmax = Θ(pθ), where c0(0,1) and θ(0,1) do not depend on
p. We adopt a Bernoulli design in which each item are independently placed in each test with probability ν
kmax for
fixed ν > 0. It follows that for a given test vector X= (X1, . . . , Xp), we have
P_
jS
Xj= 1= 1 1ν
kmax k
= (1 e)(1 + o(1)) (109)
for some c[c0,1], and hence, the corresponding observation Ysatisfies
P[Y= 1] = (1 ρ)(1 e) + ρe(1 + o(1)).(110)
In contrast, for any jS, we have
P[Y= 1|Xj= 1] = 1 ρ. (111)
The idea of the NCOMP algorithm is the following: For each item j, consider the set of tests in which the item is
included, and define the total number as N0
j. If jis defective, we should expect a proportion of roughly 1ρof
these tests to be positive according to (111), whereas if jis non-defective, we should expect the proportion to be
March 15, 2018 DRAFT
28
roughly (1 ρ)(1 e) + ρeaccording to (110). Hence, we set a threshold in between these two values, and
declare jto be defective if and only if the proportion of positive tests exceeds that threshold.
We first study the behavior of N0
j. Under the above Bernoulli test design, we have N0
jBinomialn, ν
kmax ,
and hence, standard Binomial concentration [37, Ch. 4] gives
PN0
j
2kmax eΘ(1) n
kmax (112)
1
p2,(113)
where (113) holds provided that n= Ω(klog p)with a suitably-chosen implied constant (recall that k= Θ(kmax)).
Next, we present the modified NCOMP decoding rule, and study its performance under the assumption that
N0
j=n0
jwith n0
j
2kmax , for each j∈ {1, . . . , p}. Observe that the gap between (110) and (111) behaves as Θ(1)
for any c[c0,1]. Hence, for sufficiently small >0, we have P[Y= 1] 1ρ2∆. Accordingly, letting
N0
j,1be the number of the N0
jtests including jthat returned positive, we declare jto be defective if and only if
N0
j,1(1 ρ∆)N0
j. We then have the following:
If jis defective, then the probability of incorrectly declaring it to be non-defective given N0
j=n0
jsatisfies
PN0
j,1<(1 ρ∆)n0
jeΘ(1)n0
jeΘ(1)
2kmax ,(114)
where the first inequality is standard Binomial concentration, and the second holds for n0
j
2kmax .
Similarly, if jis non-defective, the probability of incorrectly declaring it to be defective given N0
j=n0
jsatisfies
PN0
j,1(1 ρ∆)n0
jeΘ(1)n0
jeΘ(1)
2kmax .(115)
Combining these bounds with (113) and a union bound over the pitems, the overall error probability Pe=P[b
S6=S]
of the modified NCOMP algorithm is upper bounded by
Pe1
p+peΘ(1)
2kmax .(116)
Since kmax = Θ(k), this vanishes when n= Ω(klog p)with a suitably-chosen implied constant, thus establishing
the desired result.
Z-channel noise. Under the Z-channel noise model introduced in Section V, the preceding analysis is essentially
unchanged. It only relied on there being a constant gap between the probabilities P[Y= 1] and P[Y= 1 |Xj= 1],
and this is still the case here: Equations (110) and (111) still hold true when (1 ρ)(1 e) + ρeis replaced
by (1 e) + ρein the former.
ACKNOWLEDGMENT
The author thanks Volkan Cevher, Sidharth Jaggi, Oliver Johnson, and Matthew Aldridge for helpful discussions,
and Leonardo Baldassini for sharing his PhD thesis [30].
March 15, 2018 DRAFT
29
REFERENCES
[1] R. Dorfman, “The detection of defective members of large populations,Ann. Math. Stats., vol. 14, no. 4, pp. 436–440, 1943.
[2] A. Fernández Anta, M. A. Mosteiro, and J. Ramón Muñoz, “Unbounded contention resolution in multiple-access channels,” in Distributed
Computing. Springer Berlin Heidelberg, 2011, vol. 6950, pp. 225–236.
[3] R. Clifford, K. Efremenko, E. Porat, and A. Rothschild, “Pattern matching with don’t cares and few errors,J. Comp. Sys. Sci., vol. 76,
no. 2, pp. 115–124, 2010.
[4] G. Cormode and S. Muthukrishnan, “What’s hot and what’s not: Tracking most frequent items dynamically,ACM Trans. Database Sys.,
vol. 30, no. 1, pp. 249–278, March 2005.
[5] A. Gilbert, M. Iwen, and M. Strauss, “Group testing and sparse signal recovery,” in Asilomar Conf. Sig., Sys. and Comp., Oct. 2008, pp.
1059–1063.
[6] A. C. Gilbert, M. J. Strauss, J. A. Tropp, and R. Vershynin, “One sketch for all: Fast algorithms for compressed sensing,” in Proc.
ACM-SIAM Symp. Disc. Alg. (SODA), New York, 2007, pp. 237–246.
[7] L. Baldassini, O. Johnson, and M. Aldridge, “The capacity of adaptive group testing,” in IEEE Int. Symp. Inf. Theory, July 2013, pp.
2676–2680.
[8] J. Scarlett and V. Cevher, “Phase transitions in group testing,” in Proc. ACM-SIAM Symp. Disc. Alg. (SODA), 2016.
[9] D.-Z. Du and F. K. Hwang, Combinatorial group testing and its applications, ser. Series on Applied Mathematics. World Scientific, 1993.
[10] M. Aldridge, “The capacity of Bernoulli nonadaptive group testing,” 2015, http://arxiv.org/abs/1511.05201.
[11] A. Agarwal, S. Jaggi, and A. Mazumdar, “Novel impossibility results for group-testing,” 2018, http://arxiv.org/abs/1801.02701.
[12] M. Aldridge, “Individual testing is optimal for nonadaptive group testing in the linear regime,” 2018, http://arxiv.org/abs/1801.08590.
[13] M. B. Malyutov and P. S. Mateev, “Screening designs for non-symmetric response function,” Mat. Zametki, vol. 29, pp. 109–127, 1980.
[14] C. L. Chan, P. H. Che, S. Jaggi, and V. Saligrama, “Non-adaptive probabilistic group testing with noisy measurements: Near-optimal bounds
with efficient algorithms,” in Allerton Conf. Comm., Ctrl., Comp., Sep. 2011, pp. 1832–1839.
[15] C. L. Chan, S. Jaggi, V. Saligrama, and S. Agnihotri, “Non-adaptive group testing: Explicit bounds and novel algorithms,IEEE Trans.
Inf. Theory, vol. 60, no. 5, pp. 3019–3035, May 2014.
[16] J. Scarlett and V. Cevher, “Efficient and near-optimal noisy group testing: An information-theoretic framework,
https://arxiv.org/abs/1710.08704.
[17] ——, “Limits on support recovery with probabilistic models: An information-theoretic framework,IEEE Trans. Inf. Theory, vol. 63, no. 1,
pp. 593–620, 2017.
[18] M. Malyutov, “The separating property of random matrices,Math. Notes Acad. Sci. USSR, vol. 23, no. 1, pp. 84–91, 1978.
[19] G. Atia and V. Saligrama, “Boolean compressed sensing and noisy group testing,” IEEE Trans. Inf. Theory, vol. 58, no. 3, pp. 1880–1901,
March 2012.
[20] M. Aldridge, L. Baldassini, and K. Gunderson, “Almost separable matrices,J. Comb. Opt., pp. 1–22, 2015.
[21] J. Scarlett and V. Cevher, “Converse bounds for noisy group testing with arbitrary measurement matrices,” in IEEE Int. Symp. Inf. Theory,
Barcelona, 2016.
[22] A. J. Macula, “Error-correcting nonadaptive group testing with de-disjunct matrices,Disc. App. Math., vol. 80, no. 2-3, pp. 217–222,
1997.
[23] H. Q. Ngo, E. Porat, and A. Rudra, “Efficiently decodable error-correcting list disjunct matrices and applications,” in Int. Colloq. Automata,
Lang., and Prog., 2011.
[24] M. Cheraghchi, “Noise-resilient group testing: Limitations and constructions,” Disc. App. Math., vol. 161, no. 1, pp. 81–95, 2013.
[25] F. Hwang, “A method for detecting all defective members in a population by group testing,J. Amer. Stats. Assoc., vol. 67, no. 339, pp.
605–608, 1972.
[26] P. Damaschke and A. S. Muhammad, “Randomized group testing both query-optimal and minimal adaptive,” in Int. Conf. Current Trends
in Theory and Practice of Computer Science. Springer, 2012, pp. 214–225.
[27] A. J. Macula, “Probabilistic nonadaptive and two-stage group testing with relatively small pools and DNA library screening,J. Comb.
Opt., vol. 2, no. 4, pp. 385–397, 1998.
[28] M. Mézard and C. Toninelli, “Group testing with random pools: Optimal two-stage algorithms,IEEE Trans. Inf. Theory, vol. 57, no. 3,
pp. 1736–1745, 2011.
March 15, 2018 DRAFT
30
[29] S. Cai, M. Jahangoshahi, M. Bakshi, and S. Jaggi, “GROTESQUE: Noisy group testing (quick and efficient),” 2013,
https://arxiv.org/abs/1307.2811.
[30] L. Baldassini, “Rates and algorithms for group testing,” Ph.D. dissertation, Univ. Bristol, 2015.
[31] S. Kalyanakrishnan, A. Tewari, P. Auer, and P. Stone, “PAC subset selection in stochastic multi-armed bandits.” in Int. Conf. Mach. Learn.
(ICML), 2012.
[32] O. Johnson, “Strong converses for group testing from finite blocklength results,IEEE Trans. Inf. Theory, vol. 63, no. 9, pp. 5923–5933,
Sept. 2017.
[33] Y. Polyanskiy, V. Poor, and S. Verdú, “Channel coding rate in the finite blocklength regime,IEEE Trans. Inf. Theory, vol. 56, no. 5, pp.
2307–2359, May 2010.
[34] A. Feinstein, “A new basic theorem of information theory,IRE Prof. Group. on Inf. Theory, vol. 4, no. 4, pp. 2–22, Sept. 1954.
[35] T. S. Han, Information-Spectrum Methods in Information Theory. Springer, 2003.
[36] M. B. Malyutov, “Search for sparse active inputs: A review,” in Inf. Theory, Comb. and Search Theory, 2013, pp. 609–647.
[37] R. Motwani and P. Raghavan, Randomized Algorithms. Chapman & Hall/CRC, 2010.
March 15, 2018 DRAFT
... The reason (21) only depends on the placements of defective items is thatS ⊆ S and non-defective placements do not affect whether a test is "good" for i ∈S (see Definition 5). The same reasoning applies to (28), since this bound is on the "number of positive tests not explained byS", for which non-defectives are irrelevant. Here we are proving a statement regarding the placements of non-defectives, and these are independent of the placements of defectives, so the preceding conditioning does not cause complications. ...
... . Thus, recalling the high probability upper bound on the number of tests left unexplained byS given in (28), the number of newly explained tests is stochastically upper bounded by the following: 3 ...
... Among those remaining, we apply the approximate recovery algorithm from [15] with β = η − 2 for some η − = o(1) to obtainŜ ′ . As noted in [12], a result in [28,Appendix B] shows that this algorithm still succeeds with high probability even when the number of defectives is only known within a factor of 1 ± o(1). We then apply the SUBSET algorithm (Algorithm 1) using the reduced ground set and the two-sided estimateŜ ′ . ...
Preprint
The group testing problem consists of determining a sparse subset of defective items from within a larger set of items via a series of tests, where each test outcome indicates whether at least one defective item is included in the test. We study the approximate recovery setting, where the recovery criterion of the defective set is relaxed to allow a small number of items to be misclassified. In particular, we consider one-sided approximate recovery criteria, where we allow either only false negative or only false positive misclassifications. Under false negatives only (i.e., finding a subset of defectives), we show that there exists an algorithm matching the optimal threshold of two-sided approximate recovery. Under false positives only (i.e., finding a superset of the defectives), we provide a converse bound showing that the better of two existing algorithms is optimal.
... In contrast, results for the linear regime studied here lag far behind. For the noisy setting, they boil down to a non-tight lower bound on the number of necessary tests for the special case of symmetric noise [Sca19]. The lack of rigorous results in the linear regime is in stark contrast to the substantial amount of potential applications. ...
... If we turn from noiseless to noisy group testing rigorous results are even more sparse. To the best of our knowledge, the only result in this setting is a lower bound on the necessary tests for the special case of symmetric noise [Sca19], which is a factor (1 − 2p 01 ) −1 smaller than the actual bound. However, [Sca19] does not consider the question of achievability not to mention the question of efficient algorithms. ...
... To the best of our knowledge, the only result in this setting is a lower bound on the necessary tests for the special case of symmetric noise [Sca19], which is a factor (1 − 2p 01 ) −1 smaller than the actual bound. However, [Sca19] does not consider the question of achievability not to mention the question of efficient algorithms. A recent contribution [CN20] studies the decoding complexity of group testing, i.e., the possibility of finding the infected individuals in optimal, possibly sublinear time. ...
Preprint
In group testing, the task is to identify defective items by testing groups of them together using as few tests as possible. We consider the setting where each item is defective with a constant probability α\alpha, independent of all other items. In the (over-)idealized noiseless setting, tests are positive exactly if any of the tested items are defective. We study a more realistic model in which observed test results are subject to noise, i.e., tests can display false positive or false negative results with constant positive probabilities. We determine precise constants c such that cnlogncn\log n tests are required to recover the infection status of every individual for both adaptive and non-adaptive group testing: in the former, the selection of groups to test can depend on previously observed test results, whereas it cannot in the latter. Additionally, for both settings, we provide efficient algorithms that identify all defective items with the optimal amount of tests with high probability. Thus, we completely solve the problem of binary noisy group testing in the studied setting.
... In the noisy setting, there is a risk that testing a group with active elements would show a negative outcome and vice versa. Our method presented in Section 3 can be considered an adaptation of noisy adaptive GT [42]. ...
Preprint
Bayesian optimization (BO ) is an effective method for optimizing expensive-to-evaluate black-box functions. While high-dimensional problems can be particularly challenging, due to the multitude of parameter choices and the potentially high number of data points required to fit the model, this limitation can be addressed if the problem satisfies simplifying assumptions. Axis-aligned subspace approaches, where few dimensions have a significant impact on the objective, motivated several algorithms for high-dimensional BO . However, the validity of this assumption is rarely verified, and the assumption is rarely exploited to its full extent. We propose a group testing ( GT) approach to identify active variables to facilitate efficient optimization in these domains. The proposed algorithm, Group Testing Bayesian Optimization (GTBO), first runs a testing phase where groups of variables are systematically selected and tested on whether they influence the objective, then terminates once active dimensions are identified. To that end, we extend the well-established GT theory to functions over continuous domains. In the second phase, GTBO guides optimization by placing more importance on the active dimensions. By leveraging the axis-aligned subspace assumption, GTBO outperforms state-of-the-art methods on benchmarks satisfying the assumption of axis-aligned subspaces, while offering improved interpretability.
... The set of defective items can be fully recovered by using O(d log (n/d)) tests with O(log d n) stages in [10] or with two stages in [21]. When the test outcomes are unreliable, it is still possible to obtain O(d log (n/d)) tests using a few stages [22]- [26]. For probabilistic group testing with non-adaptive design, the number of tests O(d log n) has been known for a long time [27]- [30]. ...
Preprint
Full-text available
The main goal of group testing is to identify a small number of defective items in a large population of items. A test on a subset of items is positive if the subset contains at least one defective item and negative otherwise. In non-adaptive design, all tests can be tested simultaneously and represented by a measurement matrix in which a row and a column represent a test and an item, respectively. An entry in row i and column j is 1 if item j belongs to the test i and is 0 otherwise. Given an unknown set of defective items, the objective is to design a measurement matrix such that, by observing its corresponding outcome vector, the defective items can be recovered efficiently. The basic trait of this approach is that the measurement matrix has remained unchanged throughout the course of generating the outcome vector and recovering defective items. In this paper, we study the case in which some entries in the measurement matrix are erased, called \emph{the missing measurement matrix}, before the recovery phase of the defective items, and our objective is to fully recover the measurement matrix from the missing measurement matrix. In particular, we show that some specific rows with erased entries provide information aiding the recovery while others do not. Given measurement matrices and erased entries follow the Bernoulli distribution, we show that before the erasing event happens, sampling sufficient sets of defective items and their corresponding outcome vectors can help us recover the measurement matrix from the missing measurement matrix.
... In other words, we assume that the error is symmetric. The symmetric noisy model is studied in the literature for independent group testing in works such as [Sca18,SC18,CCJS11,MS98,CJBJ13]. In Section 8, we study this noisy testing for group testing with correlation. ...
Preprint
Full-text available
Group testing, a problem with diverse applications across multiple disciplines, traditionally assumes independence across nodes' states. Recent research, however, focuses on real-world scenarios that often involve correlations among nodes, challenging the simplifying assumptions made in existing models. In this work, we consider a comprehensive model for arbitrary statistical correlation among nodes' states. To capture and leverage these correlations effectively, we model the problem by hypergraphs, inspired by [GLS22], augmented by a probability mass function on the hyper-edges. Using this model, we first design a novel greedy adaptive algorithm capable of conducting informative tests and dynamically updating the distribution. Performance analysis provides upper bounds on the number of tests required, which depend solely on the entropy of the underlying probability distribution and the average number of infections. We demonstrate that the algorithm recovers or improves upon all previously known results for group testing settings with correlation. Additionally, we provide families of graphs where the algorithm is order-wise optimal and give examples where the algorithm or its analysis is not tight. We then generalize the proposed framework of group testing with general correlation in two directions, namely noisy group testing and semi-non-adaptive group testing. In both settings, we provide novel theoretical bounds on the number of tests required.
... Aldridge [2] extended this approach to the case k = (n), obtaining near-optimal rates. Recently, there has been significant progress on upper and lower bounds for noisy and adaptive group testing, although general optimal results remain elusive [32,38]. ...
Article
Full-text available
We study the problem of identifying a small number knθk\sim n^\theta , 0<θ<10\lt \theta \lt 1 , of infected individuals within a large population of size n by testing groups of individuals simultaneously. All tests are conducted concurrently. The goal is to minimise the total number of tests required. In this paper, we make the (realistic) assumption that tests are noisy, that is, that a group that contains an infected individual may return a negative test result or one that does not contain an infected individual may return a positive test result with a certain probability. The noise need not be symmetric. We develop an algorithm called SPARC that correctly identifies the set of infected individuals up to o(k) errors with high probability with the asymptotically minimum number of tests. Additionally, we develop an algorithm called SPEX that exactly identifies the set of infected individuals w.h.p. with a number of tests that match the information-theoretic lower bound for the constant column design, a powerful and well-studied test design.
... Li et al. [15]). If the testing is not exact thus with some noise, Scarlett [17] studied noise adaptive group testing and Scarlett et al. [18] represented the noise nonadaptive group testing and Aldridg et al. [1] gave some review on noise group testing. ...
Preprint
Full-text available
In multistage group testing, the tests within the same stage are considered nonadaptive, while those conducted across different stages are adaptive. Specifically, when the pools within the same stage are disjoint, meaning that the entire set is divided into several disjoint subgroups, it is referred to as a multistage group partition testing problem, denoted as the (n, d, s) problem, where n, d, and s represent the total number of items, defectives, and stages respectively. This paper presents exact solutions for the (n, 1, s) and (n, d, 2) problems for the first time. Additionally, a general dynamic programming approach is developed for the (n, d, s) problem. Significantly I give the sharp upper and lower bounds estimates. If the defective number in unknown but bounded, I can provide an algorithm with an optimal competitive ratio in the asymptotic sense. While assuming the prior distribution of the defective items, I also establish a well performing upper and lower bound estimate to the expectation of optimal strategy
Preprint
This paper focuses on the design and analysis of privacy-preserving techniques for group testing and infection status retrieval. Our work is motivated by the need to provide accurate information on the status of disease spread among a group of individuals while protecting the privacy of the infection status of any single individual involved. The paper is motivated by practical scenarios, such as controlling the spread of infectious diseases, where individuals might be reluctant to participate in testing if their outcomes are not kept confidential. The paper makes the following contributions. First, we present a differential privacy framework for the subset retrieval problem, which focuses on sharing the infection status of individuals with administrators and decision-makers. We characterize the trade-off between the accuracy of subset retrieval and the degree of privacy guaranteed to the individuals. In particular, we establish tight lower and upper bounds on the achievable level of accuracy subject to the differential privacy constraints. We then formulate the differential privacy framework for the noisy group testing problem in which noise is added either before or after the pooling process. We establish a reduction between the private subset retrieval and noisy group testing problems and show that the converse and achievability schemes for subset retrieval carry over to differentially private group testing.
Article
Full-text available
In this paper, we propose a formal security model and a construction methodology of interactive aggregate message authentication codes with detecting functionality (IAMDs). The IAMD is an interactive aggregate MAC protocol which can identify invalid messages with a small amount of tag-size. Several aggregate MAC schemes that can detect invalid messages have been proposed so far by using non-adaptive group testing in the prior work. In this paper, we utilize adaptive group testing to construct IAMD scheme, and we show that the resulting IAMD scheme can identify invalid messages with a small amount of tag-size compared to the previous schemes. To this end, we give the formalization of adaptive group testing and IAMD, and propose a generic construction starting from any aggregate MAC and any adaptive group testing method. In addition, we compare instantiations of our generic constructions, in terms of total tag-size and several properties. Furthermore, we show advantages of IAMD by implementing constructions of (non-)adaptive aggregate message authentication with detecting functionality and comparing these ones in terms of the data-size and running time of verification algorithms.
Article
Full-text available
The group testing problem consists of determining a small set of defective items from a larger set of items based on a number of tests, and is relevant in applications such as medical testing, communication protocols, pattern matching, and more. In this paper, we revisit an efficient algorithm for noisy group testing in which each item is decoded separately (Malyutov and Mateev, 1980), and develop novel performance guarantees via an information-theoretic framework for general noise models. For the special cases of no noise and symmetric noise, we find that the asymptotic number of tests required for vanishing error probability is within a factor log20.7\log 2 \approx 0.7 of the information-theoretic optimum at low sparsity levels, and that with a small fraction of allowed incorrectly decoded items, this guarantee extends to all sublinear sparsity levels. In addition, we provide a converse bound showing that if one tries to move slightly beyond our low-sparsity achievability threshold using separate decoding of items and i.i.d. randomized testing, the average number of items decoded incorrectly approaches that of a trivial decoder.
Article
Full-text available
The group testing problem consists of determining a small set of defective items from a larger set of items based on a number of tests, and is relevant in applications such as medical testing, communication protocols, pattern matching, and more. In this paper, we revisit an efficient algorithm for noisy group testing in which each item is decoded separately (Malyutov and Mateev, 1980), and develop novel performance guarantees via an information-theoretic framework for general noise models. For the special cases of no noise and symmetric noise, we find that the asymptotic number of tests required for vanishing error probability is within a factor log20.7\log 2 \approx 0.7 of the information-theoretic optimum at low sparsity levels, and that with a small number of allowed errors, this guarantee extends to all sublinear sparsity levels. In addition, we provide a converse bound showing that if one tries to move slightly beyond our low-sparsity achievability threshold using separate decoding of items and i.i.d.~randomized testing, the average number of items decoded incorrectly approaches that of a trivial decoder.
Article
Full-text available
We study combinatorial group testing schemes for learning d-sparse boolean vectors using highly unreliable disjunctive measurements. We consider an adversarial noise model that only limits the number of false observations, and show that any noise-resilient scheme in this model can only approximately reconstruct the sparse vector. On the positive side, we give a general framework for construction of highly noise-resilient group testing schemes using randomness condensers. Simple randomized instantiations of this construction give non-adaptive measurement schemes, with m=O(dlogn)m=O(d \log n) measurements, that allow efficient reconstruction of d-sparse vectors up to O(d) false positives even in the presence of δm\delta m false positives and Ω(m/d)\Omega(m/d) false negatives within the measurement outcomes, for any constant δ<1\delta < 1. None of these parameters can be substantially improved without dramatically affecting the others. Furthermore, we obtain several explicit (and incomparable) constructions, in particular one matching the randomized trade-off but using m=O(d1+o(1)logn)m = O(d^{1+o(1)} \log n) measurements. We also obtain explicit constructions that allow fast reconstruction in time poly(m), which would be sublinear in n for sufficiently sparse vectors.
Article
Full-text available
The group testing problem consists of determining a sparse subset of a set of items that are ``defective'' based on a set of possibly noisy tests, and arises in areas such as medical testing, fault detection, communication protocols, pattern matching, and database systems. We study the fundamental limits of any group testing procedure regardless of its computational complexity. In the noiseless case with the number of defective items k scaling with the total number of items p as O(pθ)O(p^{\theta}) (θ(0,1)\theta\in(0,1)), we show that the probability of reconstruction error tends to one when nklog2pk(1+o(1))n \le k\log_2\frac{p}{k}(1+o(1)), but vanishes when nc(θ)klog2pk(1+o(1))n \ge c(\theta)k\log_2\frac{p}{k}(1+o(1)), for some explicit constant c(θ)c(\theta). For θ13\theta \le \frac{1}{3}, we show that c(θ)=1c(\theta) = 1, thus providing an exact threshold on the required number measurements, i.e.~a phase transition, which was previously known only in the limit as θ0\theta \to 0. Analogous necessary and sufficient conditions are derived for the noisy setting, and also for a relaxed partial recovery criterion.
Article
We consider nonadaptive probabilistic group testing in the linear regime, where each of n items is defective independently with probability p in (0,1), where p is a constant independent of n. We show that testing each item individually is optimal, in the sense that with fewer than n tests the error probability is bounded away from zero.
Article
In this work we prove non-trivial impossibility results for perhaps the simplest non-linear estimation problem, that of {\it Group Testing} (GT), via the recently developed Madiman-Tetali inequalities. Group Testing concerns itself with identifying a hidden set of d defective items from a set of n items via t {disjunctive/pooled} measurements ("group tests"). We consider the linear sparsity regime, i.e. d=δnd = \delta n for any constant δ>0\delta >0, a hitherto little-explored (though natural) regime. In a standard information-theoretic setting, where the tests are required to be non-adaptive and a small probability of reconstruction error is allowed, our lower bounds on t are the {\it first} that improve over the classical counting lower bound, t/nH(δ)t/n \geq H(\delta), where H()H(\cdot) is the binary entropy function. As corollaries of our result, we show that (i) for δ0.347\delta \gtrsim 0.347, individual testing is essentially optimal, i.e., tn(1o(1))t \geq n(1-o(1)); and (ii) there is an {adaptivity gap}, since for δ(0.3471,0.3819)\delta \in (0.3471,0.3819) known {adaptive} GT algorithms require fewer than n tests to reconstruct D{\cal D}, whereas our bounds imply that the best nonadaptive algorithm must essentially be individual testing of each element. Perhaps most importantly, our work provides a framework for combining combinatorial and information-theoretic methods for deriving non-trivial lower bounds for a variety of non-linear estimation problems.
Article
We consider nonadaptive group testing with Bernoulli tests, where each item is placed in each test independently with some fixed probability. We give a tight threshold on the maximum number of tests required to find the defective set under optimal Bernoulli testing. Achievability is given by a result of Scarlett and Cevher; here we give a converse bound showing that this result is best possible. Our new converse requires three parts: a typicality bound generalising the trivial counting bound, a converse on the COMP algorithm of Chan et al, and a bound on the SSS algorithm similar to that given by Aldridge, Baldassini, and Johnson. Our result has a number of important corollaries, in particular that, in denser cases, Bernoulli nonadaptive group testing is strictly worse than the best adaptive strategies.