Conference PaperPDF Available

Decoding Fingerprinting Using the Markov Chain Monte Carlo Method

Authors:

Abstract

This paper proposes a new fingerprinting decoder based on the Markov Chain Monte Carlo (MCMC) method. A Gibbs sampler generates groups of users according to the posterior probability that these users could have forged the sequence extracted from the pirated content. The marginal probability that a given user pertains to the collusion is then estimated by a Monte Carlo method. The users having the biggest empirical marginal probabilities are accused. This MCMC method can decode any type of fingerprinting codes. This paper is in the spirit of the 'Learn and Match' decoding strategy: it assumes that the collusion attack belongs to a family of models. The Expectation-Maximization algorithm estimates the parameters of the collusion model from the extracted sequence. This part of the algorithm is described for the binary Tardos code and with the exploitation of the soft outputs of the watermarking decoder. The experimental body considers some extreme setups where the fingerprinting code lengths are very small. It reveals that the weak link of our approach is the estimation part. This is a clear warning to the 'Learn and Match' decoding strategy.
Decoding Fingerprints Using the Markov Chain
Monte Carlo Method
Teddy Furon #, Arnaud Guyader +, Fr´
ed´
eric C´
erou
INRIA Rennes #+, IRMAR +, Universit ´
e de Rennes II +
Rennes, France
teddy.furon@inria.fr
Abstract This paper proposes a new fingerprinting decoder
based on the Markov Chain Monte Carlo (MCMC) method. A
Gibbs sampler generates groups of users according to the poste-
rior probability that these users could have forged the sequence
extracted from the pirated content. The marginal probability that
a given user pertains to the collusion is then estimated by a Monte
Carlo method. The users having the biggest empirical marginal
probabilities are accused. This MCMC method can decode any
type of fingerprinting codes.
This paper is in the spirit of the ‘Learn and Match’ decoding
strategy: it assumes that the collusion attack belongs to a family
of models. The Expectation-Maximization algorithm estimates
the parameters of the collusion model from the extracted se-
quence. This part of the algorithm is described for the binary
Tardos code and with the exploitation of the soft outputs of the
watermarking decoder.
The experimental body considers some extreme setups where
the fingerprinting code lengths are very small. It reveals that the
weak link of our approach is the estimation part. This is a clear
warning to the ‘Learn and Match’ decoding strategy.
I. INTRODUCTION
A. The Application
This paper deals with active fingerprinting, a.k.a. traitor
tracing. A robust watermarking technique embeds the user’s
codeword into the content to be distributed. When a pirated
copy of the content is scouted, the watermark decoder extracts
the message, which identifies the dishonest user. However,
there might exist a group of dishonest users, so called collu-
sion, who mix their personal versions of the content to forge
the pirated copy. The extracted message no longer corresponds
to the codeword of one user, but is a mix of several codewords.
The decoder aims at finding back some of these codewords to
identify the colluders, while avoiding accusing innocent users.
A popular construction of the codewords is the Tardos
fingerprinting code [1]. The recent literature on this type of
binary probabilistic code aims at improving either the coding
or the decoding side. The first topic uses theoretical arguments
to fine-tune the code construction [2], [3]. Our paper has no
contribution on this side: we use the classical code construc-
tion originally proposed by Tardos in [1] for benchmarking,
WIFS‘2012, December, 2-5, 2012, Tenerife, Spain. 978-1-
4244-9080-6/10/$26.00 c
2012 IEEE.
but our scheme readily works with these more recent codes.
As for the decoding side, we pinpoint the following features
in the literature: Hard decoders work on the binary sequence
extracted from the pirated content, whereas soft decoders uses
some real outputs from the watermarking layer [4]. This paper
pertains to this latter trend. Some decoders compute a score per
user, which is provably good whatever the collusion strategy,
whereas others algorithms first estimate the collusion attack
and adapt their scoring function. This paper pertains to this
Learn and Match’ trend [5]. Last but not least, information
theory tells us that a joint decoder considering groups of users
is potentially more powerful than a single decoder computing
a score per user. Our approach is based on joint decoding [3].
B. The problem with joint decoding
Very few papers put into practice the principle of joint
decoding. The main difficulty is the matter of complexity. If
we consider groups of epeople among nusers, then we need
to browse n
e=O(ne)such groups, which is not tractable for
a large database of users. The approach first proposed in [6]
for pair decoding and generalized for bigger subsets in [5]
is an iterative decoder. The iteration tconsiders subsets of
tusers, which are taken from a small pool of most likely
colluders of size nt. If nt=O(n1/t), then the number of
subsets to be analyzed remains in O(n)and supposedly within
computational resources. But, this implies that more and more
users are discarded and this pruning is prone to miss colluders.
Another point is that the scoring function used in [5] is
not optimal for three reasons. The scoring is based on the
likelihood that a group of users are all guilty, whereas it
should be the likelihood that some of them are guilty. The
scoring relies on the way the codewords were generated and
is extremely specific to the probabilistic nature of the Tardos
code construction. It is also based on a pessimistic estimation
of the collusion attack artificially assuming a bigger collusion
size cmax. It appears that this scoring is less discriminative if
cmax is much larger than the true collusion size c.
C. Our contributions
This paper presents a soft decoder in the trend of the ‘Learn
and Match’ approach. The estimation part elegantly uses the
E.-M. algorithm for a fast estimation of the collusion attack.
After this first task, our decoding algorithm puts into practice
the concept of joint decoding more directly than the previous
methods [6], [5]. Specifically, there is no iterative pruning
of the users. Instead of computing scores per user subset,
the decoder generates typical subsets likely to be the real
collusion. This is efficiently done by a Markov chain and a
convenient representation of collusion. A Monte Carlo method
computes statistics from these sampled subsets such as the
marginals of being a colluder. This Markov Chain Monte Carlo
(MCMC) method is the core of our algorithm and is the main
contribution of this paper. It works with any code construction
(probabilistic or error correction based). This approach has
been also used in biology for library screening [7] and in
blind deconvolution [8].
Our main contributions are introduced first: the repre-
sentation of collusion (Sect. II) and the MCMC decoder
(Sect. III). The more technical details about the experimental
setup (collusion model for the watermarking layer, estimation
of its parameters with the E.-M. method, and the derivation
of transition probabilities) are presented in the second part
of the paper (Sect. IV and V). An experimental investigation
shows state-of-the-art performances as well as the limits of
our approach with very short codes.
II. THE REPRESENTATION OF COLLUSION
The keystone of our approach is the representation of
collusion by a limited list of the colluder identities. Suppose
there are nusers indexed from 1 to nand denote [n],
{1, . . . , n}. Depending on how short is the code, tracing
collusions bigger than a given size, denoted by cmax, produces
unreliable decision . The collusion representation is a vector
sof cmax integer components, each ranging from 0 to n.
Some of them may take the value 0 which codes ‘nobody’.
We denote s0,ksk0its `0-norm, i.e. the number of non-
zero components. This vector represents a collusion of size s0
whose users are given by the non-zero components. Hence,
there is no pair of non-zero components with the same value.
We denote by Sthe set of all such vectors.
For example, with cmax = 5,˜
s= (6,0,3,2,0) represents
the collusion of the ˜s0= 3 following users: 2, 3, and 6.
A. The Neighborhood of a collusion
We denote by S(s)the neighborhood of sas the set of col-
lusions differing by at most one component in their representa-
tion. We have s∈ S(s)and the other neighbors have one more
colluder, one less colluder, or just one different colluder. For
instance, {(6,1,3,2,0),(6,0,3,0,0),(6,0,4,2,0)}⊂S(˜
s).
The neighborhood is decomposed as S(s) = Scmax
i=1 Si(s),
with
Si(s),{s0∈ S|s0(k) = s(k),k6=i}.(1)
The subsets Si(s)are not disjoint. If s(i) = 0, then Si(s)
is composed of sand some collusions of size s0+ 1 (one
user is added). The cardinal of Si(s)equals n+ 1 s0. If
s(i)>0, then Si(s)is composed of some collusions of size
s0including s(user s(i)is replaced by someone else or not)
and one collusion of size s01(user s(i)is removed). In this
case, |Si(s)|=n+ 2 s0.
B. The prior probability of a collusion
We now cast a probabilistic model on the collusion repre-
sentation. Our prior is as less informative as possible. Having
no information about the size of the collusion, we pose that
all sizes are equally probable if less or equal than cmax:
P(s0) = c1
max,s0[cmax ].(2)
We also pose that the n
ccollusions of size care equally
likely. Finally, the prior on sis given by:
P(s) = P(s, s0) = P(s|s0)P(s0) = n
s01
c1
max.(3)
With this model, the prior distribution is not uniform: collu-
sions of bigger size have lower probabilities.
C. The posterior probability of a collusion
Once a pirated copy is scouted, a sequence zis extracted.
This observation refines our knowledge on the collusion,
which is reflected by the posterior probability P(s|z). Thanks
to the Bayes rule:
P(s|z) = P(z|s)P(s)
P(z).(4)
Yet, we cannot compute this probability because P(z)is
unknown. This is not critical since this quantity is a common
denominator to all posteriors: we can still compare posteriors
or compute the ratio of two posteriors. The next difficulty is
the conditional P(z|s), which in words is the probability that
collusion scould have forged sequence z. This is where we
need a probabilistic model of the collusion process. Sect. IV-
C presents our models and Sect. V-D gives the expressions of
the conditional probabilities.
The Maximum A Posteriori decoder consists in finding the
collusion with the biggest posterior. Browsing all of them is
yet computationally intractable for a large database of users.
As an alternative, we can build a single decoder based on
the marginal posterior probabilities: the probability that user
jis a colluder is given by
P(j|z) = X
s|∃s(i)=j
P(s|z).(5)
Again, browsing the collusions of all the summands is not
practical for a large number of users.
III. THE MA RKOV CHAIN MO NT E CARLO MET HO D
The key idea of our approach is to estimate the above
marginals with a Markov Chain Monte Carlo method.
0 50 100 150 200 250 300
0
100
200
300
400
500
0 50 100 150 200 250 300
0
0.2
0.4
0.6
0.8
1
Fig. 1. Illustration of the MCMC metod for K= 500 and n= 300: [Up]
The Markov chain: the binary matrix K×nindicating which users belong
to the state s(t)for 1tK, [Down] The Monte Carlo estimation: the
empirical marginal probabilities, i.e. the mean of the columns of the above
binary matrix.
A. The Monte Carlo Method
Instead of computing the marginals with (5), a Monte Carlo
method estimates their values: we draw Kcollusions {sk}K
k=1
according to distribution P(s|z), and the empirical marginals
are given by:
ˆ
P(j|z) = |{sk|∃sk(i) = j}|/K, (6)
which reads as the empirical frequency that user jbelongs to
these sampled collusions. The next subsections explain how
we succeed to sample collusions distributed as P(s|z). The
difficulty lies in the fact that we cannot sample them directly
because we only know P(s|z)up to the multiplicative constant
P(z).
B. The Markov Chain
The collusions are indeed generated thanks to a Markov
chain. It is an iterative process with a state (here a collusion)
taking value s(t)at iteration t. The next iteration draws a new
state s(t+1) according to the transition probability P(s(t+1) =
s|s(t))specified in the next section. The Markov chain is ini-
tialized by randomly selecting a collusion s(0). The transition
probabilities are carefully crafted so that the distribution of
the state s(t)converges to the targeted distribution P(s|z)
as tincreases. After a burn-in period T, we assume that
the Markov chain has forgotten s(0) and that the states are
correctly sampled from now on: the collusions {s(t)}T+K
t=T
are then passed to the Monte Carlo part of the algorithm
that computes the empirical marginals thanks to (6). Fig. 1
illustrates the MCMC method. The colluders are the 6 users
with the highest empirical marginals. One sees that innocents
often belong to some collusion states s(t), but it intermittently
occurs for one given innocent so that eventually it doesn’t
harm.
C. The Gibbs sampler
Instead of computing the transition probabilities from s(t)
to any possible collusion, we restrict the transitions to the
collusions of a subset of its neighborhood S(s(t))(See Sect. II-
A). This is called a multi-stage Gibbs sampler with random
scan [9, Alg. A.42]. At iteration t+ 1, an integer iis first
uniformly drawn in [cmax]that indicates the subset Si(s(t)).
Then, the following transition distribution is constructed:
s∈ Si(s(t))
P(s(t+1) =s|s(t)) := P(s|z)
Ps0∈Si(s(t))P(s0|z)
=P(z|s)P(s)
Ps0∈Si(s(t))P(z|s0)P(s0)(7)
This choice guarantees that the stationary distribution of this
Markov chain is P(s|z), which legitimates our approach [9,
Sect. 10.2.1]. The unknown multiplicative constant P(z)in (4)
has disappeared in the ratio. This transition distribution only
depends on the priors P(s0)and P(s)given by (3), and the
conditional probabilities {P(z|s)}s∈Si(s(t))which depend on
the collusion process that will be estimated (See Sect. IV-C
and (17) or (18)). These latter quantities are functions of the
codewords of users listed in s, whatever their construction.
This decoding algorithm thus works with any fingerprinting
code. The remainder of the paper applies this approach to
Tardos codes.
IV. THE SE TU P
This section briefly reviews the construction of the Tardos
code, describes the setup for hiding the user’s codeword into
the content, and presents the model of the collusion attack.
A. Tardos Code Construction
The binary code is composed of ncodewords of mbits.
The codeword xj= (xj,1,··· , xj,m )Tidentifying user j
U= [n], is composed of mbinary symbols independently
drawn at the code construction s.t. P(xj,i = 1) = pi,(j, i)
[n]×[m]. At initialization, the auxiliary variables {pi}m
i=1 are
independent and identically drawn according to distribution
f(p) : [0,1] R+. This distribution is a parameter to
be selected by the code designer. Tardos originally proposed
fT(p),1[t,1t](p)κt/pp(1 p)in [1], where 1[a,b]is the
indicator function of the interval (i.e. 1[a,b](p) = 1 if ap
b,0otherwise), and κtthe constant s.t. R1
0fT(p)dp = 1. The
cut-off parameter tis usually set to 1/(300cmax). The integer
cmax is the maximum expected number of colluders.
The distribution fis public, but both the code Ξ =
[x1,...,xn]and the auxiliary sequence p= (p1, . . . , pm)T
must be kept as secret parameters. A user possibly knows his
codeword, but the only way to gain knowledge on some other
codewords is to team up in a collusion.
B. The Modulation
We assume the content is divided into chunks. For instance,
a movie is split into blocks of frames. A watermarking tech-
nique embeds a binary symbol per chunk, sequentially hiding
the codeword into the content. This implies that the code
length mis limited by the watermarking embedding rate times
the duration (or size) of the content. The robustness of the
watermarking technique strongly depends on the embedding
rate: the lower the embedding rate, the more robust the
watermark. Therefore, it is crucial to design a fingerprinting
decoder able to trace colluders despite a short code length.
The watermarking decoder retrieves the hidden bit xper
watermarked chunk of content. We assume that it first com-
putes a statistic z, so-called soft output, which ideally equals
1(resp. 1) if the hidden bit is ‘1’ (resp. ‘0’). The decoder
then threshold the soft output at 0 to yield a hard output:
ˆx= ‘1’ if z0,‘0’ otherwise. This is the case for instance
with a spread spectrum based watermarking technique using
an antipodal modulation (a.k.a. BPSK).
In this paper, the fingerprinting decoder (i.e. the accusation
process) is named soft decoder because it uses the soft outputs
of the watermarking decoder. This brings more information on
the colluders’ identity than the hard outputs.
C. The collusion Attack
The model of the collusion attack is taken from [5]. The
partition of content into chunks is not a secret. We assume
that the collusion attack sequentially processes all the chunks
in the same way: there is first a fusion of the versions of the
colluders into one chunk, and then a process (coarse source
compression, noise addition, filtering etc.) further distorts this
fused chunk. We foresee the following types of fusion.
1) Attacks of type I: The fusion is a signal processing
operation that mixes cchunks of content into one. This
includes sample-wise average or median, sample patchwork
etc. The fusion aims at reducing the confidence in the decoded
symbol so we assume that |z| ≤ 1.
This mixing is strongly driven by the number kof chunks
watermarked with symbol ‘1’ (denoted ‘1’-chunk). For a
collusion of size c,k∈ {0, . . . , c}and we assume that zcan
take c+ 1 values {µc(k)}c
k=0. In the manner of the marking
assumption, we restrict the power of the collusion by enforcing
µc(c) = µc(0) = 1. In other words, there is no fusion when
all chunks are watermarked with the same symbol.
The watermarking secret key prevents the colluders from
determining the hidden symbol. Yet, they can group their
cchunks into two groups of exactly equal instances1. This
reveals the number of hidden ‘1’ up to an ambiguity: it is k
or ck. This implies a symmetry in the model: µc(k) =
µc(ck),k∈ {0, . . . , c + 1}.
At last, the distortion on the fused chunk adds a noise on
the soft output: z=µc(k) + n. We assume that this noise is
independent of k, and i.i.d. with n∼ N(0, σ2).
We give two examples taken from [4].
Average’: µc(k) = 2kc11,k. This is the case
for instance if the soft decision is a linear process and
the colluders fuse all their chunks with a sample-wise
average.
Average2’: µc(k)=0,k[c1] and µc(c) =
µc(0) = 1. This is the case for instance when the soft
1In other words, the watermarking is deterministic: its result only depends
on its inputs (the original chunk, the symbol to be embedded, and the secret
key)
decision is a linear process and the colluders fuse only
two different blocks with a sample-wise average.
2) Attacks of type II: In this type, the colluders benefit from
the division into blocks. At a given block index, they copy and
paste one of their chunks. There is no fusion of blocks. The
probability they put a ‘1’-chunk when they have kchunks
watermarked with symbol ‘1’ over cis denoted by θc(k). The
marking assumption imposes that θc(c)=1θc(0) = 1. The
ambiguity on the number of ‘1’-chunks results in the symmetry
θc(k)=1θc(ck). After the distortion on the selected
chunk, zis distributed as N(1, σ2)with probability 1θc(k)
or as N(1, σ2)with probability θc(k).
Here are two classical examples also used in [4]:
‘Uniform’: θc(k) = kc1,k. The colluders uniformly
draw one of their chunks.
‘Majority’: θc(k)=1if k > c/2,0otherwise (θc(c/2) =
1/2if cis even). The colluders classify the chunks into
two groups of same instance and choose a chunk from
the bigger group.
V. TH E ESTIMATION WITH THE E.-M. ALGORITHM
The first task of the proposed decoder is the estimation of
the collusion attack. It amounts to guess the type of the attack
and its parameters (µc, σ)(type I) or (θc, σ)(type II) from
the observations zand the knowledge of the secret p. The
model is not identifiable if cis unknown. For this reason,
the estimation is done for all collusion sizes ranging from 1
to cmax. This results in a set of cmax model parameters. This
estimation task heavily relies on the collusion model (Sect. IV-
C) and the Tardos code.
A. The Maximum Likelihood Estimator
The MLE (Maximum Likehood Estimator) searches the
maximum of the log-likelihood. For type I, the latter has the
following expression:
L(I)(µ˜c, σ),log P(z|p,µ˜c, σ)
=
m
X
i=1
log ˜c
X
k=0
P(k|pi)f(zi;µ˜c(k), σ)!,(8)
with f(z;µ, σ),e(zµ)2
2σ2/2πσ2. For type II, we have
L(II )(θ˜c, σ ),log P(z|p,θ˜c, σ)
=
m
X
i=1
log ˜c
X
k=0
P(k|pi)[θ˜c(k)f(zi; 1, σ)
+ (1 θ˜c(k))f(zi;1, σ)]) .(9)
The final decision about the type of the attack is taken by
comparing the values L(I)(µ?
˜c, σ?
I,˜c)and L(I I)(θ?
˜c, σ?
II,˜c): the
type giving the biggest likelihood is selected. Note that both
types have the same number of parameters, so that none is
more prone than the other to overfitting and this justifies the
comparison of their respective likelihoods.
Yet, these two optimizations are nothing trivial because
the functionals have plenty of local maxima. Under both
attack types, the observation zis indeed a mixture of a fixed
number of Gaussian distributions, a typical case where the
Expectation-Maximization estimator is elegant and efficient
even if convergence to the global maximum is not ensured. The
following two subsections are the application of [10, Tab. 3.1].
B. E.M. Algorithm for type I Attacks
We introduce the unknown latent variables ϕ=
(ϕ1, . . . , ϕm)that capture the number of ‘1’-chunks per index:
i[m], ϕi∈ {0,...,˜c}. If ˜c=c, the true value of ϕi
is Pk∈C xk,i. The log-likelihood function with these latent
variables is now L(I)(µ˜c, σ),log P(z,ϕ|p,µ˜c, σ ):
L(I)(µ˜c, σ) =
m
X
i=1
log (P(ϕi|pi)f(zi;µ˜c(ϕi), σ)) ,(10)
with P(ϕ|p) = ˜c
ϕpϕ(1 p)˜cϕ. The E.-M. iteratively refines
the estimation (ˆ
µ˜c,ˆσ)of the parameters and of the c+1)×m
matrix Tstoring the conditional probabilities of ϕ:
Tk,i ,P(ϕi=k|zi, pi,ˆ
µ˜c,ˆσ),k∈ {0,...,˜c}.(11)
1) E-step: Given the current estimate of the model, the
conditional probabilities of ϕare updated via the Bayes rule:
ˆ
Tk,i =P(ϕi=k|pi)f(zi; ˆµ˜c(k),ˆσ)
P˜c+1
u=0 P(ϕi=u|pi)f(zi; ˆµ˜c(k),ˆσ)(12)
2) M-step: Given the conditional probabilities, this step
updates the model by finding the parameters that maximize
the Qfunction Eϕ|zp[L(I)(µ˜c, σ)]. These have a closed-form
expression in the case of Gaussian mixtures:
ˆµ˜c(k) = m
X
i=1
ˆ
Tk,izi!/
m
X
i=1
ˆ
Tk,i,k[˜c1] (13)
ˆσ2=m1
˜c
X
k=0
m
X
i=1
ˆ
Tk,i(ziˆµ˜c(k))2(14)
These two steps are iterated until the increase of the true log-
likelihood L(I)(ˆ
µ˜c,ˆσ)is no longer sensitive.
C. E.M. Algorithm for type II Attacks
Under this type of attack, the latent variable ζi
{−˜c, . . . , 1,0,1,...,˜c}captures the event that the colluders
have |ζi|‘1’-chunks and that they copy-paste a chunk with
symbol sg(ζi)at the i-th index (sg(k) = 1 if k > 0,
0 otherwise). The log-likelihood function with these latent
variables is now L(II)(θ˜c, σ ),log P(z,ζ|p,θ˜c, σ):
L(II )(θ˜c, σ ) =
m
X
i=1
log (π(ζi, pi).f(zi; 2sg(ζi)1, σ)) ,
(15)
with π(ζi, pi) = θ(|ζi|)sg(ζi)(1θ(|ζi|))1sg(ζi)P(|ζi||pi), and
P(|ζ||p) = ˜c
|ζ|p|ζ|(1 p)˜c−|ζ|.
1) E-step: The estimates of the conditional probabilities
Uk,i ,P(ζi=k|zi, pi,ˆ
θ˜c,ˆσ)are updated as follows:
ˆ
Uk,i ˆ
θ(k)P(k|pi)f(zi; 1,ˆσ) if 0 k˜c
(1 ˆ
θ(k))P(k|pi)f(zi;1,ˆσ) if ˜ck < 0
such that P˜c
k=˜cˆ
Uk,i = 1.
2) M-step: The estimate of the model is updated as follows:
ˆ
θ(k) = m
X
i=1
ˆ
Uk,i!/ m
X
i=1
ˆ
Uk,i +ˆ
Uk,i!,k[˜c1]
ˆσ2=m1
m
X
i=1
˜c
X
k=˜c
ˆ
Uk,i(zi2sg(k) + 1)2(16)
D. The expression of the conditional probabilities
Once these cmax estimations of the collusion attacks are
done, these models are used in the MCMC decoder via the
conditional probabilities P(z|s). Denote κ= (κ1, . . . , κm)
the sequence of number of symbols ‘1’ in the codewords
of collusion s:κi=Pjsxj,i (with the convention that
x0,i = 0). If for the size s0, a type I collusion attack has
been diagnosed, then
P(z|s) =
m
Y
i=1
f(zi;µ?
s0(κi), σ?
I,s0),(17)
otherwise, for a type II attack:
P(z|s) =
m
Y
i=1 θ?
s0(κi).f(zi; 1, σ?
II ,s0)
+ (1 θ?
s0(κi)).f(zi;1, σ?
II ,s0)(18)
VI. EXPER IM EN TAL BODY
The experimental setup is the same as in [4] with ‘Uniform’,
‘Majority’, ‘Average’ and ‘Average2’ attacks (See examples
of Sect. IV-C.1 and IV-C.2). There are two scenarios:
(a) m= 2048,c= 8,cmax = 10,
(b) m= 1024,c= 6,cmax = 8.
The common parameters are n= 104,T= 400,K= 2000.
The variance of the noise is given by σ= 10SNR/20, with
SNR ranging from 10 to 2 dB. This is the signal to noise power
ratio at the output of the watermark decoder. It should not be
confused with the amount of noise on the content samples.
These settings are very extreme in the sense that the codes
are too short to be used in practice. We made this choice in
order to show the limits of our method.
The set Ais defined as A,{j|ˆ
P(j|z)> τ}. A false
negative occurs if this set is empty. In a single decoder, the
user in Awith the biggest empirical marginal is accused. In
case of a tie where dusers have the maximum score, one
user is randomly picked. This leads to a false positive with
probability di/d where diis the number of innocents with
this maximum score. In a joint decoder, the users in Aare all
accused. We record the number of caught colluders |A ∩ C|
and the number of accused innocents |A| − |A ∩ C|.
Under scenario (a), the comparison with the performances
of the decoder proposed in [4] is mitigated as shown in
Fig. 2 (down). The baseline of [4] is slightly better against
the ‘Majority’ attack whereas our decoder is better against the
three other attacks. Yet, our decoder has a major drawback: the
probability of false alarm is totally unacceptable at low SNR
(i.e., at 2 dB in Fig. 2) and when facing the ‘Uniform’ attack.
The same comment holds for the scenario (b) (See Fig. 3).
2 3 4 5 6 7 8 9 10
3.5
3
2.5
2
1.5
1
0.5
0
log
10
2 6 10
0
1
2
3
4
5
6
7
8
Fig. 2. Scenario (a), 2250 experiments, SNR ∈ {2,6,10}dB. [Up]
Probabilities of errors for the single decoder: (solid) Probability of false
negative, (dashed) Probability of false positive, [Down] Average number of
caught colluders for the joint decoder: (black) average number of accused
innocents, (dashed) the baseline from [4], Color: (blue) ‘Uniform’, (green)
‘Majority’, (red) ‘Average’, (pink) ‘Average2’.
We suspect the ‘Uniform’ attack to be closed to the worst
attack of type I, at a given SNR: it has been proven that the
achievable rate against the ‘Uniform’ quickly converges to the
one of the worst case attack when considering hard outputs [3].
The scenario (b) gives us another hint. Both the length of
the code and the square of the collusion size are now twice
smaller. Since the code length should be of order mO(c2),
the performances of the decoder should be roughly the same
under both scenarios. This is not the case: the probability
of false accusation is bigger. This allows us to suspect the
estimation part to be the Achilles’ heel of our algorithm: it
produces a bad quality estimation because the code is too short
and this spoils the MCMC decoding part. A close look at the
estimated collusion models reveals that the type of model and
the variance of the noise on the soft outputs are almost always
evaluated with high fidelity. Under the ‘Uniform’ attack, the
problem indeed stems from the estimations of ˆ
θ˜cwith ˜c>c
prone to overfitting: these estimated models are quite different
in nature from the ‘Uniform’ attack of size ˜c.
To confirm this intuition by a last experiment, we by-pass
the estimation and give to the MCMC method the collusion
models exactly matching the ‘Uniform’ attack. This simulates
a ‘perfect’ estimation. The results are shown in Fig. 3 (down)
in light blue color. If the number of caught colluders decreases
a little, the accusation is now much more reliable with a
probability of false positive around 3.103.
VII. CONCLUSION
The fingerprinting decoder studied in this paper shows
some novelty: i) the accusation is probabilistic because of
the randomness of the MCMC, ii) it resorts to a posteriori
probabilities for groups of users more directly than in the
previous approach, iii) the decoding part makes no assumption
2 3 4 5 6 7 8 9 10
3.5
3
2.5
2
1.5
1
0.5
0
2 6 10
0
1
2
3
4
5
6
Fig. 3. Scenario (b), 2715 experiments, SNR ∈ {2,6,10}dB. [Up]
Probabilities of errors for the single decoder (solid) Probability of false
negative - (dashed) Probability of false positive, [Down] Average number
of caught colluders for the joint decoder, (black) average number of accused
innocents , (dashed) the baseline from [4], Color: (blue) ‘Uniform’, (green)
‘Majority’, (red) ‘Average’, (pink) ‘Average2’, and (light blue) ‘Uniform’
attack with true model.
on the code construction. It has also some drawbacks: i) the
estimation part detailed here is specific to Tardos fingerprinting
codes, ii) the complexity of MCMC is quite high in O(Kmn),
iii) the probability of false alarm is not easy to control.
The main message is that the performances are limited by
the quality of the estimation of the collusion strategy. This is
a clear warning to the ‘Learn and Match’ decoding strategy,
and this issue deserves more research efforts.
REFERENCES
[1] G. Tardos, “Optimal probabilistic fingerprint codes,” in Proc. of the 35th
annual ACM symposium on theory of computing. San Diego, CA, USA:
ACM, 2003, pp. 116–125.
[2] K. Nuida, S. Fujitsu, M. Hagiwara, T.Kitagawa, H. Watanabe, K. Ogawa,
and H. Imai, “An improvement of discrete Tardos fingerprinting codes,”
Designs, Codes and Cryptography, vol. 52, no. 3, pp. 339–362, 2009.
[3] Y.-W. Huang and P. Moulin, “On the saddle-point solution and the large-
coalition asymptotics of fingerprinting games,” IEEE Transactions on
Information Forensics and Security, vol. 7, no. 1, pp. 160–175, 2012.
[4] M. Kuribayashi, “Bias equalizer for binary probabilistic fingerprinting
codes,” in Proc. of 14th Information Hiding Conference, ser. LNCS,
S. Verlag, Ed., Berkeley, CA, USA, may 2012.
[5] P. Meerwald and T. Furon, “Towards practical joint decoding of binary
Tardos fingerprinting codes,” Information Forensics and Security, IEEE
Transactions on, vol. 7, no. 4, pp. 1168–1180, August 2012.
[6] E. Amiri, “Fingerprinting codes: higher rates, quick accusations,” Ph.D.
dissertation, Simon Fraser University, Fall 2010.
[7] E. Knill, A. Schliep, and D. C. Torney, “Interpretation of pooling
experiments using the Markov chain Monte Carlo method,J Comput
Biol, vol. 3, no. 3, pp. 395–406, 1996.
[8] D. Ge, J. Idier, and E. L. Carpentier, “Enhanced sampling schemes for
MCMC based blind Bernoulli-Gaussian deconvolution,Signal Process-
ing, vol. 91, no. 4, pp. 759–772, 2011.
[9] C. Robert and G. Casella, Monte Carlo statistical methods. Springer
Verlag, 2004.
[10] M. R. Gupta and Y. Chen, “Theory and use of the EM algorithm,
Foundations and Trends in Signal Processing, vol. 4, no. 3, pp. 223–
296, 2010.
... 1), and that the extracted bit is random when their bits disagree. We This so-called marking assumption plays a crucial role in traitor tracing literature [26,54,79]. Surprisingly, it holds even though our watermarking process is not explicitly designed for it. ...
... Surprisingly, it holds even though our watermarking process is not explicitly designed for it. The study has room for improvement, such as creating user identifiers with more powerful traitor tracing codes [79] and using more powerful traitor accusation algorithms [26,54]. Importantly, we found the precedent remarks also hold if the colluders operate at the image level. ...
Preprint
Full-text available
Generative image modeling enables a wide range of applications but raises ethical concerns about responsible deployment. This paper introduces an active strategy combining image watermarking and Latent Diffusion Models. The goal is for all generated images to conceal an invisible watermark allowing for future detection and/or identification. The method quickly fine-tunes the latent decoder of the image generator, conditioned on a binary signature. A pre-trained watermark extractor recovers the hidden signature from any generated image and a statistical test then determines whether it comes from the generative model. We evaluate the invisibility and robustness of the watermarks on a variety of generation tasks, showing that Stable Signature works even after the images are modified. For instance, it detects the origin of an image generated from a text prompt, then cropped to keep 10%10\% of the content, with 90+%\% accuracy at a false positive rate below 106^{-6}.
... Notably, besides [IZ21], we are not aware of any other theoretical work on MCMC methods for Bernoulli group testing. On the other hand, multiple applied papers have used MCMC methods for group testing [STR03, KST96,FGC12] and it is the general understanding that their "…empirical performance appears strong in simulations " [AJS19, Section 3.3.1]. To buttress these claims, it is essential to pursue an improved theoretical understanding of MCMC methods for informationtheoretic optimal designs such as Bernoulli group testing, which is the central focus of this work. ...
Preprint
The group testing problem is a canonical inference task where one seeks to identify k infected individuals out of a population of n people, based on the outcomes of m group tests. Of particular interest is the case of Bernoulli group testing (BGT), where each individual participates in each test independently and with a fixed probability. BGT is known to be an ``information-theoretically'' optimal design, as there exists a decoder that can identify with high probability as n grows the infected individuals using m=log2(nk)m^*=\log_2 \binom{n}{k} BGT tests, which is the minimum required number of tests among \emph{all} group testing designs. An important open question in the field is if a polynomial-time decoder exists for BGT which succeeds also with mm^* samples. In a recent paper (Iliopoulos, Zadik COLT '21) some evidence was presented (but no proof) that a simple low-temperature MCMC method could succeed. The evidence was based on a first-moment (or ``annealed'') analysis of the landscape, as well as simulations that show the MCMC success for n1000sn \approx 1000s. In this work, we prove that, despite the intriguing success in simulations for small n, the class of MCMC methods proposed in previous work for BGT with mm^* samples takes super-polynomial-in-n time to identify the infected individuals, when k=nαk=n^{\alpha} for α(0,1)\alpha \in (0,1) small enough. Towards obtaining our results, we establish the tight max-satisfiability thresholds of the random k-set cover problem, a result of potentially independent interest in the study of random constraint satisfaction problems.
... due to testing speed constraints. Existing algorithms on infection vector recovery for noisy group testing under the combinatorial prior (the number of positives among samples is fixed) include LP relaxation [16], belief propagation [20], Markov Chain Monte Carlo (MCMC) [9], Noisy Combinatorial Orthogonal Matching Pursuit (NCOMP) [4], separate decoding [18] and Definite positives (DD) [19]. However, group testing under a fully probabilistic framework has not been extensively studied. ...
Preprint
Full-text available
Group testing saves time and resources by testing each pre-assigned group instead of each individual, and one-stage group testing emerged as essential for cost-effectively controlling the current COVID-19 pandemic. Yet, the practical challenge of adjusting pooling designs based on infection rate has not been systematically addressed. In particular, there are both theoretical interests and practical motivation to analyze one-stage group testing at finite, practical problem sizes, rather than asymptotic ones, under noisy, rather than perfect tests, and when the number of positives is randomly distributed, rather than fixed. Here, we study noisy group testing under the probabilistic framework by modeling the infection vector as a random vector with Bernoulli entries. Our main contributions include a practical one-stage group testing protocol guided by maximizing pool entropy and a maximum-likelihood recovery algorithm under the probabilistic framework. Our findings high-light the implications of introducing randomness to the infection vectors – we find that the combinatorial structure of the pooling designs plays a less important role than the parameters such as pool size and redundancy.
... In Section 3 we present results from simulations suggesting that our approach for solving the SSS problems, besides theoretically optimally in terms of recovery, is efficient and furthermore is able to outperform the other known popular algorithms for approximate inference of the set of defective items (including DD, COMP among others) when n ≥ (1 + )k log 2 ( p k ) for any > 0. As a final remark, our results are in agreement with earlier experimental findings on Markov Chain Monte Carlo algorithms for noisy group testing settings in applied contexts, such as computational biology [35,49] and security [26], but we note that our work is the first one to provide a concrete theoretical explanation for their strong performance in simulations, and a principled way to exploit local algorithms for implementing the SSS estimator. ...
Preprint
In this work we study the fundamental limits of approximate recovery in the context of group testing. One of the most well-known, theoretically optimal, and easy to implement testing procedures is the non-adaptive Bernoulli group testing problem, where all tests are conducted in parallel, and each item is chosen to be part of any certain test independently with some fixed probability. In this setting, there is an observed gap between the number of tests above which recovery is information theoretically (IT) possible, and the number of tests required by the currently best known efficient algorithms to succeed. Often times such gaps are explained by a phase transition in the landscape of the solution space of the problem (an Overlap Gap Property phase transition). In this paper we seek to understand whether such a phenomenon takes place for Bernoulli group testing as well. Our main contributions are the following: (1) We provide first moment evidence that, perhaps surprisingly, such a phase transition does not take place throughout the regime for which recovery is IT possible. This fact suggests that the model is in fact amenable to local search algorithms ; (2) we prove the complete absence of "bad" local minima for a part of the "hard" regime, a fact which implies an improvement over known theoretical results on the performance of efficient algorithms for approximate recovery without false-negatives, and finally (3) we present extensive simulations that strongly suggest that a very simple local algorithm known as Glauber Dynamics does indeed succeed, and can be used to efficiently implement the well-known (theoretically optimal) Smallest Satisfying Set (SSS) estimator.
... A distinct but related approach to belief propagation is based on generating samples from P(K | y) via Markov Chain Monte Carlo (MCMC). To our knowledge, the MCMC approach to group testing was initiated by Knill et al. [123]; see also [169] and [84] for related follow-up works. Each of these papers uses the notion of Gibbs sampling: A randomly-initialized set K 0 ⊆ {1, . . . ...
Article
The group testing problem concerns discovering a small number of defective items within a large population by performing tests on pools of items. A test is positive if the pool contains at least one defective, and negative if it contains no defectives. This is a sparse inference problem with a combinatorial flavour, with applications in medical testing, biology, telecommunications, information technology, data science, and more. In this monograph, we survey recent developments in the group testing problem from an information-theoretic perspective. We cover several related developments: achievability bounds for optimal decoding methods, efficient algorithms with practical storage and computation requirements, and algorithm-independent converse bounds. We assess the theoretical guarantees not only in terms of scaling laws, but also in terms of the constant factors, leading to the notion of the rate and capacity of group testing, indicating the amount of information learned per test. Considering both noiseless and noisy settings, we identify several regimes where existing algorithms are provably optimal or near-optimal, as well as regimes where there remains greater potential for improvement. In addition, we survey results concerning a number of variations on the standard group testing problem, including partial recovery criteria, adaptive algorithms with a limited number of stages, constrained test designs, and sublinear-time algorithms.
... Indeed, with n users and a code length of ℓ = O(c 2 log n), the decoding time is commonly O(ℓ · n) = O(c 2 n log n) for linear decoders, and up to O(c 2 n k log n) for joint decoders, attempting to decode to groups of k ≤ c colluders simultaneously. In particular, past work has shown that joint decoders achieve superior performance to simple decoders [2,7,11,17,18,20,22,27], but are often considered infeasible due to their high decoding complexity. Moreover, in dynamic settings [21,24] where decisions about the accusation of users need to be made swiftly, an efficient decoding method is even more critical. ...
Conference Paper
Over the past decade, various improvements have been made to Tardos' collusion-resistant fingerprinting scheme [Tardos, STOC 2003], ultimately resulting in a good understanding of what is the minimum code length required to achieve collusion-resistance. In contrast, decreasing the cost of the actual decoding algorithm for identifying the potential colluders has received less attention, even though previous results have shown that using joint decoding strategies, deemed too expensive for decoding, may lead to better code lengths. Moreover, in dynamic settings a fast decoder may be required to provide answers in real-time, further raising the question whether the decoding costs of score-based fingerprinting schemes can be decreased with a smarter decoding algorithm. In this paper we show how to model the decoding step of score-based fingerprinting as a nearest neighbor search problem, and how this relation allows us to apply techniques from the field of (approximate) nearest neighbor searching to obtain decoding times which are sublinear in the total number of users. As this does not affect the encoding and embedding steps, this decoding mechanism can easily be deployed within existing fingerprinting schemes, and this may bring a truly efficient joint decoder closer to reality. Besides the application to fingerprinting, similar techniques can potentially be used to decrease the decoding costs of group testing methods, which may be of independent interest.
Preprint
Diffusion models have achieved remarkable success in generating high-quality images. Recently, the open-source models represented by Stable Diffusion (SD) are thriving and are accessible for customization, giving rise to a vibrant community of creators and enthusiasts. However, the widespread availability of customized SD models has led to copyright concerns, like unauthorized model distribution and unconsented commercial use. To address it, recent works aim to let SD models output watermarked content for post-hoc forensics. Unfortunately, none of them can achieve the challenging white-box protection, wherein the malicious user can easily remove or replace the watermarking module to fail the subsequent verification. For this, we propose \texttt{\method} as the first implementation under this scenario. Briefly, we merge watermark information into the U-Net of Stable Diffusion Models via a watermark Low-Rank Adaptation (LoRA) module in a two-stage manner. For watermark LoRA module, we devise a scaling matrix to achieve flexible message updates without retraining. To guarantee fidelity, we design Prior Preserving Fine-Tuning (PPFT) to ensure watermark learning with minimal impacts on model distribution, validated by proofs. Finally, we conduct extensive experiments and ablation studies to verify our design.
Chapter
This chapter presents the problem of traitor tracing and the most efficient solution known, Tardos codes. Tracing codes form an application overlay to the transmission layer by watermarking: a codeword is generated for each user, and then inserted, by watermarking, into a confidential document to be shared. The chapter introduces the main components: code construction, collusion strategy, accusation with a simple decoder, based on a score function and thresholding. It provides a simplified proof of Tardos’ original scheme, presents the many extensions found in the literature. The chapter explains the choices made in Tardos and the improvements offered in Škorić et al. Among all of the further developments of the Tardos code, it presents new, more powerful score functions for binary codes.
Chapter
Group testing saves time and resources by testing each preassigned group instead of each individual, and one-stage group testing emerged as essential for cost-effectively controlling the current COVID-19 pandemic. Yet, the practical challenge of adjusting pooling designs based on infection rate has not been systematically addressed. In particular, there are both theoretical interests and practical motivation to analyze one-stage group testing at finite, practical problem sizes, rather than asymptotic ones, under noisy, rather than perfect tests, and when the number of positives is randomly distributed, rather than fixed.
Conference Paper
Full-text available
In binary probabilistic fingerprinting codes, the number of symbols "0" and "1" is generally balanced because of the design of the codeword. After a collusion attack, the balance of symbols is not always assured in a pirated codeword. It is reported in [8] that if the number of symbols in a pirated codeword is unbalanced, then a tracing algorithm can be improved by equalizing the unbalanced symbols. In this study, such a bias equalizer is customized for probabilistic fingerprinting codes utilizing the encoding parameters. Although the proposed bias equalizer is highly dependent on collusion strategies, it can improve the performance of traceability for some typical strategies to which the conventional bias equalizer can not be applied.
Article
Full-text available
The class of joint decoder in fingerprinting codes is of utmost importance in theoretical papers to establish the concept of fingerprint capacity. However, no implementation supporting a large user base is known to date. This paper presents an iterative decoder which is the first attempt toward practical large-scale joint decoding. The discriminative feature of the scores benefits on one hand from the side-information of previously found users, and on the other hand, from recently introduced universal linear decoders for compound channels. Neither the code construction nor the decoder makes assumptions about the collusion size and strategy, provided it is a memoryless and fair attack. The extension to incorporate soft outputs from the watermarking layer is straightforward. An extensive experimental work benchmarks the very good performance and offers a clear comparison with previous state-of-the-art decoders.
Article
Full-text available
It has been proven that the code lengths of Tardos’s collusion-secure fingerprinting codes are of theoretically minimal order with respect to the number of adversarial users (pirates). However, the code lengths can be further reduced as some preceding studies have revealed. In this article we improve a recent discrete variant of Tardos’s codes, and give a security proof of our codes under an assumption weaker than the original Marking Assumption. Our analysis shows that our codes have significantly shorter lengths than Tardos’s codes. For example, when c=8, our code length is about 4.94% of Tardos’s code in a practical setting and about 4.62% in a certain limit case. Our code lengths for large c are asymptotically about 5.35% of Tardos’s codes.
Article
Full-text available
La simulation est devenue dans la dernière décennie un outil essentiel du traitement statistique de modèles complexes et de la mise en oeuvre de techniques statistiques avancées, comme le bootstrap ou les méthodes d'inférence simulée. Ce livre présente les éléments de base de la simulation de lois de probabilité (génération de variables uniformes et de lois usuelles) et de leur utilisation en Statistique (intégration de Monte Carlo, optimisation stochastique). Après un bref rappel sur les chaînes de Markov, les techniques plus spécifiques de Monte Carlo par chaînes de Markov (MCMC) sont présentées en détail, à la fois du point de vue théorique (validité et convergence) et du point de vue de leur implémentation (accélération, choix de paramètres, limitations). Les algorithmes d'échantillonnage de Gibbs sont ainsi distingués des méthodes générales de Hastings-Metropolis par leur plus grande richesse théorique. Les derniers chapitres contiennent un exposé critique sur l'état de l'art en contrôle de convergence de ces algorithmes et une présentation unifiée des diverses applications des méthodes MCMC aux modèles à données manquantes. De nombreux exemples statistiques illustrent les méthodes présentées dans cet ouvrage destiné aux étudiants de deuxième et troisième cycles universitaires en Mathématiques Appliquées ainsi qu'aux chercheurs et praticiens désirant utiliser les méthodes MCMC. Monte Carlo statistical methods, particularly those based on Markov chains, are now an essential component of the standard set of techniques used by statisticians. This new edition has been revised towards a coherent and flowing coverage of these simulation techniques, with incorporation of the most recent developments in the field. In particular, the introductory coverage of random variable generation has been totally revised, with many concepts being unified through a fundamental theorem of simulation There are five completely new chapters that cover Monte Carlo control, reversible jump, slice sampling, sequential Monte Carlo, and perfect sampling. There is a more in-depth coverage of Gibbs sampling, which is now contained in three consecutive chapters. The development of Gibbs sampling starts with slice sampling and its connection with the fundamental theorem of simulation, and builds up to two-stage Gibbs sampling and its theoretical properties. A third chapter covers the multi-stage Gibbs sampler and its variety of applications. Lastly, chapters from the previous edition have been revised towards easier access, with the examples getting more detailed coverage. This textbook is intended for a second year graduate course, but will also be useful to someone who either wants to apply simulation techniques for the resolution of practical problems or wishes to grasp the fundamental principles behind those methods. The authors do not assume familiarity with Monte Carlo techniques (such as random variable generation), with computer programming, or with any Markov chain theory (the necessary concepts are developed in Chapter 6). A solutions manual, which covers approximately 40% of the problems, is available for instructors who require the book for a course. oui
Article
This paper proposes and compares two new sampling schemes for sparse deconvolution using a Bernoulli–Gaussian model. To tackle such a deconvolution problem in a blind and unsupervised context, the Markov Chain Monte Carlo (MCMC) framework is usually adopted, and the chosen sampling scheme is most often the Gibbs sampler. However, such a sampling scheme fails to explore the state space efficiently. Our first alternative, the K-tuple Gibbs sampler, is simply a grouped Gibbs sampler. The second one, called partially marginalized sampler, is obtained by integrating the Gaussian amplitudes out of the target distribution. While the mathematical validity of the first scheme is obvious as a particular instance of the Gibbs sampler, a more detailed analysis is provided to prove the validity of the second scheme.For both methods, optimized implementations are proposed in terms of computation and storage cost. Finally, simulation results validate both schemes as more efficient in terms of convergence time compared with the plain Gibbs sampler. Benchmark sequence simulations show that the partially marginalized sampler takes fewer iterations to converge than the K-tuple Gibbs sampler. However, its computation load per iteration grows almost quadratically with respect to the data length, while it only grows linearly for the K-tuple Gibbs sampler.
Article
We study a fingerprinting game in which the number of colluders and the collusion channel are unknown. The encoder embeds fingerprints into a host sequence and provides the decoder with the capability to trace back pirated copies to the colluders. Fingerprinting capacity has recently been derived as the limit value of a sequence of maximin games with mutual information as their payoff functions. However, these games generally do not admit saddle-point solutions and are very hard to solve numerically. Here under the so-called Boneh–Shaw marking assumption, we reformulate the capacity as the value of a single two-person zero-sum game, and show that it is achieved by a saddle-point solution. If the maximal coalition size is k and the fingerprinting alphabet is binary, we show that capacity decays quadratically with k . Furthermore, we prove rigorously that the asymptotic capacity is 1/(k22ln2)1/(k^{2}2ln 2) and we confirm our earlier conjecture that Tardos' choice of the arcsine distribution asymptotically maximizes the mutual information payoff function while the interleaving attack minimizes it. Along with the asymptotics, numerical solutions to the game for small k are also presented.
Article
This introduction to the expectation–maximization (EM) algorithm provides an intuitive and mathematically rigorous understanding of EM. Two of the most popular applications of EM are described in detail: estimating Gaussian mixture models (GMMs), and estimating hidden Markov models (HMMs). EM solutions are also derived for learning an optimal mixture of fixed models, for estimating the parameters of a compound Dirichlet distribution, and for dis-entangling superimposed signals. Practical issues that arise in the use of EM are discussed, as well as variants of the algorithm that help deal with these challenges.
Article
This paper describes an effective method for extracting as much information as possible from pooling experiments for library screening. Pools are collections of clones, and screening a pool with a probe determines whether any of these clones are positive for the probe. The results of the pool screenings are interpreted, or decoded, to infer which clones are candidates to be positive. These candidate positives are subjected to confirmatory testing. Decoding the pool screening results is complicated by the presence of errors, which typically lead to ambiguities in the inference of positive clones. However, in many applications there are reasonable models for the prior distributions for positives and for errors, and Bayes inference is the preferred method for ranking candidate positives. Because of the combinatoric complexity of the Bayes formulation, we implemented a decoding algorithm using a Markov chain Monte Carlo method. The algorithm was used in screening a library with 1298 clones using 47 pools. We corroborated the posterior probabilities for positives with results from confirmatory screening. We also simulated the screening of a 10-fold coverage library of 33,000 clones using 253 pools. The use of our algorithm, effective under conditions where combinatorial decoding techniques are imprudent, allows the use of fewer pools and also introduces needed robustness.
Article
We construct binary codes for fingerprinting digital documents. Our codes for n users that are ϵ-secure against c pirates have length O ( c ² log( n /ϵ)). This improves the codes proposed by Boneh and Shaw [1998] whose length is approximately the square of this length. The improvement carries over to works using the Boneh--Shaw code as a primitive, for example, to the dynamic traitor tracing scheme of Tassa [2005]. By proving matching lower bounds we establish that the length of our codes is best within a constant factor for reasonable error probabilities. This lower bound generalizes the bound found independently by Peikert et al. [2003] that applies to a limited class of codes. Our results also imply that randomized fingerprint codes over a binary alphabet are as powerful as over an arbitrary alphabet and the equal strength of two distinct models for fingerprinting.