Content uploaded by Diego F. Aranha

Author content

All content in this area was uploaded by Diego F. Aranha on Jan 21, 2022

Content may be subject to copyright.

1

Fast Privacy-Preserving Text Classiﬁcation based on

Secure Multiparty Computation

Amanda Resende, Davis Railsback, Rafael Dowsley, Anderson C. A. Nascimento and Diego F. Aranha

Abstract—We propose a privacy-preserving Naive Bayes clas-

siﬁer and apply it to the problem of private text classiﬁcation. In

this setting, a party (Alice) holds a text message, while another

party (Bob) holds a classiﬁer. At the end of the protocol, Alice will

only learn the result of the classiﬁer applied to her text input and

Bob learns nothing. Our solution is based on Secure Multiparty

Computation (SMC). Our Rust implementation provides a fast

and secure solution for the classiﬁcation of unstructured text.

Applying our solution to the case of spam detection (the solution

is generic, and can be used in any other scenario in which the

Naive Bayes classiﬁer can be employed), we can classify an SMS

as spam or ham in less than 340 ms in the case where the

dictionary size of Bob’s model includes all words (n= 5200)

and Alice’s SMS has at most m= 160 unigrams. In the case

with n= 369 and m= 8 (the average of a spam SMS in the

database), our solution takes only 21 ms.

Index Terms—Privacy-Preserving Classiﬁcation, Secure Multi-

party Computation, Naive Bayes, Spam.

I. INTRODUCTION

Classiﬁcation is a supervised learning technique in Machine

Learning (ML) that has the goal of constructing a classiﬁer

given a set of training data with class labels. Decision Tree,

Naive Bayes, Random Forest, Logistic Regression and Support

Vector Machines (SVM) are some examples of classiﬁcation

algorithms. These algorithms can be used to solve many

problems, such as: classifying an email/Short Message Ser-

vice (SMS) as spam or ham (not spam) [5]; diagnosis of

a medical condition (disease versus no disease) [63]; hate

speech detection [55]; face classiﬁcation [45]; ﬁngerprinting

identiﬁcation [17]; and image categorization [29]. For the ﬁrst

three examples above, classiﬁcation is binary, where there are

only two class labels (yes or no); while the last three are multi-

class, that is, there are more than two classes.

We consider the scenario in which there are two parties:

one possesses the private data to be classiﬁed and the other

party holds a private model used to classify such data. In such

a scenario, the party holding the data (Alice) is interested

in obtaining the classiﬁcation result of such data against a

model held by a second party (Bob) so that, at the end of

the classiﬁcation protocol, Alice knows solely the input data

and the classiﬁcation result, and Bob knows nothing beyond

A. Resende is with the Institute of Computing, University of Campinas,

Campinas, Brazil. E-mail: amanda.resende@ic.unicamp.br.

D. Railsback and A. C. A. Nascimento are with the School of Engineering

and Technology, University of Washington Tacoma, WA 98402, USA. E-mails:

{drail, andclay}@uw.edu.

R. Dowsley is with the Faculty of Information Technology, Monash

University, Australia. E-mail: rafael.dowsley@monash.edu.

D. F. Aranha is with the Department of Computer Science, Aarhus

University, Aarhus, Denmark. E-mail: dfaranha@cs.au.dk.

the model itself. This scenario is a very relevant one. There

are many situations where a data owner is not comfortable

sharing a piece of data that needs classiﬁcation (think of

psychological or health related data). Also, a machine learning

model holder might not want to/cannot reveal the model in

the clear for intellectual property issues or because the model

reveals information about the data set used to train it. Thus,

both parties have proper incentives to participate in a protocol

providing the joint functionality of private classiﬁcation.

Due to these concerns, mechanisms such as Secure Mul-

tiparty Computation (MPC) [20], Differential Privacy (DP)

and Homomorphic Encryption (HE) can be used to build

privacy-preserving solutions. MPC allows two or more parties

to jointly compute a function over their private inputs without

revealing any information to the other party, whereas HE is

an encryption scheme that allows performing computations

on encrypted data without having to decrypt it. And, DP is

a technique that adds random noise to queries, to prevent

an adversary from learning information about any particular

individual in the data set.

Our main goal is to propose protocols for privacy-preserving

text classiﬁcation. By carefully selecting cryptographic engi-

neering optimizations, we improve upon previous results by

Reich et al. [55] by over an order of magnitude achieving, to

the best of our knowledge, the fastest text-classiﬁcation results

in the available literature (21ms for an average sample of our

data set). More speciﬁcally, we propose a privacy-preserving

Naive Bayes classiﬁcation (PPNBC) based on MPC where

given a trained model we classify/predict an example without

revealing any additional information to the parties other than

the classiﬁcation result, which can be revealed to one speciﬁed

party or both parties. We then apply our solution to a text

classiﬁcation problem: classifying SMSes as spam or ham.

A. Application to Private SMS Spam Detection

SMS is one of the most used telecommunication service

in the world. It allows mobile phone users to send and

receive a short text (which has 160 7-bit characters maximum).

Due to advantages such as reliability (since the message

reaches the mobile phone user), low cost to send an SMS

(especially if bought in bulk), the possibility of personalizing,

and immediate delivery, SMS is a widely used communication

medium for commercial purposes, and mobile phone users are

ﬂooded with unsolicited advertising.

SMSes are also used in scams, where someone tries to steal

personal information, such as credit card details, bank account

information, or social security numbers. Usually, the scammer

2

sends an SMS with a link that invites a person to verify his/her

account details, make a payment, or that claims that he/she has

earned some amount of money and needs to use the link to

conﬁrm. In all cases, such SMSes can be classiﬁed as spam.

Machine learning classiﬁers can be used to detect whether

an SMS is a spam or not (ham). During the training phase,

these algorithms learn a model from a data set of labeled exam-

ples, and later on, are used during the classiﬁcation/prediction

phase to classify unseen SMSes. In a Naive Bayes classiﬁer,

the model is based on the frequency that each word occurs

in the training data set. In the classiﬁcation phase, based on

these frequencies, the model predicts whether an unseen SMS

is spam or not.

A concern with this approach is related to Alice’s privacy

since she needs to make her SMSes available to the spam

ﬁltering service provider, Bob, which owns the model. SMSes

may contain sensitive information that the user would not

like to share with the service provider. Besides, the service

provider also does not want to reveal what parameters the

model uses (in Naive Bayes, the words and their frequencies)

to spammers and concurrent service providers. Our privacy-

preserving Naive Bayes classiﬁcation based on MPC, provides

an extremely fast secure solution for both parties to clas-

sify SMSes as spam or ham without leaking any additional

information while maintaining essentially the same accuracy

as the original algorithm performed in the clear. While our

experimental treatment is focused on SMS messages, the

same approach can be naturally generalized to classify short

messages received over Twitter or instant messengers such as

WhatsApp or Signal.

B. Our Contributions

These are the main contributions of this work:

•A privacy-preserving Naive Bayes classiﬁcation

(PPNBC) protocol: We propose the ﬁrst privacy-

preserving Naive Bayes classiﬁer with private feature

extraction. Previous works assumed the features to be

publicly known. It is based on secret sharing techniques

from MPC. In our solution, given a trained model it is

possible to classify/predict an example without revealing

any additional information to the parties other than the

classiﬁcation result, which can be revealed to one or both

parties. We prove that our solution is secure using the

Universal Composability (UC) framework [14], the gold

standard framework for analyzing the security of crypto-

graphic protocols, thus proving that it enjoys very strong

security guarantees, and can be arbitrarily composed as

well as used in realistic scenarios such as the Internet

without compromising the security of the solution.

•An efﬁcient and optimized software implementation

of the protocol: The proposed protocol is implemented

in Rust using an up-to-date version of the RustLynx

framework1.

•Experimental results for the case of SMS classiﬁcation

as spam/ham: The proposed protocol is evaluated in

1https://bitbucket.org/uwtppml/rustlynx/src/master/

a use case for SMS spam detection, using a data set

widely used in the literature. However, it is important

to note that the solution is generic, and can be used in

any other scenario in which the Naive Bayes classiﬁer

can be employed.

While the necessary building blocks already exist in the

literature, the main novelty of our work is putting these

building blocks together, optimizing their implementations

using cryptographic engineering techniques and obtaining the

fastest protocol for private text classiﬁcation to date.

C. Organization

This paper is organized as follows. In Section 2, we deﬁne

the notation used during this work, describe the necessary

cryptographic building blocks and brieﬂy introduce the Naive

Bayes classiﬁer. In Sections 3 and 4, we describe our privacy-

preserving Naive Bayes classiﬁcation protocol and present its

security analysis. In Section 5, we describe the experimental

results, from the training phase until the classiﬁcation using

our PPBNC, as well as the cryptographic engineering tech-

niques used in our implementation. Finally, in Sections 6 and

7, we present the related works and the conclusions.

II. PRELIMINARIES

As in most existing works on privacy-preserving machine

learning based on MPC, we consider an honest-but-curious

adversary (also known as semi-honest adversary). In this

model, each party follows the protocol speciﬁcations but, using

his view of the protocol execution, may try to learn additional

information other than his input and speciﬁed output.

A. Secure Computation Based on Additive Secret Sharing

Our solution is based on additive secret sharing over a ring

Zq={0,1,2,· · · , q −1}. A value xis secret shared between

Alice and Bob over Zqby picking xA, xB∈Zquniformly

at random with the constraint that x=xA+xBmod q.

Alice receives the share xAwhile Bob receives the share

xB. We denote this pair of shares by JxKq. A secret shared

value xcan be open towards one party by disclosing both

shares to that party. Let JxKq,JyKqbe secret shared values

and cbe a constant. Alice and Bob can perform locally and

straightforwardly the following operations:

•Addition (z=x+y): Each party locally adds its local

shares of xand ymodulo qin order to obtain a share of

z. This operation will be denoted by JzKq←JxKq+JyKq.

•Subtraction (z=x−y): Each party locally subtracts its

local share of xand ymodulo qin order to obtain a

share of z. This operation will be denoted by JzKq←

JxKq−JyKq.

•Multiplication by a constant (z=cx): Each party multi-

plies its local share of xby cmodulo qto obtain a share

of z. This operation will be denoted by JzKq←cJxKq.

•Addition of a constant (z=x+c): Alice adds cto

her share xAof xto obtain zA, while Bob sets zB=

xB, keeping his original share. This will be denoted by

JzKq←JxKq+c.

3

Unlike the above operations, secure multiplication of secret

shared values (i.e., z=xy) cannot be done locally and

requires interaction between Alice and Bob. To compute this

operation highly efﬁciently, we use Beaver’s multiplication

triple technique [9], which consumes a multiplication triple –

i.e., (JuKq,JvKq,JwKq)such that uand vare chosen uniformly

at random and w=uv – in order to compute the multiplication

of JxKqand JyKqwithout leaking any information. We use a

trusted initializer (TI) to generate multiplication triples and

secret share them to Alice and Bob. In the trusted initializer

model, the TI can pre-distribute correlated randomness to

Alice and Bob during a setup phase, which is run before the

protocol execution (possibly long before Alice and Bob get to

know their inputs). The TI is not involved in any other part

of the protocol execution and does not get to know the parties

inputs and outputs.2This model was used in many previous

works, e.g., [58], [35], [34], [41], [60], [23], [24], [37], [2].

If a TI is not desirable or unavailable, Alice and Bob can

securely simulate a TI at the cost of introducing computational

assumptions in the protocol [25]. The TI is modeled by the

ideal functionality FD

TI . In addition to the multiplication triples,

the TI also generates random values in Zqand delivers them

to Alice so that she can use them to secret share her inputs.

If Alice wants to secret share an input x, she picks an unused

random value r(note that Bob does not know r), and sends

c=x−rto Bob. Her share xAof xis then set to xA=r,

while Bob’s share xBis set to xB=c. The secret sharing of

Bob’s inputs is done similarly using random values that the

TI only delivers to him.

Functionality FD

TI

FD

TI is parametrized by an algorithm D. Upon initial-

ization run (DA, DB)$

← D and deliver DAto Alice

and DBto Bob.

The following straightforward extension of Beaver’s idea

performs the UC-secure multiplication of secret shared matri-

ces X∈Zi×j

qand Y∈Zj×k

q[31], [25]. The protocol will be

denoted by πDMM and works as follows:

1) The TI chooses uniformly random Uand Vin Zi×j

q

and Zj×k

q, respectively, computes W=UV and pre-

distributes secret sharings JUKq,JVKq,JWKq(the secret

sharings are done element-wise) to Alice and Bob.

2) Alice and Bob locally compute JDKq←JXKq−JUKq

and JEKq←JYKq−JVKq, and then open Dand E.

3) Alice and Bob locally compute JZKq←JWKq+

EJUKq+DJVKq+DE.

B. Fixed Point Arithmetic

Many real-world applications of MPC require representing

and operating on continuous data. This poses a challenge

2It is a well-known fact that UC-secure MPC needs a setup assumption

[15], [16]. A TI is one of the setup assumptions that allows obtaining UC-

secure MPC. Other setup assumptions that enable UC-secure MPC include:

a common reference string [15], [16], [52], the availability of a public-key

infrastructure [6], signature cards [40], tamper-proof hardware [43], [30], [33]

or noisy channels between the parties [32], [36], and the random oracle model

[39], [8], [22].

because the security of additive secret sharing depends on the

fact that shares are uniformly random – a concept that only

exists for samples of ﬁnite sets. For compatibility with MPC,

continuous values need to be represented within a range of

possible values. We use the mapping

Q(x) = (2λ− b2a· |x|c if x < 0

b2a·xcif x≥0

to represent real numbers in Zq, where q= 2λ, as ﬁxed-

point precision values represented in two’s complement. The

parameter ais the fractional accuracy – the number of bits

used to represent negative powers of 2. This mapping preserves

addition in Zqstraightforwardly, but a multiplication of two

ﬁxed-point values results in a ﬁxed point value with 2a

fractional bits. To maintain the expected representation in Zq,

all products need to be truncated by abit positions, requiring

an additional MPC protocol [51]. In this paper, ﬁxed-point

values are only added together and multiplied by 0 or 1, thus

a truncation protocol is not needed.

C. Cryptographic Building Blocks

Next, we present the cryptographic building blocks that are

used in our PPNBC solution.

Secure Equality Test: To perform a secure equality test, we

use a straightforward folklore protocol πEQ. As input, Alice

and Bob have bitwise secret sharings in Z2of the bitstrings

X={x`, . . . , x1}and Y={y`, . . . , y1}. The protocol

generates as output a secret sharing of 1 if X=Yand a secret

sharing of 0 otherwise. The protocol πEQ works as follows:

1) For i= 1, . . . , `, Alice and Bob locally compute

JriK2←JxiK2+JyiK2+ 1.

2) Alice and Bob use secure multiplication (πDMM) to

compute a secret sharing of z=r1·. . . ·r`. They output

the secret sharing JzK2. (Note that if x=y, then ri= 1

in all ipositions, thus z= 1; otherwise some ri= 0

and so z= 0).

By performing the multiplications to compute zin a binary

tree style with the values r1, . . . , r`in the `leaves, the protocol

πEQ requires dlog(`)erounds of communication and a total of

4(`−1) bits of data transfer. For batched inputs {X1, ..., Xk},

{Y1, ..., Yk}, the number of communication rounds remains the

same and the data transfer per round is scaled by k.

Secure Feature Extraction: To perform the feature extraction

in a privacy-preserving way, we use the protocol πFE from

Reich et al. [55]. Alice has as input the set A={a1,· · · , am}

of unigrams occurring in her message and Bob has as input the

set B={b1,· · · , bn}of unigrams that occur in his ML model.

The elements of both sets are represented as bitstrings of size

`. The purpose of the protocol is to extract which words from

Alice’s message appear in Bob’s set. Thus, at the end of the

protocol, Alice and Bob have secret shares of a binary feature

vector Ywhich represents what words in Bob’s set appear in

Alice’s message. The binary feature vector Y={y1,· · · , yn}

is deﬁned as:

yj=1,if bi∈A

0,otherwise

The protocol πFE works as follows:

4

1) Alice secret shares apwith Bob for p= 1,· · · , m, while

Bob secret shares biwith Alice for i= 1,· · · , n. Both

use bitwise secret sharings in Z2. To secret share their

input apand bi, Alice and Bob use the method described

in Section II-A.

2) For each apand each bi, they execute the secure equality

protocol πEQ, which outputs a secret sharing of

y0

ip =1,if ap=bi

0,otherwise

3) Alice and Bob locally compute the secret share JyiK2←

Pm

p=1Jy0

ipK2

The protocol requires dlog(`)e+1 rounds of communication,

4(`−1) bits of data transfer for each call of the πEQ, and 1

bit for each input bit that is secret shared. πFE requires m·n

equality tests: the number of communication rounds remains

the same as a single execution, as all the tests can be done

in parallel; the data transfer however is scaled by m·n. The

total data transfer of πFE is therefore 4·m·n·(`−1) + m+n

bits.

Secure Conversion: To perform a secure conversion from

a secret sharing in Z2to a secret sharing in Zq, we use

the secure conversion protocol π2toQ presented by Reich et

al. [55]. Alice and Bob have as input a secret sharing JxK2

and without learning any information about x, they must get

a secret sharing JxKq. The protocol π2toQ works as follows:

1) For the input JxK2, let xA∈ {0,1}denote Alice’s share

of xand xB∈ {0,1}denote Bob’s share.

2) Deﬁne JxAKqas the shares (xA,0) and JxBKqas the

shares (0, xB).

3) Alice and Bob compute JyKq←JxAKqJxBKq.

4) They output JzKq←JxAKq+JxBKq−2JyKq.

The protocol π2toQ requires 1round of communication and

a total of 4λbits of data transfer, where λis the bit length of q.

For batched inputs {x1, ..., xk}, the number of communication

rounds remains the same and the data transfer is scaled by k.

Secure Bit Extraction: The secure bit extraction protocol

πBTX takes a secret value JxKqand a publicly known bit

position αand returns a Z2-sharing of the α-th bit of x,

J(x(α−1)) ∧1K2. The protocol is based on a reduction

of the protocol for full bit decomposition modeled after a

matrix representation of the carry look-ahead adder circuit that

was presented in [27]. πBTX [1] works as follows:

1) For the secret shared value JxKq= (xA, xB), Alice and

Bob locally create bitwise sharings of the propagate

signal

p(α)←x(α)

A⊕x(α)

B, . . . , p(1) ←x(1)

A⊕x(1)

B,

where x(i)

A/x(i)

Bindicates the i-th bit of xA/xB.

2) Alice and Bob use the secure multiplication of secret

shared values to jointly compute the generate signal

g(α)←x(α)

A∧x(α)

B, . . . , g(1) ←x(1)

A∧x(1)

B.

3) Alice and Bob jointly compute the (α−1)-th carry bit

Jc(α−1)K2as the upper right entry of

Y

1≤i≤α−1Jp(i)K2Jg(i)K2

0 1

.

4) Alice and Bob locally compute Jx(α)K2←Jp(α)K2⊕

Jc(α−1)K2.

The protocol πBTX requires dlog α−1)erounds and 2(α−

1) + 4 log (α−1) −4bits of communication. Figure 1 shows

an example circuit to compute the matrix composition phase

for α= 17. For batched inputs {x1, . . . , xk}, the number of

communication rounds remains the same and the total data

transfer is scaled by k.

Secure Comparison: To perform a secure comparison of

secret shared integers, we use the protocol πGEQ of Adams

et al. [1]. As input, Alice and Bob hold secret shares in Zq

of integers xand ysuch that |x−y|<2λ−1(as integers).

Particularly, Alice and Bob can use this protocol with integers

xand yin the range [−2λ−2,2λ−2−1] (a negative value uis

represented as 2λ− |u|). The protocol returns a secret share

in Z2of 1 if x≥yand of 0 otherwise. The protocol πGEQ

works as follows:

1) Alice and Bob locally compute the difference of xand

yas JdiﬀKq←JxKq−JyKq. Note that if y > x, then diﬀ

is negative.

2) Alice and Bob extract a Z2-sharing JMSBK2of the

most-signiﬁcant bit (MSB) of diﬀ using the protocol

πBTX.

3) Given that the most-signiﬁcant bit of a secret shared

value in Zqis 1 if and only if it is negative, the negation

of the most-signiﬁcant bit, JzK2←1 + JMSBK2, is 1 if

and only if x≥y.

The protocol πGEQ has the same round and communication

complexity as the protocol πBTX for extracting the most-

signiﬁcant bit, i.e., α=λ. They differ solely on the com-

putations done locally.

D. Naive Bayes Classiﬁers

Naive Bayes is a statistical classiﬁer based on Bayes’

Theorem with an assumption of independence among fea-

tures/predictors. It assumes that the presence (or absence) of

a particular feature in a class is unrelated to the presence (or

absence) of any other feature. The Bayes’ theorem is used as

follows:

P(c|x) = P(x|c)P(c)

P(x),

where: (1) cis the class/category; (2) xis the feature vector of

test example; (3) P(c|x)is the posterior probability, i.e., given

test example x, what is its probability of belonging to class

c; (4) P(x|c)is known as the likelihood, i.e., given a class c,

what is the probability of example xbelonging to class c; (5)

P(c)is the class prior probability; (6) P(x)is the predictor

prior probability.

The predictor prior probability P(x)is the normalizing

constant so that the P(c|x)does actually fall in the range

[0, 1]. In our solution we will be comparing the probabilities

of different classes to determine the most likely class of an

example. The probabilities are not important per se, only their

comparative values are relevant. As the denominator P(x)

remains the same, it will be omitted and we will use

P(c|x) = P(x|c)P(c).

5

M2M14

M1M13

M11

M3M4M12

M9M10

M5M6M7M8M15 M16

M1,2 M3,4 M5,6 M7,8 M9,10 M11,12 M13,14 M15,16

M1,4

M1,8

M9,12

M5,8 M13,16

M9,16

M1,16

Fig. 1. A circuit to compute the (α−1)-th matrix composition in dlog(α−1)elayers. The notations Mi.j indicates the composition of all matrices from

Mito Mj, inclusive.

As per assumption the features x= (x1, . . . , xd)are

independent, we get

P(c|x) = P(c)

d

Y

k=1

P(xk|c).

Note that when executing Naive Bayes, since the probabili-

ties are often very small numbers, multiplying them will result

in even smaller numbers, which often results in underﬂows that

can cause the model to fail. To solve that problem and also

simplify operations (and consequently improve performance),

we will “convert” all multiplication operations into additions

by using logarithms. Applying the logarithm we get

log(P(c|x)) = log P(c)

d

Y

k=1

P(xk|c)!

= log(P(c)) +

d

X

k=1

log(P(xk|c)).

To perform the classiﬁcation, we then compute the argmax

ˆc=argmaxc"log(P(c)) +

d

X

k=1

log(P(xk|c))#,

where argmaxcreturns the class cthat has the highest value

for the test example x.

Naive Bayes classiﬁers differ mainly on the assumptions

they make regarding the distribution of P(x|c). In Gaussian

Naive Bayes, the assumption is that the continuous values

associated with each class are distributed according to a

Gaussian distribution. For discrete features, as we have in text

classiﬁcation, we can use the Bernoulli Naive Bayes or the

multinomial Naive Bayes. In the Bernoulli Naive Bayes, the

features are Boolean variables, where 1 means that the word

occurred in the text and 0 if the word did not occur. And, in

the multinomial Naive Bayes, the features are the frequency

of the words present in the document. In this work, we use

the multinomial Naive Bayes and the frequencies of the words

are determined during the training phase.

III. PRI VACY-PRES ERVING NAIV E BAYES CLASSIFICATION

For the construction of our Privacy-Preserving Naive

Bayes Classiﬁcation (PPNBC) protocol πPPNBC, Alice con-

structs her set A={a1, . . . , am}of unigrams occur-

ring in her message and Bob constructs his set B=

{b1, . . . , bn}of unigrams that occur in his ML model.

Bob also has log(P(cj)), that is the logarithm of the

probability for each class cjand a set of logarithms

of probabilities {log(P(b1|cj)),...,log(P(bn|cj)),log(1 −

P(b1|cj)),...,log(1 −P(bn|cj))}, that is the logarithm of

the probability of a word bioccurring or not in a class cj.

All akand biare represented as bit strings of length `. In

our current implementation, we focus on binary classiﬁcation.

It is straightforward to generalize our protocols to the case

of classiﬁcation into more than two classes by using a secure

argmax protocol. Our protocol πPPNBC follows the description

of the Naive Bayes presented in Section II-D (using logarithms

and not using the normalizing constant), and works as follows:

1) Alice and Bob execute the secure feature extraction

protocol πFE with inputs (a1, . . . , am)and (b1, . . . , bn),

respectively. The output consists of secret shared values

Jy1K2,...,JynK2in Z2, where yi= 1 if the word bi∈A

and 0 otherwise;

2) They use protocol π2toQ to convert Jy1K2,...,JynK2to

Jy1Kq,...,JynKq, containing secret sharings of the same

values in Zq;

3) For each class cj:

a) Using the method described in Section II-A,

Bob creates secret shares of his inputs

log(P(cj)),log(P(b1|cj)),...,log(P(bn|cj)),

log(1 −P(b1|cj)),...,log(1 −P(bn|cj)), which

contain the logarithm of the class probability

and the set of logarithms of the conditional

probabilities;

b) For i= 1, . . . , n, Alice and Bob run πDMM to

compute JwiKq←JyiKqJlog(P(bi|cj))Kq+ (1 −

JyiKq)Jlog(1 −P(bi|cj))Kq;

6

c) Alice and Bob locally compute JujKq←

Jlog(P(cj))Kq+Pn

i=1JwiKq.

4) Alice and Bob use the protocol πGEQ to compare the

results of Step 3(c) for the two classes, getting as

output a secret sharing of the output class JcK2(the

secret sharing JcK2can afterwards be opened towards

the party/parties that should receive the result of the

classiﬁcation).

IV. SECURITY

The concept of simulation is central in modern cryptogra-

phy. It is used, for instance, to deﬁne zero-knowledge proofs,

to appropriately analyze secure multi-party computation pro-

tocols, and is also behind the concept of semantic security for

encryption schemes.

At a high-level, in the simulation paradigm for security

deﬁnitions and proofs, a comparison is made between a “real

world” where the actual primitive being analyzed exists and

the adversary tries to attack it, and an “ideal world” where

there is an idealized primitive (also known as ideal function-

ality) that performs the desired functionality and is secure by

deﬁnition. If one can prove that for any possible action taken

by any adversary in the real world there is a corresponding

action that an ideal-world adversary (also known as the

simulator) interacting with the ideal world can take such that

the real and ideal worlds become indistinguishable, then the

actual primitive securely realizes what is speciﬁed by the ideal

functionality. In the case of semantic security for encryption

schemes, for instance, one compares what can be learned by an

adversary that receives a ciphertext in the real world with what

can be learned by a simulator who receives nothing in the ideal

world. An encryption scheme is semantically secure if the ad-

versary cannot learn more information than the simulator. For a

tutorial of the simulation proof technique, we refer to the work

of Lindell [47]. Simulation-based security proofs generally

offer stronger security guarantees than security proofs based

on list of properties (and multi-party cryptographic protocols

proved secure according to simulation-based deﬁnitions are

sequentially composable [12]).

The security model considered in this work is the Universal

Composability (UC) framework of Canetti [14]. The UC

framework is based on the simulation paradigm, and thus in the

UC framework the security is analyzed by comparing a real

world with an ideal world. However, instead of considering

the analyzed primitive isolated from the outside world (like

other simulation-based deﬁnitions), the outside world is taken

into account. By taking this additional factor into account,

cryptographic protocols that are proven to be UC-secure

offer much stronger security guarantees: any protocol that

is proven UC-secure can be arbitrarily composed with other

copies of itself and of other protocols (even with arbitrarily

concurrent executions) while preserving security. That is an

extremely useful property that allows the modular design of

cryptographic protocols: consider the case in which one has

previously proven that a protocol ρUC-securely realizes an

ideal functionality G. Now, consider that one has designed a

new protocol πthat uses ρas a sub-protocol and wants to

prove that πsecurely realizes the ideal functionality F. Due

to the UC theorem [14], in the security proof that protocol π

realizes functionality F, one can consider πusing instances of

the ideal functionality G(that is UC-realized by ρ) instead of

instances of ρ; and this makes the modular design and security

analysis of cryptographic protocols much easier. Moreover,

UC-security is a necessity for cryptographic protocols running

in complex environments such as the Internet. For these

reasons, the UC framework is the gold standard for formally

deﬁning and analyzing the security of cryptographic protocols

(it has been used by thousands of scientiﬁc works); and

protocols that are UC-secure provide much stronger security

guarantees than protocols proven secure according to other

notions.

Here only a short overview of the UC framework for the

speciﬁc case of protocols with two participants is presented.

We refer interested readers to the full version of original work

of Canetti [13] and the book of Cramer et al. [20] for more

details.

As mentioned before, the UC framework considers a real

and an ideal worlds. In the real world Alice and Bob interact

between themselves and with an adversary Aand an environ-

ment Z. The environment Zcaptures all external activities

to the protocol instance under consideration (i.e., it captures

the outside world, everything other than the single protocol

instance whose security is being analyzed), and is responsible

for giving the inputs and getting the outputs from Alice and

Bob. The adversary Acan corrupt either Alice or Bob, in

which case he gains the control over that participant. The

network scheduling is assumed to be adversarial and thus A

is responsible for delivering the messages between Alice and

Bob. In the ideal world, there is an ideal functionality Fthat

captures the perfect speciﬁcation of the desired outcome of

the computation. Freceives the inputs directly from Alice and

Bob, performs the computations locally following the primitive

speciﬁcation and delivers the outputs directly to Alice and Bob.

A protocol πexecuted between Alice and Bob in the real world

UC-realizes the ideal functionality Fif for every adversary A

there exists a simulator Ssuch that no environment Zcan

distinguish between: (1) an execution of the protocol πin

the real world with the participants Alice and Bob, and the

adversary A; (2) and an ideal execution with dummy parties

(that only forward inputs/outputs), Fand S.

Simpliﬁcations: The messages of ideal functionalities are

formally public delayed outputs, meaning that Sis ﬁrst asked

whether they should be delivered or not (this is due to the

modeling that the adversary controls the network scheduling).

This detail as well as the session identiﬁcations are omitted

from the description of functionalities presented here for the

sake of readability.

Simulation Strategy: As standard in UC security proofs,

in our security proofs the simulator Sinteracting in the

ideal world will internally run a copy of the adversary A

and internally simulate an execution of the protocol πfor

A(using only the information that Scan get in the ideal

world). The simulator Sforwards the messages of Aand Z

that are intended to each other, thus allowing them to freely

communicate (note that in the real world Aand Zcan freely

7

communicate, so Sshould also allow this communication).

One of the goals of the simulator is to make the internally

simulated execution of the protocol πand the real execution

of the protocol πin the real world indistinguishable from

the point of view of Aand Z. Moreover, in the simulated

execution of πin the ideal world, Sneeds to extract the inputs

of the corrupted parties in order to forward them to F, and

also make sure that the outputs in the simulated execution

of πand in the ideal functionality Fmatch (note that the

environment Zcan see the inputs/outputs of the uncorrupted

parties directly from the dummy parties that simply forward

inputs/outputs between Fand Z). As long as the internally

simulated and real executions of πare indistinguishable and

the inputs/outputs of both worlds match, the environment will

not be able to distinguish the real and ideal worlds.

In the case of our (sub-)protocols all the computations

are performed using secret sharings and all the protocol

messages look uniformly random from the point of view of

the receiver, with the single exception of the openings of the

secret sharings. Nevertheless, the messages that open a secret

sharing can be straightforwardly simulated using the outputs

of the respective functionalities. In the ideal world, if πuses

ρas a sub-protocol and its has been previously shown that ρ

UC-realizes functionality G, then using the UC theorem [14]

it is possible to substitute the instances of ρused in πby

instances of G. And, in the ideal world, the simulator Shas

the leverage of being the one responsible for simulating all

the ideal functionalities other than the one whose security is

being analyzed. Using this leverage, the simulator Swill be

easily able to perform a perfect simulation in the case of our

protocols.

As shown in [31], [25], the protocol πDMM for secure matrix

multiplication UC-realizes the distributed matrix multiplica-

tion functionality FDMM in the trusted initializer model. The

correctness follows trivially as Z=XY = (U+D)(V+

E) = UV +UE +DV +DE =W+UE +DV +DE

and therefore JZKq←JWKq+EJUKq+DJVKq+DE

obtains a secret sharing corresponding to Z=XY . The

fact that the resulting shares are uniformly random with the

constraint that Z=XY follows trivially from the fact that

the pre-distributed multiplication triple has this property. The

simulator Sruns internally a copy of the adversary Aand

perfectly reproduces an execution of the real world protocol

for A:Ssimulates an execution of πDMM with dummy inputs

for the uncorrupted parties (note that from A’s point of view

the generated messages will be indistinguishable from the

messages in the real protocol execution as the shares of U

and Vare uniformly random and unknown to A), and uses

the leverage of being responsible for simulating the trusted

initializer functionality FD

TI for Ain order to extract the shares

of Xand Ywhenever a corrupted party announces its shares of

Dand Ein the simulated protocol execution. Having extracted

the inputs of the corrupted party, Scan forward them to the

distributed matrix multiplication functionality FDMM. Given

the knowledge of Sabout JUKq,JVKq,JWKq, D and E, by the

end of the simulated execution, it knows, for each corrupted

party, the value that its share of the output should be, and

therefore Scan ﬁx these values in FDMM so that the sum of the

uncorrupted parties’ shares is compatible with the simulated

execution of πDMM. Given these facts, no environment Zcan

distinguish the real and ideal worlds.

Functionality FDMM

FDMM is parametrized by the size qof the ring Zq

and the dimensions (i, j)and (j, k)of the matrices.

Input: Upon receiving a message from Alice/Bob

with its shares of JXKqand JYKq, verify if the share

of Xis in Zi×j

qand the share of Yis in Zj×k

q. If it

is not, abort. Otherwise, record the shares, ignore any

subsequent message from that party and inform the

other party about the receipt.

Output: Upon receipt of the shares from both par-

ties, reconstruct Xand Yfrom the shares, compute

Z=XY and create a secret sharing JZKqto distribute

to Alice and Bob: a corrupt party ﬁxes its share of

the output to any chosen matrix and the shares of

the uncorrupted parties are then created by picking

uniformly random values subject to the correctness

constraint.

As proved by Reich et al. [55], the protocol πEQ UC-realizes

the functionality FEQ. The correctness of πEQ follows trivially

from the fact that in the case that x=y, then all ri’s will be

equal to 1 and therefore z=Qiriwill also be 1; and if

x6=y, then for at least one value i, we have that ri= 0,

and therefore z= 0. As showed by Reich et al. [55], in the

ideal world, Scan run an internal copy of Aand a simulated

execution of πEQ with dummy inputs (from A’s point of view

the messages will be indistinguishable from the ones in the real

protocol execution). Since πDMM is substituted by FDMM using

the UC composition theorem, and Sis the one responsible for

simulating FDMM in the ideal world, Scan leverage this fact

in order to extract the share that any corrupted party have

of the value xi+yi. Let the extracted value be denoted by

vi,C .Sthen picks uniformly random xi,C , yi,C ∈ {0,1}such

that xi,C +yi,C =vi,C mod 2 and submits them to FEQ as

being the corrupted party’s shares of xiand yi(note that FEQ’s

output only depends on the values of xi+yimod 2). Sis also

trivially able to ﬁx in FEQ the output share of the corrupted

party so that it matches the one in the internally simulated

instance of πEQ. This is a perfect simulation strategy and

no environment Zcan distinguish the ideal and real worlds.

Therefore πEQ UC-realizes FEQ.

Functionality FEQ

FEQ is parametrized by the bit-length `of the values

being compared.

Input: Upon receiving a message from Alice/Bob with

her/his shares of JxiK2and JyiK2for all i∈ {1, . . . , `},

record the shares, ignore any subsequent messages

from that party and inform the other party about the

8

receipt.

Output: Upon receipt of the inputs from both parties,

reconstruct xand yfrom the bitwise shares. If x=y,

then create and distribute to Alice and Bob the secret

sharing J1K2; otherwise the secret sharing J0K2. Before

the deliver of the output shares, a corrupt party ﬁx

its share of the output to any constant value. In both

cases the shares of the uncorrupted parties are then

created by picking uniformly random values subject

to the correctness constraint.

From the fact that πEQ UC-realizes FEQ, it follows straight-

forwardly that πFE UC-realizes the functionality FFE. The

correctness is trivial to verify and FEQ does not reveal any

information at all about the secret shared values. In the ideal

world, Sexecutes an internal copy of Aand simulates an

execution of the protocol πFE for A. Note that in this internal

simulation Scan use the leverage of being responsible for

simulating FD

TI in order to extract all inputs of the corrupted

party, which can then be forwarded to FFE.Scan also ﬁx

in FFE the output shares of the corrupted party to match the

ones in the internally simulation execution. No environment

Zis able to distinguish the real and ideal worlds, and thus

πFE UC-realizes the functionality FFE.

Functionality FFE

FFE is parametrized by the sizes mof Alice’s set

and nof Bob’s set, and the bit-length `of the elements.

Input: Upon receiving a message from Alice with

her set A={a1, a2, . . . , am}or from Bob with his

set B={b1, b2, . . . , bn}, record the set, ignore any

subsequent messages from that party and inform the

other party about the receipt.

Output: Upon receipt of the inputs from both parties,

deﬁne the binary feature vector xof length nby

setting each element xito 1if bi∈A, and to 0

otherwise. Then create and distribute to Alice and

Bob the secret sharings JxiK2. Before the deliver of

the output shares, a corrupt party ﬁx its share of the

output to any constant value. In both cases the shares

of the uncorrupted parties are then created by picking

uniformly random values subject to the correctness

constraint.

As proved by Reich et al. [55], the protocol π2toQ UC-

realizes the functionality F2toQ. The correctness of π2toQ is

trivial to verify: as x=xa+xBmod 2, then z=xA+xB−

2xAxBis such that z=xfor all possible values xA, xB∈

{0,1}. In the internal simulation of π2toQ in the ideal world,

Scan use the fact that it is the one simulating FDMM in order

to extract the share of any corrupted party and ﬁx the input

to/output from F2toQ appropriately, so that no environment Z

can distinguish the real and ideal worlds. Hence π2toQ UC-

realizes F2toQ.

Functionality F2toQ

F2toQ is parametrized by the size of the ﬁeld q.

Input: Upon receiving a message from Alice/Bob

with her/his share of JxK2, record the share, ignore

any subsequent messages from that party and inform

the other party about the receipt.

Output: Upon receipt of the inputs from both parties,

reconstruct x, then create and distribute to Alice and

Bob the secret sharing JxKq. Before the deliver of the

output shares, a corrupt party ﬁx its share of the output

to any constant value. In both cases the shares of

the uncorrupted parties are then created by picking

uniformly random values subject to the correctness

constraint.

The bit extraction protocol πBTX of [1] is a straightforward

simpliﬁcation of the bit decomposition protocol πdecompOPT

from [27] and UC-realizes the bit extraction functionality

FBTX. Note that the simulator Scan straightforwardly extract

the bit-string of a corrupted party in an internal simulation of

πBTX with the adversary Aby using the fact that it is responsi-

ble for simulating FDMM that is used to compute the generate

signal. Thus Scan forward the necessary inputs FBTX. It can

also easily ﬁx the output share of the corrupted party in FBTX

so that it matches the one in the internal simulation of protocol

πBTX. Therefore Shas a perfect simulation strategy and Z

cannot distinguish the ideal and real worlds.

Functionality FBTX

FBTX is parametrized by α. It receives bit-strings

xA=x(α)

A· · · x(1)

Aand xB=x(α)

B· · · x(1)

Bfrom Alice

and Bob, respectively, and returns a secret sharing of

the α-th bit of x=xA+xB.

Input: Upon receiving a message from Alice

with her bit-string xAor from Bob with his bit-string

xB, record it, ignore any subsequent messages from

that party and inform the other party about the receipt.

Output: Upon receipt of the inputs from both

parties, compute x=xA+xB, extract the α-th bit

xαof xand distribute a new secret sharing JxαK2

of the bit xα. Before the output deliver, the corrupt

party ﬁx its shares of the output to any desired value.

The shares of the uncorrupted parties are then created

by picking uniformly random values subject to the

correctness constraints.

Proceeding to the analysis of protocol πGEQ, its correctness

follows trivially. In the ideal world, the simulator Sexecutes

an internal copy of Ainteracting with an instance of protocol

πGEQ in which the uncorrupted parties use dummy inputs.

Note that all the messages that Areceives look uniformly

9

random to him. Since πBTX is substituted by FBTX using the

UC composition theorem, and Sis responsible for simulating

FBTX in the ideal world, Scan leverage this fact in order

to extract the share that any corrupted party have of the

secret shared value diﬀ =x−y. Let the extracted value

of the corrupted party be denoted by diﬀC. The simulator

then picks uniformly random values xC, yCin Zqsuch that

diﬀC=xC−yCand submit these values to FGEQ as being the

shares of the corrupted party for secret shared values xand

y(note that the result of FGEQ only depends on the value of

x−ymod q). Sis also trivially able to ﬁx the output share

of the corrupted party in FGEQ so that it matches the one in

the internally simulated instance of πGEQ. This is a perfect

simulation strategy and no environment Zcan distinguish the

ideal and real worlds. Therefore πGEQ UC-realizes FGEQ.

Functionality FGEQ

FGEQ runs with Alice and Bob and is parametrized by

the bit length λof the ring Zq(i.e., q= 2λ). It receives

as input the secret shared values xand y, which are

guaranteed to be such that |x−y|<2λ−1(as integers).

Input: Upon receiving a message from Alice

or Bob with its share of JxKqand JyKq, record

the shares, ignore any subsequent messages from

that party and inform the other party about the receipt.

Output: Upon receipt of the inputs from both

parties, reconstruct the values xand y, and compute

diﬀ =x−y. If diﬀ represents a negative number,

distribute a new secret sharing J0K2; otherwise a new

secret sharing J1K2. Before the output deliver, the

corrupt party ﬁx its shares of the output to any desired

value. The shares of the uncorrupted parties are then

created by picking uniformly random values subject

to the correctness constraints.

When we analyze the security of the ﬁnal protocol πPPNBC

for privacy-preserving Naive Bayes classiﬁcation, we can use

the UC theorem [14] to substitute the instances of πFE,π2toQ ,

πDMM and πGEQ that are used as sub-protocols in πPPNBC by

instances of FFE,F2toQ ,FDMM and FGEQ, respectively. Note

that these ideal functionalities do not leak any information

at all to the protocol participants: the functionalities only

manipulate the secret sharings of the inputs in order to

obtain secret sharings of the desired outputs, but no secret

shared value is ever opened to any party. Thus from the

point of view of any protocol participant, all values that

it sees throughout the execution of πPPNBC look uniformly

random. In the ideal world, Sinternally runs a copy of A

and simulates an execution of the real world protocol πPPNBC

for A. Using the leverage of being responsible for simulating

FD

TI ,FFE,F2toQ ,FDMM and FGEQ in this internal simulation

of the protocol πPPNBC for the adversary A, the simulator S

is trivially able to extract all inputs of the corrupted party

that it needs to forward to the functionality FPPNBC in this

ideal world. In other words, Scan extract (a1, . . . , am)in

the case Alice is corrupted; and it can extract (b1, . . . , bn)

and (log(P(cj)),log(P(b1|cj)),...,log(P(bn|cj)),log(1 −

P(b1|cj)),...,log(1 −P(bn|cj))) for each class cjin the

case that Bob is corrupted. Knowing all values of the internal

simulation, Scan also trivially ﬁx the share of the output that

corresponds to the corrupt party in order to match the one from

the internal execution. Therefore, it follows straightforwardly

that πPPNBC UC-realizes functionality FPPNBC.

Functionality FPPNBC

Input: Upon receiving a message from

Alice with her inputs (a1, . . . , am)or

from Bob with his inputs (b1, . . . , bn)and

(log(P(cj)),log(P(b1|cj)),...,log(P(bn|cj)),

log(1 −P(b1|cj)),...,log(1 −P(bn|cj))) for each

class cj, record the values, ignore any subsequent

messages from that party and inform the other party

about the receipt.

Output: Upon receipt of the inputs from both

parties, locally perform the same computational steps

as πPPNBC using the secret sharings. Let JcK2be

the result. Before the deliver of the output shares, a

corrupt party can ﬁx the shares that it will get, in

which case the other shares are adjusted accordingly

to still sum to JcK2. The output shares are delivered

to the parties.

V. EX PER IME NTAL RES ULTS

To evaluate the proposed protocol in a use case for spam

detection, we use the SMS Spam Collection Data Set from

the UC Irvine Machine Learning Repository.3This database

contains a set of tagged SMS messages that have been

collected for SMS spam research. It contains a set of 5574

SMS messages in English tagged as legitimate (ham) or spam.

The ﬁles contain one message per line, where each line is

composed of two columns: the ﬁrst contains the label (ham

or spam) and the second contains the raw text (see examples

in Figure 2). The data set has 747 spam SMS messages and

4827 ham SMS messages, that is, 13.4% of the SMSes are

spams and 86.6% are hams.

Table I shows the distribution of tokens/unigrams in the data

set. As we can see, the data set has a total of 81175 tokens.

When training a spam classiﬁer, techniques can be used to

reduce this set of tokens in order to improve the performance

of the protocol in terms of accuracy, runtime or other metrics.

A. Training phase

In the classiﬁcation phase, Bob already has the ML model.

The model was generated using the following steps:

1) Bob takes the SMS Spam Collection Data Set and parses

each line into unigrams. The letters are converted to

lower case and everything other than letters is deleted.

3https://archive.ics.uci.edu/ml/datasets/sms+spam+

collection

10

Fig. 2. Some examples of tagged SMS messages from the SMS Spam Collection Data Set.

TABLE I

TOK EN S TATIST I CS [5].

Tokens in Hams 63632

Tokens in Spams 17543

Total of Tokens 81175

Average Tokens per Ham 13.18

Average Tokens per Spam 23.48

Average Tokens per Message 14.56

2) To have higher accuracy and improve the runtime of

the algorithm, we used the stemming and stop words

techniques. Stemming is the process of reducing the

inﬂection of words in their roots, such as mapping a

group of words to the same stem even if the stem

itself is not a valid word in the language. For example,

likes, liked, likely and liking reduce to the stem like;

retrieval, retrieved, retrieves reduce to the stem retrieve;

trouble, troubled and troubles reduce to the stem troubl.

Stop words concerns ﬁltering out words that can be

considered irrelevant to the classiﬁcation task such as:

the, a, an, in, to, for.

3) The remaining unigrams are inserted in a Bag of Words

(BoW). A BoW is created for the ham category and

another for the spam category. Each BoW contains the

unigrams and their corresponding frequency counters.

4) Based on the frequency counters, we remove the less

frequent words in order to decrease the runtime of

our privacy-preserving solution. We will address this

parameter later when we detail the trade-off between

accuracy and efﬁciency.

5) Bob computes the logarithm of the class prior probabil-

ity for each class c:

log(P(c)) = log |training examples ∈c|

|examples in the training set|.

(1)

6) Bob computes the logarithm of the probability of each

word by class. To compute the probability we have to

ﬁnd the average of each word for a given class. For the

class cand the word i, the average is given by:

log(P(i|c)) = log |word “i” in class c|

|words in class c|.(2)

However, as some words can have 0 occurrences, we

use Laplace Smoothing:

log(P(i|c)) = log |word “i” in class c|+ 1

|words in class c|+|V|,(3)

where |V|is the size of the vocabulary, i.e., all unique

words in the training data set regardless of the class.

In Equations 1 and 3, before computing the logarithm we

need to scale the result of the division to convert it to integers.

Before inputing this model into our privacy-preserving proto-

col, we need to convert any ﬁxed precision real number into

an integer. In order to do so, we follow section II-B, we pick

up a value of aequal to 34. With the values of log(P(c)) and

log(P(i|c)) computed, the model is generated. Note that only

Bob is involved in training the model.

Table II shows the distribution of tokens/unigrams in the

data set after performing the training phase. Compared to

Table I, there was a reduction of over 30 thousand tokens,

and we can see that we have less than 6000 unique tokens.

TABLE II

TOK EN S TATIST I CS A FT ER T HE T RA IN IN G PH AS E.

Tokens in Hams 38469

Tokens in Spams 10981

Total of Tokens 49450

Average Tokens per Ham 7.97

Average Tokens per Spam 14.7

Average Tokens per Message 8.87

Unique Tokens in Hams 5950

Unique Tokens in Spams 1883

Unique Tokens in the Data Set 5950

B. Cryptographic Engineering

Our solution to secure Naive Bayes classiﬁcation is imple-

mented in Rust using an up-to-date version of the RustLynx

framework4, which was used in [27] to achieve a state-of-the-

art implementation of secure logistic regression training.

We now brieﬂy describe some of the engineering aspects of

our implementation.

4https://bitbucket.org/uwtppml/rustlynx/src/master/

11

a) Parallel Data Transfer: Instead of atomic sending and

receiving queues as might be utilized in a general-purpose

multi-threaded network application, we associate each thread

(for a ﬁxed threadpool size) with a port in a given port range

such that the i-th thread executing in the P0process will

only exchange data with the i-th thread of the P1process.

We base this choice on the observation that MPC operations

are symmetric, so there is never an instance where the receiver

does not know the length, intent, or timing of a message from

the sender – situations for which a more complex, and slower,

messaging system would be necessary. An additional beneﬁt of

this structure is that the packets require no header to denote the

sender, thread ID, or length of the body. Based on empirical

testing on Z264 multiplication with an optimized threadpool

size, this method yields a 6×improvement over an architecture

with atomic queues.

b) Operations in Z2λ:RustLynx supports arithmetic over

Z264 . This particular bit length is chosen because (1) it is

sufﬁciently large to represent realistic data in a ﬁxed-point

form and (2) it aligns with a primitive data type, meaning that

modular arithmetic operations can be performed implicitly by

permitting integer overﬂow.

c) Operations in Z2:We represent Z2shares in group-

ings of 128 as the individual bits of Rust’s unsigned 128-bit

integer primitive. Doing so allows for local operations on the

entire group of secrets to share Arithmetic Logic Unit (ALU)

cycles and to be loaded, copied, and iterated quickly. The

downside of this design choice is that sending mZ2shares

corresponds to 16 · dm/128ebytes of data transfer, which,

in the worst case, is 15 bytes larger than the most compact

possible representation of mbits (that is, using groups of 8).

Based on empirical testing, the performance loss for MPC

primitives is affected signiﬁcantly more by wasting time on

local operations than wasting a small amount of bandwidth.

Hence, he largest available primitive data type was chosen to

group Z2shares.

C. Evaluation

We ran our experiments on Amazon Web Services (AWS)

using two c5.9x-large EC2 instances with 36 vCPUs, 72.0 GiB

of memory and 32 threads. Each of the parties ran on separate

machines (connected over a Gigabit Ethernet LAN), which

means that the results in Table IV cover the communication

time in addition to the computation time. All experiments

were repeated one-hundred times and averaged to minimize

the variance caused by large thread counts.

We evaluate PPNBC using 5-fold cross validation over the

entire corpus of 5574 SMS. For each unigram in Alice’s set

A={a1, a2,· · · , am}and each unigram in Bob’s set B=

{b1, b2,· · · , bn}, we apply the hash function SHA-256 (and

truncate the result) to transform each one into a bit-string of

size `= 14.

We evaluated our solution for m={8,16,50,160}and

n={369,484,688,5200}. Note that the nvalue affects

the accuracy and running time. The nvalues were deﬁned

based on the frequency of tokens appearing in the training

data set: 688 tokens appeared more than 9 times; 484 tokens

appeared more than 14 times; and 369 tokens appeared more

than 19 times. We noticed a signiﬁcant degradation of the

False Positive Rate (FPR) when further reducing n. The values

of mwere deﬁned based on the size of our messages. Note

that mdetermines the number of tokens in the message and

not the number of characters. Also, we should mention that

some messages in our data set consisted of multiple SMSes

concatenated. Our maximum value of n(160 tokens) is twice

the maximum message found in our data set. We recall that a

single SMS has a 160 7-bit characters maximum. The average

lengths found for SMS classiﬁed as ham or spam in our data

set are shown in Table II.

We evaluate the proposed protocol in a use case for SMS

spam detection, however our PPNBC can be used in any other

scenario in which the Naive Bayes classiﬁer can be employed.

It is important to note that designing a model to obtain the

highest possible accuracy is not the focus of this paper. Instead,

our goal is to demonstrate that a privacy-preserving Naive

Bayes classiﬁer based on MPC is feasible in practice. Despite

this, as shown in Table III, the protocol achieves good results

when compared to the best result presented by Almeida et

al. [5], where the data set is proposed. They reach an accuracy

equal to 97.64%, a false positive rate (FPR) equal to 16.9%

and a false negative rate (FNR) equal to 0.18% using a SVM

classiﬁer. In our best scenario (n= 5200), we have an

accuracy equal to 96.8%, FPR equal to 17.94% and FNR equal

to 0.87%. We remark that there is little variation in accuracy

and FNR when using smaller values of n. Classiﬁers based

on boosted decision trees (AdaBoost) and logistic regression,

previously used in [55], achieved accuracies within 0.5% of

the best accuracy achieved by our protocol.

TABLE III

ACCURACY RESULTS USING 5-FOLD CROSS-VALIDAT IO N OVE R TH E

CO RP US O F 5574 SMS. FPR IS T HE FA LS E PO SI TI VE R ATE AN D FNR I S

TH E FALS E N EG ATIV E RATE .

Dictionary size FNR FPR Accuracy

n=369 0.79% 28.52% 95.5%

n=484 0.89% 22.22% 96.2%

n=688 0.87% 21.15% 96.4%

n=5200 0.87% 17.94% 96.8%

Table IV reports the runtime of our PPNBC for different

sizes of mand n, where nis the size of the dictionary,

that is, the amount of unigrams belonging to Bob’s trained

model and mis the number of unigrams present in Alice’s

SMS. The feature vector extraction (Extr) runtime considers

the time required to execute the Protocols πFE and π2toQ

in the steps 1 and 2 of Protocol πPPNBC. The runtime for

classiﬁcation (Class) considers the remaining steps of the

Protocol πPPNBC. And, the total runtime is Extr + Class. We

can see that the runtime for classiﬁcation is independent of

the size of m, and is based only on the size n, and even for

n= 5200 features/unigrams it only takes 48 ms. The feature

vector extraction (Extr) runtime depends on both mand n. For

n= 5200 and m= 160, the feature extraction takes less than

290 ms, while for n= 369 and m= 8 it just takes 11 ms. The

total runtime takes a maximum of 334 ms to classify a SMS

with 160 unigrams using a dictionary with 5200 unigrams,

12

and just 21 ms to classify a SMS with 8 unigrams using a

dictionary with 369 unigrams.

As we can notice from Table IV, the feature extraction is the

part that spends the most time because m×nsecure equality

tests of bit strings are required, which are based on secure

multiplications. As discussed by Reich et al. [55], the number

of secure equality tests needed could possibly be reduced if

Alice and Bob ﬁrst map using the same hash function each

of their bitstrings to tbuckets, A1, A2,· · · , Atfor Alice and

B1, B2,· · · , Btfor Bob. Then, only the bitstrings belonging

to Aiwould need to be compared with the bitstrings belonging

to Bi.

To use buckets, each amand bnelement is hashed, and

the result is divided into two parts, where the ﬁrst qbits

indicate which bucket the element belongs to and the other

rbits are stored, thus t= 2q. To hide how many elements are

mapped to each bucket, as this can leak information about the

distribution of the elements, the empty spots of each bucket

must be ﬁlled up with dummy elements. Thus, considering s1

the size of each bucket Atand s2the size of each bucket

Bt, the extraction feature protocol will need t×s1×s2

equality tests, which can be substantially smaller than n×m

needed previously. It is important to note that these dummy

elements do not modify the accuracy (or any other metrics)

of the classiﬁcation, because when generating Bob’s model,

for each dummy element, the probability of the element to

occur for each class is deﬁned as 0, that is, it does not impact

Pd

k=1 log(P(xk|c)). In our case, since the values of mand n

are not large, there is no signiﬁcant difference between using

buckets or not. Therefore, we use the original version (without

buckets), as in this case there is no probability of information

being leaked due to buckets’ overﬂow.

We remark that, for the sake of evaluating our solution, we

have selected values of n(the dictionary size) that directly

depend on the frequency of tokens. That is not necessary in

general.

Communication and Round Complexities. We provide an

analysis of the communication and round complexities of our

protocol. Let nbe the dictionary length, mthe example length,

`the hash output length, and q= 2λthe ring order (remember

that our sub-protocols use the binary ﬁeld or a larger ring,

depending on the operations). Our protocol πPPNBC makes 1

call to the protocol πFE,ncalls to π2toQ (which can be done in

parallel) and one call to protocol πGEQ. Protocol πFE requires

dlog(`)e+1 rounds of communication and 4·m·n·(`−1)+m+

nbits of communication. Protocol π2toQ requires 1 round of

communication and 4λbits of communication. Protocol πGEQ

requires dlog λ−1)erounds and 2(λ−1) + 4 log (λ−1) −4

bits of communication. So, in total, protocol πPPNBC requires

dlog `e+dlog λ−1)e+2 rounds and 4·m·n·(`−1) + m+n+

4·λ·n+ 2(λ−1) + 4 log (λ−1) −4bits of communication.

VI. RE L ATED WO RK

Privacy-preserving versions of ML classiﬁers were ﬁrst

addressed by Lindell and Pinkas [48]. They used MPC to

build a secure ID3 decision tree where the training set is dis-

tributed between two parties. Most of the literature on privacy-

preserving ML focus on the training phase, and include

secure training of ML algorithms such as Naive Bayes [65],

[61], [54], decision tree [48], [11], [28], logistic regres-

sion [18], [51], [27], linear regression [26], [51], [2], neural

networks [59], [51], [64] and SVM [62]. Regarding privacy-

preserving classiﬁcation/inference/prediction, most works fo-

cused on secure neural network inference, e.g., [7], [38], [51],

[49], [57], [42], [64], [3], [56], [21], [50], [46], [53]. Far less

works focus on privacy in the classiﬁcation/prediction phase

of other algorithms. De Hoogh et al. [28] has a protocol for

privacy-preserving scoring of decision trees. Fritchman et al.

[37] presented a solution for private scoring of tree ensembles.

De Cock et al. [25] presented private scoring protocols for

decision trees.

There are a few proposed privacy preserving Naive Bayes

classiﬁcation protocols in the literature. We brieﬂy review

these previous proposals and point out differences to our work.

Bost et al. [10] proposed privacy-preserving classiﬁcation

protocols for hyperplane-based classiﬁers, those include Naive

Bayes as a special case. The Naive Bayes protocol proposed

in [10] has the description of the features (the dictionary in

our implementation) public. Moreover, the implementation in

[10] is based on the Paillier cryptosystem, what makes it less

efﬁcient than the implementations here described. David et al.

[23] presented protocols for privacy-preserving classiﬁcation

with hyperplane-based classiﬁers and Naive Bayes, again,

the classiﬁer features were supposed to be publicly known.

Our solution guarantees the privacy of the dictionary. Khedr

et al. [44] proposed a secure NB classiﬁer based on Fully

Homomorphic Encryption (FHE), the use of FHE makes their

solution signiﬁcantly slower than ours. To the best of our

knowledge, we propose the ﬁrst protocol that achieves privacy

preserving Naive Bayes classiﬁcation with private features,

consisting solely of modular additions and multiplications and

with formal security proofs in the UC (Universal Composabil-

ity) model.

Regarding privacy-preserving text classiﬁcation, Costantino

et al. [19] presented a proposal based on homomorphic en-

cryption that takes 19 minutes to classify a tweet, with 19

features and 76 minutes to classify an email with 17 features.

In addition to the high runtime, Bob learns which of his

lexicon’s words are present in Alice’s tweets.

To the best of our knowledge, the work by Reich et al. [55]

was the ﬁrst work to present solutions for privacy-preserving

feature extraction and classiﬁcation of unstructured texts based

on secure multiparty computation. We present their results for

dictionary sizes (number of features) equal to 50, 200, and

500, respectively in Table V. The experimental setup of [55]

is exactly the same as ours. Computations were ran on Amazon

Web Services (AWS) using two c5.9x-large EC2 instances

with 36 vCPUs, 72.0 GiB of memory and 32 threads connected

over a Gigabit ethernet local area network. These runtimes

were computed for an average number of words in Alice’s text

equal to 21.7 words. We note that for 500 features, the best

performing protocol proposed in [55] has a total runtime of

10.4s, this runtime is divided in 6.6s for feature extraction and

3.8s for the classiﬁcation of the feature vector. Our proposed

protocol has a runtime of 0.147s using a larger number of

features (688) and a much larger number of words in Alice’s

13

TABLE IV

TOTAL RUNTIME IN MILLISECONDS (TOTA L)N EE DE D TO S EC UR ELY CL AS S IF Y A SMS WI TH O UR P ROP OS A L. WE DIVIDED IT IN THE TIME NEEDED FOR

FE ATUR E VE CT OR E XT RAC TI O N (EXT R)A ND T HE T IM E FO R CL AS SI FI CATI ON ( CL AS S) . nIS T HE S I ZE O F TH E DI CT IO NA RY,THAT IS,THE AMOUNT OF

UNIGRAMS BELONGING TO BOB’S T RA IN E D MO DE L AN D mIS THE AMOUNT OF UNIGRAMS PRESENT IN ALIC E’SSMS.

SMS size

Dictionary

size

m=8 m=16 m=50 m=160

Extr Class Total Extr Class Total Extr Class Total Extr Class Total

n=369 11 ms 10 ms 21 ms 25 ms 10 ms 35 ms 102 ms 10 ms 112 ms 111 ms 10 ms 121 ms

n=484 12 ms 10 ms 22 ms 26 ms 10 ms 36 ms 103 ms 10 ms 113 ms 124 ms 10 ms 134 ms

n=688 20 ms 11 ms 33 ms 36 ms 11 ms 47 ms 106 ms 11 ms 117 ms 136 ms 11 ms 147 ms

n=5200 77 ms 48 ms 125 ms 89 ms 48 ms 137 ms 140 ms 48 ms 188 ms 286 ms 48 ms 334 ms

text (160).

Since the protocols in [55] were implemented in Java

and ours are implemented in Rust, one could wonder if the

difference in performance is solely due to the change in the

programming language. In order to check the effect of the

programming language’s choice, we have implemented the

feature vector classiﬁcation protocols proposed in [55] in Rust.

The results are presented in Table VI. We can observe that the

resulting runtimes are faster than the Java ones. However, they

are still 4 times (Logistic Regression) and 6 times (AdaBoost)

slower than our proposed classiﬁcation protocol.

As the protocols presented by Reich et al. [55], our solution

does not leak any information about Alice’s words to Bob

neither the words of Bob’s model for Alice, and classiﬁes an

SMS as ham or spam (even for a model with 5200 features)

in less than 0.3s, in the worst case, and less than 0.022s for an

average message of our data set, while using the same type of

machines that they used. Our results include communication

and computation times.

More recently, Badawi et. al proposed a protocol for

privacy-preserving text classiﬁcation based on fully homomor-

phic encryption [4]. They obtained a highly efﬁcient, GPU-

accelerated implementation that improves the state-of-the-art

of FHE based inference by orders of magnitude. A GPU

equipped machine can compute the private classiﬁcation of

a text message in about 0.17 second in their implementation.

This time does not include the communication time to send

the encrypted text from the client to the server and the time to

receive the result. In a Gigabit Ethernet network, that would

probably add anything between 0.3s to 0.5s to their total

running time because of the ciphertext expansion resulting

from the use of FHE.

VII. CONCLUSION

Privacy-preserving machine learning protocols are powerful

solutions to perform operations on data while maintaining the

privacy of the data. To the best of our knowledge, we propose

the ﬁrst privacy-preserving Naive Bayes classiﬁer with private

feature extraction. No information is revealed regarding either

Bob’s model (including which words belong to the model) or

the words contained in Alice’s SMS. Our Rust implementation

provides a fast and secure solution for the classiﬁcation of

unstructured text. Applying our solution to the case of spam

detection, we can classify an SMS as spam or ham in less

than 340 ms in the case where the dictionary size of Bob’s

model includes all words (n= 5200) and Alice’s SMS has

at most m= 160 unigrams. In the case with n= 369 and

m= 8 (the average of a spam SMS in the database), our

solution takes only 21 ms. Besides, the accuracy is practically

the same as performing the Naive Bayes classiﬁcation in the

clear. It is important to note that our solution can be used

in any application where Naive Bayes can be used. Thus,

we believe that our solution is practical for the privacy-

preserving classiﬁcation of unstructured text. To the best of

our knowledge, our solution is the fastest SMC based solution

for private text classiﬁcation.

Finally, we would like to pint out that whenever Alice is

provided with the output of the classiﬁcation, she will learn

some information about Bob’s model. This is unavoidable but

does not contradict our security deﬁnition. Indeed, such feature

is present in the ideal functionality used to deﬁne the security

of our proposed classiﬁcation protocol. A way to decrease such

release of information would be to add differential privacy

to the model, so that Alice would never be able to tell with

certainty when a word belongs to Bob dictionary or not. That

would decrease Alice’s information about Bob’s dictionary at

the cost of reducing the accuracy of the model. We leave these

questions as future work.

REFERENCES

[1] Samuel Adams, Chaitali Choudhary, Martine De Cock, Rafael Dowsley,

David Melanson, Anderson Nascimento, Davis Railsback, and Jianwei

Shen. Privacy-Preserving Training of Tree Ensembles over Continuous

Data. IACR ePrint 2021/754, 2021.

[2] Anisha Agarwal, Rafael Dowsley, Nicholas D. McKinney, Dongrui Wu,

Chin-Teng Lin, Martine De Cock, and Anderson C. A. Nascimento.

Protecting privacy of users in brain-computer interface applications.

IEEE Transactions on Neural Systems and Rehabilitation Engineering,

27(8):1546–1555, Aug 2019.

[3] Nitin Agrawal, Ali Shahin Shamsabadi, Matt J. Kusner, and Adri`

a

Gasc´

on. QUOTIENT: Two-party secure neural network training and

prediction. In Lorenzo Cavallaro, Johannes Kinder, XiaoFeng Wang,

and Jonathan Katz, editors, ACM CCS 2019: 26th Conference on

Computer and Communications Security, pages 1231–1247. ACM Press,

November 11–15, 2019.

[4] Ahmad Al Badawi, Louie Hoang, Chan Fook Mun, Kim Laine, and

Khin Mi Mi Aung. Privft: Private and fast text classiﬁcation with

homomorphic encryption. IEEE Access, 8:226544–226556, 2020.

[5] Tiago A. Almeida, Jos´

e Mar´

ıa G´

omez Hidalgo, and Akebo Yamakami.

Contributions to the study of SMS spam ﬁltering: new collection and

results. In ACM Symposium on Document Engineering, pages 259–262.

ACM, 2011.

[6] Boaz Barak, Ran Canetti, Jesper Buus Nielsen, and Rafael Pass. Uni-

versally composable protocols with relaxed set-up assumptions. In 45th

Annual Symposium on Foundations of Computer Science, pages 186–

195, Rome, Italy, October 17–19, 2004. IEEE Computer Society Press.

[7] Mauro Barni, Pierluigi Failla, Riccardo Lazzeretti, Ahmad-Reza

Sadeghi, and Thomas Schneider. Privacy-Preserving ECG Classiﬁcation

With Branching Programs and Neural Networks. IEEE Trans. Informa-

tion Forensics and Security, 6(2):452–468, 2011.

14

TABLE V

RUN TI ME F OR T HE S EC UR E TE XT C LA SS I FIC ATIO N PRO TO CO L PRO PO SE D IN [55] (JAVA FRAM EW OR K) , BRO K EN D OWN I N TI ME N EE D ED F OR F EATU RE

VE CTO R EX T RAC TI ON ( EX TR )AND TIME FOR FEATURE VECTOR CLASSIFICATION (CLA SS ).

Java Implementation [55] Runtime (sec)

Extr Class Tot

Ada; 50 trees; depth 1 0.8 6.4 7.2

Ada; 200 trees; depth 1 2.8 6.4 9.2

Ada; 500 trees; depth 1 6.6 6.7 13.3

Logistic regression (50 feat.) 0.8 3.7 4.5

Logistic regression (200 feat.) 2.8 3.7 6.5

Logistic regression (500 feat.) 6.6 3.8 10.4

TABLE VI

RUN TI ME F OR T HE F EATU RE V EC TO R CL AS SI FI CATI ON O F A TE XT F RO M [55] IMPLEMENTED IN RUS T.

Rust Implementation Runtime (msec)

Class

Ada; 50 trees; depth 1 62

Ada; 200 trees; depth 1 62

Ada; 500 trees; depth 1 66

Logistic regression (50 feat.) 38

Logistic regression (200 feat.) 39

Logistic regression (500 feat.) 41

[8] Paulo S. L. M. Barreto, Bernardo David, Rafael Dowsley, Kirill Mo-

rozov, and Anderson C. A. Nascimento. A framework for efﬁcient

adaptively secure composable oblivious transfer in the ROM. Cryptology

ePrint Archive, Report 2017/993, 2017. http://eprint.iacr.org/2017/993.

[9] Donald Beaver. Commodity-Based Cryptography (Extended Abstract).

In STOC, pages 446–455. ACM, 1997.

[10] Raphael Bost, Raluca Ada Popa, Stephen Tu, and Shaﬁ Goldwasser.

Machine Learning Classiﬁcation over Encrypted Data. In NDSS. The

Internet Society, 2015.

[11] Justin Brickell and Vitaly Shmatikov. Privacy-Preserving Classiﬁer

Learning. In Financial Cryptography, volume 5628 of Lecture Notes

in Computer Science, pages 128–147. Springer, 2009.

[12] Ran Canetti. Security and composition of multiparty cryptographic

protocols. Journal of Cryptology, 13(1):143–202, January 2000.

[13] Ran Canetti. Universally composable security: A new paradigm for

cryptographic protocols. Cryptology ePrint Archive, Report 2000/067,

2000. http://eprint.iacr.org/2000/067.

[14] Ran Canetti. Universally Composable Security: A New Paradigm for

Cryptographic Protocols. In FOCS, pages 136–145. IEEE Computer

Society, 2001.

[15] Ran Canetti and Marc Fischlin. Universally composable commitments.

In Joe Kilian, editor, Advances in Cryptology – CRYPTO 2001, volume

2139 of Lecture Notes in Computer Science, pages 19–40, Santa Barbara,

CA, USA, August 19–23, 2001. Springer, Heidelberg, Germany.

[16] Ran Canetti, Yehuda Lindell, Rafail Ostrovsky, and Amit Sahai. Uni-

versally composable two-party and multi-party secure computation. In

34th Annual ACM Symposium on Theory of Computing, pages 494–503,

Montr´

eal, Qu´

ebec, Canada, May 19–21, 2002. ACM Press.

[17] Raffaele Cappelli, Matteo Ferrara, and Davide Maltoni. Minutia

Cylinder-Code: A New Representation and Matching Technique for

Fingerprint Recognition. IEEE Trans. Pattern Anal. Mach. Intell.,

32(12):2128–2141, 2010.

[18] Kamalika Chaudhuri and Claire Monteleoni. Privacy-preserving logistic

regression. In NIPS, pages 289–296. Curran Associates, Inc., 2008.

[19] Gianpiero Costantino, Antonio La Marra, Fabio Martinelli, Andrea

Saracino, and Mina Sheikhalishahi. Privacy-preserving text mining as a

service. In ISCC, pages 890–897. IEEE Computer Society, 2017.

[20] Ronald Cramer, Ivan Damg˚

ard, and Jesper Buus Nielsen. Secure

Multiparty Computation and Secret Sharing. Cambridge University

Press, 2015.

[21] Anders P. K. Dalskov, Daniel Escudero, and Marcel Keller. Secure

evaluation of quantized neural networks. Proceedings on Privacy

Enhancing Technologies, 2020(4):355–375, October 2020.

[22] Bernardo David and Rafael Dowsley. Efﬁcient composable oblivious

transfer from CDH in the global random oracle model. In Stephan

Krenn, Haya Shulman, and Serge Vaudenay, editors, CANS 20: 19th

International Conference on Cryptology and Network Security, volume

12579 of Lecture Notes in Computer Science, pages 462–481, Vienna,

Austria, December 14–16, 2020. Springer, Heidelberg, Germany.

[23] Bernardo David, Rafael Dowsley, Raj Katti, and Anderson CA Nasci-

mento. Efﬁcient unconditionally secure comparison and privacy pre-

serving machine learning classiﬁcation protocols. In International

Conference on Provable Security, pages 354–367. Springer, 2015.

[24] Bernardo David, Rafael Dowsley, Jeroen van de Graaf, Davidson Mar-

ques, Anderson C. A. Nascimento, and Adriana C. B. Pinto. Un-

conditionally secure, universally composable privacy preserving linear

algebra. IEEE Transactions on Information Forensics and Security,

11(1):59–73, 2016.

[25] Martine De Cock, Rafael Dowsley, Caleb Horst, Raj Katti, Anderson

Nascimento, Wing-Sea Poon, and Stacey Truex. Efﬁcient and private

scoring of decision trees, support vector machines and logistic regression

models based on pre-computation. IEEE Transactions on Dependable

and Secure Computing, 16(2):217–230, 2019.

[26] Martine De Cock, Rafael Dowsley, Anderson C. A. Nascimento, and

Stacey C. Newman. Fast, privacy preserving linear regression over

distributed datasets based on pre-distributed data. In 8th ACM Workshop

on Artiﬁcial Intelligence and Security (AISec), pages 3–14, 2015.

[27] Martine De Cock, Rafael Dowsley, Anderson C. A. Nascimento, Davis

Railsback, Jianwei Shen, and Ariel Todoki. High Performance Logistic

Regression for Privacy-Preserving Genome Analysis. To appear at BMC

Medical Genomics. Available at https://arxiv.org/abs/2002.05377, 2021.

[28] Sebastiaan de Hoogh, Berry Schoenmakers, Ping Chen, and Harm

op den Akker. Practical secure decision tree learning in a teletreatment

application. In Nicolas Christin and Reihaneh Safavi-Naini, editors, FC

2014: 18th International Conference on Financial Cryptography and

Data Security, volume 8437 of Lecture Notes in Computer Science,

pages 179–194, Christ Church, Barbados, March 3–7, 2014. Springer,

Heidelberg, Germany.

[29] Jia Deng, Alexander C. Berg, Kai Li, and Fei-Fei Li. What Does

Classifying More Than 10,000 Image Categories Tell Us? In ECCV

(5), volume 6315 of Lecture Notes in Computer Science, pages 71–84.

Springer, 2010.

[30] Nico D ¨

ottling, Daniel Kraschewski, and J¨

orn M¨

uller-Quade. Uncon-

ditional and composable security using a single stateful tamper-proof

hardware token. In Yuval Ishai, editor, TCC 2011: 8th Theory of

Cryptography Conference, volume 6597 of Lecture Notes in Computer

Science, pages 164–181, Providence, RI, USA, March 28–30, 2011.

Springer, Heidelberg, Germany.

[31] Rafael Dowsley. Cryptography Based on Correlated Data: Foundations

and Practice. PhD thesis, Karlsruhe Institute of Technology, Germany,

2016.

[32] Rafael Dowsley, J¨

orn M¨

uller-Quade, and Anderson C. A. Nascimento.

15

On the possibility of universally composable commitments based on

noisy channels. In SBSEG 2008, pages 103–114, Gramado, Brazil,

September 1–5, 2008.

[33] Rafael Dowsley, J¨

orn M¨

uller-Quade, and Tobias Nilges. Weakening the

isolation assumption of tamper-proof hardware tokens. In Anja Lehmann

and Stefan Wolf, editors, ICITS 15: 8th International Conference on

Information Theoretic Security, volume 9063 of Lecture Notes in Com-

puter Science, pages 197–213, Lugano, Switzerland, May 2–5, 2015.

Springer, Heidelberg, Germany.

[34] Rafael Dowsley, J¨

orn M¨

uller-Quade, Akira Otsuka, Goichiro Hanaoka,

Hideki Imai, and Anderson C. A. Nascimento. Universally composable

and statistically secure veriﬁable secret sharing scheme based on pre-

distributed data. IEICE Transactions, 94-A(2):725–734, 2011.

[35] Rafael Dowsley, Jeroen Van De Graaf, Davidson Marques, and Ander-

son CA Nascimento. A two-party protocol with trusted initializer for

computing the inner product. In International Workshop on Information

Security Applications, pages 337–350. Springer, 2010.

[36] Rafael Dowsley, Jeroen van de Graaf, J¨

orn M¨

uller-Quade, and Anderson

C. A. Nascimento. On the composability of statistically secure bit

commitments. Journal of Internet Technology, 14(3):509–516, 2013.

[37] Kyle Fritchman, Keerthanaa Saminathan, Rafael Dowsley, Tyler Hughes,

Martine De Cock, Anderson Nascimento, and Ankur Teredesai. Privacy-

preserving scoring of tree ensembles: A novel framework for AI in

healthcare. In Proc. of 2018 IEEE International Conference on Big

Data, pages 2412–2421, 2018.

[38] Ran Gilad-Bachrach, Nathan Dowlin, Kim Laine, Kristin E. Lauter,

Michael Naehrig, and John Wernsing. CryptoNets: Applying Neural

Networks to Encrypted Data with High Throughput and Accuracy. In

ICML, volume 48 of JMLR Workshop and Conference Proceedings,

pages 201–210. JMLR.org, 2016.

[39] Dennis Hofheinz and J¨

orn M¨

uller-Quade. Universally composable

commitments using random oracles. In Moni Naor, editor, TCC 2004:

1st Theory of Cryptography Conference, volume 2951 of Lecture Notes

in Computer Science, pages 58–76, Cambridge, MA, USA, February 19–

21, 2004. Springer, Heidelberg, Germany.

[40] Dennis Hofheinz, J ¨

orn M¨

uller-Quade, and Dominique Unruh. Univer-

sally composable zero-knowledge arguments and commitments from

signature cards. In MoraviaCrypt 2005, 2005.

[41] Yuval Ishai, Eyal Kushilevitz, Sigurd Meldgaard, Claudio Orlandi, and

Anat Paskin-Cherniavsky. On the power of correlated randomness

in secure computation. In Theory of Cryptography, pages 600–620.

Springer, 2013.

[42] Chiraag Juvekar, Vinod Vaikuntanathan, and Anantha Chandrakasan.

GAZELLE: A Low Latency Framework for Secure Neural Network

Inference. In USENIX Security Symposium, pages 1651–1669. USENIX

Association, 2018.

[43] Jonathan Katz. Universally composable multi-party computation using

tamper-proof hardware. In Moni Naor, editor, Advances in Cryptology

– EUROCRYPT 2007, volume 4515 of Lecture Notes in Computer

Science, pages 115–128, Barcelona, Spain, May 20–24, 2007. Springer,

Heidelberg, Germany.

[44] Alhassan Khedr, P. Glenn Gulak, and Vinod Vaikuntanathan. SHIELD:

Scalable Homomorphic Implementation of Encrypted Data-Classiﬁers.

IEEE Trans. Computers, 65(9):2848–2858, 2016.

[45] Neeraj Kumar, Alexander C. Berg, Peter N. Belhumeur, and Shree K.

Nayar. Describable Visual Attributes for Face Veriﬁcation and Image

Search. IEEE Trans. Pattern Anal. Mach. Intell., 33(10):1962–1977,

2011.

[46] Nishant Kumar, Mayank Rathee, Nishanth Chandran, Divya Gupta,

Aseem Rastogi, and Rahul Sharma. CrypTFlow: Secure TensorFlow

inference. In 2020 IEEE Symposium on Security and Privacy, pages

336–353, San Francisco, CA, USA, May 18–21, 2020. IEEE Computer

Society Press.

[47] Yehuda Lindell. How to simulate it - A tutorial on the simulation proof

technique. Cryptology ePrint Archive, Report 2016/046, 2016. http:

//eprint.iacr.org/2016/046.

[48] Yehuda Lindell and Benny Pinkas. Privacy Preserving Data Mining. In

CRYPTO, volume 1880 of Lecture Notes in Computer Science, pages

36–54. Springer, 2000.

[49] Jian Liu, Mika Juuti, Yao Lu, and N. Asokan. Oblivious Neural Network

Predictions via MiniONN Transformations. In ACM Conference on

Computer and Communications Security, pages 619–631. ACM, 2017.

[50] Pratyush Mishra, Ryan Lehmkuhl, Akshayaram Srinivasan, Wenting

Zheng, and Raluca Ada Popa. Delphi: A cryptographic inference service

for neural networks. In Srdjan Capkun and Franziska Roesner, editors,

USENIX Security 2020: 29th USENIX Security Symposium, pages 2505–

2522. USENIX Association, August 12–14, 2020.

[51] Payman Mohassel and Yupeng Zhang. SecureML: A System for Scalable

Privacy-Preserving Machine Learning. In IEEE Symposium on Security

and Privacy, pages 19–38. IEEE Computer Society, 2017.

[52] Chris Peikert, Vinod Vaikuntanathan, and Brent Waters. A framework

for efﬁcient and composable oblivious transfer. In David Wagner, editor,

Advances in Cryptology – CRYPTO 2008, volume 5157 of Lecture

Notes in Computer Science, pages 554–571, Santa Barbara, CA, USA,

August 17–21, 2008. Springer, Heidelberg, Germany.

[53] Deevashwer Rathee, Mayank Rathee, Nishant Kumar, Nishanth Chan-

dran, Divya Gupta, Aseem Rastogi, and Rahul Sharma. CrypTFlow2:

Practical 2-party secure inference. In Jay Ligatti, Xinming Ou, Jonathan

Katz, and Giovanni Vigna, editors, ACM CCS 20: 27th Conference on

Computer and Communications Security, pages 325–342, Virtual Event,

USA, November 9–13, 2020. ACM Press.

[54] Olivier Regnier-Coudert and John A. W. McCall. Privacy-preserving

approach to bayesian network structure learning from distributed data.

In GECCO (Companion), pages 815–816. ACM, 2011.

[55] Devin Reich, Ariel Todoki, Rafael Dowsley, Martine De Cock, and An-

derson C. A. Nascimento. Privacy-Preserving Classiﬁcation of Personal

Text Messages with Secure Multi-Party Computation. In NeurIPS, pages

3752–3764, 2019.

[56] M. Sadegh Riazi, Mohammad Samragh, Hao Chen, Kim Laine,

Kristin E. Lauter, and Farinaz Koushanfar. XONN: XNOR-based

oblivious deep neural network inference. In Nadia Heninger and

Patrick Traynor, editors, USENIX Security 2019: 28th USENIX Security

Symposium, pages 1501–1518, Santa Clara, CA, USA, August 14–16,

2019. USENIX Association.

[57] M. Sadegh Riazi, Christian Weinert, Oleksandr Tkachenko, Ebrahim M.

Songhori, Thomas Schneider, and Farinaz Koushanfar. Chameleon: A

Hybrid Secure Computation Framework for Machine Learning Applica-

tions. In AsiaCCS, pages 707–721. ACM, 2018.

[58] Ronald L. Rivest. Unconditionally secure commitment and oblivious

transfer schemes using private channels and a trusted initializer. Preprint

available at http://people.csail.mit.edu/rivest/Rivest- commitment.pdf,

1999.

[59] Reza Shokri and Vitaly Shmatikov. Privacy-Preserving Deep Learning.

In ACM Conference on Computer and Communications Security, pages

1310–1321. ACM, 2015.

[60] Rafael Tonicelli, Anderson C. A. Nascimento, Rafael Dowsley, J¨

orn

M¨

uller-Quade, Hideki Imai, Goichiro Hanaoka, and Akira Otsuka.

Information-theoretically secure oblivious polynomial evaluation in the

commodity-based model. International Journal of Information Security,

14(1):73–84, 2015.

[61] Jaideep Vaidya, Murat Kantarcioglu, and Chris Clifton. Privacy-

preserving Na¨

ıve Bayes classiﬁcation. VLDB J., 17(4):879–898, 2008.

[62] Jaideep Vaidya, Hwanjo Yu, and Xiaoqian Jiang. Privacy-preserving

SVM classiﬁcation. Knowl. Inf. Syst., 14(2):161–178, 2008.

[63] Mike Voets, Kajsa Møllersen, and Lars Ailo Bongo. Replication

study: Development and validation of deep learning algorithm for

detection of diabetic retinopathy in retinal fundus photographs. CoRR,

abs/1803.04337, 2018.

[64] Sameer Wagh, Divya Gupta, and Nishanth Chandran. SecureNN: 3-

party secure computation for neural network training. Proceedings on

Privacy Enhancing Technologies, 2019(3):26–49, July 2019.

[65] Rebecca N. Wright and Zhiqiang Yang. Privacy-preserving Bayesian

network structure computation on distributed heterogeneous data. In

KDD, pages 713–718. ACM, 2004.