Content uploaded by René Peralta

Author content

All content in this area was uploaded by René Peralta on Jun 08, 2020

Content may be subject to copyright.

Notes on Interrogating Random Quantum Circuits

Luís T. A. N. Brandão*and René Peralta*

May 29, 2020

Abstract

Consider a quantum circuit that, when fed a constant

input, produces a xed-length random bit-string in each

execution. Executing it many times yields a sample of

many bit-strings that contain fresh randomness inherent

to the quantum evaluation. When the circuit is freshly

selected from a special class, the output distribution of

strings cannot be simulated in a short amount of time

by a classical (non-quantum) computer. This quantum

vs. classical gap of computational eciency enables ways

of inferring that an honest sample contains quantumly

generated strings, and therefore fresh randomness. This

possibility, initially proposed by Aaronson, has been re-

cently validated in a “quantum supremacy” experiment

by Google, using circuits with 53 qubits.

In these notes, we consider the problem of estimating

information entropy (a quantitative measure of random-

ness), based on the sum of “probability values” (here

called QC-values) of strings output by quantum evalua-

tion. We assume that the sample of strings, claimed to

have been produced by repeated evaluation of a quantum

circuit, was in fact crafted by an adversary intending to

induce us into over-estimating entropy. We analyze the

case of a “collisional” adversary that can over-sample and

possibly take advantage of observed collisions.

For diverse false-positive and false-negative rates, we

devise parameters for testing the hypothesis that the sam-

ple has at least a certain expected entropy. This enables

a client to certify the presence of entropy, after a lengthy

computation of the QC-values. We also explore a method

for low-budget clients to compute fewer QC-values, at

the cost of more computation by a server. We conclude

with several questions requiring further exploration.

Keywords:

certiable randomness, distinguishability, en-

tropy estimation, gamma distribution, public randomness,

quantum randomness, randomness beacons.

*National Institute of Standards and Technology (Gaithersburg USA).

ORCIDs: [0000-0002-4501-089X] and [0000-0002-2318-7563].

Opinions expressed in this paper are from the authors and are not

to be construed as ocial or as views of the U.S. Department of

Commerce. Certain commercial entities, equipment, or materials

may be identied in this document in order to describe an ex-

perimental procedure or concept adequately. Such identication

is not intended to imply recommendation or endorsement by

NIST, nor is it intended to imply that the entities, materials, or

equipment are necessarily the best available for the purpose.

Index of sections

1. Introduction ......................... 2

1.1. System model ................. 2

1.2. Entropy of a sample .............. 4

1.3. Organization ................. 4

2. Exponential model.................... 5

2.1. The frequency-density representation ..... 5

2.2. Summary statistics .............. 6

2.3. Entropy per honest string ........... 6

2.4. Sampling with vs. without replacement . . . . 6

3. Sums of QC-values.................... 7

3.1. Statistics of interest .............. 8

3.2. CDFs of sums of i.i.d. variables ........ 8

3.3. Testing honest sampling ............ 9

3.4. Threshold vs. probability ........... 9

3.5. Sample sizes vs. thresholds .......... 9

4. Low-budget clients....................11

4.1. Truncated QC-values ............. 11

4.2. Sum of truncated QC values (STQC) ..... 12

5. Entropy estimation ...................12

5.1. Overview ................... 13

5.2. The client ................... 13

5.3. The pseudo-delity adversary ......... 14

5.4. The collisional adversary ........... 16

5.5. Final randomness for applications ....... 20

5.6. Classes of adversaries ............. 20

6. Concluding remarks...................21

Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 22

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

A. Terminology ........................22

A.1. Abbreviations ................. 22

A.2. Acronyms ................... 23

A.3. Symbols .................... 23

B. Expected value and variance ...........23

B.1. Auxiliary primitives .............. 23

B.2. Expected values ................ 24

B.3. Variances ................... 24

B.4. The chosen-count sampling case ....... 24

C. Sum of QC-Values (SQCs)............. 24

C.1. CLT approximation .............. 24

C.2. Exact Gamma distributions ......... 25

C.3. A Gamma approximation ........... 27

D. Discretization of QC-values ............ 28

D.1. Individual probabilities ............ 28

D.2. Collisions ................... 28

D.3. Approximate sum of “entropies” ....... 30

E. Tables with more detail ............... 32

Page 1 of 35

List of Figures

Figure 1 QC-values upon uniform string sampling .5

Figure 2 QC-values upon quantum string sampling .5

Figure 3 Various PDFs of QC-values ........ 6

Figure 4 Various PDFs of SQC with . . . . 8

Figure 5 PDF approximations of QC-values .9

Figure 6 PDF approximations of SQC . . 9

Figure 7 Inverse CDFs of SQC with . . . . 10

Figure 8 Inverse CDFs of SQC with . . . . 10

Figure 9 Sample size vs. FN=FP, with .10

Figure 10 Sample size vs. FN=FP, with . . 11

List of Tables

Table 1 Statistics of QC-values ........... 6

Table 2 Expected number of collisions ( ).7

Table 3 Statistics of SQCs of strings ....... 8

Table 4 Number of strings for SQC distinguishability 11

Table 5 TQC truncation thresholds ......... 11

Table 6 Statistics of Truncated QCs ........ 12

Table 7 Number of client-veried TQC values . . . 12

Table 8 Number of strings for SQC distinguishability 16

Table 9 Comparison pseudo-delity vs. collisional . . 19

Table 10 Gamma vs. Normal (CLT) approximations .27

Table 11 Entropy approximations: Uniform, . . 31

Table 12 Entropy approximations: Uniform, . . 32

Table 13 Entropy approximations: Quantum, .32

Table 14 Sample size for SQC distinguishability . . . 33

Table 15 Sample size for STQC distinguishability . . 34

Table 16 Statistics per bin and budget factor . . . 35

1. Introduction

The recent experimental proof of quantum supremacy

using Noisy Intermediate-Scale Quantum (NISQ) devices

showed that quantum circuits with 50+ qubits can now

be sampled with signicant delity [AABB+19]. Using

this technology, it is possible, in principle, to generate a

sample of bit-strings that can at a later time be “certi-

ed” as containing strings that were quantumly sampled

[Aar19], implying it can be externally veried that the

sample contained at least a minimum of fresh entropy.

We address the following two related questions: Under

a claim that a sequence of bit-strings has been generated

by sampling a given quantum circuit, how much entropy

can be safely assumed to be contained in it? Given a

goal of entropy, how many strings should be sampled to

enable a verication with high assurance? We consider an

adversarial setting, as usual in cryptography, where the

claimant tries to trick us into over-estimating entropy.

A metrological viewpoint.

In the scope of the Na-

tional Quantum Initiative Act [NQIA] from the U.S

Congress, the National Institute of Standards and Tech-

nology (NIST) is interested in the development of quan-

tum computing and its applications. A potential appli-

cation, within reach of current or soon-to-reach state of

the art, is the production of certiable randomness based

on evaluation of random quantum circuits [Aar19]. The

Computer Security Division at NIST has a special inter-

est in the area of randomness, which has an essential role

in cryptography. “Certiable” randomness in particular

may be useful in the context of public randomness, such

as that produced by randomness beacons [KBPB19].

In these notes we take the viewpoint of a metrology body

[NIST] in doing a preliminary evaluation of the param-

eters of a potential application for obtaining certiable

randomness. We consider a cryptographic perspective,

e.g., when asking what false positive and false negative

rates should be considered in distinguishability exper-

iments. This is a preliminary analysis and should be

taken as such. We do not investigate here the complexity-

theoretic basis for the assumption that sampling from

certain quantum circuits can be done eciently with a

quantum computer but not classically [AC16]. However,

we explore how certain attacks drastically reduce entropy

from the set of bit strings to be certied. The analysis

thus illustrates the need to dene appropriate safety mar-

gins for diverse parameters (e.g., number of strings to

sample), and rules out certain ranges thereof.

1.1. System model

1.1.1. The operator

We want to compare an honest operator of the quantum

computer — which generates a sample with fresh entropy

via an honest quantum evaluation, leading a sample to

be accepted with a statistically high probability (i.e., low

false-negative rate) — vs. a malicious operator — which

maliciously minimizes the amount of entropy in a sample

crafted to sill be acceptable with a not too-low probability

(i.e., a not-too-low false-positive rate).

In any of the cases, we assume that the circuit received for

quantum evaluation was unpredictable to the operator.

It could for example be based on fresh public random-

ness, or have been provided by a client interested in the

experiment. The operator then needs to output a sample

of strings (supposedly by evaluation of the circuit) after

a short amount of time, namely before being able to

compute their output probabilities.

The honest case.

The honest operator of the quantum

computer repeatedly evaluates the quantum circuit, to

probabilistically obtain output strings that, based on the

Page 2 of 35

computational model, inherently contain fresh entropy.

It is assumed that the sampling from such probability

distribution in a fast way is only possible by way of said

quantum computation. Later, a classical super-computer

performs a lengthy computation (e.g., of a few days)

of the output probabilities of the output strings (e.g.,

see Ref. [PGNHW19] for an analysis of the complex-

ity of simulating probability values for circuits with 54

qubits). Finally, a statistical analysis of those probabili-

ties conrms that some strings must have been output by

quantum evaluation, and, consequently, that the sample

set contains entropy that was fresh at the time of the

sample generation. Such randomness is then denoted as

“certied” (or “certiable”) randomness.

The adversarial case.

In adversarial contexts (as

with cryptographic applications), we are interested in

scenarios where the operator of the quantum computer

wants to trick us into accepting a maliciously produced

sample. Therefore, we consider that the sampling may

have been performed in a variety of ways, such as:

•

uniformly at random from a dened set

of strings;

•

as a pseudo-randomly generated output computed

from a xed secret seed;

•

using rejection sampling on the output of the circuit;

• a mix of the above and other unknown methods.

The malicious goal is to minimize the entropy of the

sample, while having a not-too-low probability of it being

a posteriori accepted by the statistical test performed

by a client, who will compute and take in consideration

the probability values of the strings in the sample. We

consider concrete specications of adversary in Section 5.

The quantum computation.

The operator can use

a quantum computer to quantumly-evaluate circuits that

output strings of

bits. We denote by

the set of

-bit strings. There are

such strings. The honest computer implementation is

characterized by a delity parameter

. The sampling is

with probability

from correct quantum evaluation of

the circuit, and otherwise (i.e., with probability

)

uniform from

. As of this writing, both Google and

IBM report having quantum computers with more than 50

bits. As a reference in this work, we use the specication

reported about the quantum computer of Google, which

can evaluate circuits with

qubits at a delity of about

[AABB+19].

When honestly evaluating the quantum circuit, with -

delity

, the computer operator is not able to distinguish

whether an output string was obtained by uniform se-

lection (with probability

) or by a correct circuit

evaluation (with probability

). A malicious operator

can naturally decide to sample strings in arbitrary ways,

but (in the considered model) cannot determine, before

an expensive and lengthy classical computation, anything

new about the probability that a given string would have

been output from a correct circuit evaluation.

1.1.2. Circuits and probabilities

The circuits, with particular specications [AABB+19,

Fig. 3], are selected from a class with large cardinality,

such that it is infeasible to precompute useful information

about a non-negligible proportion of circuits.

QC-values.

Each quantum

c

ircuit

has its own prob-

abilistic distribution of output strings upon quantum

evaluation. We are interested in the set

QCVALUES Prob(1)

of “probabilities of occurrence” — here denoted as “QC-

values” — of the output strings. For simplicity we used

a set, assuming all probabilities are dierent; otherwise

we could describe a list with possible repetitions.

Assumptions.

The subsequent analysis in these notes

is based on assumptions whose coverage we have not

independently investigated. (Dierent assumptions may

invalidate some of our estimates of security or entropy

in adversarial settings.) An important notion is what

we call a “short amount of time” — the time duration

between the moment the adversary learns the circuit

specication and the deadline for publishing a sample of

output strings. At a high level, the assumptions are:

1.

for all circuits in the class, the QC-values of the

output strings t an exponential model: the density

of QC-values (real numbers between 0 and 1) is as-

sumed well approximated by an “exponential” curve.

More concretely, the “frequency density” is a normal-

ized negative exponential function

of the “probability value” [BISB+18].

2.

a quantum computer can eciently evaluate the

circuit many times within a short amount of time;

3.

without prior knowledge of (an approximation of)

the probability values, classical computers cannot,

within a short amount of time, simulate a circuit

evaluation with the appropriate output distribution;

4.

a computer (quantum or classical) cannot, within a

short amount of time, compute a useful approxima-

tion of probability values of the output strings;

5.

a classical super-computer can calculate the QC-

values (i.e.,

Prob

for any string

) after

amoderately large amount of time, at a large-but-

possible-in-practice computational cost.

The abilities and inabilities mentioned above depend on

the number

of qubits. In these notes we focus on

Page 3 of 35

qubits. We assume the computations referred in

Assumptions 3,4and 5require resources exponential in

. Thus,

needs to be chosen such that the exponential

cost is infeasible in a short amount of time, but feasible

in a moderately large amount of time.

Towards certied randomness.

Relying on the

above, and based on a proposal by Aaronson [Aar19],

here is, at a high level, a potential experiment to produce

certied (i.e., externally veriable) randomness:

1.

The operator is given the specication of a quantum

circuit freshly chosen at random from a given class.

(For example, in the context of public randomness,

the choice may be based on a timely output of a

trusted randomness beacon.)

2.

Soon thereafter, the operator evaluates the circuit

many times and publishes the output strings.

3.

Later, a classical supercomputer computes the “QC-

values” corresponding to those sampled strings.

4.

By statistically analyzing the “QC-values”, one then

gains assurance (or not) that at least some strings

were quantumly produced and thus have entropy.

For eciency of execution we consider a sampling of

many strings from the same circuit. Comparatively, the

proposal by Aaronson uses one new circuit for each string,

to enable, with respect to entropy estimation, a security

reduction to a complexity theoretic hardness assumption.

It is an open problem what kind of reduction can be

made for the case of sampling multiple strings from a

single circuit. Section 2.4 considers tradeos between the

two approaches, and mentions the possible intermediate

case of several circuits with several strings each.

Random variables.

For the initial statistical analy-

sis in these notes, the main random variable of interest

for each sampling experiment is the sum, across sampled

strings, of the QC-values. Recall that these are the “prob-

ability values” that a correct evaluation of the quantum

circuit — what we denote as quantum sampling delity

1 — would output such strings. We denote by

the

random variable corresponding to this

s

um of QC-values

(SQC). We use indices to indicate the type of sampling

(,,,) and its parameters (,,):

•Uniform: (strings are sampled uniformly)

•

Pure

Q

uantum:

(

strings are obtained by

correct quantum evaluation of the circuit)

•F

idelity

:

(each of

strings is obtained

either, with probability

, by correct quantum eval-

uation, or, with probability , uniformly)

•C

hosen-count

:

(

strings are sampled by

correct quantum evaluation, and the other

are

pseudo-randomly selected)

The index

may be omitted when it is 1. Note that the

Pure

Q

uantum and the

C

hosen-count sampling are only

possible with a quantum computer with delity 1.

Distinguishability.

In these notes, we are focused on

the problem of distinguishing honest

F

idelity sampling

from a quantum circuit vs. malicious sampling performed

by an adversary with the goal of inducing over-estimation

of the entropy in the sample. We propose parameters for

sampling experiments, distinguishability thresholds, and

entropy estimation, assuming an adversarial setting.

1.2. Entropy of a sample

The meaning of entropy can be elusive and subject to

nuances. The measure of entropy of a string, or of a

sequence of strings, only makes sense with respect to

a probability distribution of outputs. For example, in

the case of a uniform distribution over the set of

-bit

strings we say that each occurring string has

-bits of

entropy. Conversely, a string obtained pseudo-randomly

from a xed apriori determined seed (whose bits are not

counted) has overall entropy 0.

Typical interpretations of entropy relate to unpredictabil-

ity, compressibility, and/or reproducibility. We focus our

analysis on estimating Shannon entropy (the expected

negative binary logarithm,

log

, of probabilities) rather

than minimum entropy (minlog).

Estimating entropy.

In these notes, we estimate the

entropy of a sample viewed as a vector of strings. We

consider an adversarial setting where each string can be

obtained from a dierent probability distribution. The

distributions can have dependencies, such as those related

to sampling without replacement, and/or from sorting the

sequence based on some order relation. Even though the

adversary has an incentive to minimize entropy, that goal

is conditioned on the sample being accepted by the client

with a certain minimum probability. For appropriately

parametrized experiments, this requires the adversary to

use some quantumly-obtained strings that have associated

entropy. If we take into account the conditional form

of probability distributions, then we can consider the

overall entropy of the sample as sum of “entropies” of

its consecutive strings, when for each string there is an

underlying conditional probability distribution that takes

into account the dependency on the previous strings. In

fact, we will consider an adversary that selects strings one

by one, with dependencies across each other, and argue

that such adversary is optimal (within stated constraints).

1.3. Organization

Section 2analyzes the exponential model of QC-values.

Section 3discusses the distribution and statistics of

“Sums of QC-values” (SQCs) for several sampling ex-

periments, and determines optimal thresholds for distin-

Page 4 of 35

guishability. Section 4explores alternative parameters for

settings where the client has a “low budget” for verifying

QC-values computed by a distrusted server. Section 5

considers the estimation of entropy in the face of an ad-

versarial sampling. Section 6concludes with suggested

questions for followup. The Appendix contains auxiliary

details. Section Adenes abbreviations, acronyms and

symbols. Section Bderives formulas for the expected

value and variance of several distributions. Section C

analyzes several distributions of sums of QC-values. Sec-

tion Dconsiders the discretization of QC-values and the

statistics when in the face of collisions. Section Epresents

additional large tables.

2. Exponential model

2.1. The frequency-density representation

We consider the model [BISB+18] where the quantum

circuit outputs strings whose frequency density (

, a

continuous approximation) of QC-values is dened by an

exponential distribution (Exp) with rate :

Exp (2)

2.1.1. U: Uniform sampling

The function

is not a

p

robability

d

ensity

f

unction

(PDF) of the output strings, but rather a PDF of their

QC-values when strings are sampled uniformly from the

set

of all

-bit strings. The corresponding

c

umulative

distribution function (CDF) is

(3)

In the continuous model we calculate statistics while

integrating between 0 and innity, but the contribution

between 1 and innity is negligible (

). The actual

discrete QC-values (see a discretization in Appendix D),

being probabilities, are between 0 and 1.

Figure 1plots the frequency density curve (

), and its

accumulator (integral, times

), along with a histogram

of

. In the histogram, the height of each constant-width

bin equals

times the integral of the curve between the

limits of the bin. For example: 63.2 % is the fraction of

strings with “QC-values”

. The scales of the axes

are represented relative to

. The curve vanishes

exponentially fast.

0 1/N 2/N 3/N 4/N 5/N 6/N

0

0.1 N

0.2 N

0.3 N

0.4 N

0.5 N

0.6 N

0.7 N

0.8 N

0.9 N

N

QC-value

f: Exp[N] PDF

N × "Exp[N] CDF"

Histogram (bin width 1/N)

0.632

0.233

0.086 0.031 0.012 0.004

Figure 1: QC-values upon uniform string sampling

0 1/N 2/N 3/N 4/N 5/N 6/N

0

0.1 N

0.2 N

0.3 N

0.4 N

0.5 N

0.6 N

0.7 N

0.8 N

0.9 N

N

QC-value

N×f×p: Erlang[2,N] PDF

N × "Erlang[2,N] CDF"

Histogram (bin width 1/N)

0.264 0.330

0.207

0.108 0.051 0.023

Figure 2: QC-values upon quantum string sampling

2.1.2. Q: [pure] Quantum sampling

For more insight, Fig. 2plots the density curve of fre-

quency times QC-value, and its accumulator (times

).

Since QC-values are the probabilities of QC-values upon

sampling by quantum circuit evaluation, the correspond-

ing PDF (

) and CDF (

) of QC-values are as follows:

(4)

(5)

In Fig. 2, the accumulator curve shows

multiplied

by to enable simultaneous view with .

This is called an Erlang distribution, with “shape”

and

“rate”

. Interestingly, this random variable (

) corre-

sponds to the sum of two exponential random variables

(

) with rate

. This stems from the additivity of the

Erlang distribution, of which the exponential distribution

is the special case with shape 1. (It would be interesting

to explore whether this equivalence as a sum of two inde-

pendent exponential variables may have a more insightful

interpretation as a quantum phenomenon.)

Page 5 of 35

0 1/N 2/N 3/N 4/N 5/N 6/N

0

0.10 N

0.20 N

0.30 N

0.40 N

0.50 N

0.60 N

0.70 N

0.80 N

0.90 N

1.00 N

PDFs of QC-values

Pure-Quantum

Fidelity 0.4

Uniform

Figure 3: Various PDFs of QC-values

2.1.3. F: Fidelity sampling (the practical case)

In practice, honest sampling uses a quantum computer

characterized by a delity

between 0 and 1. This delity

is the probability that during the quantum computation

all gates in the circuit function without fault. When one

or more gates fail, the model assumes that the evaluation

yields a uniformly random bit-string. The resulting PDF

and CDF of QC-values are a mix of the uniform and the

pure-quantum case, as follows:

(6)

(7)

The

f

idelity case (allowing a generic

) generalizes both

the

u

niform (

) and the pure

q

uantum cases (

).

Figure 3plots at once three PDFs of QC-values.

2.2. Summary statistics

From the PDFs of QC-values for each type of sampling

(e.g.,

,

,

), we can derive statistics of interest, such as

the expected value () and variance (). Table 1shows

the resulting formulas. The calculations are detailed in

Appendix B. Two observations:

•

Both

and

in the pure

q

uantum case are twice

the corresponding statistic of the uniform case.

•

The less trivial result is the variance

for

Table 1: Statistics of QC-values

Sampling type Random

variable

Expected

value Variance

Uniform

Pure Quantum

Fidelity

fidelity sampling, since for each individual sampled

string the possible outcome as a uniform sampled

string (

) is not independent of the possible out-

come as a correct quantum circuit output (as ).

2.3. Entropy per honest string

As a reference case, consider the expected entropy of

an individual string quantumly-sampled with delity 1.

Such entropy is slightly less than the number

of

bits per string, since the quantumly-generated strings do

not have a uniform distribution. From the exponential

model for the frequency density of QC-values, we can

in a rst approximation consider a notion of dierential

entropy (using log base 2):

log (8)

In the integral, the factor

corresponds to the

density-number of strings that have probability

. The

approximate result (ignoring terms negligible in ) is

loglog (9)

where

is the Euler-Mascheroni constant.

For

this means about

bits of expected

entropy per string. This is the continuous approximation

of the Shannon entropy

(10)

, which sums, across every

string, the product of each discrete QC-value and its

log

.

Appendix D.1 considers a discretization.

log (10)

The 52.39 bits of entropy per string are valid in the setting

of honest delity-1 evaluation with replacement. That

value changes when considering a sample composed of

strings required to be distinct, and even more so if they

may be selected adversarially. Appendix D.2 considers

an adversary that outputs a sample only after observing

the result of many quantum evaluations of the circuit,

possibly observing repeated outputs.

2.4. Sampling with vs. without replacement

Independence vs. collisions.

When sampling with

replacement, the probability of string collisions becomes

more signicant as the sample size increases. When

uniformly sampling with replacement from a set with

elements, the collision probability of about 50 % occurs

when the sample size

is about

log

. For

this corresponds to about 111.7 million, i.e.,

strings. For non-uniform distributions, such

as for quantum string sampling, collisions are expected

to start earlier and be more frequent.

Page 6 of 35

Table 2shows a few examples: the expected number of

collisions is 1 when

for uniform sampling,

or when

for quantumly sampling with

delity 1; if xing the sample size to

strings, then the

expected number of collisions is about

for uniform

sampling, and about

for delity-1 quantum sampling.

Table 2: Expected number of collisions ( )

coll

Uniform 1

Quantum

Uniform

Quantum

Explicit removal of collisions.

We require that the

nal sample does not contain collisions, (i.e., repeated

strings marked as output of the same circuit). In the

honest case this equates to sampling without replacement.

Thus, we require that the client rejects any sample con-

taining any pair of equal strings claimed to have been

generated from the same circuit. If a client does not check

for collisions, then an adversary could simply produce a

sample as a sequence of

copies of a pseudo-randomly

generated string (with 0 entropy), and have a noticeable

probability of having an average probability value as high

or larger than an honest string from quantum evaluation.

Despite the mentioned requirement, as an approximation

we calculate statistics and thresholds while assuming

a sampling with replacement. This is valid when the

string length is suciently large. It is worth noticing

an (impractical) extreme case where the approximation

would not hold: sampling without replacement exactly

strings, from a single circuit with

qubits, yields

0 entropy (since all possible strings are present), apart

from the entropy contained in the ordering of the strings

(which can be 0 if maliciously ordered).

Changing the circuit.

A repetition is counted as a

collision only when it happens within the same circuit.

Thus, collisions in the nal sample can be inherently

avoided if requiring each string to be associated with a

dierent circuit. However, there is an eciency motiva-

tion for proposing sampling from a single circuit or only

a few circuits, assuming that: (i) it is more ecient to

reevaluate a circuit, compared to preparing a new circuit

for rst-time evaluation; (ii) it is more ecient to com-

pute many QC-values for a single circuit, compared to a

single QC-value for each of many circuits. If the sample

size is large enough to be reasonable to approximate the

sampling as being with replacement, then the statistics

of QC-values are similar between the single circuit and

many circuits cases.

An eciency tradeo.

An alternative option to re-

duce the probability of collisions, while not substantially

decreasing eciency, is to partition the sampling across

several circuits. Suppose the time taken to prepare a new

circuit for rst time sampling is

times longer than

the time it takes to repeat one evaluation (e.g., 10 ms

vs. 1

). Then, allowing a sampling of up to

strings

per circuit would: (i) speed up the evaluation by about

times, compared to the case of one string per circuit;

while (ii) only incurring a time increase factor of about

10 % compared to the case of sampling all strings from

the same circuit. With respect to verication time, the

complexity of verifying strings across several circuits in-

creases with the number of circuits, if assuming that the

computation/verication of several strings within each

circuit increases sub-linearly with the number of strings

(e.g., that verifying ten QC-values is less than 10 times

costlier than verifying a single QC-value).

A gap tradeo.

The range of possibilities between

one and many circuits may also allow tuning the gap be-

tween the time to sample strings and the time to compute

all QC-values. Compared with the many-string-from-a-

single-circuit setting, increasing the number of circuits to

simulate may substantially increase the time for comput-

ing QC-values, without signicantly increasing the time

to evaluate the corresponding strings.

A subtle adversarial issue.

An adversary who is able

to perform a very fast repeated evaluation of a quantum

circuit could possibly produce many string collisions. The

analysis of the frequency of collisions for each string would

show which strings are likely to have higher QC-values,

and could thus provide to the adversary an advantage in

skewing the SQC statistic, for example to adversarially

aect an estimation of entropy. This eect can be signi-

cant if the number of qubits is small, namely when the

number of possible strings is smaller than the number of

evaluations the adversary is able to perform within the

time window to publish a sample.

3. Sums of QC-values

In this section we consider the distribution of the

S

um

of

QC

-values (SQC). This distribution relates to the

Heavy Output Generation (HOG) test [AC16], where one

wants to nd whether the SQC of generated outputs is

heavy enough to be reasonable to accept that some of the

originating strings must have been quantumly obtained.

We denote by

the random variable SQC in the

honest case where

strings are sampled by a quantum

computation with

f

idelity

. The cases of

u

niform sam-

pling (

) and pure-

q

uantum sampling (

) are

special cases with delity 0 and 1, respectively.

Page 7 of 35

0 5/N 10/N 15/N 20/N 25/N 30/N 35/N 40/N

0

0.02 N

0.04 N

0.06 N

0.08 N

0.10 N

0.12 N PDFs of SQC (m=10)

Pure-Quantum

Fidelity 0.4

Uniform

Chosen-count (q=4)

Figure 4: Various PDFs of SQC with

Another adversary of interest, with a quantum computer

with delity 1,

c

hooses the number

of quantumly-

sampled strings, and then uniformly samples the remain-

ing

ones. We denote the corresponding random

variable SQC as

and denote the quotient

as

the “pseudo-delity” of the experiment.

3.1. Statistics of interest

Table 3shows the expected values and variances of the

SQC of

strings, for samplings of type

,

,

and

. The results are computed in Appendix B. For the

rst three sampling types the mean and the variance are

proportional to

, since each isolated circuit evaluation is

assumed independent of the other. It is worth noting that

the expected value of

is the same as

,

but the variance is slightly dierent.

Figure 4shows the PDFs of SQCs for the

,

,

and

samplings of strings.

3.2. CDFs of sums of i.i.d. variables

The distinguishability analysis that we are aiming for

requires calculation of points in the CDF curves. However,

in comparison with the simple formula for the CDF of

QC-values, the distributions of SQCs need to account

for the new parameter

that can specify an arbitrary

number of strings. A common approach to handle the

increase in complexity is to apply approximations that

simplify the analysis and are provably correct in a limit

Table 3: Statistics of SQCs of strings

Sampling type Random

variable

Expected

value Variance

Uniform

Pure Quantum

Fidelity

Chosen-count

of increasing the number of summed variables.

Central Limit Theorem (CLT).

When sampling a

large number of independently and identically distributed

(i.i.d.) QC-values, the distribution of their sum

can be approximated by a Normal distribution

(11)

, hav-

ing a Gaussian shaped PDF with mean

and standard deviation .

1

2𝑥−𝜇

𝜎2(11)

At a rst approximation, SQCs can be analyzed based

on the central limit theorem [BISB+18]. However, the

approximation can have noticeable inaccuracy if the num-

ber of summed variables is small and/or when evaluating

probabilities at the tails of the distribution.

Better approximations and exact formulas.

Ap-

pendix Cderives exact formulas for the PDFs and CDFs

of the SQCs under sampling experiments of interest. Even

for large

, the formulas are amenable for direct compu-

tation in the

u

niform, the

q

uantum and the

c

hosen-count

cases. For the general

f

idelity case we can use an exact

formula for not-too-large

, but when

gets larger we

use a Gamma-approximation that yields better results

than the CLT.

For the upcoming analysis we are specically interested

in the ability to evaluate the CDF and its inverse. The

Gamma distribution, with parameters derived in Ap-

pendix C.3, has a wide applicability, namely with the

following formula being applicable to several scenarios.

(12)

where

denotes the [lower] incomplete gamma regular-

ized function (further details in Appendix C.2).

Particularly, the formula is:

•

Correct for the

u

niform, the

q

uantum and the

chosen-count cases, provided and

•

A better-than-the-CLT approximation for the gen-

eral

f

idelity case (F), provided

and

, where

. The transfor-

mation of variables ensures that the expected value

and the variance

are as in the non-approximated case.

Figures 5and 6illustrate, for an example with

,

how much better the Gamma approximation is, compared

with the CLT (Gaussian) approximation. (The Gaussian

curve also extends to negative values.)

For not-too-large

, we can eciently evaluate the PDF

Page 8 of 35

0 1.0/N 2.0/N 3.0/N 4.0/N 5.0/N 6.0/N

0

0.10 N

0.20 N

0.30 N

0.40 N

0.50 N

0.60 N

0.70 N

0.80 N

0.90 N

1.00 N

QC-values(fid=0.5)

Real PDF

Gamma-approx. PDF

CLT-approx. PDF

Figure 5: PDF approximations of QC-values

0 2/N 4/N 6/N 8/N 10/N 12/N 14/N 16/N

0

0.02 N

0.04 N

0.06 N

0.08 N

0.10 N

0.12 N

0.14 N

0.16 N

0.18 N

0.20 N

SQC (m=5,fid=0.5)

Real PDF

Gamma-approx. PDF

CLT-approx. PDF

Figure 6: PDF approximations of SQC

of the fidelity case (F) as a binomial weighted sum over

all possible numbers of quantumly sampled strings:

(13)

3.3. Testing honest sampling

Some terminology:

We dene distinguishability ex-

periments where the baseline question is: Are we in the

presence of an honestly generated set of

strings [instead

of some other malicious or faulty sampling within a well-

dened range of behaviors]? In the scope of this experi-

ment/question, we dene the following terms: negative

and positive respectively mean rejection and acceptance;

false and true respectively mean incorrect and correct

classication. We are particularly interested in the false

positive and false negative probabilities:

•F

alse

n

egative [rate] (FN): a test rejects an honestly

generated sample.

•F

alse

p

ositive [rate] (FP): a test accepts a sample

generated by a reference malicious or faulty process.

When the

h

onest case is characterized by delity

,

how many (

) strings should be sampled, and what

threshold

on the SQCs (

) should be set for an accep-

tance/rejection test of honest behavior? It depends on:

•

what is the reference malicious sampling procedure;

• what FN () and FP () rates are set as a goal.

Selecting a reference malicious sampling.

As a

default reference, we sometimes measure FP with respect

to the

u

niform sampling case (i.e., with delity 0). How-

ever, the denition of FP should match a higher goal of

distinguishability. For example, if intending to maximize

the entropy estimate (see Section 5), then a more useful

FP will refer to a malicious case that ensures a certain

non-zero minimum of entropy. A useful reference of the

malicious sampling is

, where the adversary

samples with pseudo-delity

for some

, where

is the claimed honest delity.

Selecting concrete FN and FP rates.

For cryptog-

raphy applications, the value

is a common bench-

mark related to statistical security in the “one-shot” se-

curity scenario — what can the adversary do if having

luck up to an

-likely event? When application goals do

not indicate otherwise, we recommend

as a minimum goal for both FN and FP. In specic ap-

plications it might be reasonable to allow less stringent

security parameters, if explicitly justied.

3.4. Threshold vs. probability

Figure 7shows, for several delity values

, the inverse

CDF of the SQC

6

across

sampled

strings. Each of these curves represents SQC as a func-

tion of the FN rate, i.e., of the accumulated probability

(horizontal axis) that the random variable

is at

most equal to such threshold value (vertical axis). Fig-

ure 8shows the same for sums across

strings.

For the uniform case (

=0) the curve is evaluated directly

from its exact formula. For the other cases the curves

are obtained from the Gamma approximation. For such

high

, the curves look visually the same as if they were

obtained from the CLT approximation.

The analysis of the curves illustrates the gap in SQC

as the delity increases, across the delity values in

. Also, comparing the two plots, it

is easy to notice the increase in the gap with the increase

in the number

of sampled strings, becoming easier to

distinguish between two distinct delities.

3.5. Sample sizes vs. thresholds

We can now nd, for each intended upper-bound

on

Page 9 of 35

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

0.9960 m/N

0.9970 m/N

0.9980 m/N

0.9990 m/N

1.0000 m/N

1.0010 m/N

1.0020 m/N

1.0030 m/N

1.0040 m/N

1.0050 m/N

1.0060 m/N

1.0070 m/N

1.0080 m/N

1.0090 m/N

N=2^53; m=10^6; m/N=1.11022E-10

Cumulative probability

SQC (Sum of QC values)

Fidelity = 0.005

Fidelity = 0.002

Fidelity = 0.001

Fidelity = 0 (Uniform)

Figure 7: Inverse CDFs of SQC with

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

0.9990 m/N

1.0000 m/N

1.0010 m/N

1.0020 m/N

1.0030 m/N

1.0040 m/N

1.0050 m/N

1.0060 m/N

N=2^53; m=10^7; m/N=1.11022E-09

Cumulative probability

SQC (Sum of QC values)

Fidelity = 0.005

Fidelity = 0.002

Fidelity = 0.001

Fidelity = 0 (Uniform)

Figure 8: Inverse CDFs of SQC with

-40 -30 -20 -14 -10 -7 -4 -2-1

10^5

3×10^5

10^6

3×10^6

10^7

3×10^7

5×10^7

10^8

3×10^8

10^9

log2(FN=FP), in a Sqrt scale

m: sample size (number of strings)

(Squared-log scale)

Fid1 = 0.002

Fid2/Fid1=3/4

Fid2/Fid1=1/2

Fid2/Fid1=1/4

Fid2/Fid1=0

Figure 9: Sample size vs. FN=FP, with

the FN rate what is the minimal number

of strings

needed for an intended FP

, i.e., that a dened malicious

execution with lower pseudo-delity is accepted.

Example of FN vs. FP rates.

Consider an honest

sampling of

strings using delity

. If

the intended FN rate is

, then the threshold

is set at

, i.e., the value

satisfying Prob(

) = 0.2. If the malicious

reference is the uniform case, then the false-positive rate

is

FP Prob

. If the malicious

reference is the case of delity equal to half of the honest

delity, then we get

FPProb

.

Figures 9and 10, respectively for honest delities

, plot curves of the sample size (number

of sampled strings) vs. the FN=FP rates (

), for a

reference honest case

1

, and a malicious

c

hosen-

count sampling

2

with pseudo-delity

, for

. The scales of the axes of the

plots are adjusted for a consolidated view of all plotted

pseudo-delities, while FN=FP range within

.

Each curve labeled with honest delity

and malicious

pseudo-delity

shows

as a function of

= FN = FP,

such that there exists a distinguishability threshold

for

which Prob1Prob2.

For simplicity of computation, we applied here the

CLT-approximation, which provides a simple formula

for

based on the inverse-erf function ((51), (50)), as

derived in Appendix C.1. It is worth noting that the

analytic results are independent of the string space

size

(here assumed to be

), if we assume a

sampling with replacement.

Table 4shows selected examples of sample-size (

) vs.

honest-delity (

) and pseudo-delity (

). This is a

sub-table of the more-detailed Table 14 in Appendix E.

For example, the table shows that about 50 million strings

Page 10 of 35

-40 -30 -20 -14 -10 -7 -4 -2 -1

10^4

3×10^4

10^5

3×10^5

10^6

3×10^6

10^7

3×10^7

log2(FN=FP), in a Sqrt scale

m: sample size (number of strings)

(Squared-log scale)

Fid1 = 0.01

Fid2/Fid1=3/4

Fid2/Fid1=1/2

Fid2/Fid1=1/4

Fid2/Fid1=0

Figure 10: Sample size vs. FN=FP, with

Table 4: Number of strings for SQC distinguishability

(Selected entries from Table 14)

Number of strings to sample

when

when

2

1

when

2

1

0.002 4.977E+7 6.146E+7 1.993E+8

2.273E+7 2.807E+7 9.102E+7

9.569E+6 1.182E+7 3.831E+7

1.646E+6 2.032E+6 6.589E+6

0.01 2.007E+6 2.480E+6 8.066E+6

9.165E+5 1.133E+6 3.684E+6

3.858E+5 4.767E+5 1.551E+6

6.635E+4 8.199E+4 2.667E+5

need to be sampled in order to obtain FN=FP rates of

, if using a quantum computer with delity

and if the FP refers to a malicious uniform sampling.

The number increases about 4-fold if, instead, measuring

FPs with respect to a malicious sampling with pseudo-

delity of about one half of the honest delity. The

needed number of QC-values reduces about 25-fold if

using instead a quantum computer with delity 0.01.

Why considering pseudo-delities?

Distinguishing

the honest vs. the malicious uniform case (delity 0) can

be useful to ascertain that some of strings resulted from

a quantum circuit evaluation. However, for the goal of

estimating entropy (see Section 5), it is more relevant to

distinguish the honest case from a malicious sampling

with some positive pseudo-delity .

4. Low-budget clients

One drawback of the distinguishability experiment consid-

ered in the previous section is that it requires computing

many QC-values. In this section we consider a dierent

statistic that allows an interesting tradeo: a powerful

server computes more QC-values; a weak verifying client

veries fewer QC-values. The client verication can be a

recomputation of QC-values followed by equality check,

but other verications are conceivable.

In the next subsections we consider one approach for

enabling a computationally cheaper verication by clients.

Other strategies exist and it would be interesting to

investigate which ones may yield useful tradeos.

4.1. Truncated QC-values

Suppose the client wants to save work by verifying only

(the sum of) a fraction

of the QC-values. This may be

particularly useful in the setting of one string per circuit,

as there the cost of computing QC-values is maximal

per string. One approach is to set a threshold on the

QC-value, and then check only (the sum of) those with

higher value.

Truncation thresholds.

We denote as truncated

QC-value (TQC), with respect to some truncation

threshold

, the measure that is equal to the QC-value

when it is not-smaller than

, and is 0 otherwise. For each

intended verication proportion

, it is useful to know

the matching truncation threshold

. Table 5shows,

for several delities, the “truncation threshold” that

correspond to each of several verication proportions

. For notation simplicity

we let .

Table 5: TQC truncation thresholds

The indicated thresholds are to be divided by .

Thresholds

Verication proportion ()

Fidelity

0.0 0.693 1.386 2.303 4.605 6.908

0.001 0.694 1.388 2.305 4.610 6.915

0.002 0.695 1.389 2.307 4.614 6.922

0.005 0.697 1.393 2.314 4.628 6.942

0.01 0.700 1.400 2.326 4.651 6.975

0.02 0.707 1.414 2.348 4.695 7.039

0.05 0.729 1.457 2.417 4.821 7.216

0.1 0.767 1.529 2.528 5.011 7.465

0.2 0.850 1.675 2.739 5.331 7.852

0.5 1.146 2.105 3.272 5.990 8.573

1.0 1.678 2.693 3.890 6.638 9.233

Page 11 of 35

Since the threshold is measured with respect to the honest

distribution of a single QC-value, we can nd an exact

solution based on the inverse CDF of :

CDF (14)

(15)

where

is the real lower branch of the LambertW

function) [DLMF].

For example, for a proportion verication of

in a

case of claimed delity 1, the threshold should be set at

. (It is interesting to notice that even though

is 2/N, the median occurs at about

.)

As another example, for a proportion verication of

, the threshold should be set at about

if

the delity is 1, and at if the delity is 0.002.

Consider the random variable

, equal to

if

and equal to 0 otherwise. (

is the indicator

function, equal to 1 if its input is greater than

; equal

to 0 otherwise.) For the sampling of a single string we

consider the cases

,

and

, i.e., when

is obtained

respectively from

U

niform, pure-

Q

uantum and

F

idelity

samplings. Table 6shows the formula for their expected

values and variances. We use

. Notice how

replacing by 0 leads to the formulas in Table 1.

Table 6: Statistics of Truncated QCs

Statistic Formula

Legend:

(

E

xpected value);

(

F

idelity);

(pure

Q

uantum);

(

, where

is the

t

runcation

t

hreshold);

(

U

niform);

(Variance); (random variable: truncated QC-value).

4.2. Sum of truncated QC values (STQC)

Similarly to how we considered sums of QC-values, we

are now interested in the sum of truncated QC values.

We use

to denote the sum of TQC values (STQC) of

strings obtained in an experiment

. Their expected

values and variances are obtained by simply multiplying

by the statistics of the single string.

We also consider the

c

hosen-count case, where an ad-

versary with a delity 1 quantum computer chooses in

advance the number

(out) of (

) strings that it eval-

uates quantumly. The expected value and variance of

are given by the corresponding weighted sums

from the uniform and the pure-quantum cases:

(16)

(17)

It is now interesting to calculate the increase in the num-

ber of QC-values that a server needs to compute. Table 7

(with selected entries from Table 15 in Appendix E) shows,

for several FN

1

= FP

2

rates

, honest delities

,

delity proportions

, and verication proportions

, what is the number

of positive TQC-values to be

veried by the client, when the number

of TQC-values

computed by the server is higher by a proportion .

Table 7: Number of client-veried TQC values

(Selected entries from Table 15)

: # client-veried TQC-values

when when

when

when

2

1 when

when

2

1

7.29E+6 2.92E+7 2.24E+6 8.99E+6

3.33E+6 1.33E+7 1.02E+6 4.10E+6

1.40E+6 5.61E+6 4.31E+5 1.73E+6

2.41E+5 9.65E+5 7.41E+4 2.97E+5

2.97E+5 1.19E+6 9.34E+4 3.78E+5

1.36E+5 5.45E+5 4.27E+4 1.73E+5

5.70E+4 2.29E+5 1.80E+4 7.27E+4

9.81E+3 3.95E+4 3.09E+3 1.25E+4

5. Entropy estimation

The previous section focused on distinguishing an honest

sampling with a xed delity

from a malicious sam-

pling with a lower pseudo-delity

, including 0. This

section focuses directly on the goal of entropy estimation

from a set of sampled strings, and on that basis deciding

parameters for a distinguishability test, and what adver-

saries to consider. For simplicity, the discussion hereafter

assumes a distinguishability test based on SQCs (as de-

scribed in Section 3). The use of dierent statistics could

change the estimated parameters.

This section is organized as follows: Section 5.1 gives an

informal overview of the analysis; Section 5.2 discusses

the client perspective; Section 5.3 denes the pseudo-

Page 12 of 35

delity adversary, for when the sampling budget of the

adversary is not enough to observe collisions. Section 5.4

considers the collisional adversary, which takes into ac-

count the information obtained from observing collisions.

Section 5.5 considers the post-processing of the sample

and a possible hash-biasing attack.

5.1. Overview

The client.

With respect to obtaining certiable en-

tropy, we consider two possible perspectives of the client.

Either: (i) it knows how much entropy it wants, and

with which assurance it wants to obtain it (FN and FP),

and then denes corresponding parameters for a sam-

pling experiment (e.g., the number

of strings to be

sampled) and for an acceptance/rejection test (e.g., an

SQC distinguishability threshold

); or (ii) it is given a

sample of distinct strings claimed to have been sampled

from a veriably fresh circuit, and then, based on their

QC-values and on an intended FP rate, it decides a lower

bound for the entropy contained in the sample.

The adversary.

For either of the above perspectives,

the adversary tries to minimize the entropy contained

in the published sample, conditioned on satisfying the

admissibility criterion of the client (an SQC threshold)

with a probability not smaller than FP. We denote the

latter as the FP goal (or FP constraint).

The adversary of interest is assumed to have a quantum

computer with delity 1. It produces a sample of

strings where only a small number

of them are actually

obtained from quantum evaluation. The sample gener-

ation includes adversarial actions of rejection sampling,

reordering and biasing, as ways to further reduce the

entropy. The

non-quantumly-generated strings

are obtained pseudo-randomly (with entropy 0) without

dependency on the circuit specication .

A black-box model.

The adversary is assumed to

not be able to take advantage of the knowledge of the

circuit specication, apart from being able to obtain

outputs from its evaluation. Therefore, we idealize the

adversary as having a black-box access to the circuit,

being able to request its evaluation a number (

) of

times. This substantiates the assumption of not being

able to compute or estimate QC-values from the circuit

specication alone, nor of simulating an evaluation with

a correlated probability distribution. Nonetheless, this

still allows the adversary to gather some information

about QC-values, depending on the sampling budget

,

by considering the multiplicity of each occurring string.

Entropy estimation.

For concrete parameters of an

experiment, we estimate the number of quantumly-

generated strings, possibly from various probability dis-

tributions. We consider that the strings may have inter-

dependencies, such as those related to sampling without

replacement, rejection sampling, and ordering.

5.2. The client

Timing assumption.

The continuing discussion re-

tains the assumption that the adversary cannot compute

the QC-values of strings by the time deadline to pub-

lish them. This requires the circuit specication to not

be known in advance by the adversary. For example,

one may base this assumption on requiring that the cir-

cuit specication is pseudo-randomly generated from the

timely output of a public-randomness beacon, if it is

reasonable to assume that the beacon is not maliciously

colluding with the adversary.

Two perspectives.

Using a statistic such as the SQC

or STQC described in the previous sections, we consider

how to parametrize an experiment and perform an esti-

mation of entropy. For simplicity, hereafter we focus on

the SQC statistic. We consider two perspectives:

1. Decidability problem.

Given a goal of obtaining

at least

bits of entropy, with a FP rate of at

most

, and accepting a FN rate of up to

for

an honest quantum circuit operator with delity

,

the client decides the number of strings

to

request from the operator to publish as a sample, and

which SQC distinguishability threshold

to use

in

order to accept or reject the sample

. The client

determines

and

assuming that an adversarial

operator (with quantum delity 1, for a conservative

estimate) will craft a sample with minimum entropy

subject to having a probability at least FP of having

an SQC greater than the threshold

. The estimate

depends on the sampling budget

assumed to be

available to the adversary.

2. Estimation problem.

Entropy can only be esti-

mated retrospectively. Given a circuit specication

and a published sequence of

strings, the client

starts by computing its SQC

. The individual QC-

values are either computed by the client or received

from an external trusted party that would have com-

puted them. The client also denes, based on the

time

assumed to have been available to the adver-

sary since learning the circuit specication

, what

is the assumed sampling budget

. The client has its

own requirements of FN and FP, and assumes that

the adversary may have performed a targeted attack

based on those parameters. Thus, the client assumes

the nal sample includes only a “small” number of

quantumly generated strings, minimizing the overall

entropy while still attempting to ensure a probability

of at least FP of having an SQC at least T.

The

client estimates this entropy .

Page 13 of 35

Interesting adversaries.

We consider that the client

is only interested in adversaries that publish samples

whose probability of being accepted by the client is at

least FP (

). Any adversary playing outside this con-

straint is ignored, since within the intended level of as-

surance the client will reject the sample. In particular,

we ignore adversaries that would publish only a sequence

of pseudo-random strings (with overall entropy 0) that

would lead to acceptance with probability less than FP.

Number of quantumly-sampled strings.

A key

step of the analysis is determining the number

of strings

that an optimal adversary has included (or will include)

in the nal sample, and the corresponding expected QC-

values and entropies of those strings. The client also

assumes that the adversary chooses those strings when

knowing the goal/parameters set by the client. In both

perspectives, the estimates/parametrizations by the client

are based on the assumption that, before the deadline to

publish a sample, the adversary cannot compute anything

about the QC-values of concrete strings, apart from prob-

abilistic information based on the number of times each

string has appeared when quantumly sampling a large

(yet feasible) number

of strings from the circuit. The

collisional adversary in Section 5.4 considers indeed such

information when selecting which quantumly-generated

strings to include in the nal sample.

5.3. The pseudo-delity adversary

We focus rst on a specic attack performed by what

we denote as the “pseudo-delity” adversary. This is

tailored to the case where the sampling budget of the

adversary is not large enough to enable nding many

collisions (repetition of strings) before having to publish

a sample. In this subsection we consider the decidability

perspective, where the client decides the size

(number

of strings) of the sample to be published.

5.3.1. Algorithm

The pseudo-delity adversary () operates as follows:

1. Input. receives four input parameters:

(a) : (budget) #quantum evaluations of ;

(b) : # strings to publish in a sample;

(c) : SQC threshold of acceptance by the client;

(d) : FP (maximum false-positive) rate.

2. Quantum over-sampling.

quantumly eval-

uates the circuit

, with delity 1, in a black-box

manner,

times. We call pre-sample to the set of

obtained distinct strings, which is expected to have

approximately

strings. See

auxiliary formulas in Section D.2.

3. Number of quantum strings.

As a function

of the input parameters (

,

,

), the adversary

calculates the minimum number

of quantumly-

generated strings (besides the

strings to be

pseudo-randomly generated) to include in the nal

sample, such that the client accepts with probability

at least

(the FP rate). Recalling, from Section 3,

the notation for the SQC random variable

when

doing chosen-count sampling, the condition is:

minProb′ (18)

The simplifying assumption here is that the QC-

values of these

strings are i.i.d., with expected

value

and variance

, whereas in fact the

budget

and the observation of collisions would

enable inferring more detailed information.

4. Rejection sampling.

Using a secret key known

apriori,

seeds a [cryptographic] pseudo-random

permutation (PRP), thus dening a bijective map-

ping from the set of

-bit strings onto itself. To

reduce the expected entropy in the nal sample,

performs “rejection sampling” as follows: (i)

computes the PRP-output of every string in the

pre-sample; (ii)

orders the

distinct strings,

ascendingly, with respect to their PRP-output; (iii)

selects, in the corresponding PRP-lexicographical

ordering, only the rst strings.

5. Positioning of strings.

initializes a sample

vector

of length

(the sample size requested

by the client) and pseudo-randomly selects

loca-

tions therein.

then places there the

quantumly-

obtained strings selected in the previous step, in the

respective devised order. Then,

pseudo-randomly

generates

additional strings, distinct from the

already quantumly-obtained strings.

positions

them in the free

free positions of the sample.

This step adds no additional entropy.

Note: Section 5.5 mentions a possible additional step

of hash-biasing.

6. Output. outputs the sequence of strings.

5.3.2. Statistics

We now summarize how the client, with a goal of obtain-

ing a sample with

bits of entropy (e.g., 1024), should

parametrize an experiment when assuming the quantum

computer operator is a pseudo-delity adversary ().

Pre-sample size.

We assume that

can sample the

quantum circuit at most

times in the allotted time

window. For example, if one circuit evaluation counts as

one cycle, then sampling

strings within a time

window of

seconds requires a quantum computer with

Page 14 of 35

a frequency of about 1.12 MHz. For

, we assume

for simplicity that the the number

of obtained distinct strings is (see Section D.2).

QC-values and pseudo-delity.

With

, the

expected QC-value in the pre-sample is very close to what

was determined in

(9)

, i.e.,

. Since the client

wants to accept an honest sample with high-probability

(i.e., low FN rate), the corresponding SQC threshold nec-

essarily allows some probability of it also being achieved

with a pseudo-delity slightly lower than the honest -

delity. When the adversary does rejection sampling to

select

strings, then by denition the pseudo-delity

is

. The client assumes the adversary uses

the smallest pseudo-delity that will still pass the SQC

threshold test probability non-lower than the FP rate

. Section D.2 considers a more detailed approximation

(124)

for

, obtaining

, where

is the budget factor. However, compared with

the correction is only signicant when it is already

relevant to consider the more sophisticated collisional

adversary from Section 5.4, which takes advantage of

collisions observed in the pre-sampling stage.

Entropy estimation.

The original expected entropy

per each of the obtained distinct strings in the pre-sample

depends on the sampling budget

. Let

denote the

expected entropy for a thought experiment where the

adversary would now output a single string uniformly

selected from the pre-sample. This is

log

(as determined in

(9)

in Section 2.3)

if

, or up to

when

is so large that the

pre-sample contains almost all

strings. However,

has

performed “rejection sampling” to reduce the expected

entropy per string (and thus of the nal sample).

An initial intuition is that the rejection sampling induces

the selected PRP-outputs to start with about

log

zeros (meaning we can discount about

log

bits of

entropy per string). Also, the ordered selection reduces

the space of possible vectors by a factor

, increasing

the probability of each possible one by

. This would

lead to approximating the expected entropy as follows:

log

log (19)

However, a lower value is obtained if we consider an

iterated procedure, one string at a time (i.e., using

),

repeated the original

times. In the iterated case, at

the

-th selection the entropy reduction from the above

formula would be

log

, leading to an apparent

overall reduction of log

.

The Shannon entropy of a sampling procedure is the

expected value of the negative logarithm (base 2) of

the probability of obtained samples. This logarithm

can be interpreted as a summation of logarithms, where

each new logarithm applies to the probability of a new

string being selected, conditioned on the strings already

selected. The logarithm for the probability of the overall

sample is itself a random variable, which has not only an

expected value (Shannon entropy) but also for example

a variance. In practice it may be relevant to measure

something like the minimum possible entropy, or a bound

for which there is a overwhelming probability of the

variable being larger than it. Some details about this

variable are considered in Section D.3.

For simplicity, and being conservative in the estimation of

Shannon entropy, in the subsequent discussion we focus

on the approximation obtained by iterating formula

(19)

one string at a time. We also pug in a minor correction

(see Section D.3.1) to the apriori average entropy

per string (i.e., before rejection sampling), due to the

reduction of the set from which the

-th string is selected.

In summary, we consider the following approximation:

log𝛽

𝛽

log

(20)

where

is the average entropy per string expected for the

pre-sample of distinct strings obtained after

quantum

evaluations of the circuit, and where

log

represents

the binary logarithm of the descending factorial of

order

, i.e., of

. A more accurate

estimate could be based on simulation, as mentioned in

Section D.3. For example, there we show that with delity

0 and

both formulas are a non-tight lower-bound

of the expected entropy.

Parametrization.

Solving for

yields the (approxi-

mate) minimum number of quantumly-generated strings

that an interesting adversary will use. The client assumes

that the sample of

strings will be produced by a chosen-

count method, using exactly

quantumly generated

strings (i.e., pseudo-delity

), selected (and or-

dered) upon rejection sampling from a set of

strings.

Thus, the client determines the parameters

, and

,

respectively using the approximated equations

(51)

, and

(48)

or

(46)

, in Section C.1, as in the following examples.

5.3.3. Initial examples

Example 1.

Let

and

, which implies

and

. Instantiating this in equa-

tion

(19)

for a goal of

bits of entropy and solving

for

yields

, which rounded up is

(the

assumed number of quantumly-generated strings that the

adversary will include in the nal sample).

When

FN FP

, the needed sample size

can, by the CLT, be approximated as in (51) in Ap-

pendix C.1, using

. Let

be the quantum

Page 15 of 35

delity claimed by the honest operator. Then:

•E+for

•E+for

The corresponding SQC thresholds can be obtained

from

(48)

in Section C.1, using

and

. Comparing against Table 8, the values

obtained for

are only slightly higher than (but quite

close to) those obtained for

and

.

This is expected, since

makes the ratio

very small, namely

E-E+6

and

E- E+7

, respectively when

is 0.01 and 0.002. Therefore, it is expected that a small

factor increase in

will yield a large factor increase in

estimated entropy.

Table 8: Number of strings for SQC distinguishability

(Selected entries from Table 14)

for

for

2

1 for

2

1 for

2

1

4.98E+7 5.08E+7 8.85E+7 1.99E+8

1.65E+6 1.68E+6 2.93E+6 6.59E+6

2.01E+6 2.05E+6 3.57E+6 8.05E+6

6.63E+4 6.77E+4 1.18E+5 2.66E+5

Example 2.

If the client wants

bits of entropy,

then we get

. Using the same FN=FP rates

, and assuming that

will use

quantumly-generated strings, leads to:

•E+for

•E+for

We note that the last example for

already exceeds the

considered budget

, meaning that such

parametrization would not be suitable for a real experi-

ment. The adversary client should then assume a higher

budget to the adversary (e.g., possibly by giving more

time for sampling). Consequently, the client should re-

estimate

and possibly also start taking into account the

advantage that the adversary may obtain from observing

collisions in the over-sampling stage.

5.3.4. Other examples

We show now a few examples of deriving the parameter

from Table 8(which has a few selected entries from

Table 14), based on the sample size (total number

of

strings) calculated with respect to the SQC statistic, for

several pseudo-delities and FN=FP rates.

Example 3.

Let

.

For FN=FP rates of

, one needs

E+

sampled strings to distinguish between the honest

sampling

with delity

, and a

malicious chosen-count sampling

1

with

pseudo-delity

equal to one hundredth of

(i.e.,

). Such malicious case contains only

(=

) strings generated from a correct

circuit evaluation.

Example 4.

Let

.

Compared to Example 3, increasing

by a factor of

about 78 %, to

E+

, provides the same FN=FP

rates (

) but when FP refers instead to a malicious

pseudo-delity

equal to 1/4 of the honest delity

.

That corresponds to

quantumly generated strings,

which is 40 times more than in Example 3.

Example 5.

The cost in sample size is about linear in

the

base of the FN=FP rates. Let

. Consider an FP=FN goal of

, for cryptographic suitability. More than 49.7 million

strings need to be obtained for any amount of entropy to

be present in the malicious case, if supposing

.

For example, for a pseudo-delity

, the

total number of strings is about

E+

strings,

meaning about 1 016 quantum-sampled strings. Overall

the sample size is about 30 times larger than in Example 3,

to increase the FP=FN rates from to .

Example 6.

Let

. A substantial improvement

is possible by increasing the honest delity reference.

For example, for

(ve times larger than in

example 3), the needed total number

of strings is

about 2 million (i.e., about 25 times less than) for the

same

ratio as in Example 5. However, to satisfy

the FP constraint the malicious adversary would (in this

example) also be using a higher pseudo-delity, equal to

. One would then assume there are 205 quantum-

sampled strings present in the sampled string set.

5.4. The collisional adversary

We consider now how the observation of collisions during

over-sampling may give an advantage to the adversary.

Informal description.

The adversary

uses a quan-

tum computer with delity 1 to evaluate the circuit

“many” times (

), until obtaining “many” collisions. The

output strings with most collisions have a higher expected

QC-value (and lower expected entropy) than those with

fewer collisions. Based on this, fewer quantumly gener-

ated strings in the nal sample can achieve a higher SQC.

organizes the strings in bins, one for each multiplicity

of occurrence (i.e., bin

becomes the set of strings that ap-

peared, each, exactly

times). Depending on the tally of

multiplicities, i.e., the vector of numbers of strings across

bins,

chooses from which sets of bins to select strings for

the nal sample, along with applying rejection sampling.

Page 16 of 35

When a small sampling budget does not lead to colli-

sions, the collisional adversary corresponds to the pseudo-

delity adversary from Section 5.3.1. Conversely, in a the-

oretical extreme of a sampling budget being a very large

exponential in the number of qubits, each string would

get a multiplicity suciently apart from the multiplicities

of other strings. From those multiplicities the adversary

could estimate with high accuracy the QC-values of each

string. This would enable a straightforward simulation of

sampling from a circuit evaluation, while in fact only mak-

ing a pseudo-random selection with overall zero entropy.

5.4.1. Algorithm

The collisional adversary operates as follows:

1. Input.

As in Section 5.3.1,

receives the input

parameters:;;;.

2. Quantum over-sampling.

uses its sampling

budget to obtain a sequence of

strings, called

the pre-sample, which may have repetitions (i.e.,

collisions).

organizes the strings into bins. Bin

is the set of strings that appeared with multiplicity

in the pre-sample.

Let

denote the expected number of strings in bin

. For example, with qubits we have:

•

•

•

We abuse notation and also let

denote the actual

number of strings obtained with each multiplicity in

a given experiment. We have:

(21)

We use a prime in superscript (e.g., as in

) to

indicate the union of bins of multiplicities larger or

equal to a certain value. For more general union of

bins we can simply use a set, instead of an integer,

as index. For example:

.

3. Number of quantum strings.

The adversary de-

cides, from the pre-sample with distinct strings

in each bin

, totalling

distinct strings, how

many (

) quantumly-obtained strings, and from

which unions of bins, to include in the nal sam-

ple. An adversarially optimal selection takes into

account that the expected QC-value and entropy

of quantumly sampled strings also varies across the

bins. Tendentiously the strings with higher multi-

plicity are preferable in terms of having higher QC

value and lower entropy. However, an optimal deci-

sion can be more intricate considering the rejection

sampling step ahead, which depends on the number

of strings available in each bin

, and on the SQC

threshold required by the client to accept a sample.

Concretely, as a function of the input parameters

(

,

,

), and of the tally

of collisions

in the pre-sample,

will determine a sequence

of subsets (possibly only one) of multi-

plicities, from whose corresponding unions-of-bins to

pseudo-randomly sample, and determine how many

strings to select from each such union. In other

words,

needs to determine two (jointly optimal)

non-empty same-length sequences:

•

(or simply

), where each

is a subset of possible multiplicities .

• 12

, where each

𝑗

is a posi-

tive integer, such that

𝑗

is the

number of quantumly-obtained strings to be in-

cluded in the nal sample (not counting strings

that, although obtained in the over-sampling

step but not selected in the rejection sampling

step, may in the subsequent step be, by coinci-

dence, pseudo-randomly selected.)

Note on various options:

The pseudo-delity

adversary described in Section 5.3.1 uses

, i.e., all

strings are selected from within

the set of all pre-sampled distinct strings (regardless

of multiplicity); conversely, the collisional adversary

is allowed a more intricate choice across dierent

unions of bins. We dene that the optimal colli-

sional adversary is one that makes an optimal choice

of the vectors

and

, for the purpose of minimizing

entropy while satisfying the FP goal.

4. Rejection sampling.

From each union set

, the

adversary uses a dierently seeded pseudo-random

number generator to obtain

𝑗

strings by rejection

sampling. Specically,

selects the lexicographi-

cally rst

𝑗

strings upon application of the pseudo-

random permutation. See Section D.3.2 for a discus-

sion of other options.

5. Positioning of strings.

initializes a sample

vector

of length

(the sample size requested by

the client) and pseudo-randomly selects a vector

of

distinct positioned therein.

then se-

quentially places in those positions the

quantumly

obtained strings selected in the previous step, which

possibly came from various bins, in the respective

devised order (namely, considering the lexicographic

ordering respective to the pseudo-random permu-

tation used for rejection sampling in each union of

bins). Then,

pseudo-randomly selects

other

strings, distinct from the already selected

strings,

to complete a nal sample with overall strings.

6. Output. outputs the sequence of strings.

Page 17 of 35

5.4.2. Statistics per bins or unions of bins

Size of bins. Appendix D.2 shows experimentally ob-

tained formulas for

, as functions of the string space

size

and the sampling budget factor

. One

case of interest is the union of bins whose multiplicity

is at least a certain value (

). In those cases we use a

prime in superscript (e.g., as in

) to indicate that the

statistics refer to said union of bins. For example, the

expected number

of distinct strings that appear with

multiplicity at least is approximately equal to:

(22)

QC-values.

Appendix D.2 shows experimentally

obtained formulas for the expected QC-value (

)

and variance (

), for each bin

. When the budget

is signicantly smaller than the string space, the

expected QC-value and the variance remain very close,

respectively, to

and

, within each

bin

. However, for larger sampling budgets those values

start to noticeably dier. It is also relevant to consider

the special case of the expected average QC-value

in the union of bins with at least a certain multiplicity

. This is approximately equal to:

(23)

Entropy.

The expected entropy

per string de-

creases with the multiplicity. However, the entropy

of each string in the pre-sample is indistinguishable

across the strings within the same bin, i.e., before

the deadline to publish a sample. Nonetheless, the

adversary will still aect the probability distribution

with which the strings will appear in the nal sample, by

using rejection sampling. Technically, the entropy of a

string as measured in the pre-sample is not the same as

measured in the nal sample. Particularly, depending on

the rejection sampling technique, the entropy in the nal

sample will also depend on the number

of strings to

select from each union of bins.

By denition, an optimal collisional adversary is the one

that optimally selects the sequence of unions of bins in

a way that minimizes the overall expected entropy in

the nal sample, while subject to the FP goal. For the

purpose of these notes we nd sucient to highlight the

eect of a simple collisional choice —

(i.e.,

selecting strings from those that appeared at least twice in

the pre-sample) — that already outperforms the pseudo-

delity attack when the sampling budget

is suciently

large, such as when considering qubits.

Using a logic similar to the one used for the analysis

of the pseudo-delity adversary, we can consider a rst

approximation of the entropy contributed by the rst

ordered sequence of

1

strings selected from the rst

union of bins as being about:

𝑗𝑗log𝑗log𝑗log𝑗 (24)

where

1

is the expected apriori average entropy per

string in the union of bins in .

For a more conservative and still simple estimate we can

consider the result of iteratively applying the previous

formula, thus getting:

log𝑢1𝑢1

𝑢1𝑢1

log1𝑢1(25)

where and

is the descending factorial of

order

.

The expressions are similar in look to formulas (19) and

(20), which considered

, but now

we consider separate bins.

The corrections discussed in Section D.3 also apply. The

entropy contributed by strings selected across several

unions of bins also depends on the previous selections

across other unions of bins. Concretely, the initial entropy

𝑗

considered for the rst string in a union

on bins

depends on the number of strings already selected, and

from which unions of bins.

Asymptotically large budget.

It is instructive to

consider the asymptotic limit of large sampling budgets,

i.e., when

is a large enough exponential in the number

of qubits. Each string

will tend to appear in an

individual bin with multiplicity

approximately equal

to

, where

Prob

is the QC-value

of string

with respect to the circuit

. Thus, the

adversary can estimate that the QC-value of each string

is approximately .

In fact, the asymptotically large

would even allow an

exact simulation of the circuit evaluation, as follows.

1.

Pseudo-randomly simulate a binomial number

of

strings to obtain from quantum evaluation. The

binomial has parameters

and

, to simulate how

many strings, out of

, would be from correct quan-

tum evaluation in an experiment with delity .

2.

Pseudo-randomly simulate

uniform oating point

numbers between 0 and 1, as points in the inverse-

CDF of the QC-values, and determine what are the

correspondingly selected strings.

3.

Output a sequence of

strings, composed of a

pseudo-random positioning of the initial

strings,

and then interleaved by other

pseudo-randomly

selected strings simulating a uniform selection of dis-

tinct strings.

Page 18 of 35

The above described sample would be cryptographically

indistinguishable from a honest sample, implying that

not only it would pass an SQC test with the same FP

rate as the FN rate set for the honest case, as well as it

would do so for any other practical statistical test.

5.4.3. Comparison of adversaries

The pseudo-delity adversary is a special case of the

collisional adversary, when using

and then

selecting

is as the minimum possible. We conjecture

that in the black-box evaluation model this is optimal

for a sampling budget of the order

since there

is no information gained from collisions. Obtaining a

few collisions is possible in practice by evaluating the

circuit a number of times of at least the order of the

square-root of the string space. However, for

qubits, where a distinguishability from uniform with FP

rate

and delity

already requires

sampling approximately

times, it is conceivable that

an adversary would in fact be able to sample more strings,

say, up to , within the allowed time for sampling.

The collisional adversary takes advantage of observed

collisions. Below, we show some examples comparing

the pseudo-delity adversary vs. a simple collisional

adversary that simply selects strings from those that

have collided twice, i.e., using

. (For higher

budgets an optimal collisional adversary may be more

successful by using a variety of bins and their unions.)

Pseudo-delity vs. collisional.

Consider parame-

ters

FP

such that a pseudo-delity adversary

(

) is compelled to include

(e.g., 1024) quantumly-

generated strings, and then pseudo-randomly obtain the

remaining

strings.

We now ask: how many quantumly-generated strings a

collisional adversary would actually include in the sample

if its budget induces a large enough number of collisions?

The answer depends on several quantities. For the given

budget (implicit, not shown in the indices), let:

•

be the expected QC-value of strings in bin

, the

values of which are determined in Section D.2.1;

•

be the number of strings that the

c

ollisional

adversary will select from bin ;

•PRG

be as

, and

PRG

be as

, but with

respect to the set of all strings that the adversary

will not select from bins with positive multiplicity,

and of all strings not output by quantum evaluation.

This is the set from which the adversary will select

strings directly by pseudo-random generation.

Consider a simplied collisional adversary that will, for

the same nal sample size

, use

fewer quantumly

generated strings, all from bin . Then we have:

(26)

SQC

PRG PRG

(27)

SQC

PRG PRG

(28)

where SQC

denotes the expected

s

um of QC-values, and

refers to either the pseudo-delity(

) or the collisional

() adversary. The above system yields:

(29)

Example 7.

Consider a setting with

qubits.

Table 9shows, for two dierent budgets (

and

),

the estimated entropy for collisional attacks of various

degrees (i.e., various multiplicities

they take advantage

of). The calculation is for a case with

(i.e.,

when the SQC threshold requires a contribution of

from strings quantumly-generated by a

pseudo-delity adversary).

The number

of strings obtained with multiplicity

depends on the budget

. For example, evaluating the

circuit

times yields about

collisions. The expected

entropy

per string also varies with the budget and

the multiplicity

, as determined in Section D.2.2. In the

table, the precision shown for

and

was tailored

in each case to highlight how small is the correction com-

pared to the approximations

and

.

Some observations:

1.

For both budgets the expected QC-values of strings

are still very close to

, but they would

become noticeably dierent for high enough budget.

2.

For

, the attack with

is not possible

if

, since then

would be greater

than the expected number of collisions.

3. The higher the budget the lower the entropy.

Table 9: Comparison pseudo-delity vs. collisional

Assuming

1 1.999999 1024.0

2 2.999999 512.0

1 1.99998 1024.0

2 2.99998 512.0

The approximation

is based on

(24)

, i.e., the iterated application

of

(25)

in each bin. The direct application

(24)

, one string at a

time, would yield a higher estimate, by a factor of up to about

10 % in each case.

Page 19 of 35

A more relaxed estimation.

We have considered an

adversary with a quantum computer with delity

.

But a client may want to assume that the adversary can

only quantumly evaluate with a lower delity, e.g.,

. Then, in a model where any faulty evaluation yields

a uniformly random string, the previous formulas

(9)

for

expected entropy

per string output by the quantum

computer would have to be adjusted. Then, to achieve

the entropy reduction the adversary would need to have

higher sampling budget.

5.5. Final randomness for applications

What use may a client make of a list of millions of strings

that may potentially include several hundreds or thou-

sands of bits of fresh entropy? We do not explore in these

notes the interesting use of information theoretic random-

ness extractors. However, from a practical standpoint

and considering cryptographic use, we recommend the

use of a cryptographic combination of the entropy, into a

bit-string with approximately full entropy. For example,

this can be a 512-bit string with about 511 or 512 bits of

entropy. (We propose to assume 511.37 bits, as expected

for a random function with 512 bits of output.) For practi-

cal purposes, this is enough as a seed of a pseudo-random

number generator that can then produce a much larger

string indistinguishable from random (by whoever does

not know the seed). A combination performed by direct

application of an ecient cryptographic hash function is

a candidate with merit but susceptible to a bias attack.

Hash bias attack.

If the adversary

anticipates that

the client will extract entropy from the sample by ap-

plying a fast-to-evaluate compressing function without

secrets, then it can induce a further reduction in entropy.

For example, consider that the client uses a cryptographic

hash function, such as the Secure Hash Algorithm 3 (SHA-

3) version with 512 bits of output, to hash the sample and

use it as the actual randomness output by the protocol.

In that case,

can try many modications of the one or

two of the last pseudo-random strings in order to bias

the hash of the sample, e.g., making it satisfy a secret

predicate with small positive probability (e.g., of about

if it can still perform

hash computations within

the deadline). This makes the hash output be from a set

of reduced size, e.g., about

instead of

.

For example, the adversary could induce the rst 64 bits

of the hash to be a certain secret known only to the

adversary. This would reduce the entropy of the output

by about 64 bits. In practice this is not problematic for

applications that intend to use a seed not required to

have more than, say, 400+ bits of entropy.

Nonetheless, we describe two possible mitigations:

•

If the application allows the client to wait for the

calculation of QC-values, then the client can include

(at least a few of) the QC-values of the strings as part

of the hash input. This can be impractical for some

applications, due to the required delay in computing

the QC-values.

•

Using a veriable delay function for the hash, the

adversary does not have enough time to compute it

before it has to publish the sample of strings.

Alternative post-processing.

Alternatively to hash-

ing, the client can instead include a secret key (if one

exists) as part of the hash input, to prevent the operator

from limiting the size of the image space. A standard

method for this approach is to use a hash-based message-

authentication code. But if it is a one-time use secret, it

is enough to prepend it to the rest of the sample before

hashing. Actually, the entropy is not lost if the client

reveals the secret at this stage, as long as it was unpre-

dictable by the adversary before it had to publish the

sample of strings. In an application setting where it is

preferable to also prevent the client from biasing the out-

put, then the secret can be committed in advance, before

the circuit is sent to the quantum computer operator.

5.6. Classes of adversaries

Let an adversary be called “optimal” within a class if,

while satisfying the FP constraint, it minimizes the en-

tropy of the sample.

Security reduction.

These notes do not provide a

complexity-theoretic reduction. Such a reduction would

have to rule out the existence of adversaries much stronger

than those we consider here. Instead, we make the en-

tropy estimation for a concrete ecient adversary, ar-

guably optimal within an interesting class. We leave as

an open problem investigating how large this class is. A

reduction by Aaronson [Aar19] guarantees a minimum

of a few bits of entropy, in the setting of one string per

circuit. We consider, instead, repeated sampling from

the same circuit, as described in Section 2.4.

Class “A” of adversaries.

We dene class “A” as

the class of adversaries (parametrized by a sampling

budget

) for which the optimal adversary is a collisional

one. We hypothesize that this class captures the range of

adversaries of practical concern. We also hypothesize that

class A includes the set of ecient adversaries that only

access the quantum computer via a black-box interface.

While we do not know of any ecient adversary, outside

class A, that is better than the collisional adversary, we

do not rule out that possibility. We hypothesize that

any ecient adversary outside of this class would need

to use a non-trivial (currently unknown) mathematical

trick taking advantage of the circuit specication

. The

intuition for this hypothesis is conveyed below by an

analogy.

Page 20 of 35

An analogy.

We do not prove how general or restrictive

the class A is with respect to aecting the distribution of

QC-values, nor do we attempt to relate it to a complexity

theoretic argument. However, we provide an intuitive

argument by making an analogy with the properties of

Carter-Wegman universal hashing [CW77].

•Universal hashing:

1. Member of a large class.

The hash function

is uniformly selected from a large family of hash

functions, all with the same output range.

2. Equal distribution of output values.

For each

hash function, each possible output has the same

number of pre-images.

3. Advantage in predicting or biasing the out-

put values.

Until the hash function is dened, the

future hash output of any particular input cannot

be predicted any better than guessing the output

of a uniform selection over the range.

•Sampling from random quantum circuits:

1. Member of a large class.

The random quantum

circuit is (pseudo-)uniformly selected from a large

class of circuits, all with the same output range.