Available via license: CC BY 4.0
Content may be subject to copyright.
arXiv:2308.15191v1 [cs.CR] 29 Aug 2023
State of the Art Report: Verified Computation
Jim Woodcock,
Mikkel Schimdt Andersen, Diego F. Aranha, Stefan Hallerstede,
Simon Thrane Hansen, Nikolaj K¨uhne Jakobsen, Tomas Kulik,
Peter Gorm Larsen, Hugo Daniel Macedo, Carlos Ignacio Isasa Mart´ın,
Victor Alexander Mtsimbe Norrild
Aarhus University
31st October 2022
1
Executive Summary
This report describes the state of the art in verifiable computation. The problem being solved
is the following:
The Verifiable Computation Problem (VCP)Suppose we have two comput-
ing agents. The first agent is the verifier, and the second agent is the prover. The
verifier wants the prover to perform a computation. The verifier sends a descrip-
tion of the computation to the prover. Once the prover has completed the task, the
prover returns the output to the verifier. The output will contain proof. The verifier
can use this proof to check if the prover computed the output correctly. The check
is not required to verify the algorithm used in the computation. Instead, it is a
check that the prover computed the output using the computation specified by the
verifier. The effort required for the check should be much less than that required to
perform the computation.
The problem is visualised in Fig. 1. This is a classic problem with many applications:
Verifier Prover
Check:
y=f(x)
1. send(f,x)
2. send(y, proof)
3. check the proof
Figure 1: Verifiable Computation.
1. Delegation Verifiable computation can be used for delegating computation. Suppose
that we have an honest prover that is efficient and runs in polynomial time (that is, the
time taken is a simple function of the length of the input). Suppose also that the verifier
is super-efficient and checks run in nearly linear time (that is, the time taken is directly
proportional to the length of the input). The prover computes a result for the verifier. The
prover then interactively proves the correctness of the result. This means the verifier can
check the result’s correctness nearly linearly instead of running the entire computation.
The verifier does not need the computational power possessed by the prover. The prover’s
resources can be shared between many client verifiers.
2. Cloud computing CC Suppose we have a significant distributed computation on petabytes
of data. The verifier outsources the computation and its massive dataset to a prover. The
prover completes the computation and sends the results back to the verifier. The veri-
fier wants to know that the prover executed the distributed computation correctly. The
prover worries about faults that are sources of incorrect execution. These might include
data corruption, communication errors, and hardware failures. The prover sends a proof
that the results are free from faults.
3. Information retrieval A verifier wants to make a query on a remote database. A
prover acts as the remote database server. The verifier wants assurance that the prover
has performed the query correctly.
2
4. Hardware supply chains Hardware Trojans involve third-party threats of injecting
malicious circuits into chip designs. We cannot trust hardware supply chains where this
threat exists. For example, an adversarial facility might be manufacturing the hardware
under contract. We want a verifier to check assurances provided by the hardware effi-
ciently. This guarantees that the hardware is executing correctly on this input. This is
a specific instance of a more general problem: verifying assertions where a third party
supplies an untrusted execution substrate.
Although we describe a three-stage protocol in Fig. 1, the proof might be delivered over several
rounds of interaction.
The verifiable computation problem complements the program verification problem (PVP).
Verification relies on helpful redundancy. We need two descriptions of the same thing and
then compare one against the other. Program verification establishes that we have expressed
a given computation correctly. We make the judgement by comparing it with a higher-level
specification. In the verifiable computation problem, the computation fis given. We are not
verifying fagainst a specification. Instead, we want to know whether the execution performed
by the prover is consistent with the expression of f.
The literature surveyed in this state-of-the-art report presents the theory of probabilistic
proofs. A central result in this area is the Probabilistically Checkable Proof Theorem (PCPT).
PCP has a necessary consequence. For any valid mathematical assertion, it is possible to encode
the proof of that assertion. PCP shows that we can use this encoding to check the assertion’s
validity by inspecting only a constant number of points in the proof carried out elsewhere.
The practical consequence of PCP is in its application to the protocol in Fig. 1. Consider
the computation f, input x, and supposed output y. There is a proof and a randomised way
of inspecting it that guarantees the following. The verifier will accept the proof if y=f(x)
is correct. If y6=f(x), the verifier will almost always reject the proof. The proof might need
interaction between the prover and the verifier. The fact that the verifier rejects such proofs
almost always encodes an error bound. It means that, with some probability bounded in the
analysis, the verifier will incorrectly view a wrong answer as correct.
The verifiable computation protocol in Fig. 1does not explicitly check the result y. It does
less work than that. If the verifier were to check the result y=f(x), then it would need to
re-do the computation. That contradicts the problem statement and is not the intention.
So, PCPs allow a randomised verifier, with access to a purported proof, to probabilistically
verify an input statement of the form y=f(x)by querying only a few proof bits. Zero-
Knowledge PCPs (ZK-PCPs) enhance standard PCPs. In a zero-knowledge proof (ZK), one
party can prove to another that a given statement is true. It does this while avoiding giving
any additional information apart from the fact that the statement is indeed true.
There is a large body of literature devoted to probabilistically checkable proof protocols.
The original naive implementations of the PCP theory were very slow. Since then, there have
been orders of magnitude improvements in performance. Early tools used low-level represen-
tations of computations. Newer tools compile programs in a high-level language for these
low-level protocol entities. Some publications report efficient verifiers that could tackle real-
world problems. But it appears that these systems are limited to smaller executions, mainly
due to the expense of the prover. Our initial impression is that these systems are limited to
special-purpose applications.
This state-of-the-art report surveys 128 papers from the literature comprising more than
4,000 pages. Other papers and books were surveyed but were omitted. The papers surveyed
were overwhelmingly mathematical. We have summarised the major concepts that form the
foundations for verifiable computation. The report contains two main sections. The first, larger
section covers the theoretical foundations for probabilistically checkable and zero-knowledge
proofs. The second section contains a description of the current practice in verifiable compu-
3
tation. Two further reports will cover (i) military applications of verifiable computation and
(ii) a collection of technical demonstrators. The first of these is intended to be read by those
who want to know what applications are enabled by the current state of the art in verifiable
computation. The second is for those who want to see practical tools and conduct experiments
themselves.
4
Contents
Acronyms 6
Glossary 8
1 Introduction & Overview 10
1.1 Organisation of the Report .............................. 12
2 Problem Statement 13
3 Theoretical Foundations 14
3.1 Security models for two-party computation ..................... 14
3.1.1 Trust in Multiparty Systems ......................... 15
3.1.2 Verifiability in Multiparty Systems ..................... 15
3.1.3 Honesty within Multiparty Computation .................. 16
3.1.4 Secure Communication - a brief note .................... 17
3.2 Interactive Proof Systems .............................. 17
3.2.1 The Sum-check Protocol ........................... 19
3.2.2 The Fiat-Shamir Heuristic .......................... 20
3.2.3 Proof of Knowledge and Zero-knowledge .................. 21
3.2.4 Multi-Prover Interactive Proof Systems ................... 22
3.3 Probabilistically Checkable Proofs .......................... 22
3.3.1 Proof Techniques and Insights ........................ 24
3.3.2 Linear PCP .................................. 27
3.3.3 PCPs of proximity .............................. 27
3.4 Polynomial Interactive Oracle Proofs ........................ 28
3.4.1 IOPs of Proximity .............................. 29
3.5 Computations as Polynomial Constraints ...................... 29
3.5.1 Polynomials .................................. 29
3.5.2 Polynomial Commitment Schemes ...................... 29
3.5.3 Arithmetic Circuits .............................. 30
3.5.4 Rank 1 Constraint Systems ......................... 30
3.5.5 Quadratic Arithmetic Programs ....................... 30
3.5.6 Algebraic Intermediate Representation ................... 31
3.6 zk-SNARKs ...................................... 31
3.6.1 Examples of constructions .......................... 33
3.7 zk-STARKs ...................................... 34
3.7.1 Fast Reed-Solomon IOP of Proximity .................... 34
3.8 Game theory approach ................................ 36
4 Practice in Verifiable Computations 37
4.1 Implementations: The Bottleneck .......................... 37
4.2 Surveying Implementations: A Selection of Tools .................. 38
4.3 Experimental Evaluation of Tool Usability ..................... 39
4.4 Results ......................................... 40
4.5 Discussion ....................................... 42
5 Conclusions 43
6 References 45
5
Acronyms
AIR Algebraic Intermediate Representation
ARM Advanced RISC Machine
BCS Ben-Sasson-Chiesa-Spooner
BFS Byzantine Fault Tolerant NFS File System
BFT Byzantine Fault Tolerance
FFT Fast Fourier Transform
FHE Fully Homomorphic Encryption
FRI Fast Reed-Solomon Interactive Oracle Proof of Proximity
IOP Interactive Oracle Protocol
IOPP Interactive Oracle Protocol of Proximity
IP Interactive Proof
LPCP Linear Probabilistically Checkable Proof
MIP Multi-Prover Interactive Proof
NFS Network File System
NP Nondeterministic Polynomial time
PCP Probabilistically Checkable Proof
PCPP Probabilistically Checkable Proof of Proximity
PLONK Permutations over Lagrange-bases for Oecumenical Noninteractive Arguments of
Knowledge
PSPACE Polynomial Space
QAP Quadratic Arithmetic Programs
R1CS Rank-1 Constraint System
RCAS Reduced Cyber-Attack Surface
RISC-V Reduced Instruction Set Computer (RISC) 5
ROM Random Oracle Model
RS Reed-Solomon
6
RSA Rivest–Shamir–Adleman
SAT Circuit/Boolean Satisfiability
SLA Service Level Agreements
SNARG Succinct Non-Interactive Argument
SNARK Succinct Non-Interactive Argument of Knowledge
STARK Zero-Knowledge Scalable Transparent Argument of Knowledge
TLS Transport Layer Security
TPM Trusted Platform Modules
UAV Unmanned Air Vehicle
VCP Verifiable Computation Problem
ZK Zero Knowledge
7
Glossary
AM Arthur-Merlin. The set of decision problems that can be decided in polynomial time by
an Arthur–Merlin protocol
BPP Bounded-error Probabilistic Polynomial-time . The class of decision problems that prob-
abilistic Turing machines can solve in polynomial time with an error probability bounded
by 1⁄3for all instances of the problem
CC Cloud Computing. The delivery over the internet of computing services, including data
storage and computing power, on-demand with pay-as-you-go pricing (definition from
NIST [90]; other interpretations exist)
CRH Collision-Resistant Hash function . A hash function with a very low probability of two
different inputs hashing to the same output
DTIME Deterministic Time. Referring to a deterministic Turing machine’s computational
resource or computation time. Also TIME
Entropy Entropy of a random variable. See Shannon entropy
FF Finite Field. F: A field that contains a finite number of elements. A field is a set with four
basic operations: addition, subtraction, multiplication, and division, satisfying the rules
of arithmetic. An example of a finite field is the integers modulo p, where pis a prime
number
FME Fully Homomorphic Encryption. An encryption scheme that enables computations to
be run directly on data encrypted by the scheme without first decrypting it
HWT Hardware Token. A peripheral hardware device used to provide access to a protected
or restricted electronic source
IPT Interactive Polynomial Time. The class of problems solvable by an Interactive Proof
system in polynomial time
IP Interactive Proof/Interactive Proof System. An abstract machine that models proof as an
exchange of messages between a possibly untrustworthy Prover and an honest Verifier
MT Merkel Tree. A data structure in the form of a tree usually with a branching factor of 2
in which each internal node has a hash of all the information in its child nodes
MIP Multi-prover Interactive Proof. An Interactive proof system distributed across multiple
provers
NAV Non-Adaptive Verifiers. A PCP proof in which all the verifier queries are predefined
NP Nondeterministic Polynomial-time. The class of decision problems that have proofs which
are verifiable in polynomial time by nondeterministic Turing machines
8
Oracle An abstract machine used in complexity theory and computability to study decision
problems. Equivalent to a black box addition to a Turing machine that can solve certain
problems in a single step
PCP Probabilistically Checkable Proof. A proof statement that can be checked using a ran-
domised verification algorithm to within a high probability of accuracy by examining only
a bounded number of letters of the proof
PCPT Probabilistically Checkable Proof Theorem. A theorem stating that each decision
problem of NP complexity can be rewritten as a probabilistically checkable proof
PVP Program Verification Problem. The problem of verifying that a computer program always
achieves the intended result as given by a higher level specification
PSPACE Polynomial Space. The class of all decision problems that can be solved by a
deterministic Turing machine using space which is polynomial in the size of the input
PTM Probabilistic Turing Machine. A nondeterministic Turing machine that uses a proba-
bility distribution to decide between alternative transitions
RO Random or Randomised Oracle/Random Oracle model. An oracle that responds to every
unique query with a random response chosen uniformly from its output domain. The
response to any particular query will be the same each time the query is submitted
SH Shannon entropy. The Shannon entropy of a given stochastic source is the average rate at
which the source produces information. The higher the Shannon entropy, the bigger the
information gained from a new value in the process
SNARG Succinct Non-interactive Argument. A proof construction that satisfies succinctness,
meaning that the proof size is asymptotically smaller than the statement size and the
witness size
TM Turing Machine. An abstract computing device which provides a model for reasoning
about computability and its limits
VCP Verifiable Computing Problem. A computational task problem involving two agents, a
relatively weak machine called the Verifier (or Client), and a more powerful Prover (or
Worker). The problem involves the Verifier delegating computational tasks to the Prover.
In return, the Verifier can expect the result of the task plus a proof by which it can verify
the result with less computational effort than would be needed to perform the task from
scratch
ZK Zero-Knowledge Proof. A proof construction in which the Prover can prove to another
verifier that a statement is true without the Prover having to impart any information
apart from the fact that the statement is true
ZK-PCP Zero-Knowledge PCP. A PCP with an additional Zero-Knowledge guarantee be-
tween the Prover and the Verifier
ZK-SNARK Zero-Knowledge Succinct Non-Interactive Argument of Knowledge. A Zero-
Knowledge proof construction which does not involve any interaction between the Prover
and Verifier
9
1 Introduction & Overview
In this report, we review the state of the art in verified computation:
How can a single computer check computations carried out by other computers with
untrusted software and hardware?
We start by putting research in verified computation into context: where would it be useful?1A
typical application is in trusted cloud services. The use of cloud computing is now widespread.
The cloud is a model for enabling ubiquitous, convenient, on-demand network access to a shared
pool of configurable computing resources (e.g., networks, servers, storage, applications, and ser-
vices) that can be rapidly provisioned and released with minimal management effort or service
provider interaction [90]. Cloud providers offer computational power and data storage facilities
that significantly extend the capabilities of weaker devices. Clouds are large, complex systems
that are usually provided without convincing reasons why we should trust them. In practice,
they suffer from software errors, configuration issues, data corruption, hardware problems, and
malicious actors. As Walfish and Blum [122] point out, this raises two important research ques-
tions: (i) How can we trust results computed by a third party? (ii) How can we assure data
integrity stored by a third party? We consider several answers to these questions, suggested
by Walfish and Blum [122]. The answers are provided by replication, trusted hardware, and
remote attestation.
Replication The most obvious answer to both questions is replicating computations and
data over several cloud servers. For example, Canetti et al. [35] discuss practical delegation of
computation using multiple servers as an efficient and general solution to the problem. Their
protocol guarantees the correct answer produced by a replicated, efficiently computable func-
tion. The guarantee relies on at least one honest server, but we do not know which is the honest
server and which is the right answer. The protocol requires logarithmically many rounds based
on any collision-resistant hash CRH family. (Collision resistance means that it is hard to find
two inputs that hash to the same output.) The protocol uses Turing Machines (TMs) but can
be adapted to other computational models. The protocol must be deterministic. This requires,
in turn, deterministic versions of system calls, such as malloc() and free(). The construction
is not based on probabilistically checkable proofs (PCP, see Sect. 3.3) or fully homomorphic
encryption (FHE FME, see Gentry [61]). Canetti et al.’s protocol, does not rely on trusted
hardware, nor does it require a complex transformation of a Turing Machine program to a
boolean circuit. The faults that must be guarded against in replicated computation are Byzan-
tine (see Lamport et al. [83]): they leave imperfect evidence of the fault’s occurrence. Castro
and Liskov [36] propose a novel replication algorithm, BFT, designed to handle Byzantine
faults. BFT is used to implement real services, performs well, is safe in asynchronous environ-
ments like the Internet, incorporates mechanisms to defend against Byzantine-faulty clients,
and can recover replicas. The recovery mechanism tolerates any number of faults over the sys-
tem’s lifetime, provided fewer than one-third of the replicas become faulty within a short time
interval. Their Byzantine-fault-tolerant NFS file system,2BFS has a 24% performance penalty
compared with production implementations of the NFS protocol without replication. A major
drawback of replication algorithms is that they assume failures are unrelated. This assumption
is invalid for cloud services, where hardware and software platforms are often homogeneous.
1A future deliverable in the RCAS pro ject describes some example military applications.
2Network File System (NFS) is a distributed file system protocol originally developed by Sun Microsystems
in 1984, allowing a user on a client computer to access files over a computer network much like local storage is
accessed.
10
Trusted hardware Another solution is to use trusted hardware under the cloud provider’s
control. Sadeghi et al. [106] show how to combine a trusted hardware token (HWT) with
secure function evaluation to compute arbitrary functions on encrypted data where the com-
putation leaks no information and is verifiable. Their work reduces the computation latency
usually experienced by pure cryptographic solutions based on fully homomorphic and verifiable
encryption.
Remote attestation Yet another solution is remote attestation (see Feng [53]), where one
system makes reliable statements about the software it is running on another system. The
remote system can then make authorisation decisions based on that information. Many remote
attestation schemes have been proposed for various computer architectures, including Intel,
RISC-V, and ARM.
Walfish and Blum [122] give an overview of an alternative to replication, trusted hardware,
and attestation: a technology where the cloud (or some other third party) returns the results
of the computation with proof that the results have been computed correctly. This technology
aims to make this proof inexpensive to check compared to the cost of redoing the computation.
If this technology is feasible, it would not need the assumptions about faults required for
replication, trusted hardware, or remote attestation. Either the proof is valid, or it is not.
Walfish and Blum call this proof-based verifiable computation. They give four insights into the
practicality of this research area:
1. Researchers have already built systems for a local computer to check the correctness of
a remote execution efficiently. For a comprehensive discussion of the state-of-the-art and
tool implementations, see Sect. 4.
2. There are diverse applications, such as computational complexity, cryptography and cryp-
tocurrencies, secure exploitation of untrusted hardware supply chains, and trustworthy
cloud computing. Additional applications, not suggested by Walfish and Blum, include
computationally enhanced UAV swarms and automated reviewing systems for mathemat-
ical journals and conferences.
3. Important foundations include probabilistically checkable proofs (PCPs) [5,6,113], in-
teractive proof systems [8,69,86], and argument systems [29], all of which have a sound
and well-understood mathematical basis.
4. Engineering these theories for practical application is a new interdisciplinary research
area.
We describe the underlying theories in detail in later sections of this report, but for now,
here is an informal motivating example due to Mordechai Rorvig and published in Quanta
magazine [105]:
If a million computer scientists had dinner together, they’d rack up an enormous
bill. And if one of them were feeling particularly thrifty and wanted to check if the
bill was correct, the process would be straightforward, if tedious: they’d have to go
through the bill and add everything up, one line at a time, to make sure that the
sum was equal to the stated total.
But in 1992, six computer scientists proved in two papers [both by Arora and his
colleagues [6,5]] that a radical shortcut was possible. There’s always a way to
reformat a bill—of any length—so that it can be checked with just a few queries.
More importantly, they found that this is true for any computation or even any
mathematical proof since both come with their own receipt, so to speak: a record
of steps that a computer or a mathematician must take.
11
We give a very brief overview of PCP; more details are in Sect. 3.3. A PCP for a language
consists of a probabilistic polynomial-time verifier with direct access to individual bits of a
bit-string. This string (which acts as an Oracle) represents a proof and will be only partially
accessed by the verifier. Queries to the oracle are locations on the bit-string and will be
determined by the verifier’s input and coin tosses (potentially, they might be determined by
answers to previous queries). The verifier must decide whether a given input belongs to the
language. The verifier will always accept an input that belongs to the language, given access
to the oracle (the bit string). On the other hand, if the input does not belong to the language,
then the verifier will reject with probability at least 1⁄2, no matter which oracle is used. One
can view PCP systems in terms of interactive proof systems (IP). The oracle string is the proof,
and the queries are the messages sent by the verifier. The prover is memoryless and cannot
adjust answers based on previous queries.
The PCP theorem is important. Scheideler [107] observes that Arora’s original proof of the
PCP theorem [5] is one of the most complicated proofs in the theory of computation. It has
been described by Wegener as the most important result in complexity theory since Cook’s
theorem [124]3And by Goldreich as a culmination of impressive research rich in innovative
ideas [65]. Boneh (quoted in Rorvig’s Quanta article [105]) considers it very rare that such
deep algebraic tools from mathematics have made it into practice.
Both the PCP theorem and its proof have been simplified. Arora et al.’s original proof
was dramatically simplified by Dinur’s PCP construction [48]. Radhakrishnan and Sudan [103]
give an accessible commentary on Dinur’s proof. Zimand [127] presents a weaker variant of the
PCP Theorem that has a correspondingly simpler proof compared to Arora et al. In Zimand’s
simplification, the prover has only a limited time to compute each bit of the answer, in contrast
to the original prover being all-powerful. Song’s account of the theoretical setting for the PCP
theorem [112] contains a simplified version of the theorem and an accompanying proof. The
simplified theorem states that every NP statement with an exponentially long proof can be
locally tested by looking at a constant number of bits. This is weaker than the original PCP
theorem since the proofs validated by the original theorem may be much more significant. In
Arora et al.’s work, the PCP verifier deals with proofs of polynomial size, whereas in Song’s
weaker theorem, the verifier deals with proofs of exponential size. Despite this, it is interesting
that exponentially sized proofs can be verified by a constant number of queries. Ben-Sasson [13]
presents two variants of the PCP theorem. The first achieves a nearly optimal trade-off between
the amount of information read from the proof and the certainty of the proof. If the verifier is
willing to tolerate an error probability of 2−k, it suffices to inspect O(k)bits of the proof. The
second variant is very efficient in terms of the length of the proof.
1.1 Organisation of the Report
The report consists of the following sections.
1. The Executive Summary at the beginning of the report describes the Verifiable Compu-
tation Problem. It outlines several applications and gives a very high overview of the
main foundation: the PCP theorem. A lecture given by Walfish inspires part of this de-
scription.4The example of delegating computations is taken from Goldwasser et al. [70].
3The Cook–Levin theorem, also known as Cook’s theorem, proves that SAT, the Boolean satisfiability prob-
lem, is NP-complete. A problem is in NP when the correctness of each solution can be verified in polynomial
time, and solutions can be found using brute-force search (formalised using a nondeterministic Turing machine).
A problem is NP-complete if it can be used to simulate every other problem for which we can verify polynomially
that a solution is correct. So Cook’s theorem states that SAT is in the complexity class NP and any problem
in NP can be reduced in polynomial time by a deterministic Turing machine to SAT.
4“Introduction and Overview of Verifiable Computation”, a lecture by Prof. Michael Walfish during
12
2. In Sect.2, we focus on a statement of the problem we are trying to solve with verifiable
computation.
3. The main part of this report is Sect. 3, where we comprehensively introduce the theory
of probabilistically checkable and zero-knowledge proofs. We set the scene in Sect. 3.1
by describing security models for two-party computation. We then review interactive
proof systems (Sect. 3.2), probabilistically checkable proofs (Sect. 3.3), and polynomial
interactive oracle proofs (Sect. 3.4). In Sect. 3.5, we discuss verifiable computation as a set
of polynomial constraints that can be verified efficiently. Zero-knowledge proof protocols
are used to prove that a party possesses certain information without revealing it and
without any interaction between the parties proving and verifying it. We discuss such
protocols in Sects 3.6 and 3.7. Finally, in Sect. 3.8, we review a game theoretic approach.
4. The actual state of the art lies in the practice of verifiable computations embodied in
usable tools. Usability includes the following considerations: the expressiveness of the
language describing the computation; the efficiency of creating (and checking) the proof;
and the (e.g., cryptographic) libraries on which the implementation relies. Addressing
these aspects of usability poses different challenges. We review the landscape of imple-
mentations in Sect. 4. This is before the detailed report on the performance of practical
tools on demonstrators and benchmarks.
5. We conclude the report in Sect. 5with a high-level summary of the work.
6. The final section contains an extensive bibliography of papers and books on verifiable
computation, all cited in the report.
2 Problem Statement
A verifier sends the specification of a computation Pand input xto a prover. The specification
of the computation might be a program text. The prover computes an output yand sends it
back to the verifier Vthat either accepts ywith V(y) = 1 or rejects ywith V(y) = 0. If
y=P(x), then a correct prover should convince the verifier of this. The prover might answer
some questions from the verifier or provide a certificate of correctness. If y6=P(x), the verifier
should reject ywith a certain probability. The problem is to provide a protocol to carry out
this procedure. The protocol is subject to three requirements.
1. The verifier must benefit from using the protocol. It might be cheaper for the verifier to
follow the protocol than to directly compute P(x). The prover might be able to handle
computations that the verifier cannot. It might have access to data inaccessible to the
verifier.
2. We do not assume that the prover follows the protocol.
3. Pshould be general. We assume that the length of the input statically bounds the prover’s
running time.
Example. Suppose a device Dneeds to compute the matrix product f(X) = XT∗Xand
Ddoes not have the computation resources. Further, suppose another device Shas those
resources. Let us allocate the fcomputation to S. Now we have the following scenario: D
sends the pair (f,A)to S, which computes f(A)and returns the result Bto D.
the Department of Computer Studies’ Winter School, held by Bar-Ilan University in January 2016. See
www.youtube.com/watch?v=qiusq9R8Wws.
13
Instead of the trusted device Sthat computes f, device Dcould also rely on an untrusted,
possibly adversarial, device S′to compute B′. Now, Dneeds to determine whether B′equals
the expected B. To ensure this, Drequires that S′sends a proof πalong with the result B′so
that it now receives a tuple (B′, π). On reception, device Dverifies πis a proof of B′=f(A).
Our main interest focuses on the two roles S′and Dplay in the interaction: device S′is tasked
with producing proof and Dwith verifying it. We emphasise this by referring to S′as the
prover and Das the verifier.
The protocol described in the preceding paragraph has a severe problem: the proof πcan
be prohibitively long, rendering the protocol infeasible. For the protocol to become practical,
the proof needs to be shortened. Looking at only an excerpt of the proof π′, we are no longer
confident that B′=f(A). We can obtain this result only up to some probability: ∃π′·Prπ′[B′=
f(A)] ≥c. The value cis called completeness. Of course, there is also a probability that
B′6=f(A). We can estimate this probability with a bound over all corresponding proofs
∀π′·Prπ′[B′6=f(A)] ≤s. The value sis called soundness.
The challenge addressed by PCP is how to keep the shortened proofs π′to be considered as
small as possible while maximising cand minimising s.
3 Theoretical Foundations
3.1 Security models for two-party computation
Security is essential for multiparty and outsourced computation. There are many strategies to
outsource computation securely. A natural approach is to rely on cryptography, which provides
strong security guarantees from formal analysis grounded in time-tested hardness assumptions.
Techniques not depending on cryptography can be more convenient or efficient in some contexts,
but they might leave room for attacks if the incentives are large enough. We introduce several
concepts: trust, verifiability, and secure communication. We discuss some of the foundational
works in provable outsourced computation. Our focus is on two-party computation. Here, one
entity is the client with a computational problem. The other entity (a cloud server) executes
the computation and provides results to the client.
Defining security requires an adversary model against which the security properties of a
system must be enforced. Different types of adversaries capture the other threat models for
two-party computation. In general, there are three adversary models:
•Semi-honest adversaries follow the protocol specification but try to learn from a protocol
execution additional information held by other parties. This adversary model is also
called honest-but-curious and limits itself to confidentiality properties.
•Malicious adversaries can deviate arbitrarily from the protocol specification. Protecting
against these more powerful adversaries involves threats to integrity and authentication.
•Covert adversaries do not necessarily follow the protocol specification but try not to be
caught deviating from it. This model is a middle-ground between the previous ones and
captures realistic incentive structures supporting malicious behaviour.
The types of adversaries, non-adversarial (honest) parties, and the notion of honesty are further
discussed in Section 3.1.3.
14
3.1.1 Trust in Multiparty Systems
Trust is one of the top considerations when choosing a cloud provider for computation out-
sourcing. There are several definitions of trust in cloud computing [76,84,100]. We define
trust as
An expectation that the cloud carries out the computation without any malicious
intent, leakage of information, or purposefully returning wrong results.
Guaranteeing such a trusted relationship is complex. We present several ways trust is often
established.
A common way of determining trust is by considering the cybersecurity standard [41] that
the cloud provider follows. This might involve a third party’s accreditation of the cloud provider
based on periodic assessments. While this approach is often considered acceptable, there is no
guarantee that the notion of trust holds during the execution of the client’s request. In a
particularly sensitive computation, such as critical utility, military, or healthcare use, cloud
operators might be tempted by financial, ideological, or other incentives to temporarily violate
trust.
Another approach considers service-level agreements (SLAs) [1] for implementing requested
cloud services. This enables the client to monitor specific parameters defined within the SLA.
These parameters could be the type of a machine that carries out the computation, versions of
operating systems, used execution frameworks, etc. The client is informed about any changes
to the agreed parameters, or the client might monitor the parameters. The cloud provider
might be capable of spoofing some of the SLA parameters, or if the monitoring is random, the
client might take a chance and relax the parameters. This could break the trust between the
client and the cloud.
Another approach is reputation [25] using criteria defined by the client (often based on an
SLA). The client then monitors these criteria over time, analysing feedback from the client
and third parties. If the criteria metric (the cloud provider’s reputation) drops below some
threshold, the cloud provider is no longer considered trusted. One of the challenges of this
approach is the possibility of the cloud provider utilising malicious third parties to increase its
reputation. A server with an honestly gained reputation may still break trust if the reward is
significant enough.
These challenges have led to specialised cloud deployments. These include private clouds [55],
where the client’s organisation is responsible for operating the cloud platform. There may be
dedicated clusters with specific security hardware, such as custom Trusted Platform Modules
(TPMs) [7]. Both examples are expensive for the client, who is responsible for platform man-
agement, and in the latter case, also hardware components.
Whilst trust might be established, it cannot be fully guaranteed if the stakes for the out-
sourced computation are high enough. This also applies to zero trust within cloud environ-
ments [88]. As well as cloud components not trusting external clients and devices, they may
not trust each other. All access uses gateways handling authentication and authorisation. The
zero-trust frameworks pose a barrier to malicious entities, including the cloud provider. But
they do not guarantee the correctness of the result of the outsourced computation. This report
provides an overview of techniques that provide proof of the correctness of computation and
could be applied to cloud platforms without any prior establishment of trust.
3.1.2 Verifiability in Multiparty Systems
A potential solution to outsourced computation is verifiable computation [56]. The client dis-
patches the computational problem to the cloud and receives the result and proof of correctness
of the result. Different approaches can be used with varying degrees of complexity. While the
15
technical part of this report primarily discusses probabilistic and cryptographic methods. As
well as receiving a result with the proof, the data and the computed problem stay hidden from
the cloud. We discuss several other approaches in this section.
The first approach is to provide verifiability by multiple executions [9,37]. In this approach,
the same problem is distributed to various executors (cloud providers), executed, and then the
results are collected and compared. This approach poses several challenges. First, it requires
multiple cloud providers to be able to compute the given problem. Second, in the case of using
only two cloud providers, it is not possible to determine which cloud provides the correct result
(if any) without independent re-execution. Finally, the approach does not offer the ability to
detect possible collusion or a state when two cloud providers provide purposefully incorrect
results that are equal without utilising more than two parties (an approach based on game
theory has been proposed to resolve this and is discussed later in this report (Sect. 3.8)).
Another approach is certifying the results by secure hardware elements [99]. In this approach,
the execution is carried out with trust bootstrapped using cryptography hardware such as TPM.
Any computation result is signed by keys stored within this TPM, where the client can verify
this signature. In this instance, there are several challenges. The first challenge is the inability
to verify the result itself (without re-execution), using only its signature. Another, perhaps
more critical challenge is that trust in the physical components, such as TPMs, needs to be
guaranteed, as we discussed in Sect. 3.1.1; this is often difficult, and if there are high stakes,
computing might provide a false sense of security.
Approaches based on cryptography and probability theory, such as interactive proof systems
(Sect. 3.2), probabilistically checkable proofs (Sect. 3.3), interactive oracle proofs (Sect. 3.4),
ZK-SNARK (Sect. 3.6), ZK-STARK (Sect. 3.7), and finally a game theory based approach
(Section 3.8).
3.1.3 Honesty within Multiparty Computation
Within the principles of secure multiparty computation, several considerations exist for the
parties’ honesty. A frequent concern is the so-called honest majority, where most parties are
assumed to have honest intentions. They do not deviate from the protocol or gather secret
information. In this case, the honest majority can also be utilised to remove a dishonest party
from the computation [63].
In the semi-honest adversaries case, the party does not try to deviate from the protocol
specification. Instead, it tries to gather more information from the protocol execution than is
allowed. This could be by studying the protocol trace or messages exchanged during protocol
execution [58] and trying to compute additional information. Of course, there could be several
semi-honest and honest parties.
In the worst-case scenario (the malicious adversaries case), the party would willingly use
any attack vector to deviate from the protocol. This deviation could be changing the inputs
and outputs of the protocol, as well as aborting the protocol at any time. Security can still be
achieved, especially with an honest majority. The efficiency of secure multiparty computation
with malicious parties still requires improvements to reach everyday practicality [58,3]. The
malicious party could act overtly, i.e., not trying to avoid detection or a covert (covert adver-
saries case) way, making it more difficult to determine that the party is indeed malicious as
the honest majority might not have enough information to support the conclusion.
An important notion is the number of corrupted parties, which may be semi-honest or
malicious. The most common considerations used within the research of protocols for secure
multiparty computation are t<n, stating that the number of corrupted parties tcould reach
any number of parties within the computing environment, but less than n: the total number of
parties involved within the multiparty computation. The other common model is t<n⁄2, where
16
there is an honest majority. A more robust model has a two-thirds honest majority: t<n⁄3. In
many protocol cases, an honest majority is a requirement [38]. However, protocols are aimed
explicitly at cases with a dishonest majority [80]. In both cases, current research is focused on
the efficiency of these protocols.
3.1.4 Secure Communication - a brief note
Secure communication is an essential part of any data exchange. This becomes especially
important in cases where the data is being sent over the network to remote parties. A typical
approach is to ensure that the traffic utilises network encryption; this could be by using public
key cryptography such as the TLS protocol. While the context of this report is outsourcing
high-stakes computation to potentially untrusted parties, some primary considerations shall be
met. It could be expected that a remote party provides a valid TLS certificate to continue
outsourcing the computation. This means that the certificate shall be signed by a trusted
certificate authority, limiting the remote entity’s potential to fake its identity. The TLS protocol
suite supported by the remote entity should also contain support for recent cypher suites and
consider the use of cryptography algorithms for post-quantum TLS [32]. This needs to be
scrutinised every time the connection is established, requiring the client to drop support for
older cypher suites as they become outdated, forcing the communication to utilise only the
newer ones. While there could be other protocols based on custom encryption schemes, the
approaches specified in the report can be compatible with a data exchange layer based on TLS.
3.2 Interactive Proof Systems
Our study of verified computation and proof systems starts with a precise definition of proof.
An alphabet Σis typically expressed as field F, which defines a context for us to define
objects. The set Σ∗defines all finite sequences of elements in Σ, including the empty sequence
denoted by ǫ. Elements in Σ∗define statements and proofs. The validity of a proof w, for a
statement x, depends on whether (x,w)belongs to the relation R⊆Σ∗×Σ∗. We define a
language LRto be the set of provable statements over an existing relation R:
LR:= {(x,w)∈Σ∗×Σ∗|(x,w)∈R}
Complexity classes divide languages into different categories depending on specific proper-
ties. A language Lbelongs to the complexity class NP (the Nondeterministic Polynomial time
complexity class) iff there exists a nondeterministic Turing machine (which we in our context
call the verifier) VL, that given access to an instance xof length nand a proof πof size poly(n)
either rejects VL(x, π) = 0 or accepts VL(x, π) = 1 the claim x∈Lwith perfect soundness and
completeness.
•Perfect Completeness:x∈L⇒ ∃ π[VL(x, π) = 1] (i.e., there exists a proof for all
correct claims that will make the verifier accept).
•Perfect Soundness:x/∈L⇒ ∀ π[VL(x, π) = 0] (i.e., there exists no proof that can
convince the verifier to accept a false claim).
A famous instance of NP is circuit/Boolean satisfiability (or simply SAT) that considers a
Boolean circuit formula C, where we want to check if wis a satisfying assignment C(w) = 1.
An attractive property of the Cook–Levin theorem [43] is that any NP-problem can be reduced
in polynomial time by a deterministic Turing machine to SAT.
Classical proofs require the verifier to read the entire proof πto be convinced of a given
claim x∈L, which in many situations becomes unattractive for a weak computational client.
17
A well-studied question has therefore been whether we can get away with having the verifier
read fewer bits.
In the context of verifiable computation, we restrict ourselves to interactive proof systems
(IP) between two parties. Interactive proof systems were introduced in [8,69] for a function
f: ∆ →τthat is a protocol that allows a probabilistic Verifier to interact with a deterministic
Prover to determine the validity of a statement y=f(x)on a common input xbased on an
interaction of krounds (by exchanging 2kmessages). The Verifier and the Prover interact
using a sequence of messages t= (m1, π1,...mk, πk), called the transcript. In the i-th round
of interaction, the Verifier challenges the Prover by sending a uniformly random message mito
the Prover. The Prover replies to the Verifier with a message πi. The Verifier is probabilistic,
meaning that any challenges (messages) misent by the Verifier may depend on some internal
randomness rand previous messages m0,...,mi−1. The behaviour of the Verifier can be either
adaptive or non-adaptive. The Verifier is non-adaptive (NAV) if its queries (challenges) depend
only on the Verifier’s inputs and its randomness. An adaptive verifier selects its challenges
based on previous messages. At the end of the protocol, the Verifier either accepts or rejects
the statement y=f(x)based on the transcript and its internal randomness.
Definition 3.1 (Interactive Proof).Let L∈Σ∗be any language. We say that (P,V)is an
interactive proof system for the language Lif there exists a k-message protocol between a
polynomial-time verifier Vand an unbounded prover Psatisfying the following two properties:
•Completeness5:P r[hP,Vri(x) = 1 |x∈L]≥2⁄3
•Soundness against all unbounded malicious provers P∗:P r[hP∗,Vri(x) = 1 |x/∈L]≤1⁄3
The notation h−,−i denotes interaction, and the subscript notation denotes the number of times
the verifier draws a random bit (or coin) to use during the verification. Figure 2illustrates
the interaction between a prover and a verifier. Instead of computing C(x), a malicious prover
could try to guess the proof for an invalid computation. The soundness condition expresses
how hard it is to “find” an invalid proof that would pass the test by the verifier despite being
invalid. For reasons of complexity, “finding” such a proof could only be achieved by resorting
to randomised guessing. An example of an interactive proof system can be found in Sect. 3.2.1.
There exist two different branches of interactive proof systems: public-coin where the ran-
dom choices (coin tosses) of the Verifier are made public, and private-coin where the random
choices are kept secret. Any private-coin protocol can be transformed into a public-coin protocol
using the technique from [71]. A public-coin interactive proof is often called an Arthur–Merlin
AM game.
The complexity class IP denotes all languages for which an interactive proof system exists.
A fundamental measure of the efficiency of an interactive proof protocol is the round complexity
k. As illustrated in Fig. 2, the round complexity counts the number of interactions between
the prover and verifier.
It was proved in 1992 by Shamir [111] that every language in PSPACE has an interactive
proof system (IP =PSPACE) that extended the work in [86]. Argument systems are a
relaxation of interactive proof systems, where soundness is reduced to computational soundness,
meaning that soundness is only required to hold against a computationally bounded prover
running in polynomial time [123].
Definition 3.2 (Argument System).An argument system (P,V)is defined similarly to an
interactive proof system (P,V), with the following differences:
5The constants like 2⁄3used in the definition of IP stem from the original theoretical work. They are also
sufficient for proving the main PCP results. For the later, more applied work on PCP improved bounds have
been derived.
18
Prover Verifier
function f
input x
compile finto
circuit C
↓
compute y=C(x)
keep proof
of computation
↓
encode proof
compile finto
circuit C
output y
query m1
response π1
.
.
.
query mk
response πk
accept or reject
proof
Figure 2: Interaction between prover and verifier
•The soundness condition is replaced by computational soundness: For every probabilistic
polynomial time machine P∗, for all sufficiently long x/∈L, the verifier Vrejects with
probability at least 1⁄2.
Argument systems were introduced by Brassard et al. in 1986 [29] to obtain perfect zero-
knowledge protocols (see Section 3.2.3) for NP. Kilian uses the relaxation to build a constant-
round protocol with low communication complexity [81]. Limiting the prover’s computational
power seems necessary to attain soundness of argument systems [57,66]. Argument systems
use cryptographic primitives and other assumptions, which a super-polynomial time prover can
break. Such a prover might still be “efficient enough” to generate invalid proof that might
pass the test.6Cryptographic primitives have been used to improve the situation and achieve
additional desirable properties that are unattainable for interactive proofs, such as re-usability
(i.e., the ability for the verifier to reuse the same “secret state” to outsource many computations
on the same input), and public verifiability. Some of this work might be relevant when studying
specific properties of PCP. A general insight here is that some properties of interest require
strong assumptions on provers that limit the usefulness of protocols supporting those properties.
3.2.1 The Sum-check Protocol
The Sum-Check protocol7is an interactive proof system introduced by Lund, Fortnow, Karloff,
and Nisan in [86]. The sum-check protocol is used in a brief account of the original proof of
the PCP theorem in Subsection 3.3.1. The protocol approach is very similar to how the classic
6Super-polynomial time can be “just” not polynomial. For instance, the Adleman–Pomerance–Rumely
primality test has super-polynomial time complexity (log n)O(log log log n)which gets dominated by a polynomial
only for very large n.
7See the tutorial at semiotic.ai/articles/sumcheck-tutorial/ on an excellent account of the sum-check
protocol.
19
television detective questions a suspect to detect whether they are lying. The detective starts
by asking for the whole story before digging into more minor details to look for contradictions.
The detective carefully selects the questions so that each communication round restricts the
range of future valid answers; this means that a lying witness will eventually be caught in a
contradiction. In the following, let Fbe a finite field FF.
The Verifier performs a similar interaction with the Prover. This is in the original setting of
the sum-check protocol encoding the problem of interest as a sum-checking problem of values
of a low-degree multivariate polynomial on an exponentially large hypercube.
The Prover takes as input an m-variate polynomial g:Fm→Fof degree ≤din each
variable, where d≪| F|. The goal of Prover is to convince a verifier that:
β=X
x1∈{0,1}X
x2∈{0,1}
... X
xn∈{0,1}
g(x1,x2, ..., xn),(1)
where β∈F. The verifier has oracular access to the polynomial gand is given the summand
β. It is the Verifier’s job to determine whether βis a sum of the polynomials g(x1,x2,...,xn)
for each xi.
The protocol proceeds in vrounds, where in each round i, the Prover sends a univari-
ate polynomial gito the Verifier. In the first round i= 1 the Prover sends the polyno-
mial g1(X1)to the Verifier, with the claim that g1(X1) = Px2Px3···Pxvg(X1,x2,...,xv).
In following rounds i>1, the Verifier selects a random value ri−1∈Fand sends it to
the Prover. The Prover then sends a polynomial gi(Xi)to the Verifier, with the claim that
gi(Xi) = Pg(r1,...,ri−1,Xi,xi+1 . . . , xv). The Verifier checks the newest claim by considering
that gi−1(ri−1) = gi(0) + gi(1). In the final round the Prover sends the polynomial gv(Xv)
claimed to be equal to gv(Xv) = g(r1,...,rv−1,Xv).
The sum-check protocol has, since its introduction, been refined into different efficient proof
protocols [120,125,118,115,44,68]. A significant advantage of the sum-check protocol is
that implementing the prover can, in specific settings, avoid using costly operations such as
the Fast Fourier Transform, which is common in other protocols. Examples include [115] that
implements a linear prover, or [44] where the prover was a streaming algorithm.
The sum-check protocol has been generalised to cover univariate polynomials [19] and tensor
codes [89].
3.2.2 The Fiat-Shamir Heuristic
The Fiat-Shamir heuristic [54] allows the transformation of any public-coin interactive proof
protocol Iinto a non-interactive, publicly verifiable protocol Qin the random oracle model
(RO).
The random oracle model presented in [11,34] is an idealised cryptographic model that gives
all parties (the Prover and the Verifier) access to an entirely random function H:{0,1}k→
{0,1}kbetween constant sized bit strings, typically a hash function, since no such ideal function
exists. The function Htakes an input and produces a random output chosen uniformly from the
output domain. Random oracles are, in theory, used to obtain practical, efficient, and secure
protocols, while in practice are heuristically secure since no truly random hash functions can
be used in practice. Examples of this include the work in [34]. They show that there are secure
schemes in theory, but for which any implementation of the random oracle results in insecure
schemes. The random oracle allows the Prover to predict the random queries (challenges) on
behalf of the Verifier. This eliminates the need for the Verifier to send messages to the Prover,
making the protocol inactive.
Kilian [81] shows that 4-message argument systems for all of NP can be established by
combining any PCP with Merkle-hashing based on collision-resistant hash functions. Micali
20
[92] shows that applying the Fiat-Shamir transformation to Killan’s 4-message argument system
yields a succinct non-interactive argument (SNARG) in the random oracle model. The resulting
Kilian-Micali construction uses a Merkle-Tree (MT) [91] as the basis for a commitment scheme
that allows sending just a single hash value (i.e., the tree’s root) as commitment. The leaves of
the tree correspond to all the Prover’s evaluation points. Commitment schemes are discussed
in the context of polynomial commitment schemes in Section 3.5.2.
3.2.3 Proof of Knowledge and Zero-knowledge
An honest prover can always convince the verifier about a true statement in a traditional
interactive proof system with perfect completeness and soundness. Still, it cannot persuade
it about something false. These notions can be relaxed to statistical properties, where they
hold except with negligible probability, or to computational ones, where we admit the unlikely
possibility of the prover cheating the verifier if it is computationally infeasible to do so.
A variant of this concept is proofs of knowledge [69], where the prover claims to know (or can
compute) a particular piece of secret information in a way that convinces the verifier. What
it means precisely for knowledge is defined as an extractor. Since the prover cannot simply
output the secret knowledge itself, a knowledge extractor with access to the prover can extract
awitness of such knowledge [97].
Let xbe a statement of a language Lin NP as before, and W(x)a set of witnesses for x
that should be accepted in the proof. Define the relation
S={(x,w) : x∈L,w∈W(x)}.
Proof of knowledge for relation Sis a two-party protocol between a prover and verifier in which
the security notions now hold over the knowledge of the secret value:
•Knowledge Completeness: If (x,w)∈S, then the prover Pwho knows the witness
wsucceeds in convincing the verifier Vof his knowledge. More formally, P r[P(x,w)⇒
(V(x) = 1)] = 1, i.e. a prover can always convince the verifier, given their interaction.
•Knowledge Soundness: requires that the success probability of a knowledge extractor
Ein extracting the witness, after interacting with a possibly malicious prover, must be
at least as high as the success probability of the prover Pin convincing the verifier. This
property guarantees that if some prover can convince the verifier using some strategy, the
prover knows the secret information.
The notion of zero-knowledge can be seen as an additional property that an interactive
proof system, an interactive argument, or a proof of knowledge can have. In this notion, there
is another requirement that whatever strategy and a priori knowledge the verifier follows or
may have, respectively, it learns nothing except that the statement claimed by the prover
is true. This is achieved by requiring that the interaction between the prover and verifier
can be efficiently simulated without interacting with the prover, assuming the prover’s claim
holds. Perfect zero-knowledge is a more robust notion of zero-knowledge that does not limit
the verifier’s power [57].
We can model a cheating verifier as a polynomial-time Turing machine V∗that gets an
auxiliary input δof length at most polynomial in the size of input x. The auxiliary input δ
represents a priori information that the verifier may have collected from previous executions
of the protocol. That is, allowing the collection of this information, the verifier may cheat.
Definition 3.3 (Zero-Knowledge [45]).An interactive proof or argument system (P,V)for
language Lis zero-knowledge if for every polynomial-time verifier V∗there is a simulator M
running in expected probabilistic polynomial time such that the simulation is computationally
indistinguishable (in polynomial time) from (P,V)in input x∈Land arbitrary δ.
21
Similarly, as before, we can generalise the zero-knowledge notion as perfect (resp. statistical)
zero-knowledge by replacing the requirement of computational indistinguishability with perfect
indistinguishability (resp. except for negligible probability).
Remark. Falsifiable assumptions refer to the cryptographic assumptions that can be formu-
lated in terms of an interactive game between a challenger Cand an adversary Asuch that
Ccan determine whether Awon at the end of the game and an efficient Acan only succeed
with at most negligible probability. Intuitively, an efficient Ccan test whether an adversarial
strategy breaks the assumption. That is why the majority of assumptions and constructions in
cryptography are falsifiable. On the other hand, some examples of cryptographic assumptions
cannot be modelled this way and are consequently non-falsifiable. Knowledge assumptions are
a clear example of this phenomenon.
An example of a simple proof of knowledge protocol that also happens to be zero-knowledge
is Schnorr’s proof of knowledge of a discrete logarithm [108]. The protocol is defined for a cyclic
group Gor order qwith a generator element g∈G. The prover wants to prove knowledge of
x= loggyin a group where computing xgiven (g,y=gx)is computationally infeasible. This
setting constitutes a group where computing discrete logarithms is hard. The prover interacts
with the verifier as follows:
1. The prover commits to randomness rby sending t=grto the verifier.
2. The verifier replies with a challenge cchosen randomly.
3. The prover receives cand responds by sending s=r+cx mod q.
4. The verifier accepts if gs=tyc.
The protocol can be made non-interactive using the Fiat-Shamir heuristic to hash c=
H(g,y,t)as described in Section 3.2.2 above. It is a valid proof of knowledge because it
has a knowledge extractor that extracts xby interacting with the prover two times to obtain
s1=r+c1xand s2=r+c2xfor two distinct challenges and computing x= (s1−s2)/(c1−c2).
3.2.4 Multi-Prover Interactive Proof Systems
The prover of an IP can be split into multiple entities (MIP) with the restriction that these
entities cannot interact while interacting with the verifier to form a multi-prover interactive
proof (MIP) [12]. The setup is reminiscent of the police procedure of isolating suspects and
interrogating each of them separately [64]. The suspects are allowed to coordinate a strategy
before they are separated. However, once they are separated, the suspects can no longer
interact. The verifier tries like an interrogator to determine if the prover’s stories are consistent
with each other and the claim being asserted.
A multiple-prover proof system is more expressive than a regular IP because each prover
Piis unaware of all messages sent to a prover Pjwhere j6=i. It has been proved that the
two-prover systems are as robust and expressive as any multi-prover interactive proof system.
Many of the ideas of MIPs have been adapted into interactive oracle proofs and probabilis-
tically checkable proofs where a polynomial commitment scheme replaces the second prover.
3.3 Probabilistically Checkable Proofs
The first proof system introduced for NP in section 3.2 has perfect soundness and completeness;
however, the proof size is not constant and requires the verifier to work much harder than we
what we would like to be efficient and practical. Luckily enough, research has shown that the
22
performance of the verifier can be significantly improved if we are willing to settle for less-than-
perfect soundness using a probabilistic approach where the verifier queries a random subset of
the proof.
The Probabilistically Checkable Proof (PCP) theorem [6,5,48] revolutionised the field of
verifiable computation by asserting that all NP statements have a PCP, meaning that they can
be written in a format that allows an efficient (poly-logarithmic) probabilistic verifier Vwith
oracle access to the proof πto probabilistically verify claims such as “x∈L” (for an input x
and an NP-language L) by querying only a few bits of the proof πwith soundness error δs=1
2.
The soundness error can be reduced to 2−σby running the verifier σtimes.
An essential aspect of verifiable computation is the probabilistic verifier, which we will
describe next.
Definition 3.4 (Verifier).The Verifier is a randomised Turing machine (PTM) restricted by
the following functions: l,q,r,t:N+→N+. The Verifier has oracle access to a proof πof the
statement x∈Lwith |x|=n, where the length of the proof is restricted by |π| ≤ l(n). The
Verifier flips at most r(n)coins, queries at most q(n)locations of the proof to either accept
(Vr(x, π) = 1) or reject (Vr(x, π) = 0) the statement x∈Lin time ≤t(n).
Given such a verifier, the complexity class of PCP can be constructed.
Definition 3.5 (PCP class).The PCP class is defined:
PCP
length =l(n)
randomness =r(n)
queries =q(n)
time =t(n)
,(2)
where l,q,r,tare defined as in definition 3.4. A language Lbelongs to the PCP complexity
class if there exists a PCP(l(n),r(n),q(n),t(n)) with (perfect) completeness 1and soundness
1⁄2for L, where
•Perfect Completeness: x∈L⇒ ∃ π·P r[Vr(x, π) = 1] = 1
•Soundness: x/∈L⇒ ∀ π·P r[Vr(x, π) = 1] ≤1⁄2
That is, an answer will always be returned. The soundness of the answer will be no worse than
a half.
The PCP class is often described as PCP(r(n),q(n)), which only considers the query
complexity q(n)and the randomness r(n). The verifier is described using these two functions.
Much research has studied the construction of PCPs, essentially looking for languages in the
PCP class with minimal proof length, running time, randomness and query complexity while
minimising the soundness error. The most significant costs in verifiable computing are typically
the verifier’s and prover’s runtime, the number of queries q(n), and the creation of the circuit.
Definition 3.6 (PCP Theorem).The PCP theorem: NP =PCP(O(log n),O(1)).
There have been various attempts to minimise the query complexity. H˚astad [75] showed
in 1997 that the query complexity in the PCP theorem can be reduced to 3 bits.
The PCP theorem is one of the most challenging theorems in theoretical computer science
and is considered one of the most important results in complexity theory [124]. The proof of
the PCP theorem is far beyond our scope here, but we point the reader to [48]. We will instead
highlight the significant steps of the proof.
23
3.3.1 Proof Techniques and Insights
The early work on the PCP theorem was motivated by complexity-theoretic considerations
[51,52]. In contrast, later work such as [75] focused on application