Available via license: CC BY 4.0
Content may be subject to copyright.
PRX QUANTUM 3, 010345 (2022)
Quantum Error Mitigation as a Universal Error Reduction Technique:
Applications from the NISQ to the FaultTolerant Quantum Computing Eras
Yasunari Suzuki,1,2, *,§ Suguru Endo ,1,2, †,§ Keisuke Fujii,3,4,5 and Yuuki Tokunaga1,‡
1NTT Computer and Data Science Laboratories, NTT Corporation, Musashino 1808585, Japan
2JST, PRESTO, 418 Honcho, Kawaguchi, Saitama 3320012, Japan
3Graduate School of Engineering Science, Osaka University, 13 Machikaneyama, Toyonaka,
Osaka 5608531, Japan
4Center for Quantum Information and Quantum Biology, Institute for Open and Transdisciplinary Research
Initiatives, Osaka University, Japan
5Center for Emergent Matter Science, RIKEN, Wako, Saitama 3510198, Japan
(Received 19 March 2021; revised 15 October 2021; accepted 3 January 2022; published 18 March 2022)
In the early years of faulttolerant quantum computing (FTQC), it is expected that the available code
distance and the number of magic states will be restricted due to the limited scalability of quantum devices
and the insuﬃcient computational power of classical decoding units. Here, we integrate quantum error
correction and quantum error mitigation into an eﬃcient FTQC architecture that eﬀectively increases the
code distance and Tgate count at the cost of constant sampling overheads in a wide range of quantum
computing regimes. For example, while we need 104to 1010 logical operations for demonstrating quantum
advantages from optimistic and pessimistic points of view, we show that we can reduce the required
number of physical qubits by 80% and 45% in each regime. From another perspective, when the achievable
code distance is up to about 11, our scheme allows executing 103times more logical operations. This
scheme will dramatically alleviate the required computational overheads and hasten the arrival of the
FTQC era.
DOI: 10.1103/PRXQuantum.3.010345
I. INTRODUCTION
Quantum computers are believed to be capable of imple
menting several tasks such as factoring and Hamiltonian
simulations, in exponentially smaller computational times
than those of classical computers [1,2]. However, quantum
systems generally interact with their environments, which
leads to physical errors in the system that may destroy
their quantum advantages. Since the physical error rates
of quantum computers are still much higher than those
of classical computers, it is vital to suppress these errors.
As a solution, faulttolerant quantum computing (FTQC)
using quantum errorcorrecting codes has been studied
*yasunari.suzuki.gz@hco.ntt.co.jp
†suguru.endou.uc@hco.ntt.co.jp
‡yuuki.tokunaga.bf@hco.ntt.co.jp
§Y.S and S.E contributed to this work equally.
Published by the American Physical Society under the terms of
the Creative Commons Attribution 4.0 International license. Fur
ther distribution of this work must maintain attribution to the
author(s) and the published article’s title, journal citation, and
DOI.
[3–7]. The longterm FTQC allows executing conven
tional quantum algorithms such as Hamiltonian simulation
algorithms [8]. According to the current stateoftheart
resource estimations [9,10], the logical quantum operation
count will be in the order of 1010 to observe clear quan
tum advantages based on the computational complexity
theory.
Towards the realization of the longterm FTQC, we
experience several intermediate regimes as shown in Fig. 1
because highlevel encoding is not allowed due to restric
tions of quantum resources such as qubit and magic state
count [5,7]. Since quantum error correction (QEC) requires
massive classical computation for repetitive error estima
tions, the available code distance would also be strongly
limited in the near future [15–17]. As quantum technolo
gies become mature, computational quantum supremacy
[13] will be achieved in the logical space. We refer to
the intermediate regime from the realization of logical
quantum supremacy to the demonstration of longterm
applications as an early FTQC regime. The number of
physical qubits will go beyond one thousand in this region,
and we anticipate that more than about 104reliable logi
cal operations on 102logical qubits are available. Even at
the beginning of the early FTQC regime, we may observe
26913399/22/3(1)/010345(33) 0103451 Published by the American Physical Society
SUZUKI, ENDO, FUJII, and TOKUNAGA PRX QUANTUM 3, 010345 (2022)
Logical error rates
10 −4
10 −6
10 −10
10 −8
10 −12
10
Classically tractable
FTQC Early FTQC
−14
10 2op10 op
Number of Iogical quantum operaons
10 3op 10 6op 10 9op
Longterm FTQC
Early
Classicalsimulaon
boundary
Speedup with heuristic applications
may be observed from this point
10 12 op
Lon
g

te
rm
F
T
Q
C
10
12
Theoretically veriﬁed speedup
is achieved ahis point
10 −2
10 −3
e=1
e=10
−3
Hardware requirement for
longterm quantum algorithms
FIG. 1. Schematic picture representing the transitional period from the classically tractable FTQC regime towards the realization of
the longterm FTQC. We are now in the classically tractable FTQC era due to the lack of physical qubits [11,12]. In the ﬁgure, the
purple line indicates the hardware requirement for performing classically intractable tasks with a realistic time whereas the blue line
corresponds to the requirement for demonstrating quantum advantages with conventional longterm quantum algorithms. To estimate
these lines, we refer to the quantum supremacy experiments [13] and the existing stateoftheart resource estimation [9,10,14]. The
early FTQC regime is deﬁned as a region between these lines. In the main text, we assume that the number of error events during FTQC
Neisrequiredtobesmallerthan10
−3, which is shown as the dotted black line. Our technique allows for FTQCs with the number
of error events in the order of unity Ne∼1, which is shown as a solid black line, to execute applications that originally require a
much smaller errorevent count. For example, at the beginning and the end of the early FTQC regime, our technique allows simulating
applications (white and yellow circles with black rims) with the relaxed hardware requirement (white and yellow circles with red rims).
a quantum speedup with heuristic quantum algorithms,
for example, with the variational quantum eigensolver
[18–20].
In this paper, to realize eﬃcient and highaccuracy quan
tum computation in the early FTQC era, we propose a
novel framework of FTQC, where QEC and quantum
error mitigation (QEM) are combined on an equal foot
ing. While QEM has been considered to be an alternative
errorminimization technique for noisy intermediatescale
quantum (NISQ) devices due to its low hardware over
head at the expense of the sampling cost, we show that,
by integrating probabilistic error cancellation [21,22] into
the FTQC framework, we can mitigate all the dominant
types of errors in the logical space. We also note that
our scheme can eﬃciently mitigate Pauli errors by virtu
ally updating the quantum states with a classical memory
called the Pauli frame [5]. In the conventional QEM for
malism, the sampling cost of QEM increases exponentially
with the number of physical error events [23,24]. There
fore, the sampling overheads of QEM become unrealistic
in NISQ computing when the number of physical oper
ations increases for a ﬁxed error rate per quantum gate;
and the number of error events that QEM can eﬃciently
suppress is limited to the order of unity. In our frame
work, the sampling cost of QEM increases exponentially
with the number of logical error events in the encoded
space. Note that we can tune the number of logical error
events by adjusting several parameters such as the code
distance, distillation levels, and precision of approxima
tions for SolovayKitaev decomposition. Thus, it is highly
likely that we can ﬁnd regions where the QEM techniques
are the most eﬀective, i.e., the number of logical error
events is the order of unity. Accordingly, we can relax the
hardware requirement with constant sampling overheads.
Even after the scalable FTQC is realized, taking QEM into
account, we can optimize quantum computation by allo
cating computation resources at will to perform even more
eﬃcient quantum computing.
We need to overcome several fundamental diﬃcul
ties for applying QEM in the logical space because the
costs and restrictions of logical operations and dominant
sources of errors are diﬀerent from the NISQ formalism.
We resolve them in the aﬃrmative by giving a solution
one by one. For example, solutions to major problems
are as follows. In FTQC, logical Cliﬀord operations and
Pauli measurements can be eﬃciently applied while non
Cliﬀord operations are costly because it involves a number
of Tgate injection, distillation, and teleportation proce
dures [5,7]. These logical operations are aﬀected by three
types of logical errors: logical errors in each elemen
tary gate operation due to restricted code distances, noise
in nonCliﬀord logical gates deriving from shortage of
magic state distillation processes, and errors induced in the
SolovayKitaev decomposition [25,26]. We call the ﬁrst
two logical errors decoding errors and the last one approx
imation errors. We discuss what types of errors are present
0103452
QUANTUM ERROR MITIGATION AS A UNIVERSAL. . . PRX QUANTUM 3, 010345 (2022)
when implementing logical operations, and provide a hier
archical way to mitigate noisy and costly operations with
clean and less costly ones. To detect and correct phys
ical errors during computation, we store the estimated
errors in the Pauli frame instead of physically applying
recovery operations [5]. This means that actual physical
states are almost never in the code space. We provide con
crete procedures for a universal set of logical operations
incorporating QEM, which are compatible with the Pauli
frame. To apply probabilistic error cancellation, we need
a good characterization of the noise model to construct
QEM operations. We show that decoding errors can be
eﬃciently characterized with gateset tomography [27,28]
on the code space. Note that the approximation errors of
the SolovayKitaev algorithm can be characterized eﬃ
ciently on classical computers. Finally, while probabilistic
error cancellation is a QEM technique to mitigate errors
in the algorithms for calculating the expectation values,
many FTQC algorithms are sampling algorithms using
the phase estimation [9,10,29]. We show that probabilistic
error cancellation is compatible with the phase estimation
algorithm. See Appendix Hfor details.
We perform resource estimation of FTQC under real
istic scenarios with and without QEM, and we show that
our scheme can dramatically alleviate the required com
putational overheads in FTQC. We assume that the mean
number of logical error events Neis required to reach Ne=
10−3, and the sampling overhead by QEM is restricted
to a reasonable level, i.e., within 102times greater sam
ples for achieving a certain accuracy. We expect at least
104logical operations are required to demonstrate clas
sically intractable applications. In this case, the required
number of qubits is reduced to approximately one ﬁfth
with QEM compared to the original qubit count. We also
expect that 1010 logical operations are at least necessary to
perform conventional longterm applications. The required
number of qubits is reduced to 55% in this regime. From
another perspective, our scheme can be used for increas
ing the number of available logical operations when the
available code distance is strongly restricted. The lifetime
of current superconducting qubits is about up to 1 ms, and
a cycle of error estimations during FTQC must be suﬃ
ciently faster than the lifetime, i.e., about 1 μs[15,30]. To
cope with this strong restriction, an eﬃcient implemen
tation of classical errordecoding architectures has been
studied. According to the recent stateoftheart propos
als [15–17], the available code distance would be limited
up to about 11 in the near future even with simpliﬁed
decoding algorithms. When the available code distance
is limited up to 11, our scheme enables 103times more
logical operations with the same hardware requirement.
Thus, our technique can clearly accelerate the realization
of applications in early and longterm FTQC regimes. This
improvement is illustrated by red arrows in Fig. 1.Itisalso
worth noting that, to the best of our knowledge, these are
the ﬁrst examples where the performance of useful quan
tum algorithms with clear quantum advantages is enhanced
via QEM under realistic conditions since QEM has been
investigated for nearterm heuristic quantum algorithms
dependent on numerical optimization.
This paper is organized as follows. In Sec. II, we review
probabilistic error cancellation and the architecture of
faulttolerant quantum computing. In Sec. III, we describe
how to evaluate decoding errors and approximation errors.
Then we show our novel FTQC architecture with an ana
lytical argument of the cost of QEM and explain the eﬀect
of model estimation errors. In Sec. IV, we numerically ana
lyze the sampling cost of QEM for decoding errors and
approximation errors and demonstrate that we can eﬀec
tively increase the code distance and the number of T
gates via QEM even when there are ﬁnite estimation errors.
Finally, we conclude our paper with a discussion in Sec. V.
II. PRELIMINARIES
A. Quantum error mitigation and probabilistic error
cancellation
Quantum processors are aﬀected by a number of phys
ical noise sources, which should be mitigated to obtain
correct results. Here, for simplicity, we assume that the
gate errors are Markovian, i.e., the noise process Nfor a
gate is totally independent of other gate errors. In this case,
we have
ρout =NNG◦UNG◦NNG−1◦UNG−1···N1◦U1(ρin),
(1)
where ρout and ρin are the output and input quantum states,
Ukand Nkdenote the ideal and noisy part of the process
of the kth gate, and NGis the number of gates. To ensure
correct computations, it is necessary to mitigate the eﬀect
of Nk,(k=1, 2, ...,NG)and obtain
ρideal
out =UNG◦UNG−1···◦U1(ρin).(2)
QEM has been proposed as a method for suppressing errors
without encoding, and it is useful especially for NISQ
devices with a restricted number of qubits [21,22,31].
Generally speaking, QEM methods recover not the ideal
density matrix ρideal
out itself, but rather the ideal expectation
value of an observable ˆ
Mideal =Tr(ρideal
out ˆ
M)via classical
postprocessing. Note that QEM is not a scalable technique
because it needs exponentially increasing circuit runs with
the number of error events in the quantum circuit [21,22].
Now let us explain the concept of probabilistic error
cancellation with which we can eliminate a bias from the
expectation value of the observables completely given the
complete information on the noise model [21,22]. (Later,
we use this method to suppress errors in FTQC.) First, we
identify the noise map Nvia either process or gateset
0103453
SUZUKI, ENDO, FUJII, and TOKUNAGA PRX QUANTUM 3, 010345 (2022)
tomography [27,28], and calculate the inverse N−1. Then,
by ﬁnding a set of processes {Bi}such that N−1=iηiBi
where ηi∈Rand iηi=1, we have
U=N−1NU
=
i
ηiBiNU.(3)
Note that arbitrary operations can be represented as linear
combinations of tensor products of singlequbit Cliﬀord
operations and Pauli measurements [22]. Here, we can
rewrite Eq. (3) as
U=γQ
i
qisgn(ηi)BiNU,(4)
where γQ=iηi,qi=ηi/γQ,γQ≥1andsgn(ηi)is a
parity, which takes ±1, corresponding to the operation Bi.
We re fer t o γQas the QEM cost because it is related to the
sampling overhead.
Now let us suppose that we measure an observable ˆ
M
and obtain
ˆ
MU=γQ
i
qiˆμeﬀ
i.(5)
Here, ˆμeﬀ
i=sgn(ηi)ˆmi,and ˆmiis a measurement outcome
for a process BiNU. We generate the process Biwith a
probability qiand multiply the corresponding parity with
the measurement result, which is denoted as ˆμeﬀ. Then,
the expectation value of the random variable ˆμmit =γQˆμeﬀ
approximates the errorfree expectation value ˆ
MU. Note
that since Var[ ˆμmit]=γ2
QVar [ ˆμeﬀ ] and a measurement out
come without QEM, which we denote ˆμnmit has a sim
ilar variance, the variance of the errormitigated value
is approximately ampliﬁed as Q=γ2
Q. Therefore, we
need to have Qtimes more samples to achieve a similar
accuracy before applying QEM.
In practice, we use probabilistic error cancellation for
each gate in quantum circuits. The ideal process for the
entire quantum circuit is described as NG
k=1Uk. Denoting
Uk=γ(k)
Qikqiksgn(ηik)BikNkUk, we have
NG
k=1
Uk=
NG
k=1
γ(k)
Q
i1i2···iNG
NG
k=1
qik
NG
k=1
sgn(ηik)
NG
k=1
BikNkUk.
(6)
From Eq. (6), we can see that, in each gate, a process Bikis
generated with probability qik, and the product of parities
Ng
k=1sgn(ηik)is multiplied with the measurement results
to obtain the outcome ˆμeﬀ. This procedure is repeated,
and the product of the mean of the outcomes ˆμeﬀ
and γtot
Q=Ng
k=1γ(k)
Qapproximates the correct expectation
value. Note that here γtot
Qis the QEM cost for the entire
quantum circuit. Let us assume the cost for each gate is
uniform and can be approximated as γ(k)
Q=γQ=1+aε
with aand εbeing a positive constant value and the
eﬀective error rate, respectively. Now the QEM cost and
sampling overhead can be approximated as γtot
QeaεNG=
e(γQ−1)NGand tot
Q=(γ tot
Q)2, which increase exponentially
with the mean number of error events in the quantum
circuit εNG. Note that for εNG=O(1)and ε→0, since
εkNG=0(k≥2), the QEM cost can be exactly described
as γtot
Q=e(γQ−1)NG.
B. Faulttolerant quantum computing
1. Stabilizer formalism
In the framework of FTQC, one prepares a redundant
number of physical qubits and performs quantum comput
ing in a code space deﬁned as a subspace of the whole
Hilbert space. By repetitively performing quantum error
detection and correction, we can protect the logical qubits
deﬁned in the code space against physical errors. The
state of the logical qubits is manipulated in a faulttolerant
manner with a set of logical operations.
The stabilizer formalism [3,32] is the most standard way
to construct quantum errorcorrecting codes. Here, sup
posing that we construct klogical qubits with nphysical
qubits, a 2kdimensional code space Cis speciﬁed with a
subgroup of nqubit Pauli operators called the stabilizer
group. Let the nqubit Pauli group be
Gn={±1, ±i}×{I,X,Y,Z}⊗n,(7)
where Iis the identity operator and X=01
10
,Y=
0−i
i0,Z=10
0−1are Pauli operators. The set of
Pauli operators S⊂Gnis called a stabilizer group if Sis
a commutative subgroup, the number of elements in Sis
2n−k,and−I∈ S. We denote the (n−k)generator set of
a stabilizer group as G=(g1,...,gn−k). The code space
Cis deﬁned as an eigenspace with +1 eigenvalues for all
the operators in the stabilizer group, i.e., C= {ψ∀si∈
S,siψ=ψ}. In the code space, we can introduce a
logical basis as {0L,1L}⊗kand logical Pauli operators
as {IL,XL,YL,ZL}⊗k. The code distance dis deﬁned as the
minimum number of physical qubits on which an arbitrary
logical operator, except the logical identity I⊗k
L, acts.
During a quantum computation, physical errors that
occur in the encoded state are detected by using (n−
k)Pauli measurements Ps=(1/2)[I+(−1)sgi]fors∈
{0, 1}. These measurements are called stabilizer measure
ments and their binary outcomes sare called syndrome
0103454
QUANTUM ERROR MITIGATION AS A UNIVERSAL. . . PRX QUANTUM 3, 010345 (2022)
values. The original state is restored by applying appro
priate feedback operations that are estimated from the
syndrome values. These stabilizer measurements are per
formed repeatedly during a computation. One repetition
of the stabilizer measurements is called a code cycle of
faulttolerant quantum computing. If the eﬀective error
probability per physical qubit during a cycle is smaller than
a certain threshold, we can estimate the Pauli operator that
restores the original state with an exponentially small fail
ure probability with the code distance d. Since the required
number of physical qubits nincreases polynomially with
the code distance din typical quantum errorcorrecting
codes, we can exponentially decrease the error probability
of logical qubits with a polynomial qubit overhead.
2. Logical operations
We must not only correct physical errors but also update
the logical quantum state for performing quantum com
putation. To this end, a universal set of logical operations
should be performed in a faulttolerant manner. According
to the SolovayKitaev theorem [25,26], we can approxi
mate arbitrary one and twoqubit gates with a ﬁnite set
of local operations. For example, the Hadamard gate H=
(1/√2)11
1−1, controllednot (CNOT) gate =00⊗
I+11⊗X,andTgate T=exp [i(π/8)Z]form a uni
versal gate set. Several logical operations can be performed
by transversally operating the same one or twoqubit oper
ations on physical qubits. Since transversal operations con
stantly increase the eﬀective physical error rate per qubit
during a cycle, we can fault tolerantly achieve transver
sal logical operations. However, it is known that there is
no stabilizer code for which the set of transversal gates
is universal [33]. Thus, we need an additional technique
to achieve faulttolerant and universal quantum comput
ing. The most promising solution is to create a quantum
state called a magic state and perform nontransversal
logical operations with gate teleportation [5]. For exam
ple, AL=TH 0L=(1/√2)(eiπ/80L+e−iπ/81L)is
a typical magic state and Tgate operations can be per
formed by consuming this state. This magic state encoded
in a logical qubit can be constructed with a process called
magic state injection. While the inﬁdelity of a magic state
created by magic state injection is generally larger than the
logical error rate, we can create a highﬁdelity magic state
from several noisy magic states by using another quantum
errorcorrecting code implemented on the logical space,
which is called magic state distillation. Since the appli
cation of Tgates requires a longer time than the other
operations, the number of Tgates is the dominant factor
aﬀecting the computation time of FTQC.
Although we can estimate a Pauli operation for recov
ery from syndrome values, we do not directly apply it
immediately after estimation. Instead, we store the Pauli
operations that should be applied to the physical qubits
FIG. 2. Schematic ﬁgure of the Pauli frame. The recovery
operations are not physically applied to quantum computers but
rather are stored in the Pauli frame and eﬃciently updated after
each Cliﬀord gate operation. The measurement outcomes are
ﬂipped depending on the state of the Pauli frame.
for recovery in a classical memory called the Pauli frame
[5,34]. The stored operations will be taken into account
when the logical measurements are performed; the out
come of a logical measurement is ﬂipped according to
the Pauli frame. A schematic ﬁgure is shown in Fig. 2.
In the above construction of logical operations, the whole
process, except for magic state injection, consists only of
Cliﬀord operations and Pauli channels in the code space.
Since a Pauli operator conjugated by a Cliﬀord operator
is also a Pauli operator, we can always track a recov
ery operator as a Pauli operator during a computation. In
addition, when we can apply a logical Pauli operator to a
quantum state, we can perform it simply by updating the
Pauli frame, since a logical Pauli operator is a transversal
physical Pauli operation. As far as classical computers are
reliable, this operation is eﬀectively noiseless.
III. QUANTUM ERROR MITIGATION FOR
FAULTTOLERANT QUANTUM COMPUTING
In this section, we discuss how to integrate QEM into
the FTQC architecture. Here, we consider two types of
errors in FTQC: decoding errors due to failures in the error
estimation and insuﬃciency of magic state distillation and
approximation errors in the SolovayKitaev decomposi
tion. In Sec. A, we explain how these errors in FTQC can
be modeled. In Sec. B, we discuss how these errors can be
canceled and evaluate their QEM costs. Probabilistic error
cancellation requires the errors to be estimated in advance.
In Sec. C, we also discuss the eﬀect of estimation errors
on probabilistic error cancellation and the characterization
eﬃciency.
0103455
SUZUKI, ENDO, FUJII, and TOKUNAGA PRX QUANTUM 3, 010345 (2022)
A. Errors in faulttolerant quantum computing
1. Decoding error
Here, we describe noise due to the failures of error
estimation in elementary logical operations, i.e., stabilizer
measurements and magic state distillation. The ﬁrst obsta
cle to applying probabilistic error cancellation to FTQC is
how to characterize an eﬀective map of noise due to the
failures of error estimation. If we suppose that the physical
errors can be modeled as a stochastic physical Pauli map
and assume that there are no errors on the ancillary qubits
for syndrome measurements, we can deﬁne a logical noise
map for decoding errors that is Markovian and a logical
stochastic Pauli map. Yet, these assumptions do not hold in
practice. Nevertheless, here we assume that we can deﬁne
an eﬀectively Markovian logical error map for each logical
operation and also assume that this noise map is a stochas
tic logical Pauli map. It is known that even if noise is
unitary, a noise map in a logical space of surface codes can
be well approximated as stochastic Pauli noise when the
code distance is suﬃciently large [35]. Furthermore, the
remaining coherent errors can be canceled by using pulse
optimization techniques. Thus, it is reasonable to suppose
that the decoding errors due to the failure of error estima
tions in surface codes are almost stochastic Pauli errors.
In addition, we numerically verify that we can regard the
decoding errors as Markovian errors even in the presence
of measurement errors. See Appendix Ffor details. While
we mainly describe and analyze the decoding errors in the
surface codes, a similar idea can be applied to the decoding
errors due to insuﬃcient magic state distillation. As for the
logical noise map on a prepared magic state due to insuf
ﬁcient magic state distillation, we can twirl the noise map
by logical Cliﬀord operations, and it can also be assumed
to be a stochastic Pauli noise.
Under the above assumptions, we can describe a noise
map for a lqubit logical operation Ndec as the following
stochastic Pauli noise:
Ndec(ρ) =
g∈{IL,XL,YL,ZL}⊗l
pggρg†,(8)
where pg∈R,gpg=1andpg≥0. The sum of prob
abilities of nonidentity logical operations is called the
logical error probability pdec, i.e., pdec =g=I⊗lpg.Itis
known that when the physical error rate pis smaller than
a value called the threshold pth, the eﬀective logical error
probability decreases exponentially with respect to the
code distance d. For the eﬀective logical error probability
per syndromemeasurement cycle of surface codes pcyc,it
decreases as
pcyc C1C2
p
pth (d+1)/2
,(9)
where C1,C2are constants [36]. While the constant values
depend on the details of the error correction schemes, C1
0.13 and C20.61 are expected in a typical construction
of surface codes and the noise model [36,37]. Suppose that
a logical operation requires mcycles; then, the logical error
probability for the logical operation can be approximated
as pdec as
pdec =1−(1−pcyc)mmpcyc . (10)
Note that the number of cycles per logical gate increases at
most linearly with the code distance d.
In order to apply probabilistic error cancellation, we
need to know the logical error probabilities {pg}in
advance. While we can estimate {pg}by using gateset
tomography in the logical space, the estimations are not
exact. The eﬀect of estimation errors is discussed in Sec. C,
while the eﬃciency of our proposal, including noise char
acterization, is discussed in Appendix C.
2. Approximation error
Since we are only allowed to use a limited set of log
ical operations for achieving fault tolerance, we need to
decompose an arbitrary unitary gate into a sequence of
available gates. Any unitary operator can be decomposed
into a product of CNOT gates and singlequbit gates. Thus,
we need to approximate singlequbit gates with a given
gate set to the desired accuracy. By using the improved
SolovayKitaev algorithm [38], given a universal gate set
such as {T,H,S}and the singlequbit gate Uto be approx
imated, we can construct an approximated gate ˜
U, which
satisﬁes ε=˜
U−Uto an arbitrary accuracy εas a
sequence of given gate set with length ˜
O[log(ε−1)] with
·being an operator norm. The error of approximated
map is given by
NSK(ρ) =˜
UU†ρU˜
U†. (11)
Since this decomposition involves only singlequbit oper
ations, this error channel can be eﬃciently and exactly
evaluated in advance.
B. Quantum error mitigation for faulttolerant
quantum computing
1. Overview of our framework
Here, we show that decoding errors and approximation
errors can be mitigated with probabilistic error cancella
tion. When we insert recovery operations for probabilistic
error cancellation, it is assumed that the noise level of the
recovery operations for QEM is much lower than that of
the errormitigated gates. In NISQ computing, for exam
ple, it is reasonable to assume that the error probabilities of
twoqubit gates are much larger than those of singlequbit
gates and measurements; therefore, the errors of twoqubit
0103456
QUANTUM ERROR MITIGATION AS A UNIVERSAL. . . PRX QUANTUM 3, 010345 (2022)
gates can be mitigated by using singlequbit recovery oper
ations. However, this is not a reasonable assumption in
FTQC, since the operations that are noisy and time con
suming are diﬀerent from those of NISQ architecture.
More concretely, even Cliﬀord operations involving only
one logical qubit suﬀer decoding errors.
Here, we show an architecture of FTQC that imple
ments QEM with small overheads. The keys are the two
signiﬁcant properties of FTQC architecture: logical Pauli
operations are errorfree and instantaneous due to the Pauli
frame, and the noise map of the decoding errors can be
assumed as stochastic Pauli noise. Thanks to these prop
erties, we can mitigate errors in all the elementary logical
operations simply by updating the Pauli frame. This means
the errormitigated Cliﬀord operations and Pauli measure
ments are available for computation. Because they form
a complete basis for mitigating arbitrary errors [22], we
can mitigate approximation errors due to the Solovay
Kitaev decomposition. Since the approximation errors can
be exactly known in advance, an unbiased estimator free
from approximation error can be obtained, as explained in
Sec. 3.
To make our QEM procedure work, the accuracy and
eﬃciency of the decoding error estimation are vital. We
show that the decoding errors can be estimated with gate
set tomography under an appropriate choice of the gauge,
considering statepreparation and measurement errors. We
also show that the cost of gateset tomography is accept
able compared with the main computation of FTQC for
estimating expectation values in Sec. C. In this section, we
further show a reﬁned gateset tomography suited to our
framework that signiﬁcantly improves the estimation for
logical Cliﬀord gates.
2. Quantum error mitigation for decoding errors
We can express the inverse channel of the nonuniform
depolarizing channel Eq. (8) as a linear combination of
Pauli operations. Thus, we can express the inverse channel
as
N−1
dec (ρ) =
g∈{IL,XL,YL,ZL}⊗l
ηggρg†
=γdec
g∈{IL,XL,YL,ZL}⊗l
qgsgn(ηg)gρg†. (12)
Refer to Appendix Bfor a concrete expression of each
coeﬃcient ηg,γdec,andqg. Thus, we can suppress the
errors by applying probabilistic error cancellation only
with Pauli operators after the decoding processes. The
QEM cost for decoding errors in the entire circuit can
be expressed as γtot
dec =Ndec
k=1γ(k)
dec , where Ndec is the num
ber of logical gates, and γ(k)
dec is a QEM cost of the kth
operation.
FIG. 3. Schematic ﬁgure for the Pauli frame incorporating
QEM. If a QEM recovery operation is a Pauli operation, it is
not directly applied to the quantum computer but rather the Pauli
frame is updated instead. The parity is also updated in accordance
with the generated recovery operations of QEM. Here, we denote
the parity corresponding to the QEM recovery operation as pain
the ﬁgure. If a QEM recovery operation is not a Pauli operator,
it is performed physically. Then measurement outcomes are then
postprocessed depending on the Pauli frame, parity, and QEM
cost.
Note that probabilistic error cancellation usually applies
the recovery operations of QEM immediately after the
noisy gates [21,22]; however, because we perform only
logical Pauli operations as the recovery operations for
decoding errors, they can be done simply by updating
the Pauli frame instead of directly applying them after
noisy gates. Finally, the measurement result is postpro
cessed according to the state of the Pauli frame, the parity
corresponding to the applied recovery operations, and the
QEM cost. Thus, unlike in probabilistic error cancellation
for NISQ devices, the logical noise due to decoding errors
can be mitigated without any additional noise due to the
recovery operation. A schematic ﬁgure is shown in Fig. 3.
Note that the information on the QEM cost and the parity
is used only when the ﬁnal measurement result is obtained;
the outcome of a destructive logical Pauli measurement
is ﬂipped depending only on the state of the Pauli frame.
Whether we can mitigate decoding errors of complicated
logical operations such as magic state preparation, gate
teleportation, and adaptive Cliﬀord gates by simply updat
ing the Pauli frame is not trivial; therefore we provide a
concrete procedure for actual devices and Pauli frames in
Appendix E.
In the case of decoding errors in surface codes, by
approximating the QEM cost to the ﬁrst order of the logical
0103457
SUZUKI, ENDO, FUJII, and TOKUNAGA PRX QUANTUM 3, 010345 (2022)
error, we have
γdec 1+2pdec. (13)
Refer to Appendix Bfor details. Under the assumption
that the logical error rate is the same for all the logical
operations and pdecNdec =O(1)with pdec →+0, the QEM
cost γtot
dec for the entire quantum circuit can be shown to be
exactly equal to e2pdecNdec on the basis of the argument in
Sec. A. Thus, by using Eqs. (9) and (10), we obtain
γdec −1=2mC1C2
p
pth (d+1)/2
, (14)
which results in the total QEM sampling overhead
tot
L=e2(γdec−1)Ndec =exp 4mC1Ndec C2
p
pth (d+1)/2.
(15)
Notice that Eq. (15) clearly shows a tradeoﬀ relation
ship between the sampling overhead and the code distance,
i.e., the number of physical qubits.
Finally, although we mainly focus on the surface code
architecture in this paper, our scheme is also available
for typical stabilizerbased errorcorrecting codes such as
color codes [39] and concatenated Steane or Shor codes
[3,40,41] because we assume only basic properties of sta
bilizer codes, i.e., Pauli recovery operations are estimated
from syndrome measurements and nonCliﬀord opera
tions are applied by generating and consuming magic
states. Therefore, we can choose suitable quantum error
correcting codes according to the hardware connectivity.
3. Quantum error mitigation for approximation errors
Unlike decoding errors due to the failure of error correc
tion, we cannot describe errors due to the SolovayKitaev
decomposition as stochastic Pauli errors. Nevertheless, we
can still apply probabilistic error cancellation with negli
gible overheads. Denote NSK(ρ) =˜
UUρ( ˜
UU)†; we invert
this approximation error by
N−1
SK =
i
ηiB(L)
i
=γSK
i
qisgn(ηi)B(L)
i, (16)
where {B(L)
i}denotes recovery operations in the logical
space. Note that we can represent any map as a linear com
bination of Cliﬀord operations and Pauli channels [22], and
thus, we do not need Tgates for mitigating approxima
tion errors. Recovery operations are randomly chosen and
applied immediately after each singlequbit logical opera
tion if they are not Pauli operations. In the case of Pauli
operations, we can again use the Pauli frame, and phys
ical operations on quantum computers are not required,
in a similar vein to QEM for decoding errors. Since a
singlequbit logical unitary operation consists of several
repetitions of Cliﬀord gates and Tgate teleportation, the
insertion of the recovery operation for probabilistic error
cancellation negligibly increases the length of the quantum
circuit. In the numerical simulations described in the next
section, we verify that the QEM costs can be approximated
with the following equation:
γSK −1=β1e−β2NT, (17)
where β1and β2are constants dependent on the quantum
gate and NTis the number of available Tgates.
The QEM cost due to approximation errors can also be
represented as γtot
SK =NSK
k=1γ(k)
SK , where NSK is the total
number of recovery operations for mitigating approxi
mation errors in the quantum circuit with the cost γ(k)
SK
corresponding to the kth recovery operation. By assum
ing that the cost does not depend on gates, we have the
following QEM sampling overhead:
tot
SK exp 2β1NSKe−β2NT. (18)
This shows there is a tradeoﬀ relationship between the
sampling overhead and the number of available Tgates.
C. Eﬀect of estimation errors of the noise map
1. Eﬀect of estimation errors on expectation values
While approximation errors can be exactly determined
in advance, decoding errors have to be characterized. Since
the decoding error probabilities are small, it is unavoid
able that the characterization will contain ﬁnite and non
negligible estimation errors. Thus, we need to care about
QEM with estimation errors and the eﬃciency of the
characterization of decoding errors.
Let us discuss how estimation errors aﬀect the per
formance of QEM. Given a perfect characterization of
the noise model Nkfor the kth gate, we can realize the
inverse operation N−1
kwith probabilistic error cancellation
to achieve N−1
kNk=I. If we obtain an incorrect estima
tion for the error process N
k= Nk, it leads to an estimation
error Nk≡N−1
kNk= I.
Now, denoting the ideal process of the kth gate as Uk, the
diﬀerence of the errormitigated process and the errorfree
process for the entire quantum circuit can be described by
the diamond norm:
NG
k=1
NkUk−
NG
k=1
Uk
≤εNG, (19)
where we use the fact that the diamond norm is subad
ditive and we denote ε =maxkNk−I. Similarly,
0103458
QUANTUM ERROR MITIGATION AS A UNIVERSAL. . . PRX QUANTUM 3, 010345 (2022)
the discrepancy of the noisy and ideal process can be upper
bounded as NG
k=1NkUk−NG
k=1Uk≤εNG, where ε=
maxkNk−I. Because the deviation of the expectation
values of an observable Mfor two processes E1and E2with
the input state ρcan be described as δM=Tr{M[E1(ρ) −
E2(ρ)]}≤ME1−E2[42], where · is an oper
ator norm, we have δMQEM ≤MεNGand δMnoise ≤
MεNG. Here, δMQEM and δMnoise are the deviation of
the observable with and without error mitigation.
Thus, we can see that QEM is beneﬁcial when we can
achieve r<1for
ε =rε. (20)
Note that this discussion does not include sampling errors;
i.e., δMis the error of the expectation value given inﬁnite
samples.
2. Eﬃciency of characterization of decoding errors
As a cause of model estimation errors, when we use
gateset tomography to characterize the noise model for
decoding errors, we need to consider state preparation and
measurement (SPAM) errors and the ﬁnite statistical error
arising from an insuﬃcient number of samples. It has been
shown that the eﬀect of SPAM errors can be eliminated in
the case of probabilistic error cancellation based on gate
set tomography [22]. While the general choice of the gauge
is not compatible with the Pauli frame, we can modify
the scheme of gateset tomography so that this method
is compatible with QEM with the Pauli frame. Refer to
Appendix Cfor details.
To achieve an accuracy rgiven in Eq. (20), we need to
perform NGST =O[(rε)−2] samplings for gateset tomog
raphy [28,43]. Here, we show this eﬃciency is accept
able compared with the main part of FTQC, i.e., the
time required for gateset tomography corresponds to
O(r−2nqNG)runs of the whole quantum logical circuits
to obtain expectation values, where nqis the number of
logical qubits. Let the time for a single run of the log
ical circuit of FTQC be τ. The depth of logical quan
tum circuit is estimated as O(NGn−1
q), and the time per
gate can be roughly approximated as τgate =O(τ nqN−1
G).
Then, the time for gateset tomography can be estimated
as τGST =O(τgateNGST )=O[τN−1
Gnq(rε)−2]. In a situation
where QEM is useful, we have εNG=O(1)[44]. Thus, we
can conclude that to use QEM to decrease the logical error
rate pdec to rpdec by QEM, we need gateset tomography as
a precomputation that takes τGST =O(τ NGnqr−2), which
is τGST/τ =O(NGnqr−2)times longer than a single circuit
run of FTQC. The numbers of logical gates NGand logi
cal qubits nqare expected to grow polynomially with the
problem size, and FTQC circuits will be repeated on the
order of O(r−2)to make the statistical ﬂuctuation of expec
tation values smaller than the reduced bias. Accordingly,
while the estimation costs of the noise map cause another
overhead to FTQC depending on the required accuracy, it
is performed with a time that grows polynomially with
the problem size and without requiring additional phys
ical qubits. We remark that when we assume the noise
properties of the quantum devices are uniform, we can per
form the sampling for gateset tomography in parallel. If
we use all the logical qubits for characterization, the time
for gateset tomography is reduced to τGST =O(τ NGr−2).
Note that, in the scenario that we can fully parallelize the
sampling procedure, i.e., when we have O[(rε)−2] distinct
uniform quantum gates, we have τGST =O(τ N−1
G).
To further make the characterization of noise more eﬃ
cient, we propose an improved gateset tomography for
decoding errors of the Cliﬀord process that is fast and
compatible with the Pauli frame. See Appendix Cfor the
details of this scheme. The number of measurements NGST
is reduced to NGST =O(r−2ε−1), which makes the costs
of precomputation O(nqr−2). Thus, as long as ris not
too small, the time for characterization is expected to be
relatively short. While our eﬃcient gateset tomography
cannot be applied to the characterization of the Tgate
preparation, several ways to reduce the costs for estimat
ing errors of Tgates can be considered. Since the error
of logical Tgate depends on physical Tgate and the pro
cess of injection and distillation is constructed by a few
Tgate circuit, there may be an eﬃcient way to numerically
estimate the noise of logical Tgate from the characteriza
tion of physical Tgate and eﬃcient simulation for quantum
circuits dominated by Cliﬀord gates [45]. There may be a
way to mitigate Tgate errors by temporally expanding the
code distance or increasing the distillation depth for Tgate.
The cost of gateset tomography might be also reduced by
utilizing longsequence GST [43], i.e., repeating several T
gates to amplify a small error rate to a large value. Refer
ence [46] shows that if decoding errors of logical Cliﬀord
gates are negligible, one can reliably twirl the noise of T
gate and perform eﬃcient process tomography on that by
repeating Tgates. Nevertheless, it is still an open problem
whether there exists a more eﬃcient gateset tomogra
phy on the logical space with imperfect logical Cliﬀord
gates.
3. Eﬀective increase in code distance by quantum error
mitigation under estimation errors
We can regard that QEM eﬀectively increases the code
distance. Suppose that we can eﬀectively achieve an r
times smaller logical error rate peﬀ =rpdec via QEM. Since
the logical error rate is roughly approximated with the
code distance as pdec(d)=p(p/pth )(d−1)/2, QEM eﬀec
tively achieves a larger code distance dwhere peﬀ =
pdec(d)without increasing the number of physical qubits.
The eﬀective increase in the code distance via QEM can be
0103459
SUZUKI, ENDO, FUJII, and TOKUNAGA PRX QUANTUM 3, 010345 (2022)
derived as
d−d=2ln r
ln p/pth. (21)
Therefore, by setting r=(p/pth)x, we can eﬀectively
increase the code distance by 2x. Note that, as discussed
in the previous sections, we need exp[O(Ndecpdec )]=
exp[O(1)] times more repetitions to achieve the same pre
cision as the errorfree case. It<