ArticlePDF Available

Quantum Error Mitigation as a Universal Error Reduction Technique: Applications from the NISQ to the Fault-Tolerant Quantum Computing Eras


Abstract and Figures

In the early years of fault-tolerant quantum computing (FTQC), it is expected that the available code distance and the number of magic states will be restricted due to the limited scalability of quantum devices and the insufficient computational power of classical decoding units. Here, we integrate quantum error correction and quantum error mitigation into an efficient FTQC architecture that effectively increases the code distance and T-gate count at the cost of constant sampling overheads in a wide range of quantum computing regimes. For example, while we need 10^{4} to 10^{10} logical operations for demonstrating quantum advantages from optimistic and pessimistic points of view, we show that we can reduce the required number of physical qubits by 80% and 45% in each regime. From another perspective, when the achievable code distance is up to about 11, our scheme allows executing 10^{3} times more logical operations. This scheme will dramatically alleviate the required computational overheads and hasten the arrival of the FTQC era.
Content may be subject to copyright.
PRX QUANTUM 3, 010345 (2022)
Quantum Error Mitigation as a Universal Error Reduction Technique:
Applications from the NISQ to the Fault-Tolerant Quantum Computing Eras
Yasunari Suzuki,1,2, *,§ Suguru Endo ,1,2, †,§ Keisuke Fujii,3,4,5 and Yuuki Tokunaga1,
1NTT Computer and Data Science Laboratories, NTT Corporation, Musashino 180-8585, Japan
2JST, PRESTO, 4-1-8 Honcho, Kawaguchi, Saitama 332-0012, Japan
3Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama, Toyonaka,
Osaka 560-8531, Japan
4Center for Quantum Information and Quantum Biology, Institute for Open and Transdisciplinary Research
Initiatives, Osaka University, Japan
5Center for Emergent Matter Science, RIKEN, Wako, Saitama 351-0198, Japan
(Received 19 March 2021; revised 15 October 2021; accepted 3 January 2022; published 18 March 2022)
In the early years of fault-tolerant quantum computing (FTQC), it is expected that the available code
distance and the number of magic states will be restricted due to the limited scalability of quantum devices
and the insufficient computational power of classical decoding units. Here, we integrate quantum error
correction and quantum error mitigation into an efficient FTQC architecture that effectively increases the
code distance and T-gate count at the cost of constant sampling overheads in a wide range of quantum
computing regimes. For example, while we need 104to 1010 logical operations for demonstrating quantum
advantages from optimistic and pessimistic points of view, we show that we can reduce the required
number of physical qubits by 80% and 45% in each regime. From another perspective, when the achievable
code distance is up to about 11, our scheme allows executing 103times more logical operations. This
scheme will dramatically alleviate the required computational overheads and hasten the arrival of the
FTQC era.
DOI: 10.1103/PRXQuantum.3.010345
Quantum computers are believed to be capable of imple-
menting several tasks such as factoring and Hamiltonian
simulations, in exponentially smaller computational times
than those of classical computers [1,2]. However, quantum
systems generally interact with their environments, which
leads to physical errors in the system that may destroy
their quantum advantages. Since the physical error rates
of quantum computers are still much higher than those
of classical computers, it is vital to suppress these errors.
As a solution, fault-tolerant quantum computing (FTQC)
using quantum error-correcting codes has been studied
§Y.S and S.E contributed to this work equally.
Published by the American Physical Society under the terms of
the Creative Commons Attribution 4.0 International license. Fur-
ther distribution of this work must maintain attribution to the
author(s) and the published article’s title, journal citation, and
[37]. The long-term FTQC allows executing conven-
tional quantum algorithms such as Hamiltonian simulation
algorithms [8]. According to the current state-of-the-art
resource estimations [9,10], the logical quantum operation
count will be in the order of 1010 to observe clear quan-
tum advantages based on the computational complexity
Towards the realization of the long-term FTQC, we
experience several intermediate regimes as shown in Fig. 1
because high-level encoding is not allowed due to restric-
tions of quantum resources such as qubit and magic state
count [5,7]. Since quantum error correction (QEC) requires
massive classical computation for repetitive error estima-
tions, the available code distance would also be strongly
limited in the near future [1517]. As quantum technolo-
gies become mature, computational quantum supremacy
[13] will be achieved in the logical space. We refer to
the intermediate regime from the realization of logical
quantum supremacy to the demonstration of long-term
applications as an early FTQC regime. The number of
physical qubits will go beyond one thousand in this region,
and we anticipate that more than about 104reliable logi-
cal operations on 102logical qubits are available. Even at
the beginning of the early FTQC regime, we may observe
2691-3399/22/3(1)/010345(33) 010345-1 Published by the American Physical Society
Logical error rates
10 −4
10 −6
10 −10
10 −8
10 −12
Classically tractable
10 2op10 op
Number of Iogical quantum operaons
10 3op 10 6op 10 9op
Long-term FTQC
Speedup with heuristic applications
may be observed from this point
10 12 op
Theoretically verified speedup
is achieved ahis point
10 −2
10 −3
Hardware requirement for
long-term quantum algorithms
FIG. 1. Schematic picture representing the transitional period from the classically tractable FTQC regime towards the realization of
the long-term FTQC. We are now in the classically tractable FTQC era due to the lack of physical qubits [11,12]. In the figure, the
purple line indicates the hardware requirement for performing classically intractable tasks with a realistic time whereas the blue line
corresponds to the requirement for demonstrating quantum advantages with conventional long-term quantum algorithms. To estimate
these lines, we refer to the quantum supremacy experiments [13] and the existing state-of-the-art resource estimation [9,10,14]. The
early FTQC regime is defined as a region between these lines. In the main text, we assume that the number of error events during FTQC
3, which is shown as the dotted black line. Our technique allows for FTQCs with the number
of error events in the order of unity Ne1, which is shown as a solid black line, to execute applications that originally require a
much smaller error-event count. For example, at the beginning and the end of the early FTQC regime, our technique allows simulating
applications (white and yellow circles with black rims) with the relaxed hardware requirement (white and yellow circles with red rims).
a quantum speedup with heuristic quantum algorithms,
for example, with the variational quantum eigensolver
In this paper, to realize efficient and high-accuracy quan-
tum computation in the early FTQC era, we propose a
novel framework of FTQC, where QEC and quantum
error mitigation (QEM) are combined on an equal foot-
ing. While QEM has been considered to be an alternative
error-minimization technique for noisy intermediate-scale
quantum (NISQ) devices due to its low hardware over-
head at the expense of the sampling cost, we show that,
by integrating probabilistic error cancellation [21,22] into
the FTQC framework, we can mitigate all the dominant
types of errors in the logical space. We also note that
our scheme can efficiently mitigate Pauli errors by virtu-
ally updating the quantum states with a classical memory
called the Pauli frame [5]. In the conventional QEM for-
malism, the sampling cost of QEM increases exponentially
with the number of physical error events [23,24]. There-
fore, the sampling overheads of QEM become unrealistic
in NISQ computing when the number of physical oper-
ations increases for a fixed error rate per quantum gate;
and the number of error events that QEM can efficiently
suppress is limited to the order of unity. In our frame-
work, the sampling cost of QEM increases exponentially
with the number of logical error events in the encoded
space. Note that we can tune the number of logical error
events by adjusting several parameters such as the code
distance, distillation levels, and precision of approxima-
tions for Solovay-Kitaev decomposition. Thus, it is highly
likely that we can find regions where the QEM techniques
are the most effective, i.e., the number of logical error
events is the order of unity. Accordingly, we can relax the
hardware requirement with constant sampling overheads.
Even after the scalable FTQC is realized, taking QEM into
account, we can optimize quantum computation by allo-
cating computation resources at will to perform even more
efficient quantum computing.
We need to overcome several fundamental difficul-
ties for applying QEM in the logical space because the
costs and restrictions of logical operations and dominant
sources of errors are different from the NISQ formalism.
We resolve them in the affirmative by giving a solution
one by one. For example, solutions to major problems
are as follows. In FTQC, logical Clifford operations and
Pauli measurements can be efficiently applied while non-
Clifford operations are costly because it involves a number
of T-gate injection, distillation, and teleportation proce-
dures [5,7]. These logical operations are affected by three
types of logical errors: logical errors in each elemen-
tary gate operation due to restricted code distances, noise
in non-Clifford logical gates deriving from shortage of
magic state distillation processes, and errors induced in the
Solovay-Kitaev decomposition [25,26]. We call the first
two logical errors decoding errors and the last one approx-
imation errors. We discuss what types of errors are present
when implementing logical operations, and provide a hier-
archical way to mitigate noisy and costly operations with
clean and less costly ones. To detect and correct phys-
ical errors during computation, we store the estimated
errors in the Pauli frame instead of physically applying
recovery operations [5]. This means that actual physical
states are almost never in the code space. We provide con-
crete procedures for a universal set of logical operations
incorporating QEM, which are compatible with the Pauli
frame. To apply probabilistic error cancellation, we need
a good characterization of the noise model to construct
QEM operations. We show that decoding errors can be
efficiently characterized with gate-set tomography [27,28]
on the code space. Note that the approximation errors of
the Solovay-Kitaev algorithm can be characterized effi-
ciently on classical computers. Finally, while probabilistic
error cancellation is a QEM technique to mitigate errors
in the algorithms for calculating the expectation values,
many FTQC algorithms are sampling algorithms using
the phase estimation [9,10,29]. We show that probabilistic
error cancellation is compatible with the phase estimation
algorithm. See Appendix Hfor details.
We perform resource estimation of FTQC under real-
istic scenarios with and without QEM, and we show that
our scheme can dramatically alleviate the required com-
putational overheads in FTQC. We assume that the mean
number of logical error events Neis required to reach Ne=
103, and the sampling overhead by QEM is restricted
to a reasonable level, i.e., within 102times greater sam-
ples for achieving a certain accuracy. We expect at least
104logical operations are required to demonstrate clas-
sically intractable applications. In this case, the required
number of qubits is reduced to approximately one fifth
with QEM compared to the original qubit count. We also
expect that 1010 logical operations are at least necessary to
perform conventional long-term applications. The required
number of qubits is reduced to 55% in this regime. From
another perspective, our scheme can be used for increas-
ing the number of available logical operations when the
available code distance is strongly restricted. The lifetime
of current superconducting qubits is about up to 1 ms, and
a cycle of error estimations during FTQC must be suffi-
ciently faster than the lifetime, i.e., about 1 μs[15,30]. To
cope with this strong restriction, an efficient implemen-
tation of classical error-decoding architectures has been
studied. According to the recent state-of-the-art propos-
als [1517], the available code distance would be limited
up to about 11 in the near future even with simplified
decoding algorithms. When the available code distance
is limited up to 11, our scheme enables 103times more
logical operations with the same hardware requirement.
Thus, our technique can clearly accelerate the realization
of applications in early and long-term FTQC regimes. This
improvement is illustrated by red arrows in Fig. 1.Itisalso
worth noting that, to the best of our knowledge, these are
the first examples where the performance of useful quan-
tum algorithms with clear quantum advantages is enhanced
via QEM under realistic conditions since QEM has been
investigated for near-term heuristic quantum algorithms
dependent on numerical optimization.
This paper is organized as follows. In Sec. II, we review
probabilistic error cancellation and the architecture of
fault-tolerant quantum computing. In Sec. III, we describe
how to evaluate decoding errors and approximation errors.
Then we show our novel FTQC architecture with an ana-
lytical argument of the cost of QEM and explain the effect
of model estimation errors. In Sec. IV, we numerically ana-
lyze the sampling cost of QEM for decoding errors and
approximation errors and demonstrate that we can effec-
tively increase the code distance and the number of T
gates via QEM even when there are finite estimation errors.
Finally, we conclude our paper with a discussion in Sec. V.
A. Quantum error mitigation and probabilistic error
Quantum processors are affected by a number of phys-
ical noise sources, which should be mitigated to obtain
correct results. Here, for simplicity, we assume that the
gate errors are Markovian, i.e., the noise process Nfor a
gate is totally independent of other gate errors. In this case,
we have
ρout =NNGUNGNNG1UNG1···N1U1in),
where ρout and ρin are the output and input quantum states,
Ukand Nkdenote the ideal and noisy part of the process
of the kth gate, and NGis the number of gates. To ensure
correct computations, it is necessary to mitigate the effect
of Nk,(k=1, 2, ...,NG)and obtain
out =UNGUNG1···◦U1in).(2)
QEM has been proposed as a method for suppressing errors
without encoding, and it is useful especially for NISQ
devices with a restricted number of qubits [21,22,31].
Generally speaking, QEM methods recover not the ideal
density matrix ρideal
out itself, but rather the ideal expectation
value of an observable ˆ
Mideal =Trideal
out ˆ
M)via classical
postprocessing. Note that QEM is not a scalable technique
because it needs exponentially increasing circuit runs with
the number of error events in the quantum circuit [21,22].
Now let us explain the concept of probabilistic error
cancellation with which we can eliminate a bias from the
expectation value of the observables completely given the
complete information on the noise model [21,22]. (Later,
we use this method to suppress errors in FTQC.) First, we
identify the noise map Nvia either process or gate-set
tomography [27,28], and calculate the inverse N1. Then,
by finding a set of processes {Bi}such that N1=iηiBi
where ηiRand iηi=1, we have
Note that arbitrary operations can be represented as linear
combinations of tensor products of single-qubit Clifford
operations and Pauli measurements [22]. Here, we can
rewrite Eq. (3) as
where γQ=i|ηi|,qi=|ηi|Q,γQ1andsgni)is a
parity, which takes ±1, corresponding to the operation Bi.
We re fer t o γQas the QEM cost because it is related to the
sampling overhead.
Now let us suppose that we measure an observable ˆ
and obtain
Here, ˆμeff
i=sgni)ˆmi,and ˆmiis a measurement outcome
for a process BiNU. We generate the process Biwith a
probability qiand multiply the corresponding parity with
the measurement result, which is denoted as ˆμeff. Then,
the expectation value of the random variable ˆμmit =γQˆμeff
approximates the error-free expectation value ˆ
MU. Note
that since Var[ ˆμmit]=γ2
QVar [ ˆμeff ] and a measurement out-
come without QEM, which we denote ˆμnmit has a sim-
ilar variance, the variance of the error-mitigated value
is approximately amplified as Q=γ2
Q. Therefore, we
need to have Qtimes more samples to achieve a similar
accuracy before applying QEM.
In practice, we use probabilistic error cancellation for
each gate in quantum circuits. The ideal process for the
entire quantum circuit is described as NG
k=1Uk. Denoting
Qikqiksgnik)BikNkUk, we have
From Eq. (6), we can see that, in each gate, a process Bikis
generated with probability qik, and the product of parities
k=1sgnik)is multiplied with the measurement results
to obtain the outcome ˆμeff. This procedure is repeated,
and the product of the mean of the outcomes ˆμeff
and γtot
Qapproximates the correct expectation
value. Note that here γtot
Qis the QEM cost for the entire
quantum circuit. Let us assume the cost for each gate is
uniform and can be approximated as γ(k)
with aand εbeing a positive constant value and the
effective error rate, respectively. Now the QEM cost and
sampling overhead can be approximated as γtot
eQ1)NGand tot
Q= tot
Q)2, which increase exponentially
with the mean number of error events in the quantum
circuit εNG. Note that for εNG=O(1)and ε0, since
εkNG=0(k2), the QEM cost can be exactly described
as γtot
B. Fault-tolerant quantum computing
1. Stabilizer formalism
In the framework of FTQC, one prepares a redundant
number of physical qubits and performs quantum comput-
ing in a code space defined as a subspace of the whole
Hilbert space. By repetitively performing quantum error
detection and correction, we can protect the logical qubits
defined in the code space against physical errors. The
state of the logical qubits is manipulated in a fault-tolerant
manner with a set of logical operations.
The stabilizer formalism [3,32] is the most standard way
to construct quantum error-correcting codes. Here, sup-
posing that we construct klogical qubits with nphysical
qubits, a 2k-dimensional code space Cis specified with a
subgroup of n-qubit Pauli operators called the stabilizer
group. Let the n-qubit Pauli group be
Gn={±1, ±i}×{I,X,Y,Z}n,(7)
where Iis the identity operator and X=01
01are Pauli operators. The set of
Pauli operators SGnis called a stabilizer group if Sis
a commutative subgroup, the number of elements in Sis
2nk,andI∈ S. We denote the (nk)generator set of
a stabilizer group as G=(g1,...,gnk). The code space
Cis defined as an eigenspace with +1 eigenvalues for all
the operators in the stabilizer group, i.e., C= {|ψ|si
S,si|ψ=|ψ}. In the code space, we can introduce a
logical basis as {|0L,|1L}kand logical Pauli operators
as {IL,XL,YL,ZL}k. The code distance dis defined as the
minimum number of physical qubits on which an arbitrary
logical operator, except the logical identity Ik
L, acts.
During a quantum computation, physical errors that
occur in the encoded state are detected by using (n
k)Pauli measurements Ps=(1/2)[I+(1)sgi]fors
{0, 1}. These measurements are called stabilizer measure-
ments and their binary outcomes sare called syndrome
values. The original state is restored by applying appro-
priate feedback operations that are estimated from the
syndrome values. These stabilizer measurements are per-
formed repeatedly during a computation. One repetition
of the stabilizer measurements is called a code cycle of
fault-tolerant quantum computing. If the effective error
probability per physical qubit during a cycle is smaller than
a certain threshold, we can estimate the Pauli operator that
restores the original state with an exponentially small fail-
ure probability with the code distance d. Since the required
number of physical qubits nincreases polynomially with
the code distance din typical quantum error-correcting
codes, we can exponentially decrease the error probability
of logical qubits with a polynomial qubit overhead.
2. Logical operations
We must not only correct physical errors but also update
the logical quantum state for performing quantum com-
putation. To this end, a universal set of logical operations
should be performed in a fault-tolerant manner. According
to the Solovay-Kitaev theorem [25,26], we can approxi-
mate arbitrary one- and two-qubit gates with a finite set
of local operations. For example, the Hadamard gate H=
11, controlled-not (CNOT) gate =|00|⊗
I+|11|⊗X,andTgate T=exp [i(π/8)Z]form a uni-
versal gate set. Several logical operations can be performed
by transversally operating the same one- or two-qubit oper-
ations on physical qubits. Since transversal operations con-
stantly increase the effective physical error rate per qubit
during a cycle, we can fault tolerantly achieve transver-
sal logical operations. However, it is known that there is
no stabilizer code for which the set of transversal gates
is universal [33]. Thus, we need an additional technique
to achieve fault-tolerant and universal quantum comput-
ing. The most promising solution is to create a quantum
state called a magic state and perform nontransversal
logical operations with gate teleportation [5]. For exam-
ple, |AL=TH |0L=(1/2)(eiπ/8|0L+eiπ/8|1L)is
a typical magic state and T-gate operations can be per-
formed by consuming this state. This magic state encoded
in a logical qubit can be constructed with a process called
magic state injection. While the infidelity of a magic state
created by magic state injection is generally larger than the
logical error rate, we can create a high-fidelity magic state
from several noisy magic states by using another quantum
error-correcting code implemented on the logical space,
which is called magic state distillation. Since the appli-
cation of Tgates requires a longer time than the other
operations, the number of Tgates is the dominant factor
affecting the computation time of FTQC.
Although we can estimate a Pauli operation for recov-
ery from syndrome values, we do not directly apply it
immediately after estimation. Instead, we store the Pauli
operations that should be applied to the physical qubits
FIG. 2. Schematic figure of the Pauli frame. The recovery
operations are not physically applied to quantum computers but
rather are stored in the Pauli frame and efficiently updated after
each Clifford gate operation. The measurement outcomes are
flipped depending on the state of the Pauli frame.
for recovery in a classical memory called the Pauli frame
[5,34]. The stored operations will be taken into account
when the logical measurements are performed; the out-
come of a logical measurement is flipped according to
the Pauli frame. A schematic figure is shown in Fig. 2.
In the above construction of logical operations, the whole
process, except for magic state injection, consists only of
Clifford operations and Pauli channels in the code space.
Since a Pauli operator conjugated by a Clifford operator
is also a Pauli operator, we can always track a recov-
ery operator as a Pauli operator during a computation. In
addition, when we can apply a logical Pauli operator to a
quantum state, we can perform it simply by updating the
Pauli frame, since a logical Pauli operator is a transversal
physical Pauli operation. As far as classical computers are
reliable, this operation is effectively noiseless.
In this section, we discuss how to integrate QEM into
the FTQC architecture. Here, we consider two types of
errors in FTQC: decoding errors due to failures in the error
estimation and insufficiency of magic state distillation and
approximation errors in the Solovay-Kitaev decomposi-
tion. In Sec. A, we explain how these errors in FTQC can
be modeled. In Sec. B, we discuss how these errors can be
canceled and evaluate their QEM costs. Probabilistic error
cancellation requires the errors to be estimated in advance.
In Sec. C, we also discuss the effect of estimation errors
on probabilistic error cancellation and the characterization
A. Errors in fault-tolerant quantum computing
1. Decoding error
Here, we describe noise due to the failures of error
estimation in elementary logical operations, i.e., stabilizer
measurements and magic state distillation. The first obsta-
cle to applying probabilistic error cancellation to FTQC is
how to characterize an effective map of noise due to the
failures of error estimation. If we suppose that the physical
errors can be modeled as a stochastic physical Pauli map
and assume that there are no errors on the ancillary qubits
for syndrome measurements, we can define a logical noise
map for decoding errors that is Markovian and a logical
stochastic Pauli map. Yet, these assumptions do not hold in
practice. Nevertheless, here we assume that we can define
an effectively Markovian logical error map for each logical
operation and also assume that this noise map is a stochas-
tic logical Pauli map. It is known that even if noise is
unitary, a noise map in a logical space of surface codes can
be well approximated as stochastic Pauli noise when the
code distance is sufficiently large [35]. Furthermore, the
remaining coherent errors can be canceled by using pulse
optimization techniques. Thus, it is reasonable to suppose
that the decoding errors due to the failure of error estima-
tions in surface codes are almost stochastic Pauli errors.
In addition, we numerically verify that we can regard the
decoding errors as Markovian errors even in the presence
of measurement errors. See Appendix Ffor details. While
we mainly describe and analyze the decoding errors in the
surface codes, a similar idea can be applied to the decoding
errors due to insufficient magic state distillation. As for the
logical noise map on a prepared magic state due to insuf-
ficient magic state distillation, we can twirl the noise map
by logical Clifford operations, and it can also be assumed
to be a stochastic Pauli noise.
Under the above assumptions, we can describe a noise
map for a l-qubit logical operation Ndec as the following
stochastic Pauli noise:
Ndec) =
where pgR,gpg=1andpg0. The sum of prob-
abilities of nonidentity logical operations is called the
logical error probability pdec, i.e., pdec =g=Ilpg.Itis
known that when the physical error rate pis smaller than
a value called the threshold pth, the effective logical error
probability decreases exponentially with respect to the
code distance d. For the effective logical error probability
per syndrome-measurement cycle of surface codes pcyc,it
decreases as
pcyc C1C2
pth (d+1)/2
where C1,C2are constants [36]. While the constant values
depend on the details of the error correction schemes, C1
0.13 and C20.61 are expected in a typical construction
of surface codes and the noise model [36,37]. Suppose that
a logical operation requires mcycles; then, the logical error
probability for the logical operation can be approximated
as pdec as
pdec =1(1pcyc)mmpcyc . (10)
Note that the number of cycles per logical gate increases at
most linearly with the code distance d.
In order to apply probabilistic error cancellation, we
need to know the logical error probabilities {pg}in
advance. While we can estimate {pg}by using gate-set
tomography in the logical space, the estimations are not
exact. The effect of estimation errors is discussed in Sec. C,
while the efficiency of our proposal, including noise char-
acterization, is discussed in Appendix C.
2. Approximation error
Since we are only allowed to use a limited set of log-
ical operations for achieving fault tolerance, we need to
decompose an arbitrary unitary gate into a sequence of
available gates. Any unitary operator can be decomposed
into a product of CNOT gates and single-qubit gates. Thus,
we need to approximate single-qubit gates with a given
gate set to the desired accuracy. By using the improved
Solovay-Kitaev algorithm [38], given a universal gate set
such as {T,H,S}and the single-qubit gate Uto be approx-
imated, we can construct an approximated gate ˜
U, which
satisfies ε=˜
UUto an arbitrary accuracy εas a
sequence of given gate set with length ˜
O[log1)] with
·being an operator norm. The error of approximated
map is given by
NSK) =˜
U. (11)
Since this decomposition involves only single-qubit oper-
ations, this error channel can be efficiently and exactly
evaluated in advance.
B. Quantum error mitigation for fault-tolerant
quantum computing
1. Overview of our framework
Here, we show that decoding errors and approximation
errors can be mitigated with probabilistic error cancella-
tion. When we insert recovery operations for probabilistic
error cancellation, it is assumed that the noise level of the
recovery operations for QEM is much lower than that of
the error-mitigated gates. In NISQ computing, for exam-
ple, it is reasonable to assume that the error probabilities of
two-qubit gates are much larger than those of single-qubit
gates and measurements; therefore, the errors of two-qubit
gates can be mitigated by using single-qubit recovery oper-
ations. However, this is not a reasonable assumption in
FTQC, since the operations that are noisy and time con-
suming are different from those of NISQ architecture.
More concretely, even Clifford operations involving only
one logical qubit suffer decoding errors.
Here, we show an architecture of FTQC that imple-
ments QEM with small overheads. The keys are the two
significant properties of FTQC architecture: logical Pauli
operations are error-free and instantaneous due to the Pauli
frame, and the noise map of the decoding errors can be
assumed as stochastic Pauli noise. Thanks to these prop-
erties, we can mitigate errors in all the elementary logical
operations simply by updating the Pauli frame. This means
the error-mitigated Clifford operations and Pauli measure-
ments are available for computation. Because they form
a complete basis for mitigating arbitrary errors [22], we
can mitigate approximation errors due to the Solovay-
Kitaev decomposition. Since the approximation errors can
be exactly known in advance, an unbiased estimator free
from approximation error can be obtained, as explained in
Sec. 3.
To make our QEM procedure work, the accuracy and
efficiency of the decoding error estimation are vital. We
show that the decoding errors can be estimated with gate-
set tomography under an appropriate choice of the gauge,
considering state-preparation and measurement errors. We
also show that the cost of gate-set tomography is accept-
able compared with the main computation of FTQC for
estimating expectation values in Sec. C. In this section, we
further show a refined gate-set tomography suited to our
framework that significantly improves the estimation for
logical Clifford gates.
2. Quantum error mitigation for decoding errors
We can express the inverse channel of the nonuniform
depolarizing channel Eq. (8) as a linear combination of
Pauli operations. Thus, we can express the inverse channel
dec (ρ) =
qgsgng)gρg. (12)
Refer to Appendix Bfor a concrete expression of each
coefficient ηg,γdec,andqg. Thus, we can suppress the
errors by applying probabilistic error cancellation only
with Pauli operators after the decoding processes. The
QEM cost for decoding errors in the entire circuit can
be expressed as γtot
dec =Ndec
dec , where Ndec is the num-
ber of logical gates, and γ(k)
dec is a QEM cost of the kth
FIG. 3. Schematic figure for the Pauli frame incorporating
QEM. If a QEM recovery operation is a Pauli operation, it is
not directly applied to the quantum computer but rather the Pauli
frame is updated instead. The parity is also updated in accordance
with the generated recovery operations of QEM. Here, we denote
the parity corresponding to the QEM recovery operation as pain
the figure. If a QEM recovery operation is not a Pauli operator,
it is performed physically. Then measurement outcomes are then
postprocessed depending on the Pauli frame, parity, and QEM
Note that probabilistic error cancellation usually applies
the recovery operations of QEM immediately after the
noisy gates [21,22]; however, because we perform only
logical Pauli operations as the recovery operations for
decoding errors, they can be done simply by updating
the Pauli frame instead of directly applying them after
noisy gates. Finally, the measurement result is postpro-
cessed according to the state of the Pauli frame, the parity
corresponding to the applied recovery operations, and the
QEM cost. Thus, unlike in probabilistic error cancellation
for NISQ devices, the logical noise due to decoding errors
can be mitigated without any additional noise due to the
recovery operation. A schematic figure is shown in Fig. 3.
Note that the information on the QEM cost and the parity
is used only when the final measurement result is obtained;
the outcome of a destructive logical Pauli measurement
is flipped depending only on the state of the Pauli frame.
Whether we can mitigate decoding errors of complicated
logical operations such as magic state preparation, gate
teleportation, and adaptive Clifford gates by simply updat-
ing the Pauli frame is not trivial; therefore we provide a
concrete procedure for actual devices and Pauli frames in
Appendix E.
In the case of decoding errors in surface codes, by
approximating the QEM cost to the first order of the logical
error, we have
γdec 1+2pdec. (13)
Refer to Appendix Bfor details. Under the assumption
that the logical error rate is the same for all the logical
operations and pdecNdec =O(1)with pdec →+0, the QEM
cost γtot
dec for the entire quantum circuit can be shown to be
exactly equal to e2pdecNdec on the basis of the argument in
Sec. A. Thus, by using Eqs. (9) and (10), we obtain
γdec 1=2mC1C2
pth (d+1)/2
, (14)
which results in the total QEM sampling overhead
L=e2dec1)Ndec =exp 4mC1Ndec C2
pth (d+1)/2.
Notice that Eq. (15) clearly shows a trade-off relation-
ship between the sampling overhead and the code distance,
i.e., the number of physical qubits.
Finally, although we mainly focus on the surface code
architecture in this paper, our scheme is also available
for typical stabilizer-based error-correcting codes such as
color codes [39] and concatenated Steane or Shor codes
[3,40,41] because we assume only basic properties of sta-
bilizer codes, i.e., Pauli recovery operations are estimated
from syndrome measurements and non-Clifford opera-
tions are applied by generating and consuming magic
states. Therefore, we can choose suitable quantum error-
correcting codes according to the hardware connectivity.
3. Quantum error mitigation for approximation errors
Unlike decoding errors due to the failure of error correc-
tion, we cannot describe errors due to the Solovay-Kitaev
decomposition as stochastic Pauli errors. Nevertheless, we
can still apply probabilistic error cancellation with negli-
gible overheads. Denote NSK) =˜
UUρ( ˜
UU); we invert
this approximation error by
SK =
i, (16)
where {B(L)
i}denotes recovery operations in the logical
space. Note that we can represent any map as a linear com-
bination of Clifford operations and Pauli channels [22], and
thus, we do not need Tgates for mitigating approxima-
tion errors. Recovery operations are randomly chosen and
applied immediately after each single-qubit logical opera-
tion if they are not Pauli operations. In the case of Pauli
operations, we can again use the Pauli frame, and phys-
ical operations on quantum computers are not required,
in a similar vein to QEM for decoding errors. Since a
single-qubit logical unitary operation consists of several
repetitions of Clifford gates and T-gate teleportation, the
insertion of the recovery operation for probabilistic error
cancellation negligibly increases the length of the quantum
circuit. In the numerical simulations described in the next
section, we verify that the QEM costs can be approximated
with the following equation:
γSK 1=β1eβ2NT, (17)
where β1and β2are constants dependent on the quantum
gate and NTis the number of available Tgates.
The QEM cost due to approximation errors can also be
represented as γtot
SK , where NSK is the total
number of recovery operations for mitigating approxi-
mation errors in the quantum circuit with the cost γ(k)
corresponding to the kth recovery operation. By assum-
ing that the cost does not depend on gates, we have the
following QEM sampling overhead:
SK exp 2β1NSKeβ2NT. (18)
This shows there is a trade-off relationship between the
sampling overhead and the number of available Tgates.
C. Effect of estimation errors of the noise map
1. Effect of estimation errors on expectation values
While approximation errors can be exactly determined
in advance, decoding errors have to be characterized. Since
the decoding error probabilities are small, it is unavoid-
able that the characterization will contain finite and non-
negligible estimation errors. Thus, we need to care about
QEM with estimation errors and the efficiency of the
characterization of decoding errors.
Let us discuss how estimation errors affect the per-
formance of QEM. Given a perfect characterization of
the noise model Nkfor the kth gate, we can realize the
inverse operation N1
kwith probabilistic error cancellation
to achieve N1
kNk=I. If we obtain an incorrect estima-
tion for the error process N
k= Nk, it leads to an estimation
error NkN−1
kNk= I.
Now, denoting the ideal process of the kth gate as Uk, the
difference of the error-mitigated process and the error-free
process for the entire quantum circuit can be described by
the diamond norm:
εNG, (19)
where we use the fact that the diamond norm is subad-
ditive and we denote ε =maxkNkI. Similarly,
the discrepancy of the noisy and ideal process can be upper
bounded as NG
k=1UkεNG, where ε=
maxkNkI. Because the deviation of the expectation
values of an observable Mfor two processes E1and E2with
the input state ρcan be described as δM=Tr{M[E1)
E2(ρ)]}≤ME1E2[42], where · is an oper-
ator norm, we have δMQEM ≤MεNGand δMnoise
MεNG. Here, δMQEM and δMnoise are the deviation of
the observable with and without error mitigation.
Thus, we can see that QEM is beneficial when we can
achieve r<1for
ε =rε. (20)
Note that this discussion does not include sampling errors;
i.e., δMis the error of the expectation value given infinite
2. Efficiency of characterization of decoding errors
As a cause of model estimation errors, when we use
gate-set tomography to characterize the noise model for
decoding errors, we need to consider state preparation and
measurement (SPAM) errors and the finite statistical error
arising from an insufficient number of samples. It has been
shown that the effect of SPAM errors can be eliminated in
the case of probabilistic error cancellation based on gate-
set tomography [22]. While the general choice of the gauge
is not compatible with the Pauli frame, we can modify
the scheme of gate-set tomography so that this method
is compatible with QEM with the Pauli frame. Refer to
Appendix Cfor details.
To achieve an accuracy rgiven in Eq. (20), we need to
perform NGST =O[(rε)2] samplings for gate-set tomog-
raphy [28,43]. Here, we show this efficiency is accept-
able compared with the main part of FTQC, i.e., the
time required for gate-set tomography corresponds to
O(r2nqNG)runs of the whole quantum logical circuits
to obtain expectation values, where nqis the number of
logical qubits. Let the time for a single run of the log-
ical circuit of FTQC be τ. The depth of logical quan-
tum circuit is estimated as O(NGn1
q), and the time per
gate can be roughly approximated as τgate =O nqN1
Then, the time for gate-set tomography can be estimated
as τGST =OgateNGST )=O[τN1
Gnq(rε)2]. In a situation
where QEM is useful, we have εNG=O(1)[44]. Thus, we
can conclude that to use QEM to decrease the logical error
rate pdec to rpdec by QEM, we need gate-set tomography as
a precomputation that takes τGST =O NGnqr2), which
is τGST =O(NGnqr2)times longer than a single circuit
run of FTQC. The numbers of logical gates NGand logi-
cal qubits nqare expected to grow polynomially with the
problem size, and FTQC circuits will be repeated on the
order of O(r2)to make the statistical fluctuation of expec-
tation values smaller than the reduced bias. Accordingly,
while the estimation costs of the noise map cause another
overhead to FTQC depending on the required accuracy, it
is performed with a time that grows polynomially with
the problem size and without requiring additional phys-
ical qubits. We remark that when we assume the noise
properties of the quantum devices are uniform, we can per-
form the sampling for gate-set tomography in parallel. If
we use all the logical qubits for characterization, the time
for gate-set tomography is reduced to τGST =O NGr2).
Note that, in the scenario that we can fully parallelize the
sampling procedure, i.e., when we have O[(rε)2] distinct
uniform quantum gates, we have τGST =O N1
To further make the characterization of noise more effi-
cient, we propose an improved gate-set tomography for
decoding errors of the Clifford process that is fast and
compatible with the Pauli frame. See Appendix Cfor the
details of this scheme. The number of measurements NGST
is reduced to NGST =O(r2ε1), which makes the costs
of precomputation O(nqr2). Thus, as long as ris not
too small, the time for characterization is expected to be
relatively short. While our efficient gate-set tomography
cannot be applied to the characterization of the T-gate
preparation, several ways to reduce the costs for estimat-
ing errors of Tgates can be considered. Since the error
of logical Tgate depends on physical Tgate and the pro-
cess of injection and distillation is constructed by a few
T-gate circuit, there may be an efficient way to numerically
estimate the noise of logical Tgate from the characteriza-
tion of physical Tgate and efficient simulation for quantum
circuits dominated by Clifford gates [45]. There may be a
way to mitigate T-gate errors by temporally expanding the
code distance or increasing the distillation depth for Tgate.
The cost of gate-set tomography might be also reduced by
utilizing long-sequence GST [43], i.e., repeating several T
gates to amplify a small error rate to a large value. Refer-
ence [46] shows that if decoding errors of logical Clifford
gates are negligible, one can reliably twirl the noise of T
gate and perform efficient process tomography on that by
repeating Tgates. Nevertheless, it is still an open problem
whether there exists a more efficient gate-set tomogra-
phy on the logical space with imperfect logical Clifford
3. Effective increase in code distance by quantum error
mitigation under estimation errors
We can regard that QEM effectively increases the code
distance. Suppose that we can effectively achieve an r
times smaller logical error rate peff =rpdec via QEM. Since
the logical error rate is roughly approximated with the
code distance as pdec(d)=p(p/pth )(d1)/2, QEM effec-
tively achieves a larger code distance dwhere peff =
pdec(d)without increasing the number of physical qubits.
The effective increase in the code distance via QEM can be
derived as
dd=2ln r
ln p/pth. (21)
Therefore, by setting r=(p/pth)x, we can effectively
increase the code distance by 2x. Note that, as discussed
in the previous sections, we need exp[O(Ndecpdec )]=
exp[O(1)] times more repetitions to achieve the same pre-
cision as the error-free case. It<