PreprintPDF Available

Vertex Fault-Tolerant Emulators

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract

A $k$-spanner of a graph $G$ is a sparse subgraph that preserves its shortest path distances up to a multiplicative stretch factor of $k$, and a $k$-emulator is similar but not required to be a subgraph of $G$. A classic theorem by Thorup and Zwick [JACM '05] shows that, despite the extra flexibility available to emulators, the size/stretch tradeoffs for spanners and emulators are equivalent. Our main result is that this equivalence in tradeoffs no longer holds in the commonly-studied setting of graphs with vertex failures. That is: we introduce a natural definition of vertex fault-tolerant emulators, and then we show a three-way tradeoff between size, stretch, and fault-tolerance for these emulators that polynomially surpasses the tradeoff known to be optimal for spanners. We complement our emulator upper bound with a lower bound construction that is essentially tight (within $\log n$ factors of the upper bound) when the stretch is $2k-1$ and $k$ is either a fixed odd integer or $2$. We also show constructions of fault-tolerant emulators with additive error, demonstrating that these also enjoy significantly improved tradeoffs over those available for fault-tolerant additive spanners.
arXiv:2109.08042v1 [cs.DS] 16 Sep 2021
Vertex Fault-Tolerant Emulators
Greg Bodwin
University of Michigan
bodwin@umich.edu
Michael Dinitz
Johns Hopkins University
mdinitz@cs.jhu.edu
Yasamin Nazari
University of Salzburg
ynazari@cs.sbg.ac.at
Abstract
Ak-spanner of a graph Gis a sparse subgraph that preserves its shortest path distances up to
a multiplicative stretch factor of k, and a k-emulator is similar but not required to be a subgraph
of G. A classic theorem by Thorup and Zwick [JACM ’05] shows that, despite the extra flexibility
available to emulators, the size/stretch tradeoffs for spanners and emulators are equivalent. Our
main result is that this equivalence in tradeoffs no longer holds in the commonly-studied setting
of graphs with vertex failures. That is: we introduce a natural definition of vertex fault-tolerant
emulators, and then we show a three-way tradeoff between size, stretch, and fault-tolerance for
these emulators that polynomially surpasses the tradeoff known to be optimal for spanners.
We complement our emulator upper bound with a lower bound construction that is essentially
tight (within log nfactors of the upper bound) when the stretch is 2k1 and kis either a fixed
odd integer or 2. We also show constructions of fault-tolerant emulators with additive error,
demonstrating that these also enjoy significantly improved tradeoffs over those available for
fault-tolerant additive spanners.
Supported in part by NSF award CCF-1909111.
Supported in part by NSF award CCF-1909111 and by Austrian Science Fund (FWF) grant P 32863-N.
1 Introduction
Two well-studied objects in graph sparsification are spanners and emulators. Given a weighted
input graph G= (V, E, w), a t-spanner of Gis a subgraph Hof Gin which
distG(u, v)distH(u, v)t·distH(u, v) (1)
for all u, v V. Note that the first inequality, that distG(u, v)distH(u, v), is implied auto-
matically by the fact that His a subgraph of G. The value tis called the the stretch of the
spanner.
At-emulator [19] is defined in the same way, except that His not required to be a subgraph of
G. For emulators, the first inequality is not automatic, and it implies that any edge (u, v) in the
emulator Hbut not in the input graph Gmust have weight at least distG(u, v). In fact, it is easy
to see that without loss of generality that we may assign it weight exactly distG(u, v).
1.1 Equivalence of Spanner/Emulator Tradeoffs
Both spanners and emulators have been studied extensively, and we have long had a complete under-
standing of the tradeoffs between spanner/emulator size (number of edges) and stretch. Specifically:
Alth¨ofer et al. [3] proved that for every positive integer k, every weighted graph G= (V, E , w)
has a (2k1)-spanner with at most O(n1+1/k ) edges.
On the lower bounds side, one can quickly verify that any unweighted input graph Gof girth
>2khas no (2k1) spanner, except for Gitself. Under the Erd˝os girth conjecture [20],
there are graphs of girth >2kwith Ω(n1+1/k) edges. Thus, the upper bound of Alth¨ofer et
al. cannot be improved at all on these graphs.
Thorup and Zwick [27] observed that essentially the same lower bound applies for emulators.
For any two subgraphs H1, H2of a graph of girth >2k, they disagree on some pairwise
distance by more than ·(2k1). This implies that H1, H2need different representations as
(2k1)-emulators. There are 2Ω(n1+1/k)subgraphs of a girth conjecture graph, and so by a
pigeonhole argument, one of these subgraphs requires an emulator on e
Ω(n1+1/k) edges. (In
fact, the same method gives a lower bound on the size of any data structure that approximately
encodes graph distances, and hence this is often called an incompressibility argument.)
Thus, even though emulators are substantially more general objects than spanners, they do not
enjoy a meaningfully better tradeoff between size and stretch.
1.2 Fault-Tolerant Spanners
Spanners are commonly applied as a primitive in distributed computing, in which network nodes
or edges are prone to sporadic failures. This has motivated significant interest in fault-tolerant
spanners. Intuitively, a vertex fault-tolerant spanner is a subgraph that remains a spanner even
after a small set of nodes fails in both the spanner and the original graph. More formally, the
following definition was given by Chechik, Langberg, Peleg, and Roditty [17].
Definition 1 (VFT Spanner).Let G= (V, E , w) be a weighted graph. A subgraph Hof Gis an
f-vertex fault-tolerant (f-VFT) t-spanner of Gif, for all FVwith |F| ≤ f,H\Fis a t-spanner
of G\F.
1
After significant work following [17], we now completely understand the achievable bounds on
fault-tolerant spanners: Bodwin and Patel [14] proved that every graph has an f-VFT (2k1)
spanner with at most Of11/kn1+1/kedges (and the same bounds were shown to be achievable in
polynomial time by [11, 18]), and Bodwin, Dinitz, Parter, and Williams [10] gave examples (under
the girth conjecture) of graphs on which this bound cannot be improved in any range of parameters.
1.3 Fault-Tolerant Emulators
In this paper we ask a natural question: what if we add a fault-tolerance requirement to emulators?
Are stronger bounds possible than the ones known for spanners? Making progress on this requires
answers to two related questions:
1. How should we even define a fault-tolerant emulator? As we discuss shortly, there are two
different definitions that both seem plausible at first glance.
2. The lower bound on VFT spanners of [10] can also be generalized into an incompressibility
argument, like the one by Thorup and Zwick [27]. Since an emulator is just a different way
of compressing distances, why wouldn’t this lower bound apply to fault-tolerant emulators,
ruling out hope for a better size/stretch tradeoff?
These questions turn out to have some surprising answers. We first argue that, of the two
a priori reasonable definitions of fault-tolerant emulators, only one of them is actually sensible.
We then show that this definition escapes the incompressibility lower bound, and we design fault-
tolerant emulators that are sparser than the known lower bounds for fault-tolerant spanners by
poly(f) factors. We also discuss fault-tolerant emulators with additive stretch, and show that these
also enjoy substantial improvements in size/stretch tradeoff over fault-tolerant additive spanners.
1.3.1 VFT Emulator Definition
Before we can even discuss bounds or constructions, we need to define fault-tolerant emulators.
Following Definition 1, we get the following definition:
Definition 2 (VFT Emulator Template Definition).Let G= (V, E , w) be a weighted graph. A
graph His an f-vertex fault-tolerant (f-VFT) t-emulator of Gif, for all FVwith |F| ≤ f,
H\Fis a t-emulator of G\F.
However, there are two reasonable definitions of (non-faulty) t-emulators that we could plug
into this template definition. These definitions are functionally equivalent in the non-faulty setting,
but they give rise to two importantly different definitions of VFT emulators.
1. One natural possibility is to define a weighted graph Hto be a t-emulator of Gif it satisfies
distG(u, v)distH(u, v)t·distH(u, v)
for all nodes u, v. Plugging this into Definition 2, we get that a weighted graph His an
f-VFT emulator of Gif, for any fault set FV, |F| ≤ fand vertices u, v V\F, we have
distG\F(u, v)distH\F(u, v)t·distG\F(u, v).
2. Recall that in the non-faulty setting, we always set the weight of an emulator edge {u, v}
to be exactly w(u, v) = distG(u, v): we need w(u, v)distG(u, v) in order to ensure that
2
distG(u, v)distH(u, v), and there is no benefit to setting w(u, v)>distG(u, v). In other
words, we can define an emulator Has an unweighted graph, where the weight of each edge
{u, v}simply becomes the corresponding distance distG(u, v) in the input graph. We then say
that His a t-emulator if it satisfies the usual distance inequalities
distG(u, v)distH(u, v)t·distG(u, v)
after this reweighting. This is a subtle distinction, since there is no important difference
from the previous one in the non-faulty setting. But passed through Definition 2, it gives an
importantly different definition of VFT emulators:
Definition 3 (VFT Emulators).Let G= (V, E, w) be a weighted graph, and let Hbe an
graph on vertex set V. For every fault set FVwith |F| ≤ f, for every u, v 6∈ Fwith
(u, v)E(H), we define weight function wFwhere wF(u, v) = distG\F(u, v).
We then define distH\F(u, v) to be the u vshortest path distance in H\Funder weight
function wF. We say that His an f-VFT t-emulator if
distG\F(u, v)distH\F(u, v)t·distG\F(u, v)
for all u, v Vand for all FVwith |F| ≤ fand u, v 6∈ F.
In other words, for emulator edges in H, the edge weight in the post-fault graph H\F
automatically updates to be equal to the shortest-path distance between the endpoints in the
remaining graph G\F.
Our next task is to point out that the second definition is the natural one to study, both math-
ematically and because it captures applications of fault-tolerant emulators in distributed systems.
Going forward, Definition 3 is the one we use.
1.3.2 Theoretical Motivation for the Second Definition
Although the first definition of VFT emulators may seem simpler, there is a pitfall when one
attempts any construction under this definition. Imagine that we add an edge (u, v) to an emulator
H, where (u, v) is not also an edge in G. Suppose we set its weight to w(u, v) = distG(u, v). Then
after any set of vertex faults that stretches distG(u, v) at all, the u vdistance will be smaller
in Hthan in G, violating the lower distance inequality! In general, one would always have to set
emulator edge weights to be at least the maximum distance distG\F(u, v) over all possible vertex
fault sets F. This is an unnatural constraint, and it precludes most reasonable uses of emulator
edges. For example, if Gis a path with three nodes uxvand we create an emulator edge (u, v),
then if F={x}we will have distG\F(u, v) = . Thus we are forced to set emulator edge weight
w(u, v) = , essentially disallowing this as an emulator edge at all.
The other issue is the incompressibility lower bounds from [10]. The lower bound on VFT
spanners from [10] actually holds for all compression schemes: one cannot generally build a data
structure on o(f11/kn1+1/k) bits that can report (2k1)-approximate distances between all pairs
of vertices under at most fvertex faults. The first definition of VFT emulators functions as such
a compression scheme, so it cannot achieve improved bounds.
Why can we hope for the second definition of VFT emulators to bypass this lower bound?
The answer lies in the fact that our emulator definition updates its edge weights under faults. A
VFT emulator cannot actually be represented by a data structure of size approximately equal to
3
the number of edges in the emulator, since a static data structure would not have this updating
behavior. In other words, since we assume that weight updates occur automatically, we are not
charging ourselves for the extra information one would have to carrry around in order to actually
compute these updates. This means it is a priori possible that the second definition of fault-tolerant
emulators can be significantly sparser than fault-tolerant spanners.
1.3.3 Practical Motivation for the Second Definition.
Now we explain the practical motivation behind the second definition. Automatically updating edge
weights may seem at first like an incredibly strong and unrealistic assumption. Indeed, in some
of the contexts in which spanners and emulators are used this would not be reasonable, e.g., as a
preprocessing step for computing shortest paths [2, 19]. But spanners were originally designed for
use in distributed computing [23,24], and in distributed contexts, emulator edges typically represent
logical links rather than physical links. That is, each emulator edge is treated as if it represents
a path between the endpoints, since that is how packets/messages would actually travel between
the endpoints. An example of this is overlay networks, where one builds a logical network that
lives “on top of” another network (usually the Internet). Overlay networks have been extensively
studied, often either directly or indirectly using spanners, emulators, or related objects [4–6, 25,26].
In a logical link on top of an underlying network, packets are automatically rerouted post-
failures using some routing protocol on the underlying topology. The vast majority of these routing
protocols use shortest paths. So for a logical link (u, v), we would actually expect its distance to
“automatically” become distG\F(u, v), where the seeming “magic” of the edge weight update is
implemented by the underlying routing algorithm converging on new shortest paths.
So in applications of emulators to distributed computing, edges that take on weight equal to the
remaining shortest path length is a very reasonable assumption. Note that this does not obviate
the need for emulators: in an overlay network there will be a layer of routing in the overlay network
itself (on top of the underlying network), so packets sent from uto vwill follow shortest paths
between uand vin the overlay. Thus, these packets will experience stretch according to the stretch
of the weight-updating emulator.
1.4 Our Results
Our previous discussion explains why it is possible for VFT emulators to improve on the size/stretch
tradeoff available to VFT spanners. Our main results confirm this possibility; we construct VFT
emulators that polynomially surpass the lower bounds for VFT spanners.
1.4.1 Multiplicative Stretch
Our most general results (and main technical contributions) are in the multiplicative stretch setting,
where we prove the following upper bound.
Theorem 1.1. For all k1and fn, every n-node weighted graph G= (V, E, w)admits an
f-VFT (2k1)-emulator Hwith
|E(H)| ≤
e
Okf1
21
2kn1+1/k +fnif kodd
e
Okf1
2n1+1/k +fnif keven.
Moreover, there is a randomized polynomial-time algorithm which constructs such an emulator with
high probability.
4
In the above theorem, e
Okhides factors that are polylogarithmic in n, and also factors that are
exponential in k.1We typically think of fas being polynomial in n, and in this setting (and when
kis a constant at least 3), our emulators improve polynomially on VFT spanners.
The algorithm we design to prove Theorem 1.1 starts from the basic greedy VFT spanner
algorithm of [10] (and its polytime extension in [18]), where we consider edges in nondecreasing
weight order and add an edge if there is a fault set that forces us to add it. To take advantage of
the power of emulators, though, we augment this with an extra “path sampling” step: intuitively,
when we decide to add a spanner edge, we also flip a biased coin for every k-path that it completes
to decide whether to also add an emulator edge between the endpoints of the path. These extra
emulator edges do not replace the added spanner edge (i.e., we do not add the emulator edge
instead of the spanner edge), but instead act to help protect future graph edges in the ordering,
making it less likely that we will need to add spanner edges downstream. Our two main lemmas
are roughly that 1) with high probability there are not too many k-paths in our final emulator, and
2) if the emulator has many edges then it has many k-paths. Combining these with an appropriate
parameter balance gives us Theorem 1.1.
The technical details of this analysis get surprisingly tricky, and it turns out that we actually
cannot consider all k-paths in the algorithm and analysis outlined above, but only a carefully
selected subset of them that we call “SALAD” paths. The technical details of these paths are
responsible for the exp(k) factors and the even/odd kdistinction (a similar even/odd distinction
appears in [12], for a similar technical reason).
We complement our emulator upper bound with a nearly matching lower bound, which is a
relatively straightforward extension of the edge-fault-tolerant lower bound for spanners from [10].
The case of k= 2 (stretch 3) is slightly different, so we handle it separately.
Theorem 1.2. For all positive integers n, f with fn, there exists an unweighted n-node graph
with Ω(f1/2n3/2)edges for which any f-VFT 3-emulator must have at least Ω(f1/2n3/2)edges.
Theorem 1.3. Assuming the Eros girth conjecture, for all k3and fnthere is an unweighted
n-node graph in which every f-VFT (2k1)-emulator has at least Ω(k1f1
21
2kn1+1/k)edges.
This lower bound matches our upper bound for constant odd k, and is off by only an f1/(2k)
factor for constant even k. An easy folklore observation implies that any f-regular input graph
requires Ω(fn) edges for an f-VFT emulator, so our +f n terms cannot be removed either.
1.4.2 Additive Stretch
Spanners and emulators are also studied in the context of additive stretch: a +k-spanner/emulator
Hof an input graph Gis one that satisfies the distance inequality
distG(u, v)distH(u, v)distG(u, v) + k
for all nodes u, v. We have a complete understanding of the possibilities for additive emulators.
It is known that every unweighted input graph has a +2-emulator on O(n3/2) edges [2] and a +4-
emulator on O(n4/3) edges [19]. These emulators are optimal, both in the sense that neither size
bound can be improved at all, and in the sense that no +c-emulator can achieve O(n4/3ε) edges,
even when cis an arbitrarily large constant [1]. For spanners, our understanding lags only slightly
1When kis super-constant, [11, 14, 18] already give an upper bound of O(f n1+o(1) ) for VFT spanners, which
cannot be improved beyond O(f n) even with emulators. Thus the most interesting remaining parameter regime is
when k=O(1).
5
behind: all graphs have +2-spanners on O(n3/2) edges [2, 21], +4-spanners on e
O(n7/5) edges [16],
and +6-spanners on O(n4/3) edges [8, 21 , 29].
Braunschvig, Chechik, Peleg, and Sealfon [15] were the first to introduce fault-tolerance to
additive spanners, via the natural extension of Definition 3.
Unfortunately, it turns out that the price of fault-tolerance for additive spanners with fixed
error is untenably high. It is proved in [13] that, for any fixed constant c, there are graphs on
which an f-VFT +c-spanner needs n2Ω(1/f)edges. In other words, tolerating one additional
fault costs poly(n) in spanner size, and there is no way to tolerate Ω(log n) faults in subquadratic
size. Accordingly, constructions of VFT spanners of fixed size have to pay super-constant additive
error of type +O(f) [9, 13, 15, 22].
We define VFT additive emulators with similar weight-updating behavior as in the multiplica-
tive setting, with the same motivation. We then show that these emulators actually avoid the
undesirable size/fault-tolerance tradeoff suffered by VFT spanners. We show the following exten-
sions of the +2 and +4 emulators:
Theorem 1.4. For all fn, every n-node unweighted graph G= (V, E)admits an f-VFT
+2-emulator Hwith |E(H)|=e
O(f1/2n3/2)edges. There is also a randomized polynomial-time
algorithm which computes such an emulator with high probability.
Theorem 1.5. For all fn, every n-node unweighted graph G= (V, E)admits an f-VFT +4-
emulator Hwith |E(H)|=e
Of1/3n4/3+nfedges. There is also a randomized polynomial-time
algorithm which computes such an emulator with high probability.
The main point of these results is that the price of fault-tolerance for additive emulators is a
multiplicative factor depending only on f, rather than the parameter fappearing in the exponent
of the dependence on n, as it does for VFT additive spanners. Moreover, the f-factors we obtain
are essentially tight by our previous lower bound. Any +2-emulator is also a 3-emulator and hence
by Theorem 1.2 must have size at least Ω(f1/2n3/2), and any +4-emulator is also a 5-emulator and
so by Theorem 1.3 must have size at least Ω(f1/3n4/3).
1.5 Outline
We begin by proving Theorem 1.1 in the special case k= 3 in Section 2. This introduces the main
ideas and approach that we use to prove Theorem 1.1 in general, but it also happens to avoid a
few technical details that become necessary only when we move to larger k(allowing us to replace
the complicated SALAD paths with simpler “middle-heavy fault-avoiding” paths). We then prove
Theorem 1.1 in its full generality: in Section 3 we design an exponential-time algorithm which
proves existence of sparse fault-tolerant emulators for all k, and then in Section 4 we show how
to use ideas from [18] to make the algorithm polynomial-time without significant loss in emulator
sparsity. We then prove our lower bounds (Theorems 1.2 and 1.3) in Section 5, and we conclude
with our results on additive spanners in Section 6.
2 Warmup: k= 3 (Stretch 5)
We will warm up by proving the following special case of Theorem 1.1:
Theorem 2.1 (Theorem 1.1, k= 3).For all fn, every n-node weighted graph G= (V, E , w)
has an f-VFT 5-emulator Hwith |E(H)| ≤ e
Of1/3n4/3+O(fn).
6
Our algorithm for 5-emulators is given in Algorithm 1. We incrementally build an emulator H
by starting with an empty graph and adding edges. We designate our added edges as spanner edges
(which are always contained in the input graph) and emulator edges (which are not generally in
the input graph). We then let H(sp)be the subgraph of Hcontaining only its spanner edges, and
let H(em)be the subgraph of Hcontaining only its emulator edges.
The algorithm is defined with respect to a parameter d. Intuitively we can think of das (roughly)
the desired average node degree in our final emulator: we will set d=f1/3n1/3.
Algorithm 1: Algorithm for f-VFT (2k1)-emulators
Input: Graph G= (V, E , w), positive integer f;
Let H(V, , w) be the initially-empty emulator;
foreach edge (u, v)Ein order of nondecreasing weight w(u, v)do
if there is FV\{u, v}of size |F| ≤ fwith distH\F(u, v)>5·w(u, v)then
Add (u, v) as a spanner edge to H;
foreach sNH(sp)\F(u)and tNH(sp)\F(v)do
Add (s, t) as an emulator edge with probability d2;
return H;
We begin by proving correctness.
Lemma 2.2. The graph Hreturned by Algorithm 1 is an f-VFT 5-emulator of G.
Proof. It is easy to see (and essentially standard) that we just need to prove that distH\F(u, v)
5·w(u, v) for each edge (u, v)Eand possible fault set F: by considering shortest paths in G\F,
this suffices to imply that distH\F(x, y )5·distG\F(x, y) for all x, y, F with x, y 6∈ F, and hence
implies that His an f-VFT 5-emulator of G.
So let (u, v)E, and let FV\{u, v}with |F| ≤ f. If (u, v)E(H), then this is trivially true
since then distH\F(u, v)w(u, v) = distG\F(u, v). Otherwise, Algorithm 1 did not add (u, v) to
H. By the condition of the if statement, this implies that distH\F(u, v)5·w(u, v) as claimed.
We now move to the more difficult (and interesting) task of proving sparsity. We will assume for
convenience that all edge weights in the input graph Gare unique, so that we may unambiguously
refer to the heaviest edge among a set of edges. If not, the following argument still goes through
if we break ties between edge weights by the order in which the edges are considered by the
algorithm. We need to bound the number of spanner edges and the number of emulator edges in
the construction; our strategy is to count the number of instances of a particular structure in H(sp)
called middle-heavy fault-avoiding 3-paths, and then we will use this counting in two different ways
to bound the number of spanner edges in H, and then the number of emulator edges in H.
2.1 Sparsity Analysis
We start with the definitions of the paths that we care about, and then prove some of their properties
and begin to count them.
Definition 4 (Middle-Heavy 3-Paths).A 3-path πwith node sequence (s, u, v , t) is middle-heavy
if its middle edge is its heaviest one; that is, w(u, v)> w(s, u) and w(u, v)> w(v, t).
7
When the edge (u, v) is added to a middle-heavy path πin H(sp), we say that πis completed
by (u, v) (i.e. after adding (u, v)πexists in H(sp)).
For every edge (u, v) added by the algorithm, there must exist some set F(u,v)with |F(u,v)| ≤ f
such that distH\F(u,v)(u, v)>5·w(u, v) (or else the algorithm would not have added (u, v)). If
multiple such sets exist, choose one arbitrarily as F(u,v).
Definition 5 (Fault-Avoiding Paths).A path πin H(sp)with heaviest edge (u, v) is fault-avoiding
if πF(u,v)=.
We first prove an auxiliary counting lemma. Let C(s,t)count the number of middle-heavy fault-
avoiding 3-paths from sto tin H(sp)at a given moment in the algorithm. Whenever we choose to
add a spanner edge (u, v), we define the set
Ψ(u, v) := {(s, t)V×V|(u, v) completes a middle-heavy fault-avoiding 3-path from sto t}.
The following lemma gives a certain kind of control on the values that C(s,t)can reach:
Lemma 2.3. With high probability, whenever we add a new spanner edge (u, v)in our algorithm,
we have X
(s,t)Ψ(u,v)
C(s,t)e
O(fd2)
where the values C(s,t)are defined just before (u, v)is added to H.
Proof Sketch. We defer the full proof to Appendix A, since the details are technical and do not
provide much additional insight. Intuitively, this lemma is true because the counter value C(s,t)
corresponds to the number of different times we flipped a coin to decide whether or not to add
(s, t) as an edge (since C(s,t)is the number of middle-heavy 3-paths between sand t, and for each
such path we flip such a coin). Since each coin has bias d2by the definition of the algorithm, if
P(s,t)Ψ(u,v)C(s,t)were much larger than d2then with high probability there would already be an
emulator edge (s, t) where (s, t)Ψ(u, v). And if such an edge existed, the path ustvwould
have stretch at most 5, and hence we would not have added (u, v).
Making this formal requires union bounding over all possible fault sets Fin the definition of
fault-avoiding rather than just considering F(u,v), which also causes the extra factor of fin the
lemma statement. This introduces significant extra notation but is a straightforward calculation,
so we defer it to Appendix A.
We can now use Lemma 2.3 to bound the number of middle-heavy fault-avoiding 3-paths.
Lemma 2.4. With high probability, there are d2|E(H(sp))|+e
Ofn2total middle-heavy fault-
avoiding 3-paths in the final graph H(sp).
Proof. For each edge (u, v) added to the emulator, let us split into two cases by the size of Ψ(u, v).
Notice that, since a middle-heavy fault-avoiding path completed by (u, v) is uniquely determined by
(u, v) and its endpoints, the size of Ψ(u, v) is the same as the number of middle-heavy fault-avoiding
paths completed by (u, v).
Case 1: |Ψ(u, v)| ≤ d2.In this case, the edge (u, v) completes d2new middle-heavy fault-
avoiding 3-paths. By a unioning, only d2|E(H(sp)|middle-heavy fault-avoiding 3-paths can be
completed by edges of this type.
8
Case 2: |Ψ(u, v)|> d2.Assuming the high-probability event from Lemma 2.3 holds, we also have
X
(s,t)Ψ(u,v)
C(s,t)=e
Ofd2.
Thus, the average value of C(s,t)over the > d2node pairs in Ψ(u, v) is e
O(f). So by Markov’s
inequality, for at least half of the node pairs (s, t)Ψ(u, v), we have C(s,t)=e
O(f).
This implies that only e
O(fn2) middle-heavy fault-avoiding 3-paths may be completed by edges
from this case, by a straightforward amortization argument over all pairs (s, t). Whenever a middle-
heavy fault-avoiding 3-path π= (s, u, v, t) is completed by an edge in this second case, let us say
that πis dispersed if C(s,t)=e
O(f).
By the previous argument, at least half of all paths completed by edges in this case are dispersed,
so it suffices to only count the dispersed paths. Moreover, by definition of C(s,t)every dispersed
path from sto tis among the first e
O(f) middle-heavy 3-paths from sto t; thus, unioning over all
choices of s, t there are e
O(fn2) dispersed paths in total.
Combining the two cases, we get at most d2|E(H(sp)|+e
O(fn2) middle-heavy fault-avoiding
3-paths in H(sp).
We now show how to use the above bound on middle-heavy fault-avoiding 3-paths to bound the
number of edges in our emulator. We first bound the number of emulator edges (edges which were
added by the path sampling and so might not be in E) in terms of the number of spanner edges
(edges from E), and then bound the number of spanner edges.
Lemma 2.5 (Emulator Edge to Path Counting).With high probability, we have
H(em)OEH(sp)+e
Ofn2
d2.
Proof. Let α=d2|E(H(sp)|+e
O(fn2) be the bound on the number of middle-heavy fault-avoiding
3-paths in H(sp)which holds with high probability from Lemma 2.4. Consider the following two
events.
Let Abe the event that H(sp)has at most αmiddle-heavy fault-avoiding paths. We know
from Lemma 2.4 that this holds with high probability.
Whenever the algorithm considers adding some emulator edge, we call this an attempt. Let
Xibe an indicator random variable for the event that the ith attempt is successful (meaning
that the emulator edge is actually added). If there is no ith attempt since the algorithm has
terminated before iattempts are made, then we set Xi= 1 with probability d2and Xi= 0
with probability 1 d2. Note that E[Xi] = d2for all i. Moreover, note that Xiand Xjare
independent for i6=j. Let X=Pα
i=1 Xi, and let Bbe the event that X < 2d2α. So if B
occurs, then of the first αattempts, at most 2d2αemulator edges are added. By linearity of
expectations we know that E[X] = d2α. Moreover, we know that the Xi’s are independent
and that d2α=ω(log n). Hence a standard Chernoff bound implies that Boccurs with high
probability.
Since both Aand Boccur with high probability, a simple union bound implies that A ∩ B
occurs with high probability. Note that every emulator edge is caused by some attempt, and that
the number of attempts is precisely equal to the number of middle-heavy fault-avoiding 3-paths.
Hence if both Aand Boccur, the number of emulator edges in His at most O(d2α), as claimed.
9
Lemma 2.6 (Spanner Edge to Path Counting).Letting mbe the number of middle-heavy fault-
avoiding 3-paths in H(sp), we have EH(sp)Om1/3n2/3+nf.
Proof. Let c > 0 be a sufficiently small absolute constant and let δbe the average degree in H(sp).
If f > cδ then we have O(nf) edges in H(sp), and we are done. So assume in the following
that f. We will pass from H(sp)to a subgraph HH(sp), and then to another subgraph
H′′ H. The first of these moves is simple: let Hbe a random induced subgraph of H(sp)
obtained by independently keeping each node with probability ()1. For the second move, let us
say that an edge (u, v) in His clean if none of the nodes in its associated fault set F(u,v)survive
in H. We define H′′ as the subgraph of Hthat contains only its clean edges.
Let m′′ be the number of middle-heavy 3-paths in H′′ that are simple (do not repeat nodes).
Our proof strategy is to bound the expected value of m′′ from both below and above.
Lower Bound on E[m′′ ].First, let us analyze the probability that a given edge (u, v) in H(sp)
survives in H′′. The probability that (u, v) survives in His exactly ()2(the event that u, v
each survive). Conditional on (u, v) surviving in H, it is clean iff none of the nodes in F(u,v)also
survive. Since |F(u,v)| ≤ f, (u, v) is clean with constant probability. So (u, v) survives in H′′
with probability Ω(()2), which implies
EE(H′′)=E(H(sp))·()2= 1c2.
Meanwhile, the expected number of nodes that survive in H′′ is exactly E[|V(H′′)|] = 1c1.
Let us imagine that we start with an initially-empty graph on the vertex set V(H′′), and we add
the edges in E(H′′) one by one in order of increasing weight. For each added edge (u, v) that is the
first edge incident to one of its endpoints uor v, this edge does not create any new middle-heavy
3-paths. There are at most |V(H′′ )|such edges. Any other edge (u, v) creates at least one simple
middle-heavy 3-path in H′′. Specifically, the 3-path (s, u, v, t) in which it is the middle edge must
be middle-heavy by the order in which we are adding the edges, and it is simple since if s=tthen
we are forced to include sF(u,v), but then smust not survive in G(since (u, v) is clean). It
follows that
E[m′′]EE(H′′)V(H′′)=EE(H′′ )EV(H′′)= 1c21c1
= 1c2by choice of small enough c > 0.
Upper Bound on E[m′′].We can relate mand m′′ as follows. We notice that every simple
middle-heavy 3-path πin H′′ must correspond to a fault-avoiding 3-path in H(sp). This holds
because if the middle edge (u, v) of πsurvives in H′′, then it must be clean, implying that no nodes
in F(u,v)survive in H′′.
Now let qbe a middle-heavy fault-avoiding 3-path in H(sp). We notice that qmust be simple,
since (as before) if q= (s, u, v, s) then we would have to include sF(u,v)and so qwould not
be fault-avoiding. Since qis simple it survives in Hwith probability exactly ()4, and thus it
survives in H′′ with probability ()4. We therefore have E[m′′]Om()4.
Putting It Together. By the previous two parts, we have Ω 1c2E[m′′]Om()4,
which implies that δ=O((m/(nc2))1/3), and thus =Ocm1/3n2/3.Since δis defined as the
average degree in H(sp), this proves the lemma.
Our size bound now follows by directly combining our previous three lemmas; see Appendix A.
10
Lemma 2.7. The emulator Hreturned by Algorithm 1 has |E(H)|=e
Of1/3n4/3+O(fn)with
high probability.
3 Vertex Fault-Tolerant (2k1)-Emulators
Our goal in this section is to prove Theorem 1.1. We start by defining several properties of certain
desired paths that let us generalize the algorithm.
3.1 SALAD Paths and Proof Overview
We begin by explaining, at a high level, the relationship between this argument for general kand
the one given previously for k= 3. The core of our previous proof was a counting argument over
middle-heavy fault-avoiding 3-paths in H(sp). The core of our general argument will be a counting
argument over “SALAD” k-paths in H(sp). SALAD is an acronym for Simple, Alternating, Local,
Avoids faults, Dispersed. We will explain these five properties and their role in the analysis
momentarily, but first let us state our algorithm. This algorithm uses a notion of local paths,
which we define immediately after the algorithm and do not have an analog in our simpler k= 3
case. We say that a path in H(sp)is completed by an edge (u, v) if the path exists in H(sp)and
(u, v) is the heaviest edge in the path (i.e., the path exists in H(sp)once (u, v) has been added).
Algorithm 2: Algorithm for f-VFT (2k1)-emulators
Input: Graph G= (V, E , w), positive integer f, odd positive integer k;
Let H(V, , w) be the initially-empty emulator;
foreach edge (u, v)Ein order of nondecreasing weight w(u, v)do
if there is FV\{u, v}of size |F| ≤ fwith distH\F(u, v)>(2k1) ·w(u, v)then
Add (u, v) as a spanner edge to H;
foreach local path πin H(sp)with jkedges completed by (u, v)do
Add the endpoints (s, t) of πas an emulator edge with probability d(j1);
return H;
Stretch analysis of this algorithm is essentially the same as Lemma 2.2; we include it here for
completeness.
Lemma 3.1. The emulator Hreturned by Algorithm 2 is an f-VFT (2k1)-emulator.
Proof. Consider some (u, v)Eand FV\ {u, v}with |F| ≤ f. If (u, v)E(H) then clearly
distH\F(u, v)w(u, v). Otherwise, Algorithm 2 did not add (u, v) to H, and so by the “if
condition we know that dH\F(u, v)(2k1) ·w(u, v) as required.
We now define SALAD paths:
Simple:πdoes not repeat nodes. We implicitly required simplicity in our previous k= 3
proof, since (as used in Lemma 2.6) a non-simple middle-heavy path of the form (s, u, v, s) is
not fault-avoiding. In our extension, it is more convenient to make the simplicity requirement
explicit.
Alternating:πis alternating if every even-numbered edge in πis heavier than the two
adjacent odd-numbered edges. That is: if πhas edge sequence (e1,...,ek), then for all i, we
11
have w(e2i)> w(e2i1) and w(e2i)> w(e2i+1). If kis even, then ekonly needs to be heavier
than ek1.
“Alternating” turns out to be the natural extension of “middle-heavy” to paths of length
k3 (notice for k= 3, alternating and middle-heavy are the same). Roughly, our analysis
will involve “splitting” paths over their heaviest edge and recursively analyzing the subpath
on either side. Like for k= 3, this splitting process is most efficient when the heaviest edge is
somewhere in the middle of the path (neither the first nor last one). An alternating path is
one where the heaviest edge remains somewhere in the middle at every step of the recursion,
until finally the path decomposes into individual edges. In fact, this is not quite true in the
case where kis even (due to the last edge), which is precisely why our bounds are a little
worse for even k.
Local: this is a new property that does not have an analog in our previous k= 3 argument.
Let b= Θ(kd) be a parameter (it will be more convenient to specify the implicit constant
later in the analysis). For each node v, we put the edges incident to vin H(sp)into buckets
{Bi
v}: the first bedges incident to vare in its first bucket B1
v, the next bedges are in the
second bucket B2
v, etc. We define πto be local if, for any three-node contiguous subpath
(x, y, z)π, the edges (x, y),(y, z) belong to the same bucket for y.
Locality is necessary because we sample SALAD paths of all lengths jk. Our proof
strategy from k= 3 works just fine to limit the emulator edges contributed by SALAD paths
of length k. But it does not help us limit the emulator edges contributed by SALAD paths
of shorter length j < k. By including locality explicitly, we gain an easy way to limit this
quantity, at the price of a little more complexity in some of the downstream proofs.
Avoids Faults: This is a slightly more stringent property than “fault-avoiding” used previ-
ously. Whenever we add a spanner edge (u, v), let F(u,v)be a choice of fault set that forces
(u, v), just like in the k= 3 proof. We say that πavoids faults if, for every edge (u, v)π
(not just the heaviest one), we have πF(u,v)=.
Dispersed: this property showed up briefly in the k= 3 case, but we were able to bury it
in the technical details of Lemma 2.4. Here, we need to bring it more to the forefront of the
analysis. We will say that πis a SALA path if it satisfies the first four properties described
previously. Among the SALA paths, we will declare them either concentrated or dispersed as
follows, and we will only use the dispersed ones in our analysis:
Notice that we can split πinto two (possibly empty) shorter SALA paths π1, π2by
removing its heaviest edge (u, v). If either of π1, π2is concentrated, then πis concentrated
as well. If π1, π2are both dispersed, then we will say that πis splittable, and it may still
be concentrated or dispersed according to the following point:
Set a threshold parameter τ=e
O(f). For all 1 jk, among the splittable j-paths
between each pair of endpoints (s, t), the first τj1
2completed paths are dispersed and
the rest are concentrated. If two s tsplittable paths are completed by the same edge,
and thus arise in H(sp)at the same time, then we pick an arbitrary order so that the
“first” paths are unambiguous.
The inclusion of locality among our properties actually significantly changes the structure of
the proof. Because we consider a more restricted kind of path, it gets much easier to control the
number of emulator edges (this is the whole point of locality):
12
Lemma 3.2. With high probability, E(H(em))e
OEH(sp)·O(k)k.
Proof. One generates a local j-path by picking an oriented edge to be the starting edge, and then
repeatedly extending the path by choosing 1 edge among the at most bpossible edges satisfying
the locality constraint. Hence there are at most O|E(H(sp)| · bj1local paths in H(sp).
Each local j-path completed by a spanner edge (u, v) is independently sampled as an emulator
edge with probability d(j1). Thus, the expected number of emulator edges contributed by local
j-paths is
O EH(sp)·b
dj1!EH(sp)·O(k)k1.
Since the edges are sampled independently, by a standard Chernoff bound,
e
OEH(sp)·O(k)k1.
The lemma now follows by unioning over all choices of jk.
On the other hand, it gets much harder to bound the number of spanner edges in H. We use
the following main technical lemma:
Lemma 3.3 (Counting Lemma).Let cbe a large enough absolute constant, suppose H(sp)has
average degree δcdk, and also suppose dcf. Then with high probability, H(sp)has at least ndk
SALAD k-paths.
Before proving this lemma, we can do some simple algebra to show why it implies a bound on
spanner edges:
Lemma 3.4. If we set parameter
d:=
max npolylog n·f1
21
2kn1/k, cf oif kodd
max npolylog n·f1
2n1/k, cf oif keven,
with large enough polylogs, then with high probability, we have
E(H(sp))
e
Okf1
21
2kn1+1/k +fnif kodd
e
Okf1
2n1+1/k +fnif keven
Proof, assuming Lemma 3.3. By definition of dispersion, for each node pair (s, t), we can have only
e
O(f)k1
2total s tSALAD k-paths, so there are n2·e
O(f)k1
2SALAD k-paths in total.
Therefore the number of these paths is at most n2·˜
O(f)k1
2=n2·˜
O(f)k1
2when kis odd. Based
on definition of d, and by a choice of large enough polylog, this means that H(sp)has strictly less
than n2·˜
O(f)k1
2< ndkSALAD k-paths.
If kis even, there are at most n2·˜
O(f)k1
2=n2·˜
O(f)k
2. Similarly by choosing a large enough
polylogarithmic factor in the definition of dfor the even case, we also have that the number of
SALAD k-paths is strictly less than ndk.
In both cases, by applying the counting lemma in contrapositive, we conclude that the average
degree in H(sp)is δ=O(dk). Thus H(sp)has O() edges, and by plugging in dthe claim
follows.
13
And now it is trivial to prove Theorem 1.1.
Proof of Theorem 1.1. The combination of Lemma 3.2 (which bounds the number of edges of H(sp))
and Lemma 3.4 (which relates the number of emulator edges added to E(H(sp))) implies the theo-
rem.
So it just remains to prove our counting lemma, which is the main technical part of the proof.
3.2 Counting Lemma
Towards proving our counting lemma, our first task is to extend Lemma 2.3 from the k= 3 case.
We will define slightly more expressive variables: let Cj
(s,t)count the number of local s t j-paths
at a given moment in the algorithm. We also define sets
Ψj(u, v) := {(s, t)V×V|(u, v) completes a SALA splittable j-path from sto t}.
The following lemma controls the values that Cj
(s,t)can reach:
Lemma 3.5. With high probability, for all spanner edges (u, v)added to Hand all 1jk, just
before (u, v)is added we have
X
(s,t)Ψj(u,v)
Cj
(s,t)=e
Ofdj1.
Proof. The proof is similar to Lemma 2.3. Intuitively, if P
(s,t)Ψj(u,v)
Cj
(s,t)is large enough, then with
high probability there will already be an emulator edge (s, t) for some (s, t)Ψj(u, v), which would
mean that we would not have actually added (u, v) to H. To formalize this, though, we need to
analyze even edges that were not added to Hand take a union bound over all possible fault sets,
as in Lemma 2.3.
So we begin as in Lemma 2.3. Let (u, v) be an edge in the input graph, and let FV\ {u, v}
with |F| ≤ f. Consider the moment in the algorithm where we inspect (u, v) and decide whether
or not to add it to the emulator (note: (u, v), F are arbitrary; we may or may not actually add
(u, v), and if we do, we do not necessarily have F=F(u,v)). We use the following extensions of our
previous definitions:
For a path πin H(sp)that would be completed, if we added (u, v) to the emulator, we say
that πis F-avoiding if πF=.
Ψj(u, v, F ) is the set of node pairs (s, t)V×Vsuch that, if we added (u, v) to the emulator,
it would complete at least one new F-avoiding SALA j-path from sto t.
We say that Fis mass-avoiding for (u, v) and jif
X
(s,t)Ψ(u,v,F )
Cj
(s,t)> cfdj1log n.
where cis some large enough absolute constant.
14
Note that the lemma statement is equivalent to the claim that, if (u, v) is added to H(sp),then
F(u,v)is not mass-avoiding. We have set things up for general (u, v), F because our proof strategy
is to take a union bound over all possible choices of (u, v), F , which will thus include F(u,v).
We say that a mass-avoiding Fis good for (u, v) if (immediately prior to (u, v) being considered
by the algorithm) there is some (s, t)Ψj(u, v , F ) such that (s, t) is already an emulator edge in
H. Otherwise, we say that Fis bad for (u, v).
We now prove that with high probability, every mass-avoiding Fis good for (u, v). To see this,
consider some mass-avoiding F. Every local j-path πwhich contributes to Cj
(s,t)was completed
by some (even) edge, so by definition of the algorithm, when πwas completed we sampled (s, t)
as an emulator edge with probability d(j1). This is true for every such s tlocal j-path, so
the algorithm independently adds (s, t) as an emulator edge with probability d(j1) at least Cj
(s,t)
times. These choices are also clearly independent for different pairs (s, t) and (s, t), and hence the
probability that Fis bad is at most
Y
(s,t)Ψj(u,v,F )11
d(j1) Cj
(s,t)exp P(s,t)Ψj(u,v,F )Cj
(s,t)
dj1!exp (cf log n)1/nf+10,
where we used that Fis mass-avoiding and we set csufficiently large. There are at most nfpossible
mass-avoiding sets F(since |F| ≤ f), so a union bound over all all of them implies that every mass-
avoiding set Fis good for (u, v) with probability at least 1 1/n10 . We can now do another union
bound over all (u, v) to get that this holds for every (u, v)E(whether added to H(sp)or not)
with probability at least 1 1/n8.
Now consider some (u, v)H(sp). By the above, if F(u,v)is mass-avoiding, then it must be
good. Hence there is some emulator edge (s, t) with (s, t)Ψj(u, v, F(u,v)), so by the definition
of Ψ(u, v, F(u,v)) there is some s tSALA j-path in H(sp)that is completed by (u, v). Note
that by the definition of a SALA path, we know that no vertices in this path are in F(u,v). Thus
immediately prior to adding (u, v) to H(sp), it was the case that
distH\F(u,v)(u, v)distH\F(u,v )(u, s) + distH\F(u,v)(s, t) + distH\F(u,v)(t, v )
distG\F(u,v)(u, s) + distG\F(u,v)(s, t) + distG\F(u,v)(t, v )
distG\F(u,v)(u, s) + distG\F(u,v)(s, u) + distG\F(u,v )(u, v) + distG\F(u,v)(t, v )+ distG\F(u,v )(t, v)
(j1) ·distG\F(u,v)(u, v) + j·distG\F(u,v )(u, v)
= (2j1) ·w(u, v)(2k1) ·w(u, v).
In the above inequalities we used the triangle inequality, the fact that (u, v) is the heaviest edge in
the SALA s t j-path since edges are added in increasing weight order, and the fact that (s, t) is
an emulator edge and so after the failure of F(u,v)must have weight distG\F(u,v)(s, t).
But this means that the algorithm would not have added (u, v) due to fault set F(u,v), which
contradicts the definition of (u, v) and F(u,v). Hence if (u, v) is added then F(u,v)cannot be mass-
avoiding, which implies the lemma.
We will proceed by converting this to a bound on the number of paths that we need to declare
concentrated at each scale:
Lemma 3.6. With high probability, for every spanner edge (u, v)and 1jk, the number of
concentrated splittable j-paths completed by (u, v)is dj1
4k.
15
Proof. Let (u, v)E(H(sp)). We will say that a node pair (s, t)Ψj(u, v) is saturated if, just
before (u, v) is added, we have
Cj
(s,t)τj1
2 j3
2= τj1
2
(where the last equality is by choosing polylogs in τlarge enough that τj). We know from
Lemma 3.5 that with high probability, P(s,t)Ψj(u,v)Cj
(s,t)=e
Ofdj1just before (u, v) is added.
Thus with high probability, just before (u, v) is added the number of saturated pairs is at most
e
Ofdj1
τj1
2.
Meanwhile, by definition of dispersion, for any 1 hjthere can be only τh2
2total s u
SALAD (h1)-paths, and there can be only τjh1
2total v tSALAD (jh)-paths. Thus,
for any pair (s, t)Ψj(u, v), the number of splittable s t j-paths completed by (u, v) is
j
X
even h=2
τh2
2·τjh1
2 j3
2
where we sum only over even hbecause, due to the alternating property, the heaviest edge (u, v)
along a path must be even-numbered.
So for each unsaturated node pair (s, t)Ψj(u, v), all splittable s tpaths completed by (u, v)
are dispersed. This is because by the definition of saturated and the definition of the counters, there
are less than τj1
2 j3
2SALA paths s t j-paths before (u, v) is added, and by the above
calculation, adding (u, v) adds an additional at most jτ j3
2splittable SALA s t j-paths. Hence
all of these new paths can be dispersed, as we will still have at most τj1
2SALAD st j-paths.
On the other hand, for each saturated node pair (s, t)Ψj(u, v), we might need to declare all
of the new splittable s tSALA j-paths paths to be concentrated. We showed that there are
at most j3
2new such paths for each such (s, t). Hence the total number of splittable paths
completed by (u, v) that are declared concentrated is
e
Ofdj1
τj1
2· j3
2=dj1·e
Ofj
τ.
By setting the implicit polylogs in τ=e
O(f) high enough, and using that klog nand j=e
O(1),
we can ensure that the latter term is at most 1
4k, and the lemma follows.
Our next step towards our counting lemma roughly follows the proof of Lemma 2.6. We pass
from H(sp)to a random induced subgraph Gby keeping each node independently with probability
d1, and deleting the others. Let us say that an edge (u, v)E(G) is clean if:
None of the nodes in its associated fault set F(u,v)survive in G, and
For every 1 jkand every simple concentrated splittable j-path πcompleted by (u, v), π
does not survive in G.
Lemma 3.7. For any edge (u, v)H(sp), assuming that the high probability event of Lemma 3.6
occurs, we have
Pr (u, v)is clean |u, v both survive in G1/2.
16
Proof. First, since cf d, by choice of large enough constant c, the probability that there is some
node from F(u,v)in Gis at most
111
df
1/4.
Fix some value of 1 jk. By Lemma 3.6, there are dj1
4ksimple concentrated splittable
j-paths completed by (u, v). Conditional on (u, v) itself surviving in G, there are j1 other
nodes in each such path π, so it survives with probability d(j1). So in expectation, 1
4ksimple
concentrated splittable j-paths survive. By Markov’s inequality the probability that any such paths
survive is at most 1
4k. By a union bound over all choices of j, the probability that any concentrated
splittable 1 jkpath survives is at most 1/4.
By a union bound on the previous two parts, (u, v) is clean with probability at least 1/2.
We now pass from Gto another graph G′′ = (V′′, E ′′) using the following two steps:
Delete all edges from Gthat are not clean, and then
V′′ is defined by dividing each node vVinto nodes {vi}, where each node vicorresponds
to one of the buckets Bi
voriginally associated to the node v, used in the definition of locality.
Then E′′ is defined as follows: for each edge (u, v) which has survived (both endpoints are in
Gand it is clean), add an edge in E′′ between the copies (uh, vi) in G′′, where the edge is in
the hth bucket of uand the ith bucket of v.
In the following let σbe the number of SALAD k-paths in H(sp)and let σ′′ be the number
of edge-simple alternating paths in G′′ (i.e., each edge is used at most once). We will prove two
inequalities relating σto the expectation of σ′′, analogous to the ones used internally in Lemma
2.6. The first is:
Lemma 3.8. E[σ′′]σ
dk+1 .
Proof. For every path πG′′, let s(π) denote the corresponding path in G(the path which uses
the same edges). Note that s(π) is also a path in H(sp), and that if π6=π, then s(π)6=s(π). We
first argue that if πis an edge-simple alternating path in G′′ , then s(π) is a SALAD path in G.
(Simple) Suppose towards contradiction that s(π) is not (node-)simple in G. Since πis edge-
simple so is s(π), and hence s(π) contains a cycle subpath Cof length jk. The edge (u, v)
that completes Cmust include a node in Cin its associated fault set F(u,v), since (exactly
as in Lemma 2.6) there is a u vpath of length k·w(u, v) going around the cycle C. So
either none of the nodes in F(u,v)survive in G(in which case Cis not in G), or else (u, v)
is unclean and so the corresponding edge does not survive in G′′. In either case, at least one
edge of πis missing from G′′, contradicting the definition of π.
(Alternating) Since πis alternating in G′′, clearly s(π) is also alternating in Gsince their
edges correspond.
(Local) Due to the node-dividing step when we move from Gto G′′, any non-local path in
Gdoes not survive in G′′. Hence s(π) must be local.
(Avoids Faults) If s(π) contains both an edge (u, v) and a node in its associated fault set
F(u,v), then we will either delete this node when moving from H(sp)to G, or else (u, v) is
unclean and so we will delete this edge when moving from Gto G′′. In either case, πwould
not exist in G′′, contradicting the definition of π
17
(Dispersed) By the previous four bullet points, s(π) is SALA. We proceed using a proof by
minimal counterexample. Suppose towards contradiction that s(π) is not dispersed, and let
qs(π) be the shortest subpath of s(π) that is SALA but not dispersed. Thus qis splittable,
since the two subpaths that emerge after the split are both shorter than q.
Let (u, v) be the heaviest edge in q. If qdoes not survive in G, then clearly πis not a path
in G′′, which is a contradiction. Hence qis still a path in G. But then by definition the edge
(u, v) is unclean in G. Hence the equivalent edge would not exist in G′′ , contradicting the
fact that πis a path in G′′. This completes the contradiction; thus, s(π) must be dispersed.
So we have that every edge-simple path in G′′ corresponds to a SALAD path in G, and this
correspondence is injective. Thus σ′′ is at most the number of SALAD k-paths in G. We can
upper bound the expectation of this quantity by observing that (by simplicity) any SALAD k-path
in H(sp)has k+ 1 distinct nodes, and each of these nodes survives in Gwith probability d1, so
each SALAD k-path in H(sp)survives in Gwith probability d(k+1). Hence E[σ′′]σ
dk+1 .
Our second inequality needs the following lemma, which was proved implicitly in [12] (it is a
generalization of the “intermediate counting lemma” of [12]; we include the proof here for com-
pleteness).
Lemma 3.9. Any n-node graph with m > kn edges has at least mkn edge-simple alternating
k-paths.
Proof. The weak counting lemma of [12] implies that any n-node graph with at least kn edges has
at least 1 edge-simple alternating k-path. So consider the following process: pick an edge that is
contained in at least one edge-simple alternating k-path, remove it, and repeat until there are no
more edge-simple alternating k-paths. Since we remove the edge we find in each iteration, there
are at least as many edge-simple alternating k-paths as there are iterations. And there are at least
mkn iterations, since as long as at least kn edges remain in the graph, there is still at least 1
edge-simple alternating k-path. Hence there are at least mkn edge-simple alternating k-paths
in total.
Using this, we prove:
Lemma 3.10. E[σ′′ ] = n
d.
Proof. The number of buckets over all nodes in H(sp)is O(|E(H(sp))|/(kd)), with as small an
implicit constant as we like by setting the constant in the definition of bto be as large as we need.
Each bucket survives as a node in G′′ with probability d1, since a bucket survives if and only if
its corresponding node survives. So the expected number of nodes in G′′ is at most
E[|V(G′′)|]c|E(H(sp))|
kd2
for as small a constant cas we want. Any particular edge in H(sp)survives in Gwith probability
d2, and then it is clean (and thus survives in G′′ ) with probability 1/2 by Lemma 3.7. So the
expected number of edges in G′′ is at least
E[|E(G′′)|]|E(H(sp))|
2d2.
18
By pushing csufficiently small, it follows from Lemma 3.9 that the expected number of edge-simple
alternating k-paths in G′′ is
E[σ′′]E[|E(G′′)| − k|V(G′′)|] = E[|E(G′′ )|]kE[|V(G′′)|]
|E(H(sp))|
2d2kc|E(H(sp))|
kd2 |E(H(sp))|
d2!
= nδ
d2= kn
d= n
d,
where the last line is since we assume H(sp)has average degree δcdk.
Finally, putting the pieces together: by Lemmas 3.8 and 3.10, we have
n
d=E[σ′′]σ
dk+1
and so, rearranging, we get
σ= ndk
which proves our counting lemma.
4 Polynomial Time
Unfortunately, Algorithm 2 does not run in polynomial time. This is because of the “if condition:
we need to check whether there is some FV\ {u, v}with |F| ≤ fsuch that distH\F(u, v)>
(2k1) ·w(u, v). The obvious way of doing this takes Ω(nf) time in order to check all possible
fault sets, which is not polynomial if fis superconstant (which is the interesting case, as we are
studying f-dependence). One might hope to design a polynomial-time algorithm to perform this
check, but even special cases of this problem are NP-hard: if the graph is unweighted then this is
equivalent to the Length-Bounded Cut problem, which is known to be NP-hard [7].
The same problem arises in the context of VFT spanners, where the best-known algorithm is the
VFT-greedy algorithm (exactly Algorithm 2 but without adding emulator edges). This algorithm
was shown to p roduce spanners of existentially optimal size in [10, 14], but the same running time
issue was present. This issue was recently resolved by [11,18], who showed how to slightly change
the greedy algorithm to make it polytime. We show that we can adapt the techniques of [18]
to the VFT emulator setting, obtaining a polynomial-time algorithm with size bounds that are
only polylogarithmically worse than the exponential time algorithm (and we have suppressed our
polylogs with e
O(·) notation anyway).
The main idea of [18] was to replace the nftime check with an approximation algorithm for the
Length-Bounded Cut problem on unweighted graphs, and then show that the approximation
and the restriction to unweighted graphs did not cost us much even when applied to weighted
graphs. We follow this approach, replacing the line
“if there is FV\ {u, v}of size |F| ≤ fwith distH\F(u, v)>(2k1) ·w(u, v)”
in Algorithm 2 with a new condition:
“if find-fault-set(G, H, u, v, k, f ) returns YES”
where find-fault-set is a subroutine described below. Intuitively, the algorithm find-fault-set
is supposed to be an approximate version of our previous check. We now give this algorithm
and prove the guarantees that it provides, and then show how changing Algorithm 2 to use
find-fault-set affects the proofs from Section 3.
19
4.1 The Distinguishing Algorithm
Consider the following algorithm.
Algorithm 3: find-fault-set(G, H, u, v, k, f )
Consider all edges in Hand Gto be unweighted. So the weight of an emulator edge (s, t)
under fault set fis equal to the number of hops between sand tin G\F;
if (u, v)E(H)then return NO ;
Initialize F=;
while distH\F(u, v)2k1do
Let Pbe a shortest uvpath in H\F;
Let Pbe the (possibly non-simple) path in G\Fobtained by replacing every
emulator edge in Pwith a shortest path between its endpoints in G\F;
Add all vertices in P\ {u, v}to F;
if |F| ≤ (2k2)fthen return YES else return NO ;
We claim that this is effectively a polynomial-time (2k2)-approximation algorithm for the
problem of finding the smallest Fthat makes distH\F(u, v)>2k1 (recall that in this algorithm
we use unweighted graph edges).
Lemma 4.1. find-fault-set takes polynomial time.
Proof. Each iteration of the while loop involves a shortest path computation for uv, and then
a shortest path computation for each emulator edge to update its weight appropriately. Hence
each iteration takes polynomial time if we use a polynomial-time shortest-path algorithm. In each
iteration we add at least one node to F(since the path Pmust have at least two edges or else we
would have immediately returned NO), and hence the number of iterations is at most O(n). Thus
the total running time is polynomial.
Lemma 4.2. If distH\F(u, v)2k1for all FV\ {u, v}with |F| ≤ (2k2)f, then
find-fault-set(G, H, u, v, k, f )returns NO.
Proof. We prove the contrapositive. Suppose that find-fault-set(G, H, u, v, k, f ) returns YES.
Then by the definition of the algorithm, the set Fthat it found has distH\F(u, v)>2k1 with
|F| ≤ (2k2)f.
Lemma 4.3. If there exists an FV\ {u, v}with |F| ≤ fsuch that distH\F(u, v)>2k1,
then find-fault-set(G, H, u, v, k, f )returns YES.
Proof. Let FV\ {u, v}with |F| ≤ fsuch that distH\F(u, v)>2k1, and let Fbe the fault
set created by the algorithm. We need to prove that |F| ≤ (2k2)f, since that will imply that the
algorithm returns YES. Let Pbe a path that the algorithm found in some iteration, and let Pbe
the associated path in G(so the algorithm added all vertices of Pto Fother than u, v).
Clearly the length of Pis at most the length of P, since we just replaced every emulator edge
by a graph path of the exact same length, and we know from the algorithm that the length of Pis
at most 2k1. Hence Phas at most 2k2 vertices other than u, v. Thus in every iteration |F|
grows by at most 2k2.
Since Phas length at most 2k1 and distH\F(u, v)>2k1, it must be the case that FP6=.
On the other hand, if P1and P2are two different paths found by the algorithm (so they were found
in different iterations), then P1P2={u, v}(since if P1appeared in an earlier iteration then all
20
vertices other than u, v in P1were added to F, so were not available for P2). Thus in every iteration
we add at least one vertex from Fto Fthat was not already in F. Hence the number of iterations
is at most |F| ≤ f, and so |F| ≤ (2k2)fas required.
4.2 Effect on the Analysis of Algorithm 2
We now discuss what changes from Section 3 if we replace the if condition in Algorithm 2 with
find-fault-set. We give only a sketch of this changed analysis, since it simply involves repeating
the analysis of Section 3 but keeping track of an extra O(k) factor.
First, it is easy to see that correctness (Lemma 3.1) still holds thanks to the greedy ordering
and Lemma 4.3. If (u, v)6∈ E(H) then Lemma 4.3 implies that for every fault set Fof size at most
f, there is a uvpath in H\Fof length at most 2k1 in the unweighted version. But every
edge already in Hwhen we are considering adding (u, v) has weight less than w(u, v), and hence
this path must also have length at most (2k1) ·w(u, v) in the real H(with weights).
As before, for any spanner edge (u, v) added by the algorithm, let F(u,v)denote the fault set
that caused us to add it. The main difference in our new polynomial time algorithm is that rather
than knowing |F(u,v)| ≤ f, we just know from Lemma 4.2 that |F(u,v)| ≤ (2k2)f. So we need to
trace how this change propagates through Section 3. The main change is that our definition of d
in Lemma 3.4 will be multiplied by O(k). Since kO(log n), this means that the final bound on
|E(H(sp))|in Lemma 3.4 will be unchanged.
The bound on the number of emulator edges (Lemma 3.2) continues to hold without change,
since it simply involves counting the number of local j-paths. The statement of the Lemma 3.3 is
changed online slightly, by changing the assumption that dcf with the assumption that dcf k.
The proof of the spanner edge lemma (Lemma 3.4) is also unchanged since it is just calculations
assuming the counting lemma (Lemma 3.3); we simply need to change dto have an extra factor of
O(k), but this is absorbed by the polylog(n) factors since kO(log n).
Now consider Lemma 3.5, the main “counter mass” lemma. We do not change the statement
as written, but note that there will be an extra factor of kon the right hand side, but this is
hidden by the e
Onotation. Since F(u,v)could have size up to (2k2)f, we need to union bound
over all possible fault sets of that size, rather than of f. Hence we need to change our definition of
mass-avoiding to be X
(s,t)Ψ(u,v,F )
Cj
(s,t)> ckf dj1log n
(i.e., we add an extra factor of (2k2) to the right hand side, which we simply write as kdue to
the inclusion of the constant c). Once we make this change, we can union bound over all fault sets
of size at most (2k2)fto get that every mass-avoiding F(of size at most (2k2)fis good for
(u, v) with high probability. The rest of the proof is unchanged: this means that if F(u,v)it is also
good, but that would imply that the algorithm would not have added (u, v) since F(u,v)would not
have actually forced us to (contradicting the definition of F(u,v)).
Now consider Lemma 3.6, the bound on the number of concentrated splittable j-paths completed
by each spanner edge. This goes through without change, since the statement of Lemma 2.3 is
unchanged.
We then define G, clean nodes, and G′′ as before. The statement of Lemma 3.7 is unchanged,
and the proof only has to change by noticing that the probability that there is some node from F(u,v)
in Gis at most 111
d(2k2)f, since now |F(u,v)|can be up to (2k2)f. But since dis larger by
a factor of O(k), this cancels out and the remaining calculation is identical. Lemmas 3.8, 3.9, and
21
3.10 are unchanged, since they just use the previous lemmas (whose statements are unchanged) as
black boxes. Hence they can be combined as before, proving Lemma 3.3 (the main counter lemma).
5 Lower Bound
In this section we show that our algorithm is optimal with respect to dependence on f. Interestingly,
our lower bound construction for f-VFT (2k1)-emulators is essentially the same as the lower
bound for f-EFT (edge fault-tolerant) (2k1)-spanners from [10] (the analysis is more complex,
though, as we must account for the extra power of emulator edges).
Conjecture 5.1 (Erd˝os girth conjecture [20]).For every positive integer k, there exists an infinite
family of graphs on nnodes with Ω(n1+1/k)edges and girth 2k+ 2.
We first prove our lower bound for the special case of k= 2 (stretch 3), then handle the general
case of k3.
Theorem 1.2. For all positive integers n, f with fn, there exists an unweighted n-node graph
with Ω(f1/2n3/2)edges for which any f-VFT 3-emulator must have at least Ω(f1/2n3/2)edges.
Proof. Let G= (V, E ) be a girth conjecture graph with k= 2, i.e., a graph with girth at least 6
and at least Ω(|V|3/2) edges. This setting of the girth conjecture has actually been proved [28], so
we use this construction. We construct a new graph G= (V, E) as follows: let t=f /4, let
V=V×[t], and let E={{(u, i),(v, j)}:{u, v} ∈ E , i, j, [t]}. In other words, each edge {u, v}
of Gis replaced by a complete bipartite graph between two sets of copies (of size t). For every
uV, let Bu={(u, i) : i[t]}be the copies of uin V. Note that
|E|=t2|E| ≥ f2|V|3/2= f2|V|
f3/2!= f1/2|V|3/2,
so we simply need to show that every f-VFT (2k1)-emulator for Ghas at least Ω(|E|) edges.
In the spanner setting we would accomplish this by showing that every edge in Emust be in every
spanner, but in the emulator setting this is not true: emulator edges can be used to replace spanner
edges. Instead, we must show that the number of emulator edges in an emulator must be roughly
equal to the number of edges of Gwhich are not in the emulator.
Let H= (VH, EH) be an f-fault-tolerant 3-emulator for G, and let e={(u, i),(v, j)}be some
edge in E. We say that an emulator edge eEHprotects eif there is a 2-path between the
endpoints of ein Gthat uses e(such a path will intuitively correspond to a path of length 3 in H,
since the emulator edge will have length 2 under the fault sets that we care about). We say that e
is β-protected if there are at least βemulator edges that protect e.
We first claim that if eis not f /4-protected then eEH. So suppose for contradiction that e
is not f/4-protected but e6∈ EH. Let Fconsist of all endpoints of emulator edges that protect e
other than (u, i) and (v, j), together with all vertices in BuBvother than (u, i) and (v, j ). Clearly
|F| ≤ 4(f/4) = f, and clearly eG\Fand hence distG\F((u, i),(v, j)) = 1.
Since Ghas girth at least 6: it is easy to see that any path from (u, i) to (v, j) in H\Fthat
uses only edges in G\F(i.e., does not use emulator edges) must have length at least 5. This is
because any such path must essentially follow a uvpath in Gthat does not use the edge {u, v}.
On the other hand, consider some path from (u, i) to (v, j) in H\Fthat does use an emulator
edge e={(s, i),(t, j)}. Then by the definition of F, we know that edoes not protect e. But then
it is easy see that the length of this path is strictly larger than 3. Thus distH\F((u, i),(v, j)) >
22
3·distG\F((u, i),(v, j)), contradicting our assumption that His an f-VFT 3-emulator. Hence every
eE\EHmust be f /4-protected.
Now consider some emulator edge e={(s, i),(t, j)} ∈ EH. If distG(s, t)3 then edoes not
protect any edges of G. If distG(s, t) = 2 then since Ghas girth larger than 4 there is a unique
2-path sxtbetween sand tin G, and so by definition ecan only protect edges between (s, i)
and Bxand between (t, j) and Bx. Thus every emulator edge can protect at most 2tf /2 edges
of G. Since we showed that any edge in Ewhich is not in EHmust be f /4-protected, this implies
that
|EH|=|EHE|+|EH\E| ≥ |EHE|+1
f/2·f
4|E\EH| ≥ Ω(|E|),
as claimed.
We now modify this lower bound to hold for k3. This involves using a different parameter
for t(basically frather than f) and generalizing the argument, but the basic construction is the
same.
Theorem 1.3. Assuming the Eros girth conjecture, for all k3and fnthere is an unweighted
n-node graph in which every f-VFT (2k1)-emulator has at least Ω(k1f1
21
2kn1+1/k)edges.
Proof. Let G= (V, E ) be a graph from the girth conjecture (Conjecture 5.1). We construct a new
graph G= (V, E) as follows: let t=f, let V:= V×[t], and let E:= {{(u, i),(v, j ) :
{u, v} ∈ E, i, j [t]}}. In other words, each edge {u, v}of Gis replaced by a complete bipartite
graph between two sets of copies (of size f). For every uV, let Bu={(u, i) : i[t]}be the
copies of uin V. Note that
|E|=t2|E| ≥ f|V|1+1/k= f|V|
f1+1/k!= f1
21
2k|V|1+1/k,
so we simply need to show that every f-VFT (2k1)-emulator for Ghas at least Ω(|E|/k) edges.
In the spanner setting we would accomplish this by showing that every edge in Emust be in every
spanner, but in the emulator setting this is not true: emulator edges can be used to replace spanner
edges. Instead, we must show that the number of emulator edges in an emulator must be roughly
equal to the number of edges of Gwhich are not in the emulator.
Let H= (VH, EH) be an f-fault-tolerant (2k1)-emulator for G, and let e={(u, i),(v, j)}be
some edge in E. We say that an emulator edge {(s, i),(t, j)} ∈ EHprotects eif there is a simple
path of the form s uv tof length at most kin G. We say that eis β-protected if there are
at least βemulator edges that protect e.
We first claim that if eis not f /3-protected then eEH. So suppose for contradiction that
eis not f /3-protected but e6∈ EH. Let Fconsist of all endpoints of emulator edges that protect
e(other than (u, i) and (v, j) if they happen to be such an endpoint) together with all vertices in
BuBvother than (u, i) and (v, j). Clearly |F| ≤ 2(f /3) + 2tf, and clearly eG\Fand
hence distG\F((u, i),(v, j)) = 1.
Since Ghas girth at least 2k+ 2, it is easy to see that any path from (u, i) to (v, j ) in H\F
that uses only edges in G\F(i.e., does not use emulator edges) must have length at least 2k+ 1.
This is because any such path must essentially follow a uvpath in Gthat does not use the edge
{u, v}.
On the other hand, consider some path from (u, i) to (v, j) in H\Fthat does use an emulator
edge e={(s, i),(t, j)}. Then by the definition of F, we know that edoes not protect e. Hence
23
the length of this path is at least
distG(u, s) + distG(s, t) + distG(t, v)2k+ 1.
Thus distH\F((u, i),(v, j)) >(2k1) ·distG\F((u, i),(v, j )), contradicting our assumption that
His an f-VFT (2k1)-emulator. Hence every eE\EHmust be f /3-protected.
On the other hand, consider some emulator edge e={(s, i),(t, j )} ∈ EH. In order to protect
any edge, it must be the case that distG(s, t)k. Since Ghas girth at least 2k+ 2, there is only
one simple s tpath in Gof length at most k. By the definition of protection, any edge protected
by emust be between some node in Buand some node in Bvwhere uand vare neighbors on this
path. Hence ecan protect at most k·t2O(kf ) edges of G. Since we showed that any edge in
Ewhich is not in EHmust be f/3-protected, this implies that
|EH|=|EHE|+|EH\E| ≥ |EHE|+1
O(kf )
f
3|E\EH| ≥ Ω(|E|/k),
as claimed.
6 Additive Emulators
In this section we consider emulators with purely additive stretch, proving Theorems 1.4 and 1.5. In
particular, we provide simple algorithms for constructing f-VFT +2-emulators and +4-emulators.
We start with the f-VFT +4-emulator; the algorithm for the +2-emulator will be very similar,
so we will just sketch how the analysis needs to be modified.
6.1 +4-emulator
We will prove the following theorem.
Theorem 1.5. For all fn, every n-node unweighted graph G= (V, E)admits an f-VFT +4-
emulator Hwith |E(H)|=e
Of1/3n4/3+nfedges. There is also a randomized polynomial-time
algorithm which computes such an emulator with high probability.
Algorithm. Let dbe a parameter which depends on f: if fnthen we set d= (fn)1/3, and
if f > nthen we set d= 2f. We say that a node is light if its degree in Gis at most d, and
otherwise it is dense.
Initially our emulator His empty.
1. Every light node adds all of its incident edges to H.
2. For every dense node v, we arbitrarily choose dof its neighbors in Gand add edges between
vand each of these dnodes to H.
3. Set p=12dln n
n. For every pair of nodes {u, v} ∈ V×V, add {u, v}to Has an emulator edge
independently with probability p.
Size Analysis. It is easy to see that Hhas O(dn log n) edges: the first step adds at most
dn edges, the second step adds at most dn edges, and a simple Chernoff bound implies that
with high probability the third step adds at most O(pn
2) = O(dn log n) edges. Hence the total
number of edges is at most O(dn log n). If fn, then this means that we have at most
O((fn)1/3nlog n) = O(f1/3n4/3log n) edges. If f > n, then we have at most O(f n log n) edges.
Hence the total number of edges is at most O((f1/3n4/3+nf) log n).
24
Stretch Analysis. We now bound the stretch. More formally,
Lemma 6.1. Let FVwith |F| ≤ f, and let u, v V\F. Then
distH\F(u, v)distG\F(u, v) + 4
with probability at least 11
n3f.
Proof. Let P= (u=x0, x1,...,xk1, xk=v) be the shortest uvpath in G\F(breaking ties
arbitrarily). If all edges of Pare in E(H) then we are done. Otherwise, let ibe the smallest integer
such that {xi, xi+1} 6∈ E(H), and let jkbe the largest integer such that {xj1, xj} 6∈ E(H).
Then clearly xiand xjare both dense vertices, or else all of their incident edges would be in H.
Hence they each have more than dneighbors in G, and so have more than dfd/2 neighbors
from step 2 in H\F. Let N(xi) and N(xj) denote these nodes.
First, observe that if there is an emulator edge between some node aN(xi) and some node
bN(xj) then we are done. This is because then we would have that
distH\F(u, v)distH\F(u, xi) + distH\F(xi, a) + distH\F(a, b) + distH\F(b, xj) + distH\F(xj, v)
i+ 1 + distG\F(a, b) + 1 + (kj)
i+kj+ 2 + (1 + (ji) + 1)
=k+ 4
= distG\F(u, v) + 4
This is also true if a=b, i.e., if N(xi)N(xj)6=. Hence we are already finished if N(xi)N(xj)6=
, so we will assume without loss of generality that N(xi)N(xj) = .
So if we can prove that the probability that such an edge {a, b}exists is at least 1 1/n3fthen
we are finished. The number of possible such edges is at least |N(xi)| · |N(xj)| ≥ (d/2)2=d2/4
(since fd/2 and N(xi)N(xj) = ). Each possible edge is added to Hwith probability pin
the third step of our construction. Hence the probability that none of these edges are added is at
most:
(1 p)d2/4=112dln n
nd2/4
exp 3d3ln n
n
If fnthen d= (fn)1/3, and hence the probability that we fail to get an appropriate
emulator edge is at most:
exp(3fln n)n3f
Similarly, if fnthen d= 2f, so the probability that we fail to get an appropriate emulator
edge is at most
exp 24f3ln n
nexp(3fln n)n3f
Therefore, with probability at least 1 n3fwe have added an emulator edge that satisfies the
stretch guarantee for u, v.
We now just need two union bounds over all possible fault sets and pairs u, v to finish the
stretch analysis. In particular, the union bound is over all n
fn2nf+2 choices of Fand u, v to
get that the desired stretch bound holds with high probability for all possible F, u, v.
25
6.2 +2-emulator
We now show that the same algorithm with a slightly different parameter setting and analysis leads
to a +2-emulator. More formally, we show the following theorem.
Theorem 1.4. For all fn, every n-node unweighted graph G= (V, E)admits an f-VFT
+2-emulator Hwith |E(H)|=e
O(f1/2n3/2)edges. There is also a randomized polynomial-time
algorithm which computes such an emulator with high probability.
Algorithm. Let d= (fn)1/2. Similar to the previous section, we say that a node is light if its
degree in Gis at most d, and otherwise it is dense.
We start with an empty emulator H.
1. Every light node adds all of its incident edges to H.
2. For every dense node v, we arbitrarily choose dof its neighbors in Gand add edges between
vand each of these nodes to H. E
3. Set p=6dln n
n. For every {u, v} ∈ V×V, add {u, v}to Hindependently with probability p.
Size Analysis. The first step adds at most dn edges, and the second step also adds at most dn
edges. And a simple Chernoff bound implies that with high probability the third step adds at most
O(pn
2)O(nd log n) edges.
Stretch Analysis. For any pair of nodes u, v, consider the shortest path P= (u=u0, ..., ui, ..., u=
v) between uand vin G\F. If all edges on the path are in H, we are done. Let x=uibe the
first node on the path such that (ui, ui+1)6∈ E(H), and let y=ujbe the furthest node on Psuch
that (uj1, uj)6∈ E(H). Then we know that xis dense (as otherwise we would have added all of
its incident edges), and hence has dedges incident to it from step 2 of the algorithm. At least
dfd/2 of these neighbors remain in H\F. Let N(x) denote these neighbors.
It is easy to see that with high probability there is an emulator edge added between N(x) and
yin the third step of the algorithm: the probability of not adding such an edge is at most
(1 p)d/2=16dln n
nd/2
exp 3d2ln n
n
Hence the probability that no edge is added between N(x) and ycan be bounded by exp(3fln n)
n3f. Hence with probability at least 1 1/n3fwe have added an emulator edge (z, y) for some
zN(x).
If we have such an edge, then by using it and the triangle inequality we have that
distH\F(u, v)distH\F(u, x) + distH\F(x, z) + distH\F(z, y) + distH\F(y, v)
i+ 1 + distG\F(z, y) + (j)
=+ 2 = distG\F(u, v) + 2.
This implies that with probability at least 1 n3fthe stretch guarantee holds for u, v and F.
Now as before a union bound over all fault sets of size at most fand all pairs u, v (nf+2 choices)
implies that the stretch guarantee holds with high probability for all such u, v , F .
26
References
[1] Amir Abboud and Greg Bodwin. The 4/3 additive spanner exponent is tight. Journal of the
ACM (JACM), 64(4):1–20, 2017.
[2] Donald Aingworth, Chandra Chekuri, Piotr Indyk, and Rajeev Motwani. Fast estimation of
diameter and shortest paths (without matrix multiplication). SIAM Journal on Computing,
28(4):1167–1181, 1999.
[3] Ingo Alth¨ofer, Gautam Das, David P. Dobkin, Deborah Joseph, and Jos´e Soares. On sparse
spanners of weighted graphs. Discrete & Computational Geometry, 9:81–100, 1993.
[4] Yair Amir, Claudiu Danilov, Stuart Goose, David Hedqvist, and Andreas Terzis. An overlay
architecture for high-quality voip streams. IEEE Transactions on Multimedia, 8(6):1250–1262,
2006. doi:10.1109/TMM.2006.884609.
[5] David Andersen, Hari Balakrishnan, Frans Kaashoek, and Robert Morris. Resilient overlay net-
works. SIGOPS Oper. Syst. Rev., 35(5):131?145, October 2001. doi:10.1145/502059.502048.
[6] Amy Babay, Emily Wagner, Michael Dinitz, and Yair Amir. Timely, reliable, and cost-effective
internet transport service using dissemination graphs. In IEEE 37th International Conference
on Distributed Computing Systems (ICDCS), pages 1–12, 2017. doi:10.1109/ICDCS.2017.63.
[7] Georg Baier, Thomas Erlebach, Alexander Hall, Ekkehard K¨ohler, Heiko Schilling, and Martin
Skutella. Length-bounded cuts and flows. In Michele Bugliesi, Bart Preneel, Vladimiro Sassone,
and Ingo Wegener, editors, Automata, Languages and Programming, pages 679–690, Berlin,
Heidelberg, 2006. Springer Berlin Heidelberg.
[8] Surender Baswana, Telikepalli Kavitha, Kurt Mehlhorn, and Seth Pettie. Additive spanners
and (α,β)-spanners. ACM Transactions on Algorithms (TALG), 7(1):1–26, 2010.
[9] Davide Bil`o, Fabrizio Grandoni, Luciano Gual`a, Stefano Leucci, and Guido Proietti. Improved
purely additive fault-tolerant spanners. In Algorithms-ESA 2015, pages 167–178. Springer,
2015.
[10] Greg Bodwin, Michael Dinitz, Merav Parter, and Virginia Vassilevska Williams. Optimal
vertex fault tolerant spanners (for fixed stretch). In Artur Czumaj, editor, Proceedings of
the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2018, New
Orleans, LA, USA, January 7-10, 2018, pages 1884–1900. SIAM, 2018.
[11] Greg Bodwin, Michael Dinitz, and Caleb Robelle. Optimal vertex fault-tolerant spanners
in polynomial time. In Proceedings of the Thirty-Second Annual ACM-SIAM Symposium on
Discrete Algorithms, SODA 2021, 2021.
[12] Greg Bodwin, Michael Dinitz, and Caleb Robelle. Partially optimal edge fault-tolerant span-
ners. arXiv preprint arXiv:2102.11360, 2021.
[13] Greg Bodwin, Fabrizio Grandoni, Merav Parter, and Virginia Vassilevska Williams. Preserving
distances in very faulty graphs. In 44th International Colloquium on Automata, Languages,
and Programming (ICALP 2017), volume 80, page 73. Schloss Dagstuhl–Leibniz-Zentrum fuer
Informatik, 2017.
27
[14] Greg Bodwin and Shyamal Patel. A trivial yet optimal solution to vertex fault tolerant span-
ners. In Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing,
PODC ’19, page 541–543, New York, NY, USA, 2019. Association for Computing Machinery.
doi:10.1145/3293611.3331588.
[15] Gilad Braunschvig, Shiri Chechik, David Peleg, and Adam Sealfon. Fault tolerant additive
and (µ,α)-spanners. Theoretical Computer Science, 580:94–100, 2015.
[16] Shiri Chechik. New additive spanners. In Proceedings of the twenty-fourth annual ACM-SIAM
symposium on Discrete algorithms, pages 498–512. SIAM, 2013.
[17] Shiri Chechik, Michael Langberg, David Peleg, and Liam Roditty. Fault tolerant spanners for
general graphs. SIAM J. Comput., 39(7):3403–3423, 2010.
[18] Michael Dinitz and Caleb Robelle. Efficient and simple algorithms for fault-tolerant spanners.
In Proceedings of the 2020 ACM Symposium on Principles of Distributed Computing, PODC
’20, 2020.
[19] Dorit Dor, Shay Halperin, and Uri Zwick. All-pairs almost shortest paths. SIAM Journal on
Computing, 29(5):1740–1759, 2000.
[20] Paul Erd˝os. Extremal problems in graph theory. In In Theory of Graphs and its Applications,
Proc. Sympos. Smolenice, 1964.
[21] Mathias Bæk Tejs Knudsen. Additive spanners: A simple construction. In Scandinavian
Workshop on Algorithm Theory, pages 277–281. Springer, 2014.
[22] Merav Parter. Vertex fault tolerant additive spanners. Distributed Computing, 30(5):357–372,
2017.
[23] David Peleg and Alejandro A. Sch¨affer. Graph spanners. Journal of Graph Theory, 13(1):99–
116, 1989.
[24] David Peleg and Jeffrey D. Ullman. An optimal synchronizer for the hypercube. SIAM J.
Comput., 18(4):740–747, 1989.
[25] S. Savage, T. Anderson, A. Aggarwal, D. Becker, N. Cardwell, A. Collins, E. Hoffman, J. Snell,
A. Vahdat, G. Voelker, and J. Zahorjan. Detour: informed internet routing and transport.
IEEE Micro, 19(1):50–59, 1999. doi:10.1109/40.748796.
[26] Lakshminarayanan Subramanian, Ion Stoica, Hari Balakrishnan, and Randy H. Katz. Overqos:
An overlay based architecture for enhancing internet qos. In Proceedings of the 1st Conference
on Symposium on Networked Systems Design and Implementation - Volume 1, NSDI’04, page 6,
USA, 2004. USENIX Association.
[27] Mikkel Thorup and Uri Zwick. Approximate distance oracles. Journal of the ACM (JACM),
52(1):1–24, 2005.
[28] R Wenger. Extremal graphs with no c4’s, c6’s, or c10’s. Jour-
nal of Combinatorial Theory, Series B, 52(1):113–116, 1991. URL:
https://www.sciencedirect.com/science/article/pii/0095895691900974,
doi:https://doi.org/10.1016/0095-8956(91)90097-4.
[29] David P Woodruff. Additive spanners in nearly quadratic time. In International Colloquium
on Automata, Languages, and Programming, pages 463–474. Springer, 2010.
28
A Proofs Omitted from Section 2
Lemma 2.3. With high probability, whenever we add a new spanner edge (u, v)in our algorithm,
we have X
(s,t)Ψ(u,v)
C(s,t)e
O(fd2)
where the values C(s,t)are defined just before (u, v)is added to H.
Proof. Let (u, v) be an edge in the input graph, and let FV\ {u, v}with |F| ≤ f. Consider
the moment in the algorithm where we inspect (u, v) and decide whether or not to add it to the
emulator (note: (u, v), F are arbitrary; we may or may not actually add (u, v), and if we do we do
not necessarily have F=F(u,v)). We use the following extensions of our previous definitions:
For a path πin H(sp)that would be completed if we added (u, v) to the emulator, we say
that πis F-avoiding if πF=.
Ψ(u, v, F ) is the set of node pairs (s, t)V×Vsuch that, if we added (u, v) to the emulator,
it would complete at least one new middle-heavy F-avoiding 3-path from sto t.
We say that Fis mass-avoiding for (u, v) if
X
(s,t)Ψ(u,v,F )
C(s,t)> cfd2log n.
where cis some large enough absolute constant.
Note that the lemma statement is equivalent to the claim that, if (u, v) is added to H(sp),then
F(u,v)is not mass-avoiding. We have set up these definitions for general (u, v), F because our proof
strategy is to take a union bound over all possible choices of (u, v), F , which will thus include F(u,v).
We say that a mass-avoiding Fis good for (u, v) if (immediately prior to (u, v) being considered
by the algorithm) there is some (s, t)Ψ(u, v , F ) such that (s, t) is already an emulator edge in
H. Otherwise, we say that Fis bad for (u, v).
We now prove that with high probability, every mass-avoiding Fis good for (u, v). To see this,
consider some mass-avoiding F. Every middle-heavy fault-avoiding 3-path π= (s, x, y , t) which
contributes to C(s,t)was completed by its middle edge (x, y) (since the path is middle-heavy and
the algorithm considers edges in increasing weight order) and does not intersect F(x,y)(since it is
fault-avoiding). So, by definition of the algorithm, when πwas completed we sampled (s, t) as an
emulator edge with probability d2. No two such paths share the same middle edge, and hence
we independently add (s, t) as an emulator edge with probability d2at least C(s,t)times. These
choices are also clearly independent for different pairs (s, t) and (s, t), and hence the probability
that Fis bad is at most
Y
(s,t)Ψ(u,v,F )11
d2C(s,t)
exp P(s,t)Ψ(u,v,F )C(s,t)
d2!exp (cf log n)1/nf+10,
where we used that Fis mass-avoiding and we set csufficiently large. There are at most nfpossible
mass-avoiding sets F(since |F| ≤ f), so a union bound over all all of them implies that every mass-
avoiding set Fis good for (u, v) with probability at least 1 1/n10 . We can now do another union
bound over all (u, v) to get that this holds for every (u, v)E(whether added to H(sp)or not)
with probability at least 1 1/n8.
29
Now consider some (u, v)H(sp). By the above, if F(u,v)is mass-avoiding, then it must be
good. Hence there is some emulator edge (s, t) with (s, t)Ψ(u, v , F(u,v)), which implies that
(s, u),(v, t)H(sp)and s, t 6∈ F(u,v). Thus immediately prior to adding (u, v) to H(sp), it was the
case that
distH\F(u,v)(u, v)distH\F(u,v )(u, s) + distH\F(u,v)(s, t) + distH\F(u,v)(t, v )
distG\F(u,v)(u, s) + distG\F(u,v)(s, t) + distG\F(u,v)(t, v )
distG\F(u,v)(u, s) + distG\F(u,v)(s, u) + distG\F(u,v )(u, v) + distG\F(u,v)(t, v )+ distG\F(u,v )(t, v)
5·distG\F(u,v)(u, v)
= 5 ·w(u, v).
In the above inequalities we used the triangle inequality, the fact that edges are added in increasing
weight order and (u, s) and (t, v) have already been added and so are lighter than (u, v), and the fact
that (s, t) is an emulator edge and so after the failure of F(u,v)must have weight distG\F(u,v)(s, t).
But this means that the algorithm would not have added (u, v) due to fault set F(u,v), which
contradicts the definition of (u, v) and F(u,v). Hence if (u, v) is added then F(u,v)cannot be mass-
avoiding, which implies the lemma.
Lemma 2.7. The emulator Hreturned by Algorithm 1 has |E(H)|=e
Of1/3n4/3+O(fn)with
high probability.
Proof. Let mbe the number of middle-heavy fault-avoiding 3-paths in the final graph H(sp). By
Lemma 2.4, we have that with high probability
m=d2EH(sp)+e
Ofn2.
We condition on this event occurring in the remainder of the argument. There are two cases,
depending on which of these terms is larger.
Case 1: the term d2E(H(sp)is larger. In this case, by Lemma 2.5, the total number of
emulator edges is OEH(sp), and so it suffices to bound the spanner edges. By Lemma 2.6,
the total number of spanner edges is
EH(sp)=Od2E(H(sp)1/3n2/3+nf.
If the latter term dominates then EH(sp)=O(nf ), so we are done. If the former term
dominates, then by rearranging we get EH(sp)=O(nd), and so in this case |E(H)|=O(nd).
Case 2: the term e
Ofn2is larger. In this case, by Lemma 2.5, the total number of edges in
H(em)is e
Ofn2d2. By Lemma 2.6, the total number of spanner edges in H(sp)is
EH(sp)=Oe
Ofn21/3n2/3+nf =e
Of1/3n4/3+O(nf).
So in this case, the number of edges in the final emulator His
e
Ofn2d2+f1/3n4/3+O(nf ).
30
Putting It Together. In either case, the total number of edges in the final emulator His
e
Ofn2d2+f1/3n4/3+O(nd +nf ).
Setting d=f1/3n1/3, we thus get e
Of1/3n4/3+O(nf) edges in total, as claimed.
31