Page 1
arXiv:cs/0607105v4 [cs.NA] 17 Sep 2009
Nearly-Linear Time Algorithms for Preconditioning and Solving
Symmetric, Diagonally Dominant Linear Systems∗
Daniel A. Spielman
Department of Computer Science
Program in Applied Mathematics
Yale University
Shang-Hua Teng
Department of Computer Science
Boston University
September 17, 2009
Abstract
We present a randomized algorithm that, on input a symmetric, weakly diagonally dom-
inant n-by-n matrix A with m non-zero entries and an n-vector b, produces an ˜ x such that
??˜ x − A†b??
mlogO(1)nlog(1/ǫ).
A≤ ǫ??A†b??
Ain expected time
The algorithm applies subgraph preconditioners in a recursive fashion. These preconditioners
improve upon the subgraph preconditioners first introduced by Vaidya (1990). For any
symmetric, weakly diagonally-dominant matrix A with non-positive off-diagonal entries and
k ≥ 1, we construct in time mlogO(1)n a preconditioner of A with at most
2(n − 1) + (m/k)logO(1)n
non-zero off-diagonal entries such that the finite generalized condition number κf(A,B) is
at most k. If the non-zero structure of the matrix is planar, then the condition number is at
most
O?(n/k)lognloglog2n?,
and the corresponding linear system solver runs in expected time
O(nlog2n + nlogn (loglogn)2log(1/ǫ)).
Similar bounds are obtained on the running time of algorithms computing approximate
Fiedler vectors.
∗This paper is the last in a sequence of three papers expanding on material that appeared first under the title
“Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems” [ST04].
The second paper, “Spectral Sparsification of Graphs” [ST08c] contains algorithms for constructing sparsifiers
of graphs, which we use in this paper to build preconditioners. The first paper, “A Local Clustering Algorithm
for Massive Graphs and its Application to Nearly-Linear Time Graph Partitioning” [ST08b] contains graph
partitioning algorithms that are used to construct sparsifiers in the second paper.
This material is based upon work supported by the National Science Foundation under Grant Nos. 0325630,
0324914, 0634957, 0635102 and 0707522. Any opinions, findings, and conclusions or recommendations expressed in
this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.
1
Page 2
1Introduction
We design an algorithm with nearly optimal asymptotic complexity for solving linear systems
in symmetric, weakly diagonally dominant (SDD0) matrices. The algorithm applies a classical
iterative solver, such as the Preconditioned Conjugate Gradient or the Preconditioned Chebyshev
Method, with a novel preconditioner that we construct and analyze using techniques from graph
theory. Linear systems in these preconditioners may be reduced to systems of smaller size in
linear time by use of a direct method. The smaller linear systems are solved recursively. The
resulting algorithm solves linear systems in SDD0matrices in time almost linear in their number
of non-zero entries. Our analysis does not make any assumptions about the non-zero structure
of the matrix, and thus may be applied to the solution of the systems in SDD0matrices that
arise in any application, such as the solution of elliptic partial differential equations by the
finite element method [Str86, BHV04], the solution of maximum flow problems by interior point
algorithms [FG04, DS08], or the solution of learning problems on graphs [BMN04, ZBL+03,
ZGL03].
Graph theory drives the construction of our preconditioners. Our algorithm is best un-
derstood by first examining its behavior on Laplacian matrices—the symmetric matrices with
non-positive off-diagonals and zero row sums. Each n-by-n Laplacian matrix A may be associ-
ated with a weighted graph, in which the weight of the edge between distinct vertices i and j is
−Ai,j(see Figure 1). We precondition the Laplacian matrix A of a graph G by the Laplacian
matrix B of a subgraph H of G that resembles a spanning tree of G plus a few edges. The sub-
graph H is called an ultra-sparsifier of G, and its corresponding Laplacian matrix is a very good
preconditioner for A: The finite generalized condition number κf(A,B) is logO(1)n. Moreover,
it is easy to solve linear equations in B. As the graph H resembles a tree plus a few edges, we
may use partial Cholesky factorization to eliminate most of the rows and columns of B while
incurring only a linear amount fill. We then solve the reduced system recursively.
1.5
−1.5
0
0
−1.5
4
−2
−0.5
00
−2
3
−1
−0.5
−1
1.5
2
4
3
1
1.5
0.5
1
2
Figure 1: A Laplacian matrix and its corresponding weighted graph.
The technical meat of this paper lies in the construction of ultra-sparsifiers for Laplacian
matrices, which appears in Sections 7 through 10. In the remainder of the introduction, we
formally define ultra-sparsifiers, and the sparsifiers from which they are built. In Section 2,
we survey the contributions upon which we build, and mention other related work. We devote
Section 3 to recalling the basics of support theory, defining the finite condition number, and
explaining why we may restrict out attention to Laplacian matrices.
In Section 4, we state the properties we require of partial Cholesky factorizations, and we
present our first algorithms for solving equations in SDD0-matrices. These algorithms directly
solve equations in the preconditioners, rather than using a recursive approach, and take time
roughly O(m5/4logO(1)n) for general SDD0-matrices and O(n9/8log1/2n) for SDDM0-matrices
2
Page 3
with planar non-zero structure. To accelerate these algorithms, we apply our preconditioners
in a recursive fashion. We analyze the complexity of these recursive algorithms in Section 5,
obtaining our main algorithmic results. In Section 6, we observe that these linear system solvers
yield efficient algorithms for computing approximate Fiedler vectors, when applied inside the
inverse power method.
We do not attempt to optimize the exponent of logn in the complexity of our algorithm.
Rather, we present the simplest analysis we can find of an algorithm of complexity mlogO(1)nlog(1/ǫ).
We expect that the exponent of logn can be substantially reduced through advances in the con-
structions of low-stretch spanning trees, sparsifiers, and ultrasparsifiers. Experimental work is
required to determine whether a variation of our algorithm will be useful in practice.
1.1 Ultra-sparsifiers
To describe the quality of our preconditioners, we employ the notation A ? B to indicate that
B − A is positive semi-definite. We define a SDDM0-matrix to be a SDD0-matrix with no
positive off-diagonal entries. When positive definite, the SDDM0-matrices are M-matrices and
in particular are Stieltjes matrices.
Definition 1.1 (Ultra-Sparsifiers). A (k,h)-ultra-sparsifier of an n-by-n SDDM0-matrix A with
2m non-zero off-diagonal entries is a SDDM0-matrix Assuch that
(a) As? A ? k · As.
(b) Ashas at most 2(n − 1) + 2hm/k non-zero off-diagonal entries.
(c) The set of non-zero entries of Asis a subset of the set of non-zero entries of A.
In Section 10, we present an expected mlogO(1)n-time algorithm that on input a Laplacian
matrix A and a k ≥ 1 produces a (k,h)-ultra-sparsifier of A with probability at least 1 − 1/2n,
for
h = c3logc4
2n, (1)
where c3and c4are some absolute constants. As we will use these ultra-sparsifiers throughout
the paper, we will define a k-ultra-sparsifier to be a (k,h)-ultra-sparsifier where h satisfies (1).
For matrices whose graphs are planar, we present a simpler construction of (k,h)-ultra-
sparsifiers, with h = O?logn(loglogn)2?. This simple constructions exploits low-stretch span-
ultra-sparsifiers in Section 10 builds upon the simpler construction, but requires the use of
sparsifiers. The following definition of sparsifiers will suffice for the purposes of this paper.
ning trees [AKPW95, EEST08, ABN08], and is presented in Section 9. Our construction of
Definition 1.2 (Sparsifiers). A d-sparsifier of n-by-n SDDM0-matrix A is a SDDM0-matrix As
such that
(a) As? A ? (5/4)As.
(b) Ashas at most dn non-zero off-diagonal entries.
(c) The set of non-zero entries of Asis a subset of the set of non-zero entries of A.
3
Page 4
(d) For all i,
?
j?=i
As(i,j)
A(i,j)
≤ 2|{j : A(i,j) ?= 0}|.
In a companion paper [ST08c], we present a randomized algorithm Sparsify2 that produces
sparsifiers of Laplacian matrices in expected nearly-linear time. As explained in Section 3, this
construction can trivially be extended to all SDDM0-matrices.
Theorem 1.3 (Sparsification). On input an n × n Laplacian matrix A with 2m non-zero off-
diagonal entries and a p > 0, Sparsify2 runs in expected time mlog(1/p)log17n and with
probability at least 1 − p produces a c1logc2(n/p)-sparsifier of A, for c2= 30 and some absolute
constant c1> 1.
We parameterize this theorem by the constants c1and c2as we believe that they can be
substantially improved. In particular, Spielman and Srivastava [SS08] construct sparsifiers with
c2= 1, but these constructions require the solution of linear equations in Laplacian matrices,
and so can not be used to help speed up the algorithms in this paper. Batson, Spielman and
Srivastava [BSS09] have proved that there exist sparsifiers that satisfy conditions (a) through
(c) of Definition 1.2 with c2= 0.
2 Related Work
In this section, we explain how our results relate to other rigorous asymptotic analyses of algo-
rithms for solving systems of linear equations. For the most part, we restrict our attention to
algorithms that make structural assumptions about their input matrices, rather than assump-
tions about the origins of those matrices.
Throughout our discussion, we consider an n-by-n matrix with m non-zero entries. When
m is large relative to n and the matrix is arbitrary, the fastest algorithms for solving linear
equations are those based on fast matrix multiplication [CW82], which take time approximately
O(n2.376). The fastest algorithm for solving general sparse positive semi-definite linear systems
is the Conjugate Gradient. Used as a direct solver, it runs in time O(mn) (see [TB97, Theo-
rem 28.3]). Of course, this algorithm can be used to solve a system in an arbitrary matrix A in
a similar amount of time by first multiplying both sides by AT. To the best of our knowledge,
every faster algorithm requires additional properties of the input matrix.
2.1 Special non-zero structure
In the design and analysis of direct solvers, it is standard to represent the non-zero structure
of a matrix A by an unweighted graph GAthat has an edge between vertices i ?= j if and only
if Ai,jis non-zero (see [DER86]). If this graph has special structure, there may be elimination
orderings that accelerate direct solvers. If A is tri-diagonal, in which case GAis a path, then a
linear system in A can be solved in time O(n). Similarly, when GAis a tree a linear system in
A by be solved in time O(n) (see [DER86]).
If the graph of non-zero entries GA is planar, one can use Generalized Nested Dissec-
tion [Geo73, LRT79, GT87] to find an elimination ordering under which Cholesky factorization
can be performed in time O(n1.5) and produces factors with at most O(nlogn) non-zero entries.
4
Page 5
We will exploit these results in our algorithms for solving planar linear systems in Section 4.
We recall that a planar graph on n vertices has at most 3n−6 edges (see [Har72, Corollary 11.1
(c)]), so m ≤ 6n.
2.2Subgraph Preconditioners
Our work builds on a remarkable approach to solving linear systems in Laplacian matrices
introduced by Vaidya [Vai90]. Vaidya demonstrated that a good preconditioner for a Laplacian
matrix A can be found in the Laplacian matrix B of a subgraph of the graph corresponding to
A. He then showed that one could bound the condition number of the preconditioned system by
bounding the dilation and congestion of an embedding of the graph of A into the graph of B. By
using preconditioners obtained by adding edges to maximum spanning trees, Vaidya developed
an algorithm that finds ǫ-approximate solutions to linear systems in SDDM0-matrices with at
most d non-zero entries per row in time O((dn)1.75log(1/ǫ)). When the graph corresponding
to A had special structure, such as having a bounded genus or avoiding certain minors, he
obtained even faster algorithms. For example, his algorithm for solving planar systems runs in
time O((dn)1.2log(1/ǫ)).
As Vaidya’s paper was never published and his manuscript lacked many proofs, the task
of formally working out his results fell to others. Much of its content appears in the thesis
of his student, Anil Joshi [Jos97], and a complete exposition along with many extensions was
presented by Bern et. al. [BGH+06]. Gremban, Miller and Zagha [Gre96, GMZ95] explain parts
of Vaidya’s paper as well as extend Vaidya’s techniques. Among other results, they find ways of
constructing preconditioners by adding vertices to the graphs. Maggs et. al. [MMP+05] prove
that this technique may be used to construct excellent preconditioners, but it is still not clear if
they can be constructed efficiently.
The machinery needed to apply Vaidya’s techniques directly to matrices with positive off-
diagonal elements is developed in [BCHT04]. An algebraic extension of Vaidya’s techniques for
bounding the condition number was presented by Boman and Hendrickson [BH03b], and later
used by them [BH01] to prove that the low-stretch spanning trees constructed by Alon, Karp,
Peleg, and West [AKPW95], yield preconditioners for which the preconditioned system has con-
dition number at most m2O(√lognloglogn). They thereby obtained a solver for symmetric diago-
nally dominant linear systems that produces ǫ-approximate solutions in time m1.5+o(1)log(1/ǫ).
Through improvements in the construction of low-stretch spanning trees [EEST08, ABN08] and
a careful analysis of the eigenvalue distribution of the preconditioned system, Spielman and
Woo [SW09] show that when the Preconditioned Conjugate Gradient is applied with the best
low-stretch spanning tree preconditioners, the resulting linear system solver takes time at most
O(mn1/3log1/2nlog(1/ǫ)). The preconditioners in the present paper are formed by adding edges
to these low-stretch spanning trees.
The recursive application of subgraph preconditioners was pioneered in the work of Joshi [Jos97]
and Reif [Rei98]. Reif [Rei98] showed how to recursively apply Vaidya’s preconditioners to solve
linear systems in SDDM0-matrices with planar non-zero structure and at most a constant num-
ber of non-zeros per row in time O(n1+βlogO(1)(κ(A)/ǫ)), for every β > 0. While Joshi’s anal-
ysis is numerically much cleaner, he only analyzes preconditioners for simple model problems.
Our recursive scheme uses ideas from both these works, with some simplification. Koutis and
Miller [KM07] have developed recursive algorithms that solve linear systems in SDDM0-matrices
5
Page 6
with planar non-zero structure in time O(nlog(1/ǫ)).
2.3Other families of matrices
Subgraph preconditioners have been used to solve systems of linear equations from a few other
families.
Daitch and Spielman [DS08] have shown how to reduce the problem of solving linear equa-
tions in symmetric M0-matrices to the problem of solving linear equations in SDDM0-matrices,
given a factorization of the M0-matrix of width 2 [EGB05]. These matrices, with the required
factorizations, arise in the solution of the generalized maximum flow problem by interior point
algorithms.
Shklarski and Toledo [ST08a] introduce an extension of support graph preconditioners, called
fretsaw preconditioners, which are well suited to preconditioning finite element matrices. Daitch
and Spielman [DS07] use these preconditioners to solve linear equations in the stiffness matrices
of two-dimensional truss structures in time O(n5/4lognlog(1/ǫ)).
For linear equations that arise when solving elliptic partial differential equations, other tech-
niques supply fast algorithms. For example, Multigrid methods may be proved correct when
applied to the solution of some of these linear systems [BHM01], and Hierarchical Matrices run
in nearly-linear time when the discretization is nice [BH03a]. Boman, Hendrickson, and Vavasis
[BHV04] have shown that the problem of solving a large class of these linear systems may be
reduced to that of solving diagonally-dominant systems. Thus, our algorithms may be applied
to the solution of these systems.
3 Background and Notation
By logx, we mean the logarithm of x base 2, and by lnx the natural logarithm.
We define SDD0 to be the class of symmetric, weakly diagonally dominant matrices, and
SDDM0to be the class of SDD0-matrices with non-positive off-diagonal entries. We define a
Laplacian matrix to be a SDDM0-matrix with with zero row-sums.
Throughout this paper, we define the A-norm by
√
?x?A=
xTAx.
3.1Preconditioners
For symmetric matrices A and B, we write
A ? B
if B−A is positive semi-definite. We recall that if A is positive semi-definite and B is symmetric,
then all eigenvalues of AB are real. For a matrix B, we let B†denote the Moore-Penrose pseudo-
inverse of B—that is the matrix with the same nullspace as B that acts as the inverse of B on
its image. We will use the following propositions, whose proofs are elementary.
Proposition 3.1. If A and B are positive semi-definite matrices such that for some α,β > 0,
αA ? B ? βA
then A and B have the same nullspace.
6
Page 7
Proposition 3.2. If A and B are positive semi-definite matrices having the same nullspace and
α > 0, then
αA ? B
if and only if
αB†? A†.
The following proposition notes the equivalence of two notions of preconditioning. This
proposition is called the “Support Lemma” in [BGH+06] and [Gre96], and is implied by Theo-
rem 10.1 of [Axe85]. We include a proof for completeness.
Proposition 3.3. If A and B are symmetric matrices with the same nullspace and A is positive
semi-definite, then all eigenvalues of AB†lie between λminand λmaxif and only if
λminB ? A ? λmaxB.
Proof. We first note that AB†has the same eigenvalues as A1/2B†A1/2. If for all x ∈ Image(A)
we have
λminxTx ≤ xTA1/2B†A1/2x,
then by setting z = A1/2x, we find that for all z ∈ Image(A),
λminzTA†z ≤ zTB†z,
which is equivalent to λminA†? B†and
λminB ? A.
The other side is proved similarly.
Following Bern et. al. [BGH+06], we define the finite generalized condition number κf(A,B)
of matrices A and B having the same nullspace to be the ratio of the largest to smallest non-
zero eigenvalues AB†. Proposition 3.3 tells us that λminB ? A ? λmaxB implies κf(A,B) ≤
λmax/λmin. One can use κf(A,B) to bound the number of iterations taken by the Preconditioned
Conjugate Gradient algorithm to solve linear systems in A when using B as a preconditioner.
Given bounds on λmaxand λmin, one can similarly bound the complexity of the Preconditioned
Chebyshev method.
3.2Laplacians Suffice
When constructing preconditioners, we will focus our attention on the problem of preconditioning
Laplacian matrices.
Bern et. al. [BGH+06, Lemma 2.5], observe that the problem of preconditioning SDDM0-
matrices is easily reduced to that of preconditioning Laplacian matrices. We recall the reduction
for completeness.
Proposition 3.4. Let A be a SDDM0-matrix. Then, A can be expressed as A = AL+ADwhere
AL is a Laplacian matrix and AD is a diagonal matrix with non-negative entries. Moreover,
if BL is a Laplacian matrix such that AL ? BL, then A ? BL+ AD. Similarly, if BL is a
Laplacian matrix such that BL? AL, then BL+ AD? A.
7
Page 8
So, any algorithm for constructing sparsifiers or ultra-sparsifiers for Laplacian matrices can
immediately be converted into an algorithm for constructing sparsifiers or ultra-sparsifiers of
SDDM0-matrices. Accordingly in Sections 9 and 10 we will restrict our attention to the problem
of preconditioning Laplacian matrices.
Recall that a symmetric matrix A is reducible if there is a permutation matrix P for which
PTAP is a block-diagonal matrix with at least two blocks. If such a permutation exists, one
can find it in linear time. A matrix that is not reducible is said to be irreducible. The problem
of solving a linear system in a reducible matrix can be reduced to the problems of solving linear
systems in each of the blocks. Throughout the rest of this paper, we will restrict our attention
to solving linear systems in irreducible matrices. It is well-known that a symmetric matrix is
irreducible if and only if its corresponding graph of non-zero entries is connected. We use this
fact in the special case of Laplacian matrices, observing that the weighted graph associated with
a Laplacian matrix A has the same set of edges as GA.
Proposition 3.5. A Laplacian matrix is irreducible if and only if its corresponding weighted
graph is connected.
It is also well-known that the null-space of the Laplacian matrix of a connected graph is the
span of the all-1’s vector. Combining this fact with Proposition 3.4, one can show that the only
singular irreducible SDDM0-matrices are the Laplacian matrices.
Proposition 3.6. A singular irreducible SDDM0-matrix is a Laplacian matrix, and its nullspace
is spanned by the all-1’s vector.
To the extent possible, we will describe our algorithms for solving irreducible singular and
non-singular systems similarly. The one tool that we use for which this requires some thought is
the Cholesky factorization. As the Cholesky factorization of a Laplacian matrix is degenerate, it
is not immediately clear that one can use backwards and forwards substitutions on the Cholesky
factors to solve a system in a Laplacian. To handle this technicality, we note that an irreducible
Laplacian matrix A has a factorization of the form
A = LDLT,
where L is lower-triangular and non-zero on its entire diagonal and D is a diagonal matrix with
ones on each diagonal entry, excluding the bottom right-most which is a zero. This factorization
may be computed by a slight modification of standard Cholesky factorization algorithms. The
pseudo-inverse of A can be written
A†= ΠL−TDL−1Π,
where Π is the projection orthogonal to the all-1’s vector (see Appendix D).
When A is a Laplacian and we refer to forwards or backwards substitution on its Cholesky
factors, we will mean multiplying by DL−1Π or ΠL−TD, respectively, and remark that these
operations can be performed in time proportional to the number of non-zero entries in L.
4 Solvers
We first note that by Gremban’s reduction, the problem of solving an equation of the form
Ax = b for a SDD0-matrix A can be reduced to the problem of solving a system that is twice
8
Page 9
as large in a SDDM0-matrix (see Appendix A). So, for the purposes of asymptotic complexity,
we need only consider the problem of solving systems in SDDM0-matrices.
To solve systems in an irreducible SDDM0-matrix A, we will compute an ultra-sparsifier
B of A, and then solve the system in A using a preconditioned iterative method. At each
iteration of this method, we will need to solve a system in B. We will solve a system in B
by a two-step algorithm. We will first apply Cholesky factorization repeatedly to eliminate
all rows and columns with at most one or two non-zero off-diagonal entries. As we stop the
Cholesky factorization before it has factored the entire matrix, we call this process a partial
Cholesky factorization. We then apply another solver on the remaining system. In this section,
we analyze the use of a direct solver. In Section 5, we obtain our fastest algorithms by solving
the remaining system recursively.
The application of partial Cholesky factorization to eliminate rows and columns with at most
2 non-zero off-diagonal entries results in a factorization of B of the form
B = PLCLTPT,
where C has the form
C =
?
In−n1
0
0
A1,
?
,
P is a permutation matrix, L is non-singular and lower triangular of the form
L =
?
L1,1
L2,1
0
In1,
?
,
and every row and column of A1has at least 3 non-zero off-diagonal entries.
We will exploit the properties of this factorization stated in the following proposition.
Proposition 4.1 (Partial Cholesky Factorization). If B is an irreducible SDDM0-matrix then,
(a) A1is an irreducible SDDM0-matrix and is singular if and only if A is singular.
(b) If the graph of non-zero entries of B is planar, then the graph of non-zero entries of A1is
as well.
(c) L has at most 3n non-zero entries.
(d) If B has 2(n−1+j) non-zero off-diagonal entries, then A1has dimension at most 2j −2
and has at most 2(3j − 3) non-zero off-diagonal entries.
?In−n1
Proof. It is routine to verify that A1 is diagonally dominant with non-positive off-diagonal
entries, and that planarity is preserved by elimination of rows and columns with 2 or 3 non-zero
entries, as these correspond to vertices of degree 1 or 2 in the graph of non-zero entries. It is
similarly routine to observe that these eliminations preserve irreducibility and singularity.
To bound the number of entries in L, we note that for each row and column with 1 non-zero
off-diagonal entry that is eliminated, the corresponding column in L has 2 non-zero entries,
(e) B†= ΠP−TL−T
0
A1†
0
?
L−1P−1Π, where Π is the projection onto the span of B.
9
Page 10
and that for each row and column with 2 non-zero off-diagonal entries that is eliminated, the
corresponding column in L has 3 non-zero entries.
To bound n1, the dimension of A1, first observe that the elimination of a row and column
with 1 or 2 non-zero off-diagonal entries decreases both the dimension by 1 and the number of
non-zero entries by 2. So, A1will have 2(n1− 1 + j) non-zero off-diagonal entries. As each row
in A1has at least 3 non-zero off-diagonal entries, we have
2(n1− 1 + j) ≥ 3n1,
which implies n1≤ 2j −2. The bound on the number non-zero off-diagonal entries in A1follows
immediately.
Finally, (5) may be proved by verifying that the formula given for B†satisfies all the axioms
of the pseudo-inverse (which we do in Appendix D).
We name the algorithm that performs this factorization PartialChol, and invoke it with
the syntax
(P,L,A1) = PartialChol(B).
We remark that PartialChol can be implemented to run in linear time.
4.1 One-Level Algorithms
Before analyzing the algorithm in which we solve systems in A1recursively, we pause to examine
the complexity of an algorithm that applies a direct solver to systems in A1. While the results
in this subsection are not necessary for the main claims of our paper, we hope they will provide
intuition.
If we are willing to ignore numerical issues, we may apply the conjugate gradient algorithm
to directly solve systems in A1in O(n1m1) operations [TB97, Theorem 28.3], where m1is the
number of non-zero entries in A1. In the following theorem, we examine the performance of the
resulting algorithm.
Theorem 4.2 (General One-Level Algorithm). Let A be an irreducible n-by-n SDDM0-matrix
with 2m non-zero off-diagonal entries. Let B be a√m-ultra-sparsifier of A. Let (P,L,A1) =
PartialChol(B). Consider the algorithm that solves systems in A by applying PCG with B as a
preconditioner, and solves each system in B by a performing backward substitution on its partial
Cholesky factor, solving the inner system in A1by conjugate gradient used as an exact solver,
and performing forward substitution on its partial Cholesky factor. Then for every right-hand
side b, after
O(m1/4log(1/ǫ))
iterations, comprising
O(m5/4log2c4nlog(1/ǫ))
arithmetic operations, the algorithm will output an approximate solution ˜ x satisfying
???˜ x − A†b
???A≤ ǫ
???A†b
???A. (2)
10
Page 11
Proof. As κf(A,B) ≤√m, we may apply the standard analysis of PCG [Axe85], to show that
(2) will be satisfied after O(m1/4log(1/ǫ)) iterations. To bound the number of operations in
each iteration, note that B has at most 2(n − 1) + O(√mlogc4n) non-zero off-diagonal entries.
So, Proposition 4.1 implies m1and n1are both O(√mlogc4n). Thus, the time required to solve
each inner system in A1is at most O(m1n1) = O(mlog2c4n). As A is irreducible m ≥ n −1, so
this bounds the number of operations that must be performed in each iteration.
If m is much greater than n, we could speed up this algorithm by first applying Sparsify2
to compute a very good sparse preconditioner Asfor A, using the one-level algorithm to solve
systems in As, and then applying this solver to A by iterative refinement.
When the graph of non-zero entries of A is planar, we may precondition using the the
algorithm UltraSimple, presented in Section 9, instead of UltraSparsify. As the matrix
A1produced by applying partial Cholesky factorization to the output of UltraSimple is also
planar, we can solve the linear systems in A1by the generalized nested dissection algorithm of
Lipton, Rose and Tarjan [LRT79]. This algorithm uses graph separators to choose a good order
for Cholesky factorization. The Cholesky factorization is then computed in time O(n3/2
resulting Cholesky factors only have O(n1logn1) non-zero entries, and so each linear system in
A1may be solved in time O(n1logn1), after the Cholesky factors have been computed.
1). The
Theorem 4.3 (Planar One-Level Algorithm). Let A be an n-by-n planar SDDM0-matrix with
m non-zero entries. Consider the algorithm that solves linear systems in A by using PCG with
the preconditioner
B = UltraSimple(A,n3/4log1/3n),
and solves systems in B by applying PartialChol to factor B into PL[I,0;0,A1]LTPT, and
uses generalized nested dissection to solve systems in A1. For every right-hand side b, this
algorithm computes an ˜ x satisfying
???˜ x − A†b
in time
On9/8log1/2nlog(1/ǫ)
???A≤ ǫ
???A†b
???A
?
(3)
?
.
Proof. First, recall that the planarity of A implies m ≤ 3n. Thus, the time taken by UltraSimple
is dominated by the time taken by LowStretch, which is O(nlog2n).
By Theorem 9.1 and Theorem 9.5, the matrix B has at most 2(n−1)+6n3/4log1/3n non-zero
off-diagonal entries and
?
Again, standard analysis of PCG [Axe85] tells us that the algorithm will require at most
?
iterations guarantee that (3) is satisfied.
κf(A,B) = On1/4log2/3nlog2logn
?
≤ O
?
n1/4logn
?
.
On1/8log1/2nlog(1/ǫ)
?
11
Page 12
By Proposition 4.1, the dimension of A1, n1, is at most 6n3/4log1/3n. Before beginning to
solve the linear system, the algorithm will spend
O(n3/2
1) = O((n3/4log1/3n)3/2) = O(n9/8log1/2n)
time using generalized nested dissection [LRT79] to permute and Cholesky factor the matrix A1.
As the factors obtained will have at most O(n1logn1) ≤ O(n) non-zeros, each iteration of the
PCG will require at most O(n) steps. So, the total complexity of the application of the PCG
will be
On ·
which dominates the time required to compute the Cholesky factors and the time of the call to
UltraSimple.
??
n1/8log1/2nlog(1/ǫ)
??
= O
?
n9/8log1/2nlog(1/ǫ)
?
,
5The Recursive Solver
In our recursive algorithm for solving linear equations, we solve linear equations in a matrix A
by computing an ultra-sparsifier B, using partial Cholesky factorization to reduce it to a matrix
A1, and then solving the system in A1recursively. Of course, we compute all of the necessary
ultra-sparsifiers and Cholesky factorizations just once at the beginning of the algorithm.
To specify the recursive algorithm for an n-by-n matrix, we first set the parameters
χ = c3logc4n,(4)
and
k = (14χ + 1)2, (5)
where we recall that c3 and c4 are determined by the quality of the ultra-sparsifiers we can
compute (see equation (1)), and were used to define a k-ultra-sparsifier.
We the following algorithm BuildPreconditioners to build the sequence of preconditioners
and Cholesky factors. In Section 10, we define the routine UltraSparsify for weighted graphs,
and thus implicitly for Laplacian matrices.For general irreducible SDDM0-matrices A, we
express A as a sum of matrices ALand ADas explained in Proposition 3.4, and return ADplus
the ultra-sparsifier of the Laplacian matrix AL.
12
Page 13
BuildPreconditioners(A0),
1. i = 0.
2. Repeat
(a) i = i + 1.
(b) Bi= UltraSparsify(Ai−1,k).
(c) (Pi,Li,Ai) = partialChol(Bi).
(d) Set Πito be the projection onto the span of Bi.
Until Aihas dimension less than 66χ + 6.
3. Set ℓ = i.
4. Compute Zℓ= Aℓ†.
We now make a few observations about the sequence of matrices this algorithm generates.
In the following, we let noff (A) denote the number of non-zero off-diagonal entries in the upper-
triangular portion of A, and let dim(A) denote the dimension of A.
Proposition 5.1 (Recursive Preconditioning). If A0 is a symmetric, irreducible, SDDM0-
matrix, and for each i the matrix Biis a k-ultra-sparsifier of Ai, then
(a) For i ≥ 1, noff (Ai) ≤ (3χ/k)noff (Ai−1).
(b) For i ≥ 1, dim(Ai) ≤ (2χ/k)noff (Ai−1).
(c) For i ≥ 1, dim(Bi) = dim(Ai−1).
(d) Each of Biand Aiis an irreducible SDDM0-matrix.
(e) Each Aiand Biis a Laplacian matrix if and only if A0is as well.
(f) If A0is a Laplacian matrix, then each Πiis a projection orthogonal to the all-1’s vector.
Otherwise, each Πiis the identity.
Proof. Let nibe the dimension of Ai. Definition 1.1 tells us that
noff (Bi) ≤ n − 1 + hnoff (Ai)/k = n − 1 + noff (Ai)χ/k.
Parts (a), (b), (d) and (e) now follow from Proposition 4.1. Part (c) is obvious, and part (f)
follows from Proposition 3.6.
Our recursive solver will use each matrix Bias a preconditioner for Ai−1. But rather than
solve systems in Bidirectly, it will reduce these to systems in Ai, which will in turn be solved
recursively. Our solver will use the preconditioned Chebyshev method, instead of the precon-
ditioned conjugate gradient. This choice is dictated by the requirements of our analysis rather
than by common sense. Our preconditioned Chebyshev method will not take the preconditioner
13
Page 14
Bias input. Rather, it will take a subroutine solveBithat produces approximate solutions to
systems in Bi. So that we can guarantee that our solvers will be linear operators, we will fix
the number of iterations that each will perform, as opposed to allowing them to terminate upon
finding a sufficiently good solution. While this trick is necessary for our analysis, it may also be
unnecessary in practice1.
For concreteness, we present pseudocode for the variant of the preconditioned Chebyshev
algorithm that we will use. It is a modification of the pseudocode presented in [BBC+94, page
36], the difference being that it takes as input a parameter t determining the number of iterations
it executes (and some variable names have been changed).
x = precondCheby(A,b,t,f(·),λmin,λmax)
(0) Set x = 0.
(1) r = b
(2) d = (λmax+ λmin)/2, c = (λmax− λmin)/2
(3) for i = 1,...,t,
(a) z = f(r)
(b) if i = 1,
x = z
α = 2/d
else,
β = (cα/2)2
α = 1/(d − β)
x = z + βx
(c) x = x + αx
(d) r = b − Ax
Proposition 5.2 (Linear Chebyshev). Let A be a positive semi-definite matrix and f be a
positive semi-definite, symmetric linear operator such that
λminf†? A ? λmaxf†.(6)
Let ǫ < 1 and let
t ≥
?
1
2
?λmax
λmin
ln2
ǫ
?
. (7)
Then, the function precondCheby(A,b,t,f,λmin,λmax) is a symmetric linear operator in b with
the same nullspace as A. Moreover, if Z is the matrix realizing this operator, then
(1 − ǫ)Z†? A ? (1 + ǫ)Z†.
1One could obtain a slightly weaker analysis of this algorithm if one instead allowed the Chebyshev solvers to
terminate as soon as they found a sufficiently accurate solution. In an early version of this paper, we analyzed such
an algorithm using the analysis of the inexact preconditioned Chebyshev iteration by Golub and Overton [GO88].
This analysis was improved by applying a slight extension by Joshi [Jos97] of Golub and Overton’s analysis. The
idea of bypassing these analysis by forcing our solvers to be linear operators was suggested to us by Vladimir
Rokhlin.
14
Page 15
Proof. By Proposition 3.1, condition (6) implies that f and A have the same nullspace. An
inspection of the pseudo-code reveals that the function computed by precondCheby can be
expressed as a sum of monomials of the form f(Af)i, from which it follows that this function is
a symmetric linear operator having the same nullspace as A. Let Z be the matrix realizing this
operator.
Standard analyses of the preconditioned Chebyshev algorithm [Axe85, Section 5.3] imply
that for all b in the range of A,
???Zb − A†b
Now, consider any non-zero eigenvalue λ and eigenvector b of AZ, so that
???A≤ ǫ
???A†b
???A.
AZb = λb.
Multiplying on the left by A†and using the fact that Z and A have the same nullspace, we
obtain
Zb = λA†b.
Plugging this into the previous inequality, we find
ǫ
???A†b
???A≥
???Zb − A†b
???A= |λ − 1|
???A†b
???A,
and so λ must lie between 1 − ǫ and 1 + ǫ. Applying Proposition 3.3, we obtain
(1 − ǫ)Z†? A ? (1 + ǫ)Z†.
We can now state the subroutine solveBifor i = 1,...,ℓ.
x = solveBi(b)
1. Set λmin= 1 − 2e−2, λmax= (1 + 2e−2)k and t =
?
1.33√k
?
.
2. Set s = L−1
iP−1
?
i
Πib.
?
3. Write s =
s0
s1
, where the dimension of s1is the size of Ai.
4. Set y0= s0, and
(a) if i = ℓ, set y1= Zℓs1
(b) else, set y1= precondCheby(Ai,s1,solveBi+1,t,λmin,λmax).
?
5. Set x = ΠiP−T
i
L−T
i
y0
y1
?
.
We have chosen the parameters λmin, λmax, and t so that inequality (7) holds for for ǫ = 2e−2.
15
Page 16
We note that we apply L−T
constructing the inverses. Similarly, Πimay be applied in time proportional to the length of b as
it is either the identity, or the operator that orthogonalizes with respect to the all-1’s vector. We
remark that the multiplications by Πiare actually unnecessary in our code, as solveBiwill only
appear inside a call to precondCheby, in which case it is multiplied on either side by matrices
that implicitly contain Πi. However, our analysis is simpler if we include these applications of
Πi.
i
and L−1
i
by forward and backward substitution, rather than by
Lemma 5.3 (Correctness of solveBi). If A is an irreducible SDDM0-matrix and Bi? Ai−1?
kBifor all i ≥ 1, then for 1 ≤ i ≤ ℓ,
(a) The function solveBiis a symmetric linear operator.
(b) The function precondCheby(Ai−1,b,solveBi,t,λmin,λmax) is a symmetric linear operator
in b.
(c)
(1 − 2e−2)Zi†? Ai? (1 + 2e−2)Zi†,
where for i ≤ l − 1, Ziis the matrix such that
Zis1= precondCheby(Ai,s1,solveBi+1,t,λmin,λmax).
(d)
(1 − 2e−2)solveBi
†? Bi? (1 + 2e−2)solveBi
†.
Proof. We first prove (a) and (b) by reverse induction on i. The base case of our induction is
when i = ℓ, in which case BuildPreconditioners sets Zℓ= Aℓ†, and so
solveBℓ= ΠℓP−T
ℓ
L−T
ℓ
?
I
0
0
Zℓ
?
L−1
ℓP−1
ℓ
Πℓ,
which is obviously a symmetric linear operator.
operator, part (b) for Ai−1follows from Proposition 5.2. Given that (b) holds for Aiand that
the call to precondCheby is realized by a symmetric matrix Zi, we then have that
Given that solveBiis a symmetric linear
solveBi= ΠiP−T
i
L−T
i
?
I
0
0
Zi
?
L−1
iP−1
i
Πi
is a symmetric linear operator. We may thereby establish that (a) and (b) hold for all 1 ≥ i ≥ ℓ.
We now prove properties (c) and (d), again by reverse induction. By construction Zℓ= Aℓ†,
so (c) holds for i = ℓ. To see that if (c) holds for i, then (d) does also, note that
(1 − 2e−2)Zi†? Ai
(1 − 2e−2)Ai†? Zi,
(1 − 2e−2)
0
implies
by Proposition 3.2, which implies
?I0
Ai†
?
?
?
I
0
0
Zi
?
which implies
16
Page 17
(1 − 2e−2)Bi†= (1 − 2e−2)ΠiP−T
i
L−T
i
?I0
Ai†
0
?
L−1
iP−1
i
Πi
(by Proposition 4.1 (e))
? ΠiP−T
i
L−T
i
?
I
0
0
Zi
?
L−1
iP−1
i
Πi= solveBi,
which by Proposition 3.2 implies (1−2e−2)solveBi†? Bi. The inequality Bi? (1+2e−2)solveBi†
may be established similarly.
To show that when (d) holds for i then (c) holds for i−1, note that (d) and Bi? Ai−1? k·Bi
imply
(1 − 2e−2)solveBi
So, (c) for i − 1 now follows from Proposition 5.2 and the fact that λmin,λmaxand t have been
chosen so that inequality (7) is satisfied with ǫ = 2e−2.
†? Ai−1? k(1 + 2e−2)solveBi
†.
Lemma 5.4 (Complexity of solveBi). If A0is an irreducible, n-by-n SDDM0-matrix with 2m
non-zero off-diagonal entries and each Biis a k-ultra-sparsifier of Ai−1, then solveB1runs in
time
O(n + m).
Proof. Let Tidenote the running time of solveBi. We will prove by reverse induction on i that
there exists a constant c such that
Ti≤ c(dim(Bi) + (γχ + δ)(noff (Ai) + dim(Ai))), (8)
where
γ = 196 andδ = 15.
This will prove the lemma as dim(B1) = dim(A0) = n, and Proposition 5.1 implies
(γχ + δ)(noff (Ai) + dim(Ai)) ≤ (γχ + δ)5χm
k
≤ m5γχ2+ 5δχ
(14χ + 1)2= O(m).
To prove (8), we note that there exists a constant c so that steps 2 and 5 take time at most
c(dim(Bi)) (by Proposition 4.1), step 4a takes time at most c(dim(Aℓ)2), and step 4b takes
time at most t(c · dim(Ai) + c · noff (Ai) + Ti+1), where t is as defined on step 1 of solveBi.
The base case of our induction will be i = ℓ, in which case the preceding analysis implies
?
≤ c(dim(Bℓ) + (66χ + 6)dim(Aℓ)),
which satisfies (8). We now prove (8) is true for i < ℓ, assuming it is true for i + 1. We have
Tℓ≤ cdim(Bℓ) + dim(Aℓ)2?
(by step 2 of BuildPreconditioners)
Ti≤ c(dim(Bi)) + t(c · dim(Ai) + c · noff (Ai) + Ti+1)
≤ c?dim(Bi) + t?dim(Ai) + noff (Ai) + dim(Bi+1) + (γχ + δ)(noff (Ai+1) + dim(Ai+1))??
(by the induction hypothesis)
≤ c?dim(Bi) + t?2 dim(Ai) + noff (Ai) + (γχ + δ)(5 noff (Ai)χ/k)??
17
Page 18
(by Proposition 5.1)
≤ c[dim(Bi) + t(2 dim(Ai) + 6 noff (Ai))],
as γχ2+ δχ ≤ k. As
6t ≤ 6 · (1.33(14χ + 1) + 1) ≤ γχ + δ,
we have proved that (8) is true for i as well.
We now state and analyze our ultimate solver.
x = solve(A,b,ǫ)
1. Set λmin= 1 − 2e−2, λmax= (1 + 2e−2)k and t =
?
0.67√kln(2/ǫ)
?
.
2. Run BuildPreconditioners(A).
3. x = precondCheby(A,b,solveB1,t,λmin,λmax).
Theorem 5.5 (Nearly Linear-Time Solver). On input an irreducible n-by-n SDDM0-matrix A
with 2m non-zero off-diagonal entries and an n-vector b, with probability at least 1 − 1/500,
solve(A,b,ǫ) runs in time
O(mlogc4mlog(1/ǫ)) + mlogO(1)m
and produces an ˜ x satisfying
???˜ x − A†b
???A≤ ǫ
???A†b
???A.
Proof. By Proposition 5.1, the numbers noff (Ai) are geometrically decreasing, and l ≤ logk/3χm.
So we may use Theorem 10.5 to show that the time required to build the preconditioners is at
most mlogO(1)m. If each Biis a k-ultra-sparsifier of Ai−1, then the bound on the A-norm of
the output follows by an analysis similar to that used to prove Lemma 5.3. In this case, we may
use Lemma 5.4 to bound on the running time of step 3 by
√
klog(1/ǫ)) = O(mlogc4nlog(1/ǫ)).
O(mt) = O(m
The probability that there is some Bithat is not a k-ultra-sparsifier of Ai−1is at most
?
assuming c3,c4≥ 1.
If the non-zero structure of A is planar, then by Theorem 9.5, we can replace all the calls
to UltraSparsify in the above algorithm with calls to UltraSimple. By Theorem 9.1, this
is like having (k,h)-ultra-sparsifiers with h = O(lognlog2logn). Thus, the same analysis goes
through with χ = O(lognlog2logn), and the resulting linear system solver runs in time
i
1
2 dim(Bi)≤
l
2(66χ + 6)≤
logk/3χm
2(66χ + 6)< 1/500,
O(nlog2n + nlogn log2logn log(1/ǫ)).
18
Page 19
We remark that our analysis is very loose when m is much larger than n. In this case, the
first ultra-sparsifier constructed, B1, will probably have close to n edges, which could be much
lower than the bound proved in Proposition 5.1. While it is not necessary for the proof of our
theorem, one could remove this slack by setting B1= Sparsify(A0,1/2n) in this case.
6 Computing Approximate Fiedler Vectors
Fiedler [Fie73] was the first to recognize that the eigenvector associated with the second-smallest
eigenvalue of the Laplacian matrix of a graph could be used to partition a graph. From a result
of Mihail [Mih89], we know that any vector whose Rayleigh quotient is close to this eigenvalue
can also be used to find a good partition. We call such a vector an approximate Fiedler vector.
Definition 6.1 (Approximate Fiedler Vector). For a Laplacian matrix A, v is an ǫ-approximate
Fiedler vector if v is orthogonal to the all-1’s vector and
vTAv
vTv
≤ (1 + ǫ)λ2(A),
where λ2(A) is the second-smallest eigenvalue of A.
Our linear system solvers may be used to quickly compute ǫ-approximate Fiedler vectors.
We will prove that the following algorithm does so with probability at least 1 − p.
v = ApproxFiedler(A,ǫ,p)
1. Set λmin= 1 − 2e−2, λmax= (1 + 2e−2)k and t =
?
0.67√kln(8/ǫ)
?
.
2. Set k = 8ln(18(n − 1)/ǫ)/ǫ.
3. For a = 1,...,⌈log2p⌉.
(a) Run BuildPreconditioners(A).
(b) Choose r0to be a random unit vector orthogonal to the all-1’s vector.
(c) For b = 1,...,k
rb= precondCheby(A,rb−1,solveB1,t,λmin,λmax).
(d) Set va= rk.
4. Let a0be the index of the vector minimizing vT
a0Ava0/vT
a0va0.
5. Set v = va0.
Theorem 6.2. On input a Laplacian matrix A with m non-zero entries and ǫ,p > 0, with
probability at least 1−1/p, ApproxFiedler(A,ǫ,p) computes an ǫ-approximate Fiedler vector of
A in time
mlogO(1)mlog(1/p)log(1/ǫ)/ǫ.
19
Page 20
Our proof of Theorem 6.2 will use the following proposition.
Proposition 6.3. If Z is a matrix such that
(1 − ǫ)Z†? A ? (1 + ǫ)Z†,
and v is a vector such that vTZ†v ≤ (1+ǫ)λ2(Z†), for some ǫ ≤ 1/5, then v is a 4ǫ-approximate
Fiedler vector of A.
Proof. We first observe that
λ2(Z†) ≤ λ2(A)/(1 − ǫ).
We then compute
vTAv ≤ (1 + ǫ)vTZ†v
≤ (1 + ǫ)(1 + ǫ)λ2(Z†)
≤ (1 + ǫ)(1 + ǫ)λ2(A)/(1 − ǫ)
≤ (1 + 4ǫ)λ2(A),
for ǫ ≤ 1/5.
Proof of Theorem 6.2. As we did in the proof of Lemma 5.3 and Theorem 5.5, we can show that
precondCheby(A,b,solveB1,t,λmin,λmax) is a linear operator in b. Let Z denote the matrix
realizing this operator. As in the proof of Lemma 5.3, we can show that (1 − ǫ/4)Z†? A ?
(1 + ǫ/4)Z†.
By Proposition 6.3, it suffices to show that with probability at least 1/2 each vector va
satisfies
vT
aZ†va/vT
ava≤ (1 + ǫ/4)λ2(Z†).
To this end, let 0 = µ1≤ µ2≤ ··· ≤ µnbe the eigenvalues of Z†, and let 1 = u1,...,unbe
corresponding eigenvectors. Let
r0=
?
i≥2
αiui,
and recall that (see e.g. [SST06, Lemma B.1])
?
Thus, with probably at least 1/2, the call to BuildPreconditioners succeeds and |α2| ≥
2/3?(n − 1). In this case,
We now show that this inequality implies that rksatisfies
Pr
|α2| ≥ 2/3
?
(n − 1)
?
≥
2
√2π
?∞
2/3
e−t2/2dt ≥ 0.504.
k ≥ 8ln(8/α2
2ǫ)/ǫ.(9)
(rk)TZ†rk
(rk)Trk
≤ (1 + ǫ/4)µ2.
To see this, let j be the greatest index such that µj≤ (1 + ǫ/8)µ2, and compute
rk= Zkr0=
?
i≥2
αi/µk
iui,
20
Page 21
so
(rk)TZ†rk
(rk)Trk
=
?
?
i≥2α2
?
?
i/µ2k−1
i
i≥2α2
j≥i≥2α2
j≥i≥2α2
?
i/µ2k
i/µ2k−1
i/µ2k
i>jα2
α2
i
≤
i
i
+
?
i>jα2
?
i/µ2k−1
i
i≥2α2
i/µ2k
i
≤ µj+
i/µ2k−1
i
2/µ2k
2
≤ (1 + ǫ/8)µ2+ µ2
??
??
?
i>jα2
i(µ2/µi)2k−1
α2
2
?
≤ (1 + ǫ/8)µ2+ µ2
i>jα2
i(1/(1 + ǫ/8))2k−1
α2
2
?
≤ (1 + ǫ/8)µ2+ µ2
i>j
α2
iǫ/8(by inequality (9))
≤ (1 + ǫ/8)µ2+ µ2(ǫ/8)
≤ (1 + ǫ/4)µ2.
7Laplacians and Weighted Graphs
We will find it convenient to describe and analyze our preconditioners for Laplacian matrices
in terms of weighted graphs. This is possible because of the isomorphism between Laplacian
matrices and weighted graphs. To an n-by-n Laplacian matrix A, we associate the graph with
vertex set {1,...,n} having an edge between vertices u and v of weight −A(u,v) for each u and
v such that A(u,v) is non-zero.
All the graphs we consider in this paper will be weighed. If u and v are distinct vertices in
a graph, we write ( ( (u,v) ) ) to denote an edge between u and v of weight 1. Similarly, if w > 0,
then we write w( ( (u,v) ) ) to denote an edge between u and v of weight w. A weighted graph is then
a pair G = (V,E) where V is a set of vertices and E is a set of weighted edges on V , each of
which spans a distinct pair of vertices. The Laplacian matrix LGof the graph G is the matrix
such that
We recall that for every vector x ∈ IRn,
xTLGx =
LG(u,v) =
−w
0
?
if there is an edge w( ( (u,v) ) ) ∈ E
if u ?= v and there is no edge between u and v in E
if u = v.
w( ( (u,x) ) )∈Ew
?
w( ( (u,v) ) )∈E
w(xu− xv)2.
For graphs G and H, we define the graph G+H to be the graph whose Laplacian matrix is
LG+ LH.
21
Page 22
8Graphic Inequalities, Resistance, and Low-Stretch Spanning
Trees
In this section, we introduce the machinery of “graphic inequalities” that underlies the proofs in
the rest of the paper. We then introduce low-stretch spanning trees, and use graphic inequalities
to bound how well a low-stretch spanning tree preconditions a graph. This proof provides the
motivation for the construction in the next section.
We begin by overloading the notation ? by writing
G ? HandE ? F
if G = (V,E) and H = (V,F) are two graphs such that their Laplacian matrices, LGand LH
satisfy
LG? LH.
Many facts that have been used in the chain of work related to this paper can be simply
expressed with this notation. For example, the Splitting Lemma of [BGH+06] becomes
A1? B1
andA2? B2
impliesA1+ A2? B1+ B2.
We also observe that if B is a subgraph of A, then
B ? A.
We define the resistance of an edge to be the reciprocal of its weight. Similarly, we define
the resistance of a simple path to be the sum of the resistances of its edges. For example,
the resistance of the path w1( ( (1,2) ) ),w2( ( (2,3) ) ),w3( ( (3,4) ) ) is (1/w1+ 1/w2+ 1/w3). Of course, the
resistance of a trivial path with one vertex and no edges is zero. If one multiplies all the weights
of the edges in a path by α, its resistance decreases by a factor of α.
The next lemma says that a path of resistance r supports an edge of resistance r. This
lemma may be derived from the Rank-One Support Lemma of [BH03b], and appears in simpler
form as the Congestion-Dilation Lemma of [BGH+06] and Lemma 4.6 of [Gre96].
Lemma 8.1 (Path Inequality). Let e = w( ( (u,v) ) ) and let P be a path from u to v. Then,
e ? w resistance(P) · P.
Proof. After dividing both sides by w, it suffices to consider the case w = 1. Without loss of
generality, we may assume that e = ( ( (1,k + 1) ) ) and that P consists of the edges wi( ( (i,i + 1) ) ) for
1 ≤ i ≤ k. In this notation, the lemma is equivalent to
??
We prove this for the case k = 2. The general case follows by induction.
Recall Cauchy’s inequality, which says that for all 0 < α < 1,
( ( (1,k + 1) ) ) ?
i
1
wi
?
?w1( ( (1,2) ) ) + w2( ( (2,3) ) ) + ··· + wk( ( (k,k + 1) ) )?.
(a + b)2≤ a2/α + b2/(1 − α).
22
Page 23
For k = 2, the lemma is equivalent to
(x1− x3)2≤ (1 + w1/w2)(x1− x2)2+ (1 + w2/w1)(x2− x3)2,
which follows from Cauchy’s inequality with α = w2/(w1+ w2).
Recall that a spanning tree of a weighted graph G = (V,E) is a connected subgraph of G
with exactly |V |−1 edges. The weights of edges that appear in a spanning tree are assumed to
be the same as in G. If T is a spanning tree of a graph G = (V,E), then for every pair of vertices
u,v ∈ V , T contains a unique path from u to v. We let T(u,v) denote this path. We now use
graphic inequalities to derive a bound on how well T preconditions G. This bound strengthens
a bound of Boman and Hendrickson [BH03b, Lemma 4.9].
Lemma 8.2. Let G = (V,E) be a graph and let T be a spanning tree of G. Then,
??
Proof. As T is a subgraph of G, T ? G is immediate. To prove the right-hand inequality, we
compute
?
?
??
T ? G ?
e∈E
resistance(T(e))
resistance(e)
?
· T.
E =
e∈E
e
?
e∈E
resistance(T(e))
resistance(e)
· T(e),
?
by Lemma 8.1
?
e∈E
resistance(T(e))
resistance(e)
· T,as T(e) ? T.
Definition 8.3. Given a tree T spanning a set of vertices V and a weighted edge e = w( ( (u,v) ) )
with u,v ∈ V , we define the stretch of e with respect to T to be
stT(e) =resistance(T(e))
resistance(e)
= w · resistance(T(e)).
If E is a set of edges on V , then we define
stT(E) =
?
e∈E
stT(e).
With this definition, the statement of Lemma 8.2 may be simplified to
T ? G ? stT(E) · T.(10)
We will often use the following related inequality, which follows immediately from Lemma 8.1
and the definition of stretch.
w( ( (u,v) ) ) ? stT(w( ( (u,v) ) )) T(u,v) = w stT(( ( (u,v) ) )) T(u,v), (11)
where we recall that T(u,v) is the unique path in T from u to v.
23
Page 24
9 Preconditioning with Low-Stretch Trees
In this section, we present a simple preconditioning algorithm, UltraSimple, that works by
simply adding edges to low-stretch spanning trees. This algorithm is sufficient to obtain all our
results for planar graphs. For arbitrary graphs, this algorithm might add too many additional
edges. We will show in Section 10 how these extra edges can be removed via sparsification.
9.1 Low-Stretch Trees
Low-stretch spanning trees were introduced by Alon, Karp, Peleg and West [AKPW95]. At
present, the construction of spanning trees with the lowest stretch is due to Abraham, Bartal
and Neiman [ABN08], who prove
Theorem 9.1 (Low Stretch Spanning Trees). There exists an O(mlogn + nlog2n)-time algo-
rithm, LowStretch, that on input a weighted connected graph G = (V,E), outputs a spanning
tree T of G such that
stT(E) ≤ cABN mlognloglogn(logloglogn)3,
where m = |E|, for some constant cABN. In particular, stT(E) = O(mlogn log2logn).
9.2 Augmenting Low-Stretch Spanning Trees
To decide which edges to add to the tree, we first decompose the tree into a collection of
subtrees so that no non-singleton subtree is attached to too many edges of E of high stretch.
In the decomposition, we allow subtrees to overlap at a single vertex, or even consist of just a
single vertex. Then, for every pair of subtrees connected by edges of E, we add one such edge
of E to the tree. The subtrees are specified by the subset of the vertices that they span.
Definition 9.2. Given a tree T that spans a set of vertices V , a T-decomposition is a decom-
position of V into sets W1,...,Whsuch that V = ∪Wi, the graph induced by T on each Wiis a
tree, possibly with just one vertex, and for all i ?= j, |Wi∩ Wj| ≤ 1.
Given an additional set of edges E on V , a (T,E)-decomposition is a pair ({W1,...,Wh},ρ)
where {W1,...,Wh} is a T-decomposition and ρ is a map that sends each edge of E to a set or
pair of sets in {W1,...,Wh} so that for each edge in (u,v) ∈ E,
(a) if ρ(u,v) = {Wi} then {u,v} ∈ Wi, and
(b) if ρ(u,v) = {Wi,Wj}, then either u ∈ Wiand v ∈ Wj, or u ∈ Wjand v ∈ Wi.
We remark that as the sets Wiand Wjcan overlap, it is possible that ρ(u,v) = {Wi,Wj},
u ∈ Wiand v ∈ Wi∩ Wj.
We use the following tree decomposition theorem to show that one can always quickly find a
T-decomposition of E with few components in which the sum of stretches of the edges attached
to each component is not too big. As the theorem holds for any non-negative function η on the
edges, not just stretch, we state it in this general form.
24
Page 25
W6
W3
W4
W5
W1
W2
Figure 2: An example of a tree decomposition. Note that sets W1and W6overlap, and that set
W5is a singleton set and that it overlaps W4.
Theorem 9.3 (decompose). There exists a linear-time algorithm, which we invoke with the
syntax
({W1,...,Wh},ρ) = decompose(T,E,η,t),
that on input a set of edges E on a vertex set V , a spanning tree T on V , a function η : E → IR+,
and an integer 1 < t ≤?
(a) h ≤ t,
(b) for all Wisuch that |Wi| > 1,
?
e∈Eη(e), outputs a (T,E)-decomposition ({W1,...,Wh},ρ), such that
e∈E:Wi∈ρ(e)
η(e) ≤4
t
?
e∈E
η(e).
For pseudo-code and a proof of this theorem, see Appendix C. We remark that when t ≥ n,
the algorithm can just construct a singleton set for every vertex.
For technical reasons, edges with stretch less than 1 can be inconvenient. So, we define
η(e) = max(stT(e),1)andη(E) =
?
e∈E
η(e). (12)
The tree T should always be clear from context.
Given a (T,E)-decomposition, ({W1,...,Wh},ρ), we define the map
σ : {1,...,h} × {1,...,h} → E ∪ {undefined}
by setting
?
undefined
σ(i,j) =
argmaxe:ρ(e)={Wi,Wj}weight(e)/η(e), if i ?= j and such an e exists
otherwise.
(13)
In the event of a tie, we let e be the lexicographically least edge maximizing weight(e)/η(e) such
that ρ(e) = {Wi,Wj}. Note that σ(i,j) is a weighted edge.
The map σ tells us which edge from E between Wi and Wj to add to T. The following
property of σ, which follows immediately from its definition, will be used in our analysis in this
and the next section.
25
Page 26
Proposition 9.4. For every i,j such that σ(i,j) is defined and for every e ∈ E such that
ρ(e) = {Wi,Wj},
weight(e)
η(e)
≤weight(σ(i,j))
η(σ(i,j))
.
We can now state the procedure by which we augment a spanning tree.
F = AugmentTree(T,E,t),
E is set of weighted edges,
T is a spanning tree of the vertices underlying E,
t is an integer.
1. Compute stT(e) for each edge e ∈ E.
2. Set ((W1,...,Wh),ρ) = decompose(T,E,η,t), where η(e) is as defined in (12).
3. Set F to be the union of the weighted edges σ(i,j) over all pairs 1 ≤ i < j ≤ h for which
σ(i,j) is defined, where σ(i,j) is as defined in (13).
A = UltraSimple(E,t)
1. Set T = LowStretch(E).
2. Set F = AugmentTree(T,E,t).
3. Set A = T ∪ F.
We remark that when t ≥ n, UltraSimple can just return A = E.
Theorem 9.5 (AugmentTree). On input a set of weighted edges E, a spanning subtree T, and
an integer 1 < t ≤ η(E), the algorithm AugmentTree runs in time O(mlogn), where m = |E|.
The set of edges F output by the algorithm satisfies
(a) F ⊆ E,
(b) |F| ≤ t2/2,
(c) If T ⊆ E, as happens when AugmentTree is called by UltraSimple, then (T ∪ F) ? E.
(d)
E ?12η(E)
t
· (T ∪ F).(14)
Moreover, if E is planar then A is planar and |F| ≤ 3t − 6.
Proof. In Appendix B, we present an algorithm for computing the stretch of each edge of E in
time O(mlogn). The remainder of the analysis of the running time is trivial. Part (a) follows
immediately from the statement of the algorithm. When T ⊆ E, T ∪F ⊆ E, so part (c) follows
as well.
26
Page 27
To verify (b), note that the algorithm adds at most one edge to F for each pair of sets in
W1,...,Wh, and there are at most?t
E on the sets W1,...,Whis also planar. Thus, the number of pairs of these sets connected by
edges of E is at most the maximum number of edges in a planar graph with t vertices, 3t − 6.
We now turn to the proof of part (d). Set
2
?≤ t2/2 such pairs. If E is planar, then F must be planar
as F is a subgraph of E. Moreover, we can use Lemma C.1 to show that the graph induced by
β = 4η(E)/t.(15)
By Theorem 9.3, ρ and W1,...,Whsatisfy
?
e:Wi∈ρ(e)
η(e) ≤ β, for all Wisuch that |Wi| > 1.(16)
Let Eint
e with |ρ(e)| = 2 and Wi∈ ρ(e). Let Eint= ∪iEint
the tree formed by the edges of T inside the set Wi. Note that when |Wi| = 1, Tiand Eint
empty.
We will begin by proving that when |Wi| > 1,
e∈Eint
i
denote the set of edges e with ρ(e) = (Wi,Wi), and let Eext
i
denote the set of edges
. Also, let Tidenote
i
and Eext= ∪iEext
i
i
are
Eint
i
?
?
i
η(e)
Ti,(17)
from which it follows that
Eint?
?
i:|Wi|>1
?
e∈Eint
i
η(e)
Ti. (18)
For any edge e ∈ Eint
(11) we have
i
, the path in T between the endpoints of e lies entirely in Ti. So, by
e ? stT(e) · Ti? η(e) · Ti.
Inequality (17) now follows by summing over the edges e ∈ Eint
We now define the map τ : E → E ∪ {undefined} by
?
undefined
i .
τ(e) =
σ(i,j), if |ρ(e)| = 2, where ρ(e) = {Wi,Wj}, and
otherwise.
(19)
To handle the edges bridging components, we prove that for each edge e with ρ(e) = (Wi,Wj),
e ? 3η(e)(Ti+ Tj) + 3
weight(e)
weight(τ(e))· τ(e)(20)
Let e = w( ( (u,v) ) ) be such an edge, with u ∈ Wiand v ∈ Wj. Let τ(e) = z( ( (x,y) ) ), with x ∈ Wiand
y ∈ Wj. Let tidenote the last vertex in Tion the path in T from u to v (see Figure 3). If Tiis
empty, ti= u. Note that tiis also the last vertex in Tion the path in T from x to y. Define tj
similarly. As Ti(u,x) ⊆ Ti(u,ti) ∪ Ti(ti,x), the tree Ticontains a path from u to x of resistance
27
Page 28
W6
W3
W4
W5
W1
W2
t3
t4
x
vu
y
Figure 3: In this example, e = w( ( (u,v) ) ) and τ(e) = z( ( (x,y) ) ).
at most
resistance(Ti(u,ti)) + resistance(Ti(ti,x)),
and the tree Tjcontains a path from y to v of resistance at most
resistance(Tj(y,tj)) + resistance(Tj(tj,v)).
Furthermore, as Ti(u,ti) + Tj(tj,v) ⊆ T(u,v) and Ti(ti,x) + Tj(y,tj) ⊆ T(x,y), the sum of the
resistances of the paths from u to x in Tiand from y to v in Tjis at most
resistance(T(u,v)) + resistance(T(x,y)) = stT(e)/w + stT(τ(e))/z
≤ η(e)/w + η(τ(e))/z
≤ 2η(e)/w,
where the last inequality follows from Proposition 9.4. Thus, the graph
3η(e)(Ti+ Tj) + 3w( ( (x,y) ) ) = 3η(e)(Ti+ Tj) + 3
weight(e)
weight(τ(e))· τ(e)
contains a path from u to v of resistance at most
2
3
1
w+1
3
1
w=1
w,
which by Lemma 8.1 implies (20).
We will now sum (20) over every edge e ∈ Eext
i
for every i, observing that this counts every
28
Page 29
edge in Eexttwice.
Eext= (1/2)
?
?
i
?
e∈Eext
i
e
?
?
?
i
e∈Eext
i
3η(e)Ti+ (1/2)
?
i
?
e∈Eext
?
?
i
3
weight(e)
weight(τ(e))· τ(e)
= 3
i
?
e∈Eext
i
η(e)
Ti+ 3
e∈Eext
weight(e)
weight(τ(e))· τ(e)
= 3
?
i:|Wi|>1
?
e∈Eext
i
η(e)
Ti+ 3
e∈Eext
weight(e)
weight(τ(e))· τ(e),(21)
as Tiis empty when |Wi| = 1.
We will now upper bound the right-hand side of (21). To handle boundary cases, we divide
Eextinto two sets. We let Eext
size 1. We let Eext
τ(e) = e, while for e ∈ Eext
For Eext
single, we have
singleconsist of those e ∈ Eextfor which both sets in ρ(e) have
general= Eext− Eext
general, τ(e) ∈ Eext
singlecontain the rest of the edges in Eext. For e ∈ Eext
general.
single,
?
e∈Eext
single
weight(e)
weight(τ(e))· τ(e) =
?
e∈Eext
single
τ(e) = Eext
single.
To evaluate the sum over the edges e ∈ Eext
τ. Let i be such that f ∈ Eext
e ∈ Eext
?
τ(e)=f
general, consider any f ∈ Eext
and |Wi| > 1. Then, for every e such that τ(e) = f, we have
. So, by Proposition 9.4,
generalin the image of
i
i
e∈Eext
weight(e)
weight(τ(e))=
?
τ(e)=f
e∈Eext
i
weight(e)
weight(τ(e))
≤
?
e∈Eext
i
weight(e)
weight(τ(e))≤
?
e∈Eext
i
η(e)
η(τ(e))≤
?
e∈Eext
i
η(e) ≤ β.(22)
Thus,
?
e∈Eext
weight(e)
weight(τ(e))· τ(e) ? Eext
single+
?
general
f∈image(τ)
f∈Eext
β · f ? β · F.
Plugging this last inequality into (21), we obtain
Eext? 3
?
i:|Wi|>1
?
e∈Eext
i
η(e)
Ti+ 3β · F.
29
Page 30
Applying (18) and then (16), we compute
E = Eext+ Eint? 3
?
i:|Wi|>1
Ti
?
e∈Eint
i
η(e) +
?
e∈Eext
i
η(e)
+ 3β · F ? 3β · (T ∪ F),
which by (15) implies the lemma.
We now observe three sources of slack in Theorem 9.5, in decreasing order of importance.
The first is the motivation for the construction of ultra-sparsifiers in the next section.
1. In the proof of Theorem 9.5, we assume in the worst case that the tree decomposition
could result in each tree Tibeing connected to t − 1 other trees, for a total of t(t − 1)/2
extra edges. Most of these edges seem barely necessary, as they could be included at a
small fraction of their original weight. To see why, consider the crude estimate at the end
of inequality (22). We upper bound the multiplier of one bridge edge f from Ti,
?
τ(e)=f
e∈Eext
i
weight(e)
weight(τ(e)),
by the sum of the multipliers of all bridge edges from Ti,
?
e∈Eext
i
weight(e)
weight(τ(e)).
The extent to which this upper bound is loose is the factor by which we could decrease
the weight of the edge f in the preconditioner.
While we cannot accelerate our algorithms by decreasing the weights with which we include
edges, we are able to use sparsifiers to trade many low-weight edges for a few edges of
higher weight. This is how we reduce the number of edges we add to the spanning tree to
tlogO(1)n.
2. The number of edges added equals the number of pairs of trees that are connected. While
we can easily obtain an upper bound on this quantity when the graph has bounded genus,
it seems that we should also be able to bound this quantity when the graph has some nice
geometrical structure.
3. The constant 12 in inequality (14) can be closer to 2 in practice. To see why, first note
that the internal and external edges count quite differently: the external edges have three
times as much impact. However, most of the edges will probably be internal. In fact, if
one uses the algorithm of [ABN08] to construct the tree, then one can incorporate the
augmentation into this process to minimize the number of external edges. Another factor
of 2 can be saved by observing that the decomposeTree, as stated, counts the internal
edges twice, but could be modified to count them once.
30
Page 31
10Ultra-Sparsifiers
We begin our construction of ultra-sparsifiers by building ultra-sparsifiers for the special case in
which our graph has a distinguished vertex r and a low-stretch spanning tree T with the property
that for every edge e ∈ E − T, the path in T connecting the endpoints of e goes through r.
In this case, we will call r the root of the tree. All of the complications of ultra-sparsification
will be handled in this construction. The general construction will follow simply by using tree
splitters to choose the roots and decompose the input graph.
The algorithm RootedUltraSparsify begins by computing the same set of edges σ(i,j),
as was computed by UltraSimple. However, when RootedUltraSparsify puts one of these
edges into the set F, it gives it a different weight: ω(i,j). For technical reasons, the set F is
decomposed into subsets Fbaccording to the quantities φ(f), which will play a role in the analysis
of RootedUltraSparsify analogous to the role played by η(e) in the analysis of UltraSimple.
Each set of edges Fbis sparsified, and the union of the edges of E that appear in the resulting
sparsifiers are returned by the algorithm. The edges in Fbcannot necessarily be sparsified
directly, as they might all have different endpoints. Instead, Fbis first projected to a graph Hb
on vertex set {1,...,h}. After a sparsifier Hb
graph to form Eb
s. Note that the graph Esreturned by RootedUltraSparsify is a subgraph of
E, with the same edge weights.
We now prove that F = ∪⌈log2η(E)⌉
was defined in (12) and which was used to define the map σ.
sof Hbis computed, it is lifted back to the original
b=1
Fb. Our proof will use the function η, which we recall
Lemma 10.1. For φ as defined in (24), for every f = ψ(i,j)σ(i,j) ∈ F,
1 ≤ ψ(i,j) ≤ φ(f) ≤ η(E).
Proof. Recall from the definitions of φ and ψ that
(25)
φ(f) ≥ ψ(i,j) =
?
e∈E:ρ(e)={Wi,Wj}weight(e)
weight(σ(i,j))
.
By definition σ(i,j) is an edge in E satisfying ρ(σ(i,j)) = {Wi,Wj}; so, the right-hand side of
the last expression is at least 1.
To prove the upper bound on φ(f), first apply Proposition 9.4 to show that
?
weight(σ(i,j))
ψ(i,j) =
e∈E:ρ(e)={Wi,Wj}weight(e)
≤
?
e∈E:ρ(e)={Wi,Wj}η(e)
η(σ(i,j))
≤ η(E),
as η is always at least 1. Similarly,
stT(f) =
ω(i,j)
weight(σ(i,j))stT(σ(i,j)) =
stT(σ(i,j))
weight(σ(i,j))
?
e∈E:ρ(e)={Wi,Wj}
e∈E:ρ(e)={Wi,Wj}
weight(e)
≤
η(σ(i,j))
weight(σ(i,j))
?
e∈E:ρ(e)={Wi,Wj}
weight(e)
≤
?
η(e) ≤ η(E),
where the second-to-last inequality follows from Proposition 9.4.
31
Page 32
Es= RootedUltraSparsify(E,T,r,t,p)
Condition: for all e ∈ E, r ∈ T(e). The parameter t is a positive integer at most ⌈η(E)⌉.
1. Compute stT(e) and η(e) for each edge e ∈ E, where η is as defined in (12).
2. If t ≥ |E|, return Es= E.
3. Set ({W1,...,Wh},ρ) = decompose(T,E,η,t).
4. Compute σ, as given by (13), everywhere it is defined.
5. For every (i,j) such that σ(i,j) is defined, set
ω(i,j) =
?
e∈E:ρ(e)={Wi,Wj}
weight(e)andψ(i,j) = ω(i,j)/weight(σ(i,j)). (23)
6. Set F = {ψ(i,j)σ(i,j) : σ(i,j) is defined}.
7. For each f = ψ(i,j)σ(i,j) ∈ F, set
φ(f) = max(ψ(i,j),stT(f)). (24)
8. For b ∈ {1,...,⌈log2η(E)⌉}:
(a) Set Fb=
?
{f ∈ F : φ(f) ∈ [1,2]}
?f ∈ F : φ(f) ∈ (2b−1,2b]?
if b = 1
otherwise
(b) Let Hbbe the set of edges on vertex set {1,...,h} defined by
?
(c) Set Hb
Hb=ω(i,j)( ( (i,j) ) ) : ψ(i,j)σ(i,j) ∈ Fb?
.
s= Sparsify2(Hb,p).
(d) Set
Eb
s=
?
σ(i,j) : ∃w such that w( ( (i,j) ) ) ∈ Hb
s
?
.
9. Set Es= ∪bEb
s.
32
Page 33
It will be convenient for us to extend the domain of ρ to F by setting ρ(f) = ρ(e) where
e ∈ E has the same vertices as f. That is, when there exists γ ∈ IR+such that f = γe. Define
β = 4η(E)/t.
Our analysis of RootedUltraSparsify will exploit the inequalities contained in the following
two lemmas.
Lemma 10.2. For every i for which |Wi| > 1,
?
f∈F:Wi∈ρ(f)
stT(f) ≤ β.
Proof. Consider any f ∈ F, and let f = ψ(i,j)σ(i,j). Note that the weight of f is ω(i,j), and
recall that stT(f) ≤ η(f). We first show that
?
e:τ(e)=σ(i,j)
η(e) ≥ η(f).
By Proposition 9.4, and the definition of τ in (19)
?
e:τ(e)=σ(i,j)
η(e) ≥
η(σ(i,j))
weight(σ(i,j))
?
e:τ(e)=σ(i,j)
weight(e)
=
η(σ(i,j))
weight(σ(i,j))weight(f)
?
= max(ψ(i,j),stT(f))
= max(φ(f),stT(f))
≥ max(1,stT(f))
= η(f).
= max
weight(f)
weight(σ(i,j)),
stT(σ(i,j))
weight(σ(i,j))weight(f)
?
(by (24))
(by (25))
We then have
?
e∈E:Wi∈ρ(e)
η(e) ≥
?
f∈F:Wi∈ρ(f)
η(f).
The lemma now follows from the upper bound of 4η(E)/t imposed on the left-hand term by
Theorem 9.3.
Lemma 10.3. For every i for which |Wi| > 1,
?
f∈F:Wi∈ρ(f)
φ(f) ≤ 2β.(26)
33
Page 34
Proof. For an edge f ∈ F, let ψ(f) equal ψ(i,j) where f = ψ(i,j)σ(i,j). With this notation,
we may compute
?
≤
f∈F:Wi∈ρ(f)
≤ β +
f∈F:Wi∈ρ(f)
f∈F:Wi∈ρ(f)
φ(f) ≤
?
?
f∈F:Wi∈ρ(f)
stT(f) +
?
f∈F:Wi∈ρ(f)
?
ψ(f),
ψ(f)
η(f) +
f∈F:Wi∈ρ(f)
ψ(f)
?
by Lemma 10.2. We now bound the right-hand term as in the proof of inequality (22):
?
f∈F:Wi∈ρ(f)
ψ(f) =
?
e∈Eext
i
weight(e)
weight(τ(e))≤
?
e∈Eext
i
η(e)
η(τ(e))≤
?
e∈Eext
i
η(e) ≤ β,
by our choice of β and Theorem 9.3.
Lemma 10.4 (RootedUltraSparsify). Let T be a spanning tree on a vertex set V , and let E
be a non-empty set of edges on V for which there exists an r ∈ V be such that for all e ∈ E,
r ∈ T(v). For p > 0 and t a positive integer at most ⌈η(E)⌉, let Esbe the graph returned by
RootedUltraSparsify(E,T,r,t,p). The graph Esis a subgraph of E, and with probability at
least 1 − ⌈log2η(E)⌉p,
|Es| ≤ c1logc2(n/p)max(1,⌈log2η(E)⌉)t,(27)
and
E ? (3β + 126β max(1,log2η(E))) · T + 120β · Es,(28)
where β = 4η(E)/t.
Proof. We first dispense with the case in which the algorithm terminates at line 2. If t ≥ m,
then both (27) and (28) are trivially satisfied by setting Es= E, as β ≥ 2.
By Theorem 1.3 each graph Hb
according to Definition 1.2 with probability at least 1−p. As there are at most ⌈log2η(E)⌉ such
graphs Hb, this happens for all of these graphs with probability at least 1 − ⌈log2η(E)⌉p. For
the remainder of the proof, we will assume that each graph Hb
Hb. Recalling that h ≤ t, the bound on the number of edges in Esis immediate.
Our proof of (28) will go through an analysis of intermediate graphs. As some of these could
be multi-graphs, we will find it convenient to write them as sums of edges.
To define these intermediate graphs, let ribe the vertex in Withat is closest to r in T. As in
Section 9, let Tidenote the edges of the subtree of T with vertex set Wi. We will view rias the
root of tree Ti. Note that if |Wi| = 1, then Wi= {ri} and Tiis empty. As distinct sets Wiand
Wjcan overlap in at most one vertex,?
r ∈ T(e).
scomputed by Sparsify2 is a c1logc2(n/p)-sparsifier of Hb
sis a c1logc2(n/p)-sparsifier of
iTi≤ T. We will exploit the fact that for each e ∈ E
with ρ(e) = {Wi,Wj}, the path T(e) contains both riand rj, which follows from the condition
34
Page 35
We now define the edge set Db, which is a projection of Hbto the vertex set r1,...,rh, and
s, which is an analogous projection of the sparsifier Hb
?
Db
s. We set
Db=
(i,j):ψ(i,j)σ(i,j)∈Fb
ω(i,j)( ( (ri,rj) ) )
and
Db
s=
?
w( ( (i,j) ) )∈Hb
s
w( ( (ri,rj) ) ).
As the sets Wiand Wjare allowed to overlap slightly, it could be the case that some ri= rjfor
i ?= j. In this case, Dbwould not be isomorphic to Hb.
Set
Fb
s= γψ(i,j)σ(i,j) : ∃γ and (i,j) so that γω(i,j)( ( (i,j) ) ) ∈ Hb
The edge set Hbcan be viewed as a projection of the edge set Fbto the vertex set {1,...,h},
and the edge set Fb
We will prove the following inequalities
?
s
?
.
scan be viewed as a lift of Hb
sback into a reweighted subgraph of Fb.
E ? 3β · T + 3F
Fb? 2β · T + 2Db
Db? (5/4)Db
Db
Fb
(29)
(30)
s
(31)
s? 16β · T + 2Fb
s? 8β · Eb
s
(32)
s
(33)
Inequality (28) in the statement of the lemma follows from these inequalities and F =?
in RootedUltraSparsify are the same as those chosen by UltraSimple, except that they are
reweighted by the function ψ. If we follow the proof of inequality (14) in Theorem 9.5, but
neglect to apply inequality (22), we obtain
bFb.
To prove inequality (29), we exploit the proof of Theorem 9.5. The edges F constructed
E ? 3β · T + 3
?
e∈Eext
weight(e)
weight(τ(e))· τ(e) = 3β · T + 3F.
To prove inequality (30), consider any edge w( ( (u,v) ) ) = f ∈ Fb. Assume ρ(f) = {Wi,Wj},
u ∈ Wiand v ∈ Wj. We will now show that
f ? 2stT(f)(Ti+ Tj) + 2w( ( (ri,rj) ) ). (34)
As the path from u to v in T contains both riand rj,
resistance(T(u,ri)) + resistance(T(rj,v)) ≤ resistance(T(u,v)) = stT(f)/w.
Thus, the resistance of the path
2stT(f)T(u,ri) + 2w( ( (ri,rj) ) ) + 2stT(f)T(rj,v)
35
Page 36
is at most 1/w, and so Lemma 8.1 implies that
f ? 2stT(f)T(u,ri) + 2w( ( (ri,rj) ) ) + 2stT(f)T(rj,v),
which in turn implies (34). Summing (34) over all f ∈ Fbyields
?
?
? 2
β · Ti+ 2Db,
Fb? 2
i
?
f∈F:Wi∈ρ(f)
f∈F:Wi∈ρ(f)
stT(f)
Ti+ 2Db
stT(f)
Ti+ 2Db
Fb? 2
i:|Wi|>1
?
?
as Tiis empty when |Wi| = 1
i
by Lemma 10.2
? 2β · T + 2Db.
We now prove inequality (32), as it uses similar techniques. Let fs= w( ( (u,v) ) ) ∈ Fb
there exist γ and (i,j) so that γω(i,j)( ( (i,j) ) ) ∈ Hb
multiplier γ. By part (c) of Definition 1.2, we must have ω(i,j)( ( (i,j) ) ) ∈ Hband ψ(i,j)σ(i,j) ∈ Fb.
Let f = ψ(i,j)σ(i,j). Note that fs= γ(fs)f. The sum of the resistances of the paths from ri
to u in Tiand from v to rjin Tjis
s. Then,
s, u ∈ Wi, and v ∈ Wj. Set γ(fs) to be this
resistance(T(ri,u)) + resistance(T(v,rj)) ≤ resistance(T(u,v)) = stT(f)/ω(i,j),
as weight(f) = ω(i,j). Thus, the resistance of the path
2stT(f)T(ri,u) + 2f + 2stT(f)T(v,rj)
is at most 1/ω(i,j), and so Lemma 8.1 implies that
ω(i,j)( ( (ri,rj) ) ) ? 2stT(f)(Ti+ Tj) + 2f,
and
γ(fs)ω(i,j)( ( (ri,rj) ) ) ? 2γ(fs)stT(f)(Ti+ Tj) + 2fs
? 2γ(fs)φ(f)(Ti+ Tj) + 2fs
? 2b+1γ(fs)(Ti+ Tj) + 2fs
(by (24))
(by f ∈ Fb).
Summing this inequality over all fs∈ Fb
s, we obtain
Db
s?
?
i
2b+1
?
fs∈Fb
s:Wi∈ρ(fs)
γ(fs)
Ti+ 2Fb
s.
For all i such that |Wi| > 1,
?
fs∈Fb
s:Wi∈ρ(fs)
γ(fs) ≤ 2
???
?
f ∈ Fb: Wi∈ ρ(f)
?
≤ 4β/2b−1
= β/2b−3.
????
(part (d) of Definition 1.2)
≤ 2
f∈Fb:Wi∈ρ(f)
φ(f)/2b−1
(by Lemma 10.3)
(35)
36
Page 37
So,
Db
s?
?
i
16β · Ti+ 2Fb
s? 16β · T + 2Fb
s.
To prove inequality (33), let fsbe any edge in Fs, let f be the edge in F such that fs= γ(fs)f,
and let σ(i,j) be the edge such that fs= γ(fs)ψ(i,j)σ(i,j). It suffices to show that
weight(fs) ≤ 8β weight(σ(i,j)). (36)
Set b so that f ∈ Fb. By (35),
γ(fs) ≤ β/2b−3≤ 8β/φ(f) = 8β/max(ψ(i,j),stT(f)) ≤ 8β/ψ(i,j).
As weight(fs) = γ(fs)ψ(i,j)weight(σ(i,j)), inequality (36) follows.
It remains to prove inequality (31). The only reason this inequality is not immediate from
part (a) of Definition 1.2 is that we may have ri= rjfor some i ?= j. Let R = {r1,...,rh} and
S = {1,...,h}, Define the map π : IRR→ IRSby π(x)i= xri. We then have for all x ∈ IRR
xTLDbx = π(x)TLHbπ(x) andxTLDb
sx = π(x)TLHb
sπ(x);
so,
xTLDbx = π(x)TLHbπ(x) ≤ (5/4)π(x)TLHb
sπ(x) = (5/4)xTLDb
sx.
The algorithm UltraSparsify will construct a low-stretch spanning tree T of a graph, choose
a root vertex r, apply RootedUltraSparsify to sparsify all edges whose path in T contains r,
and then work recursively on the trees obtained by removing the root vertex from T. The root
vertex will be chosen to be a tree splitter, where we recall that a vertex r is a splitter of a tree
T if the trees T1,...,Tqobtained by removing r each have at most two-thirds as many vertices
as T. It is well-known that a tree splitter can be found in linear time. By making the root a
splitter of the tree, we bound the depth of the recursion. This is both critical for bounding the
running time of the algorithm and for proving a bound on the quality of the approximation it
returns. For each edge e such that r ?∈ T(e), T(e) is entirely contained in one of T1,...,Tq.
Such edges are sparsified recursively.
U = UltraSparsify(G = (V,E),k)
Condition: G is connected.
1. T = LowStretch(E).
2. Set t = 517 · max(1,log2η(E)) ·
?
log3/2n
?
η(E)/k and p =?2⌈logη(E)⌉n2?−1.
3. If t ≥ η(E) then set A = E − T; otherwise, set A = TreeUltraSparsify(E − T,t,T,p).
4. U = T ∪ A.
37
Page 38
A = TreeUltraSparsify(E′,t′,T′,p)
1. If E′= ∅, return A = ∅.
2. Compute a splitter r of T′.
3. Set Er= {edges e ∈ E′such that r ∈ T′(e)} and tr= ⌈t′η(Er)/η(E′)⌉.
4. If tr> 1, set Ar= RootedUltraSparsify(Er,T′,r,tr,p); otherwise, set Ar= ∅.
5. Set T1,...,Tqto be the trees obtained by removing r from T′. Set V1,...,Vqto be the
vertex sets of these trees, and set E1,...,Eqso that Ei=?(u,v) ∈ E′: {u,v} ⊆ Vi?.
6. For i = 1,...,q, set
A = Ar∪ TreeUltraSparsify(Ei,t′η(Ei)/η(E′),Ti,p).
Theorem 10.5 (Ultra-Sparsification). On input a weighted, connected n-vertex graph G =
(V,E) and k ≥ 1, UltraSparsify(E,k) returns a set of edges U = T ∪ A ⊆ E such that T is a
spanning tree of G, U ⊆ E, and with probability at least 1 − 1/2n,
U ? E ? kU,(37)
and
|A| ≤ O
?m
klogc2+5n
?
,(38)
where m = |E|. Furthermore, UltraSparsify runs in expected time mlogO(1)n.
We remark that this theorem is very loose when m/k ≥ n. In this case, the calls made to
decompose by RootedUltraSparsify could have t ≥ n, in which case decompose will just return
singleton sets, and the output of RootedUltraSparsify will essentially just be the output of
Sparsify2 on Er. In this case, the upper bound in (38) can be very loose.
Proof. We first dispense with the case t ≥ η(E). In this case, UltraSparsify simply returns
the graph E, so (37) is trivially satisfied. The inequality t ≥ η(E) implies k ≤ O(log2n), so (38)
is trivially satisfied as well.
At the end of the proof, we will use the inequality t < η(E). It will be useful to observe that
every time TreeUltraSparsify is invoked,
t′= tη(E′)/η(E).
To apply the analysis of RootedUltraSparsify, we must have
tr≤ ⌈η(Er)⌉.
This follows from
tr=?t′η(Er)/η(E′)?= ⌈tη(Er)/η(E)⌉ ≤ ⌈η(Er)⌉,
as TreeUltraSparsify is only called if t < η(E).
38
Page 39
Each vertex of V can be a root in a call to RootedUltraSparsify at most once, so this sub-
routine is called at most n times during the execution of UltraSparsify. Thus by Lemma 10.4,
with probability at least
1 − n⌈log2η(E)⌉p = 1 − 1/2n,
every graph Esreturned by a call to RootedUltraSparsify satisfies (27) and (28). Accordingly,
we will assume both of these conditions hold for the rest of our analysis.
We now prove the upper bound on the number of edges in A. During the execution of
UltraSparsify, many vertices become the root of some tree. For those vertices v that do not,
set tv= 0. By (27),
?
As ⌈z⌉ ≤ 2z for z ≥ 1 and Er1∩ Er2= ∅ for each r1?= r2,
?
Thus,
|A| =
r∈V :tr>1
|Ar| ≤ c1logc2(n/p)max(1,⌈log2η(E)⌉)
?
r∈V :tr>1
tr.(39)
r∈V :tr>1
tr=
?
r∈V :tr>1
?η(Er)
η(E)t
?
≤
?
r∈V :tr>1
2η(Er)
η(E)t ≤ 2t.
(39) ≤ 2c1logc2(n/p)⌈log2η(E)⌉t
≤ 2c1logc2(n/p)⌈log2η(E)⌉517 · log2η(E) ·
≤ O
?
log3/2n
?
η(E)/k
?m
klogc2+5n
?
,
where the last inequality uses η(E) = O(mlognlog2n) = O(mlog2n) from Theorem 9.1 and
logm = O(logn).
We now establish (37). For every vertex r that is ever selected as a tree splitter in line 2 of
TreeUltraSparsify, let Trbe the tree T′of which r is a splitter, and let Erdenote the set of
edges and trbe the parameter set in line 3. Observe that ∪rEr= E − T. Let
βr= 4η(Er)/tr,
and note this is the parameter used in the analysis of RootedUltraSparsify in Lemma 10.4. If
tr> 1, let Arbe the set of edges returned by the call to RootedUltraSparsify. By Lemma 10.4,
RootedUltraSparsify returns a set of edges Arsatisfying
Er? (3βr+ 126βrmax(1,log2η(Er))) · Tr+ 120βr· Ar.
On the other hand, if tr= 1 and so Ar= ∅, then βr= 4η(Er). We know that (40) is satisfied in
this case because Er? η(Er)Tr(by (10)). If tr= 0, then Er= ∅ and (40) is trivially satisfied.
As tr= ⌈tη(Er)/η(E)⌉,
βr≤ 4η(E)/t.
We conclude
(40)
Er? 129βrmax(1,log2η(Er))·Tr+120βr·Ar? 516(η(E)/t)max(1,log2η(Er))Tr+120(η(E)/t)Ar.
39
Page 40
Adding T, summing over all r, and remembering η(Er) ≤ η(E), we obtain
T + (E − T) ? T + 516(η(E)/t)max(1,log2η(E))
?
r
Tr+ 120(η(E)/t)A.
As r is always chosen to be a splitter of the tree input to TreeUltraSparsify, the depth of the
recursion is at mostlog3/2n . Thus, no edge of T appears more than
sum?
T + (E − T) ? T + 516(η(E)/t)max(1,log2η(E))
? 517(η(E)/t)max(1,log2η(E))
???
log3/2n
?
times in the
rTr, and we may conclude
?
log3/2n
?
?
T + 120(η(E)/t)A
?
log3/2nT + 120(η(E)/t)A
? k(T + A)
= kU,
where the second inequality follows from t ≤ η(E), and the third inequality follows from the
value chosen for t in line 2 of UltraSparsify.
To bound the expected running time of UltraSparsify, first observe that the call to
LowStretch takes time O(mlog2n). Then, note that the routine TreeUltraSparsify is re-
cursive, the recursion has depth at most O(logn), and all the graphs being processed by
TreeUltraSparsifyat any level of the recursion are disjoint. The running time of TreeUltraSparsify
is dominated by the calls made to Sparsify2 inside RootedUltraSparsify. Each of these takes
nearly-linear expected time, so the overall expected running time of TreeUltraSparsify is
O(mlogO(1)n).
References
[ABN08] I. Abraham, Y. Bartal, and O. Neiman. Nearly tight low stretch spanning trees.
In Proceedings of the 49th Annual IEEE Symposium on Foundations of Computer
Science, pages 781–790, Oct. 2008.
[AKPW95] Noga Alon, Richard M. Karp, David Peleg, and Douglas West. A graph-theoretic
game and its application to the k-server problem. SIAM Journal on Computing,
24(1):78–100, February 1995.
[Axe85] O. Axelsson. A survey of preconditioned iterative methods for linear systems of
algebraic equations. BIT Numerical Mathematics, 25(1):165–187, March 1985.
[BBC+94] R. Barrett, M. Berry, T. F. Chan, J. Demmel, J. Donato, J. Dongarra, V. Eijkhout,
R. Pozo, C. Romine, and H. Van der Vorst. Templates for the Solution of Linear
Systems: Building Blocks for Iterative Methods, 2nd Edition. SIAM, Philadelphia,
PA, 1994.
[BCHT04] Erik G. Boman, Doron Chen, Bruce Hendrickson, and Sivan Toledo. Maximum-
weight-basis preconditioners. Numerical linear algebra with applications, 11(8–
9):695–721, October/November 2004.
40
Page 41
[BGH+06] M. Bern, J. Gilbert, B. Hendrickson, N. Nguyen, and S. Toledo. Support-graph
preconditioners. SIAM J. Matrix Anal. & Appl, 27(4):930–951, 2006.
[BH01]Erik Boman and B. Hendrickson. On spanning tree preconditioners. Manuscript,
Sandia National Lab., 2001.
[BH03a]Mario Bebendorf and Wolfgang Hackbusch. Existence of H-matrix approximants
to the inverse FE-matrix of elliptic operators with L∞-coefficients.
Mathematik, 95(1):1–28, July 2003.
Numerische
[BH03b]Erik G. Boman and Bruce Hendrickson. Support theory for preconditioning. SIAM
Journal on Matrix Analysis and Applications, 25(3):694–717, 2003.
[BHM01]W. L. Briggs, V. E. Henson, and S. F. McCormick. A Multigrid Tutorial, 2nd
Edition. SIAM, 2001.
[BHV04]Erik G. Boman, Bruce Hendrickson, and Stephen A. Vavasis. Solving elliptic fi-
nite element systems in near-linear time with support preconditioners.
cs.NA/0407022, 2004.
CoRR,
[BMN04]M. Belkin, I. Matveeva, and P. Niyogi. Regularization and semi-supervised learning
on large graphs. Proc. 17th Conf. on Learning Theory, pages 624–638, 2004.
[BSS09]Joshua D. Batson, Daniel A. Spielman, and Nikhil Srivastava. Twice-Ramanujan
sparsifiers. In Proceedings of the 41st Annual ACM Symposium on Theory of com-
puting, pages 255–262, 2009.
[CW82]D. Coppersmith and S. Winograd. On the asymptotic complexity of matrix multi-
plication. SIAM Journal on Computing, 11(3):472–492, August 1982.
[DER86]I. S. Duff, A. M. Erisman, and J. K. Reid. Direct Methods for Sparse Matrices.
Oxford Science Publications, 1986.
[DS07]Samuel I. Daitch and Daniel A. Spielman. Support-graph preconditioners for 2-
dimensional trusses. CoRR, abs/cs/0703119, 2007.
[DS08] Samuel I. Daitch and Daniel A. Spielman. Faster approximate lossy generalized flow
via interior point algorithms. In Proceedings of the 40th Annual ACM Symposium
on Theory of Computing, pages 451–460, 2008.
[EEST08] Michael Elkin, Yuval Emek, Daniel A. Spielman, and Shang-Hua Teng. Lower-
stretch spanning trees. SIAM Journal on Computing, 32(2):608–628, 2008.
[EGB05] Ojas Parekh Sivan Toledo Erik G. Boman, Doron Chen.
symmetric h-matrices. Linear Algebra and its Applications, 405(1):239–248, August
2005.
On factor width and
[FG04]A. Frangioni and C. Gentile. New preconditioners for KKT systems of network flow
problems. SIAM Journal on Optimization, 14(3):894–913, 2004.
41
Page 42
[Fie73]Miroslav Fiedler.
Journal, 23(98):298–305, 1973.
Algebraic connectivity of graphs. Czechoslovak Mathematical
[Geo73]J. A. George. Nested dissection of a regular finite element mesh. SIAM J. Numer.
Anal., 10:345–363, 1973.
[GMZ95] Keith D. Gremban, Gary L. Miller, and Marco Zagha. Performance evaluation of a
new parallel preconditioner. In Proceedings of the 9th International Symposium on
Parallel Processing, pages 65–69. IEEE Computer Society, 1995.
[GO88] G. H. Golub and M. Overton. The convergence of inexact Chebychev and Richardson
iterative methods for solving linear systems. Numerische Mathematik, 53:571–594,
1988.
[Gre96]Keith Gremban. Combinatorial Preconditioners for Sparse, Symmetric, Diagonally
Dominant Linear Systems. PhD thesis, Carnegie Mellon University, CMU-CS-96-
123, 1996.
[GT87]John R. Gilbert and Robert Endre Tarjan. The analysis of a nested dissection
algorithm. Numerische Mathematik, 50(4):377–404, February 1987.
[Har72]Frank Harary. Graph Theory. Addison-Wesley, 1972.
[Jos97]Anil Joshi. Topics in Optimization and Sparse Linear Systems. PhD thesis, UIUC,
1997.
[KM07] Ioannis Koutis and Gary L. Miller. A linear work, o(n1/6) time, parallel algorithm
for solving planar Laplacians. In Proceedings of the 18th Annual ACM-SIAM Sym-
posium on Discrete Algorithms, pages 1002–1011, 2007.
[LRT79]Richard J. Lipton, Donald J. Rose, and Robert Endre Tarjan. Generalized nested
dissection. SIAM Journal on Numerical Analysis, 16(2):346–358, April 1979.
[Mih89] Milena Mihail. Conductance and convergence of Markov chains—A combinatorial
treatment of expanders. In 30th Annual IEEE Symposium on Foundations of Com-
puter Science, pages 526–531, 1989.
[MMP+05] Bruce M. Maggs, Gary L. Miller, Ojas Parekh, R. Ravi, and Shan Leung Maverick
Woo. Finding effective support-tree preconditioners. In Proceedings of the seven-
teenth annual ACM symposium on Parallelism in algorithms and architectures, pages
176–185, 2005.
[Rei98] John Reif. Efficient approximate solution of sparse linear systems. Computers and
Mathematics with Applications, 36(9):37–58, 1998.
[SS08]Daniel A. Spielman and Nikhil Srivastava. Graph sparsification by effective resis-
tances. In Proceedings of the 40th annual ACM Symposium on Theory of Computing,
pages 563–568, 2008.
42
Page 43
[SST06]A. Sankar, D. A. Spielman, and S.-H. Teng. Smoothed analysis of the condition
numbers and growth factors of matrices. SIAM Journal on Matrix Analysis and
Applications, 28(2):446–476, 2006.
[ST04] Daniel A. Spielman and Shang-Hua Teng. Nearly-linear time algorithms for graph
partitioning, graph sparsification, and solving linear systems. In Proceedings of the
thirty-sixth annual ACM Symposium on Theory of Computing, pages 81–90, 2004.
Full version available at http://arxiv.org/abs/cs.DS/0310051.
[ST08a]Gil Shklarski and Sivan Toledo. Rigidity in finite-element matrices: Sufficient con-
ditions for the rigidity of structures and substructures. SIAM Journal on Matrix
Analysis and Applications, 30(1):7–40, 2008.
[ST08b]Daniel A. Spielman and Shang-Hua Teng. A local clustering algorithm for mas-
sive graphs and its application to nearly-linear time graph partitioning. CoRR,
abs/0809.3232, 2008. Available at http://arxiv.org/abs/0809.3232.
[ST08c]Daniel A. Spielman and Shang-Hua Teng. Spectral sparsification of graphs. CoRR,
abs/0808.4134, 2008. Available at http://arxiv.org/abs/0808.4134.
[Str86]Gilbert Strang. Introduction to Applied Mathematics. Wellesley-Cambridge Press,
1986.
[SW09]Daniel
low-stretch
http://arxiv.org/abs/0903.2816.
A.Spielman
spanning trees.
andJaeohWoo.A noteon preconditioning
2009.
by
CoRR, abs/0903.2816,Available at
[TB97] L. N. Trefethen and D. Bau. Numerical Linear Algebra. SIAM, Philadelphia, PA,
1997.
[Vai90]Pravin M. Vaidya. Solving linear equations with symmetric diagonally dominant
matrices by constructing good preconditioners. Unpublished manuscript UIUC 1990.
A talk based on the manuscript was presented at the IMA Workshop on Graph
Theory and Sparse Matrix Computation, October 1991, Minneapolis., 1990.
[ZBL+03] Dengyong Zhou, Olivier Bousquet, Thomas Navin Lal, Jason Weston, and Bernhard
Sch¨ olkopf. Learning with local and global consistency. In Adv. in Neural Inf. Proc.
Sys. 16, pages 321–328, 2003.
[ZGL03]Xiaojin Zhu, Zoubin Ghahramani, and John D. Lafferty. Semi-supervised learning
using Gaussian fields and harmonic functions. In Proc. 20th Int. Conf. on Mach.
Learn., 2003.
AGremban’s reduction
Gremban [Gre96] (see also [MMP+05]) provides the following method for handling positive off-
diagonal entries. If A is a SDD0-matrix, then Gremban decomposes A into D +An+Ap, where
43
Page 44
D is the diagonal of A, Anis the matrix containing all the negative off-diagonal entries of A,
and Apcontains all the positive off-diagonals. Gremban then considers the linear system
?
? A
x1
x2
?
=ˆb, where
? A =
?
D + An
−Ap
−Ap
D + An
?
and
ˆb =
?
b
−b
?
,
and observes that x = (x1−x2)/2 will be the solution to Ax = b, if a solution exists. Moreover,
approximate solutions of Gremban’s system yield approximate solutions of the original:
????
SDD0-matrix into that of solving a linear system in a SDDM0-matrix that is at most twice as
large and has at most twice as many non-zero entries.
?
x1
x2
?
−? A†ˆb
????≤ ǫ
???? A†ˆb
???
implies
???x − A†b
??? ≤ ǫ
???A†b
???,
where again x = (x1− x2)/2. Thus we may reduce the problem of solving a linear system in a
BComputing the stretch
We now show that given a weighted graph G = (V,E) and a spanning tree T of G, we can
compute stT(e) for every edge e ∈ E in O((m + n)logn) time, where m = |E| and n = |V |.
For each pair of vertices u,v ∈ V , let resistance(u,v) be the resistance of T(u,v), the path
in T connecting u and v. We first observe that for an arbitrary r ∈ V , we can compute
resistance(v,r) for all v ∈ V in O(n) time by a top-down traversal on the rooted tree obtained
from T with root r. Using this information, we can compute the stretch of all edges in Er=
{edges e ∈ E such that r ∈ T(e)} in time O(|Er|). We can then use tree splitters in the same
manner as in TreeUltraSparsify to compute the stretch of all edges in E in O((m + n)logn)
time.
C Decomposing Trees
The pseudo-code for for decompose appears on the next page. The algorithm performs a depth-
first traversal of the tree, greedily forming sets Wionce they are attached to a sufficient number
of edges of E. While these sets are being created, the edges they are responsible for are stored
in Fsub, and the sum of the value of η on these edges is stored in wsub. When a set Wiis formed,
the edges e for which ρ(e) = Wiare set to some combination of Fsuband Fv.
We assume that some vertex r has been chosen to be the root of the tree. This choice is
used to determine which nodes in the tree are children of each other.
Proof of Theorem 9.3. As algorithm decompose traverses the tree T once and visits each edge
in E once, it runs in linear time.
In our proof, we will say that an edge e is assigned to a set Wjif Wj∈ ρ(e). To prove part
(a) of the theorem, we use the following observations: If Wjis formed in step 3.c.ii or step 6.b,
then the sum of η over edges assigned to Wjis at least φ, and if Wjis formed in step 7.b, then
the sum of η of edges incident to Wj and Wj+1(which is a singleton) is at least 2φ. Finally,
if a set Whis formed in line 5.b of decompose, then the sum of η over edges assigned to Whis
44
Page 45
({W1,...,Wh},ρ) = decompose(T,E,η,t)
Comment: h, ρ, and the Wi’s are treated as global variables.
1. Set h = 0.
2. For all e ∈ E, set ρ(e) = ∅.
3. Set φ = 2?
5. If U ?= ∅,
(a) h = h + 1.
(b) Wh= U.
(c) For all e ∈ F, set ρ(e) = ρ(e) ∪ {Wh}.
e∈Eη(e)/t.
4. (F,w,U) = sub(r).
(F,w,U) = sub(v)
1. Let v1,...,vsbe the children of v.
2. Set wsub= 0, Fsub= ∅ and Usub= ∅.
3. For i = 1,...,s
(a) (Fi,wi,Ui) = sub(vi).
(b) wsub= wsub+ wi, Fsub= Fsub∪ Fi, Usub= Usub∪ Ui.
(c) If wsub≥ φ,
i. h = h + 1.
ii. Set Wh= Usub∪ {v}.
iii. For all e ∈ Fsub, set ρ(e) = ρ(e) ∪ {Wh}.
iv. Set wsub= 0, Fsub= ∅ and Usub= ∅.
4. Set Fv= {(u,v) ∈ E}, the edges attached to v.
5. Set wv=?
(a) h = h + 1.
(b) Set Wh= Usub∪ {v}.
(c) For all e ∈ Fsub∪ Fv, set ρ(e) = ρ(e) ∪ {Wh}.
(d) Return (∅,0,∅).
7. If wv+ wsub> 2φ,
e∈Fvη(e).
6. If φ ≤ wv+ wsub≤ 2φ,
(a) h = h + 1.
(b) Set Wh= Usub.
(c) For all e ∈ Fsub, set ρ(e) = ρ(e) ∪ {Wh}.
(d) h = h + 1.
(e) Set Wh= {v}.
(f) For all e ∈ Fv, set ρ(e) = ρ(e) ∪ {Wh}.
(g) Return (∅,0,∅).
8. Return (Fsub∪ Fv,wsub+ wv,Usub∪ {v})
45
Page 46
greater than zero. But, at most one set is formed this way. As each edge is assigned to at most
two sets in W1,...,Wh, we may conclude
?
which implies t > h − 1. As both t and h are integers, this implies t ≥ h.
We now prove part (b). First, observe that steps 6 and 7 guarantee that when a call to
sub(v) returns a triple (F,w,U),
w =η(e) < φ.
2
e∈E
η(e) > (h − 1)φ,
?
e∈U
Thus, when a set Whis formed in step 3.c.ii, we know that the sum of η over edges assigned to
Whequals wsuband is at most 2φ. Similarly, we may reason that wsub< φ at step 4. If a set
Whis formed in step 6.b, the sum of η over edges associated with Whis wv+ wsub, and must
be at most 2φ. If a set Whis formed in step 7.b, the sum of η over edges associated with Whis
wsub, which we established is at most φ. As the set formed in step 7.e is a singleton, we do not
need to bound the sum of η over its associated edges.
Lemma C.1. Suppose G = (V,E) is a planar graph, π is a planar embedding of G, T is a
spanning tree of G, and t > 1 is an integer. Let ({W1,...,Wh},ρ) = decompose(T,E,η,t) with
the assumption that in Step 1 of sub, the children v1,...,vsof v always appear in clock-wise order
according to π. Then the graph G{W1,...,Wh}= ({1,...,h},{(i,j) : ∃ e ∈ E,ρ(e) = {Wi,Wj}})
is planar.
Proof. Recall that the contraction of an edge e = (u,v) in a planar graph G = (V,E) defines
a new graph (V − {u},E ∪ {(x,v) : (x,u) ∈ E} − {(x,u) ∈ E}). Also recall that edge deletions
and edge contractions preserve planarity.
We first prove the lemma in the special case in which the sets W1,...,Whare disjoint. For
each j, let Tj be the graph induced on T by Wj. As each Tj is connected, G{W1,...,Wh}is a
subgraph of the graph obtained by contracting all the edges in each subgraph Tj. Thus in this
special case G{W1,...,Wh}is planar.
We now analyze the general case, recalling that the sets W1,...,Whcan overlap. However,
the only way sets Wjand Wkwith j < k can overlap is if the set Wjwas formed at step 3.c.ii,
and the vertex v becomes part of Wkafter it is returned by a call to sub. In this situation, no
edge is assigned to Wjfor having v as an end-point. That is, the only edges of form (x,v) that
can be assigned to Wjmust have x ∈ Wj. So, these edges will not appear in G{W1,...,Wh}.
Accordingly, for each j we define
?
Wj
otherwise.
Xj=
Wj− v if Wjwas formed at step 3.c.ii, and
We have shown that G{W1,...,Wh}= G{X1,...,Xh}. Moreover, the sets X1,...,Xhare disjoint. Our
proof would now be finished, if only each subgraph of G induced by a set Xjwere connected.
While this is not necessarily the case, we can make it the case by adding edges to E.
The only way the subgraph of G induced on a set Xj can fail to be connected is if Wj is
formed at line 3.c.ii from the union of v with a collection sets Uifor i0≤ i ≤ i1returned by
46
Page 47
recursive calls to sub. Now, consider what happens if we add edges of the form (vi,vi+1) to the
graph for i0≤ i < i1, whenever they are not already present. As the vertices vi0,...,vi1appear
in clock-wise order around v, the addition of these edges preserves the planarity of the graph.
Moreover, their addition makes the induced subgraphs on each set Xj connected, so we may
conclude that G{X1,...,Xh}is in fact planar.
D The Pseudo-Inverse of a Factored Symmetric Matrix
We recall that B†is the pseudo-inverse of B if and only if it satisfies
BB†B = B
B†BB†= B†
(BB†)T= BB†
(B†B)T= B†B.
(41)
(42)
(43)
(44)
We now prove that if B = XCXT, where X is a non-singular matrix and C is symmetric,
then
B†= ΠX−TC†X−1Π,
where Π is the projection onto the span of B. We will prove that by showing that ΠX−TC†X−1Π
satisfies axioms (41–44). Recall that Π = BB†= B†B and that ΠB = B.
To verify (41), we compute
B(ΠX−TC†X−1Π)B = BX−TC†X−1B
= (XCXT)X−TC†X−1(XCXT)
= XCC†CXT
= XCXT
= B.
To verify (42), we compute
(ΠX−TC†X−1Π)B(ΠX−TC†X−1Π) = ΠX−TC†X−1BX−TC†X−1Π
= ΠX−TC†X−1XCXTX−TC†X−1Π
= ΠX−TC†CC†X−1Π
= ΠX−TC†X−1Π.
47
Page 48
To verify (43), it suffices to verify that BΠX−TC†X−1Π is symmetric, which we now do:
B(ΠX−TC†X−1Π) = ΠBX−TC†X−1Π
= Π(XCXT)(X−TC†X−1Π)
= ΠXCC†X−1Π
= B†BXCC†X−1BB†
= B†XCXTXCC†X−1XCXTB†
= B†XCXTXCC†CXTB†
= B†XCXTXCXTB†
= B†BBB†,
which is symmetric as B is symmetric.
As B and ΠX−TC†X−1Π are symmetric, it follows that (44) is satisfied as well.
48