Content uploaded by Christoph Buchheim

Author content

All content in this area was uploaded by Christoph Buchheim on Jan 22, 2016

Content may be subject to copyright.

Improving Walker’s Algorithm

to Run in Linear Time

Christoph Buchheim

1

, Michael J¨unger

1

, and Sebastian Leipert

2

1

Universit¨at zu K¨oln, Institut f¨ur Informatik,

Pohligstraße 1, 50969 K¨oln, Germany

{buchheim,mjuenger}@informatik.uni-koeln.de

2

caesar research center,

Friedensplatz 16, 53111 Bonn, Germany

leipert@caesar.de

Abstract. The algorithm of Walker [5] is widely used for drawing trees

of unbounded degree, and it is widely assumed to run in linear time,

as the author claims in his article. But the presented algorithm clearly

needs quadratic runtime. We explain the reasons for that and present a

revised algorithm that creates the same layouts in linear time.

1 Introduction

Since Walker presented his article [5] on drawing rooted ordered trees of un-

bounded degree, this topic is considered a solved problem of Automatic Graph

Drawing. In 1979, Wetherell and Shannon [6] presented a linear time algorithm

for drawing binary trees satisfying the following aesthetic requirements: the

y-coordinate of a node corresponds to its level, so that the hierarchical struc-

ture of the tree is displayed; the left child of a node is placed to the left of the

right child, i.e., the order of the children is displayed; ﬁnally, each parent node is

centered over its children. Nevertheless, this algorithm showed some deﬁciencies.

In 1981, Reingold and Tilford [2] improved the Wetherell-Shannon algorithm by

adding the following feature: each pair of isomorphic subtrees is drawn identi-

cally up to translation, i.e., the drawing does not depend on the position of a

subtree within the complete tree. They also made the algorithm symmetrical: if

all orders of children in a tree are reversed, the computed drawing is the reﬂected

original one. The width of the drawing is not always minimized subject to these

conditions, but it is close to the minimum in general. The algorithm of Reingold

and Tilford runs in linear time, too.

Extending this algorithm to rooted ordered trees of unbounded degree in

a straightforward way produces layouts where some subtrees of the tree may

get clustered on a small space, even if they could be dispersed much better.

This problem was solved in 1990 by the algorithm of Walker [5], which spaces

out subtrees whenever possible. Unfortunately, the runtime of the algorithm

presented in [5] is quadratic, in contrary to the author’s assertion. In the present

article, we close this gap by giving an adjustment of Walker’s algorithm that

does not aﬀect the computed layouts but yields linear runtime.

M.T. Goodrich and S.G. Kobourov (Eds.): GD 2002, LNCS 2528, pp. 344–353, 2002.

c

Springer-Verlag Berlin Heidelberg 2002

Improving Walker’s Algorithm to Run in Linear Time 345

In the next section, we establish the basic notation about trees and state the

aesthetic criteria guiding the algorithms to be dealt with. In Sect. 3, we explain

the Reingold-Tilford algorithm. In Sect. 4, we describe the idea of Walker’s

algorithm and point out the non-linear parts. We improve these parts in order

to get a linear time algorithm in Sect. 5.

2 Preliminaries

We deﬁne a (rooted) tree as a directed acyclic graph with a single source, called

the root of the tree, such that there is a unique directed path from the root to any

other node. The level of a node is the length of this path. For each edge (v, w),

we call v the parent of w and w a child of v.Ifw

1

and w

2

are two diﬀerent

children of v, we say that w

1

and w

2

are siblings. Each node w on the path from

the root to a node v is called an ancestor of v, while v is called a descendant

of w.Aleaf of the tree is a sink of the graph, i.e., a node without children. If v

−

and v

+

are two nodes such that v

−

is not an ancestor of v

+

and vice versa, the

greatest distinct ancestors of v

−

and v

+

are deﬁned as the unique ancestors w

−

and w

+

of v

−

and v

+

, respectively, such that w

−

and w

+

are siblings. Each

node v of a rooted tree T induces a unique subtree of T with root v.

In a binary tree, each node has at most two children. In an ordered tree, a

certain order of the children of each node is ﬁxed. The ﬁrst (last) child according

to this order is called the leftmost (rightmost) child. The left (right) sibling of

anodev is its predecessor (successor) in the list of children of the parent of v.

The leftmost (rightmost) descendant of v on level l is the leftmost (rightmost)

node on level l belonging to the subtree induced by v. Finally, if v

1

is the left

sibling of v

2

, w

1

is the rightmost descendant of v

1

on some level l, and w

2

is the

leftmost descendant of v

2

on the same level l, we call w

1

the left neighbor of w

2

and w

2

the right neighbor of w

1

.

To draw a tree into the plane means to assign x- and y-coordinates to its

nodes and to represent each edge (v, w) by a straight line connecting the points

corresponding to v and w. When drawing a rooted tree, one usually requires the

following aesthetic properties:

(A1) The layout displays the hierarchical structure of the tree, i.e., the y-

coordinate of a node is given by its level.

(A2) The edges do not cross each other and nodes on the same level have a

minimal horizontal distance.

(A3) The drawing of a subtree does not depend on its position in the tree,

i.e., isomorphic subtrees are drawn identically up to translation.

If the trees to be drawn are ordered, we additionally require the following:

(A4) The order of the children of a node is displayed in the drawing.

(A5) The algorithm works symmetrically, i.e., the drawing of the reﬂection of

a tree is the reﬂected drawing of the original tree.

Here, the reﬂection of an ordered tree is the tree with reversed order of children

for each parent node. Usually, one tries to ﬁnd a layout satisfying (A1) to (A5)

with a small width, i.e., with a small range of x-coordinates.

346 Christoph Buchheim, Michael J¨unger, and Sebastian Leipert

3 Reingold and Tilford’s Algorithm

For ordered binary trees, the ﬁrst linear time algorithm satisfying (A1) to (A5)

was presented by Reingold and Tilford [2]. This algorithm is easy to describe

informally: it draws the tree recursively in a bottom-up sweep. Leaves are placed

to an arbitrary x-coordinate and to the y-coordinate given by their level. After

drawing the subtrees induced by the children of a parent node independently,

the right one is shifted so that it is placed as close to the right of the left subtree

as possible.

1

Next, the parent is placed centrally above the children, this is, at

the x-coordinate given by the average x-coordinate of the children, and at the

y-coordinate given by its level. Finally, the edges are inserted.

The Reingold-Tilford algorithm obviously satisﬁes (A1) to (A5). The diﬃcult

task is how to perform the steps described above in linear time. The crucial

part of the algorithm is the shifting of the second subtree; solving the following

problems takes quadratic runtime in total, if a straightforward algorithm is used:

ﬁrst, the computation of the new position of this subtree, second, the shifting of

the subtree itself.

For the ﬁrst problem, deﬁne the left (right) contour of a tree as the sequence

of leftmost (rightmost) nodes in each level, traversed from the root to the highest

level. For an illustration, see Fig. 1, where nodes belonging to the contours are

shaded. To place the right subtree as close to the left one as possible, we have to

compare the positions of the right contour of the left subtree with the positions of

the left contour of the right subtree, for all levels occuring in both subtrees. Since

each node belongs to the traversed part of the left contour of the right subtree

at most for one subtree combination, the total number of such comparisons is

linear for the complete tree. The runtime problem is how to traverse the contours

without traversing (too many) nodes not belonging to the contours. To solve this

problem, Reingold and Tilford introduce threads. For each leaf of the tree that

has a successor in the same contour, the thread is a pointer to this successor.

See Fig. 1 again, where the threads are represented by dotted arrows. For every

node of the contour, we now have a pointer to its successor in the contour: either

it is the leftmost (rightmost) child, or it is given by the thread. Finally, to keep

the threads up to date, one has to add a new thread whenever two subtrees of

diﬀerent height are combined.

For the second problem, the straightforward algorithm would shift all nodes

of the right subtree by the same value. Since this needs quadratic time in total,

Reingold and Tilford attach a new value mod(v) to each node v, which is called

its modiﬁer (this technique was presented by Wetherell and Shannon [6]). The

position of each node is preliminary in the bottom-up traversal of the tree.

When moving a subtree rooted at v, only mod(v) and a preliminary x-coordinate

prelim(v) are adjusted by the amount of shifting. The modiﬁer of a node v

1

For simplicity, we assume throughout this paper that all nodes have the same di-

mensions and that the minimal distance required between neighbors is the same for

each pair of neighbors. Both restrictions can be relaxed easily, since we will always

compare a single pair of neighbors

Improving Walker’s Algorithm to Run in Linear Time 347

t

Fig. 1. Combining two subtrees and adding a new thread t

is interpreted as a value to be added to all preliminary x-coordinates in the

subtree rooted at v, except for v itself. Thus, the real position of a node is its

preliminary position plus the aggregated modiﬁer modsum(v) given by the sum

of all modiﬁers on the path from the parent of v to the root. To compute all real

positions in linear time, the tree is traversed in a top-down fashion at the end.

When comparing contour nodes to compute the new position of the right

subtree, we need the real positions of these nodes, too. For runtime reasons,

we may not sum up modiﬁers on the paths to the root. Therefore, modiﬁers

are used in leaves as well. A modiﬁer of a leaf v with a thread to a node w

stores the diﬀerence between modsum(w) and modsum(v). Since new threads

are added after combining two subtrees, these modiﬁer sums can be computed

while traversing the contours of the two subtrees. We have to traverse not only

the inside contours but also the outside contours for computing the modiﬁer

sums, since v is a node of the outside contour. Now the aggregated modiﬁers can

be computed as the sums of modiﬁers along the contours instead of the paths

to the root.

4 Walker’s Algorithm

For drawing trees of unbounded degree, the Reingold-Tilford algorithm could

be adjusted easily by traversing the children from left to right, placing and

shifting the corresponding subtrees one after another. However, this violates

property (A5): the subtrees are placed as close to each other as possible and small

subtrees between larger ones are piled to the left; see Fig. 2(a). A simple trick

to avoid this eﬀect is to add an analogous second traversal from right to left; see

Fig. 2(b), and to take average positions after that. This algorithm satisﬁes (A1)

to (A5), but smaller subtrees are usually clustered then; see Fig. 2(c).

To obtain a layout where smaller subtrees are spaced out evenly, as for ex-

ample in Fig. 2(d), Walker [5] proposed the following proceeding; see Fig. 3: the

subtrees of the current root are processed one after another from left to right.

First, each child of the current root is placed as close to the right of its left sib-

ling as possible. As in Reingold and Tilford’s algorithm, the left contour of the

current subtree is then traversed top down in order to compare the positions of

its nodes to those of their left neighbors. Whenever two conﬂicting neighbors v

−

and v

+

are detected, forcing v

+

to be shifted to the right by an amount of

348 Christoph Buchheim, Michael J¨unger, and Sebastian Leipert

(a) (b)

(c) (d)

Fig. 2. Extending the Reingold-Tilford algorithm to trees of unbounded degree

shift, we apply an appropriate shift to all smaller subtrees between the subtrees

containing v

−

and v

+

. More precisely, let w

−

and w

+

be the greatest distinct

ancestors of v

−

and v

+

. Notice that both w

−

and w

+

are children of the current

root. Let subtrees be the number of children of the current root between w

−

and w

+

plus 1. Spacing out the subtrees is shifting the subtree rooted at the i-th

child to the right of w

−

by an amount of i·shift/subtrees,fori =1,...,subtrees.

Observe that subtrees may be shifted several times by this algorithm, even while

adding a single subtree. It is easy to see that this algorithm satisﬁes (A5).

Fig. 3. Spacing out the smaller subtrees

Unfortunately, many parts of the algorithm presented in [5] do not run in

linear time. Some of them are easy to improve, for example by using Reingold

and Tilford’s ideas, and some require new ideas. All problems concern Walker’s

procedure

APPORTION; see pages 695–697 in [5]. In the following, we list the

critical parts. In the next section, we will explain how to change these in order

to obtain linear runtime.

Traversing the right contour: A recursive function

GETLEFTMOST is used to

ﬁnd the leftmost descendant of a given node v on a given level l. If the level

Improving Walker’s Algorithm to Run in Linear Time 349

of v is l, the algorithm returns v. Otherwise, GETLEFTMOST is applied re-

cursively to all children of v, from left to right. The aggregated runtime of

GETLEFTMOST is not linear in general. To prove that, we present a series

of trees T

k

such that the number of nodes in T

k

is n ∈ Θ(k

2

), but the total

number of

GETLEFTMOST calls is Θ(k

3

). Since k ∈ Θ(n

1/2

), this shows that

the total runtime of GETLEFTMOST is Ω(k

3

)=Ω(n

3/2

). The tree T

k

is deﬁned

as follows (see Fig. 4(a) for k = 3): Beginning at the root, there is a chain of 2k

nodes, each of the last 2k −1 being the right or only child of its predecessor. For

i =1,...,k, the i-th node in this chain has another child to the left; this child

is the ﬁrst node of a chain of 2(k − i) + 1 nodes. The number of nodes in T

k

is

2k +

k

i=1

(2(k − i)+1)=2k + k(k − 1) + k ∈ Θ(k

2

) .

Now, for each i =0,...,k− 1, we have to combine two subtrees when visiting

the node on the right contour of T

k

on level i. In this combination, the highest

common level of the subtrees is 2k −i −1, and by construction of T

k

we always

have to apply

GETLEFTMOST to every node of the right subtree up to this level.

The number of these nodes is

k − i +

k−i−1

j=0

(2j)=(k − i)+(k − i)(k − i − 1)=(k − i)

2

,

hence the total number of

GETLEFTMOST calls for all combinations is

k−1

i=0

(k − i)

2

=

k

i=1

i

2

= k(k + 1)(2k +1)/6 ∈ Θ(k

3

) .

Finding the ancestors and summing up modiﬁers: This part of the algorithm is

obviously quadratic. When adjusting the current subtree to the left subforest,

the greatest distinct ancestors of the possibly conﬂicting neighbors are computed

for each level by traversing the graph up to the current root, at the same time

computing the modiﬁer sums. Since the distance of the levels grows linearly, the

total number of steps is in Ω(n

2

).

Counting and shifting the smaller subtrees: When shifting the current subtree

to the right because of a conﬂict with a subtree to the left, the procedure

APPORTION also shifts all smaller subtrees in-between immediately. Further-

more, the number of these subtrees is computed by counting them one by one.

Both actions have an aggregated runtime of Ω(n

3/2

), as the following example

shows. Let the tree T

k

be constructed as follows (see Fig. 4(b) for k = 3): add k

children to the root. The i-th child, counted i =1,...,k from left to right, is

root of a chain of i nodes. Between each pair of these children, add k children as

leaves. The leftmost child of the root has 2k + 5 children, and up to level k −1,

every rightmost child of the 2k+5 children has again 2k+5 children. The number

350 Christoph Buchheim, Michael J¨unger, and Sebastian Leipert

of nodes of T

k

is

1+

k

i=1

i +(k − 1)k +(k − 1)(2k +5)∈ Θ(k

2

) .

Furthermore, by construction of the left subtree, adding the i-th subtree chain

for i =2,...,k results in a conﬂict with the left subtree on level i. Hence all

(i − 1)(k +1)−1 smaller subtrees between the two conﬂicting ones are counted

and shifted. Thus, the total number of counting and shifting steps is

k

i=2

((i − 1)(k +1)− 1)=(k +1)k(k − 1)/2 − k +1∈ Θ(k

3

) .

As in the last example, we derive that counting and shifting needs Ω(n

3/2

) time

in total.

(a) T

3

(b) T

3

Fig. 4. Examples proving the non-linear runtime of Walker’s algorithm

5 Improving Walker’s Algorithm

In this section, we explain how to improve the algorithm of Walker to run in

linear time without aﬀecting the computed layouts. For a closer look, see [1],

where we present the complete revised algorithm in a pseudocode style as well

as an experimental runtime comparison of both versions.

Traversing the contours and summing up modiﬁers: This can be done exactly

as in the case of binary trees by using threads; see Sect. 3. The fact that the left

subforest is no tree in general does not create any additional diﬃculty.

Finding the ancestors: The problem of ﬁnding the greatest distinct ancestors w

−

and w

+

of two nodes v

−

and v

+

can be solved by the algorithm of Schieber and

Vishkin [3]. For each pair of nodes, this algorithm can determine the greatest

distinct ancestors in constant time, after an O(n) preprocessing step. However,

Improving Walker’s Algorithm to Run in Linear Time 351

in our application, a much simpler algorithm can be applied. First observe that

we know the right ancestor w

+

anyway; it is just the root of the current subtree.

Furthermore, as v

+

is always the right neighbor of v

−

in our algorithm, the left

one of the greatest distinct ancestors only depends on v

−

. Thus we may shortly

call it the ancestor of v

−

in the following. We use a node pointer ancestor (x)for

each node x to save its ancestor and initialize it to x itself. Observe that this value

is not correct for rightmost children, but we do not need the correct value w

−

of

ancestor(v

−

) until the right neighbor v

+

of v

−

is added, i.e., until the current root

is the parent node of w

−

. Hence assume that we are placing the subtrees rooted

at the children of v from left to right. Since tracing all ancestor(x) consumes too

much time, we use another node pointer defaultAncestor. Our aim is to have the

following property (*) for all nodes v

−

on the right contour of the left subforest

after each subtree addition: if ancestor(v

−

) is up to date, i.e., is a child of v,

then it points to the correct ancestor w

−

of v

−

; otherwise, the correct ancestor

is defaultAncestor . We start with placing the ﬁrst subtree, rooted at w, which

does not require any ancestor calculations. After that, we set defaultAncestor

to w. Since all pointers ancestor(x) of the left subtree either point to w or to

a node of a higher level, the desired property (*) holds, see Fig. 5(a). After

placing the subtree rooted at another child w

of v, we distinguish two cases:

if the subtree rooted at w

is smaller than the left subforest, we can actualize

ancestor(x) for all nodes x on its right contour by setting it to w

. By this, we

obviously keep (*); see Fig. 5(b). Otherwise, if the new subtree is larger than the

left subforest, we may not do the same because of runtime. But now it suﬃces

to set defaultAncestor to w

, since again all pointers ancestor(x) of the subtree

induced by w

either point to w

or to a node of a higher level, and all other

subtrees in the left subforest are hidden. Hence we have (*) again; see Fig. 5(c).

(a) (b) (c)

Fig. 5. Adjusting ancestor pointers when adding new subtrees: the pointer ancestor(x)

is represented by a solid arrow if it is up to date and by a dashed arrow if it is expired.

In the latter case, the defaultAncestor is used and drawn black. When adding a small

subtree, all ancestor pointers ancestor(x) of its right contour are updated. When adding

a large subtree, only defaultAncestor is updated

352 Christoph Buchheim, Michael J¨unger, and Sebastian Leipert

Counting the smaller subtrees: For that, we just have to number the children of

each node consecutively; then the number of smaller subtrees between the two

greatest distinct ancestors w

−

and w

+

is the number of w

+

minus the number

of w

−

minus 1. Hence it can be computed in constant time (after a linear time

preprocessing step to compute all child numbers).

000

00

0

000 1

1

3

0

0

[−

1

3

]

000 10

1

3

0

0

[−

1

3

]

0

000 10

1

[

1

3

+

1

5

]

0

0

[−

1

3

]

0

[−

1

5

]

Fig. 6. Aggregating the shifts: the top number at node x indicates the value of shift(x),

and the bottom number indicates the value of change(x)

Shifting the smaller subtrees: In order to get a linear runtime, we will shift each

subtree at most once when it is not the currently added subtree. The currently

added subtree, however, may be shifted whenever it conﬂicts with a subtree to

the left, using the fact that shifting a single subtree is done in constant time

(recall that we only have to adjust prelim(w

+

) and mod (w

+

)). Furthermore,

shifting the current subtree immediately is necessary to keep the right contour of

the left subforest up to date. All shiftings of non-current subtrees are performed

in a single traversal after all subtrees of the current root have been placed. To

memorize the shiftings at the moment they arise, we use real numbers shift(x)

and change(x) for each node x and set both to zero at the beginning. Assume

that the subtree rooted at w

+

is the subtree currently placed, and that a conﬂict

with the subtree rooted at w

−

forces the current subtree to move to the right by

an amount of shift. Let subtrees be the number of subtrees between w

−

and w

+

,

plus 1. According to Walker’s idea, the i-th of these subtrees has to be moved

by i·shift/subtrees. We save this by increasing shift(w

+

)byshift, decreasing

change(w

+

)byshift/subtrees, and increasing change(w

−

)byshift/subtrees. The

interpretation of this is the following: to the left of node w

+

, the nodes are

shifted by an amount initialized to shift, but this amount starts decreasing by

shift/subtrees per subtree at node w

+

and ends decreasing at w

−

, where it is

zero. The trick is to aggregate the shifts: since the decrease in the amount of

Improving Walker’s Algorithm to Run in Linear Time 353

shifting is linear, we can add all these decreases in one array; see Fig. 6 for an

example. Finally, we execute all shifts in a single traversal of the children of the

current root as follows, see Fig. 7: we use two real values shift and change to

store the shifts and the decreases of shift per subtree, respectively, and set both

to zero at the beginning. Then we traverse the children from right to left. When

visiting child v,wemovev to the right by shift (i.e., we increase prelim(v) and

mod(v)byshift), increase change by change(v), and increase shift by shift(v)

and by change. Then we go on to the left sibling of v. It is easy to see that this

algorithm shifts each subtree by the correct amount.

0

00

1

0

1

[

1

3

+

1

5

]

00

[−

1

3

]

0

[−

1

5

]

0

[

1

5

+

1

3

]

[

2

5

+

2

3

]

3

5

4

5

0

0

[−

1

5

−

1

3

]

[−

1

5

−

1

3

]

[−

1

5

−

1

3

]

[−

1

5

]

[−

1

5

]

Fig. 7. Executing the shifts: the new numbers at node x indicate the values of shift

and change before shifting x, respectively

References

1. C. Buchheim, M. J¨unger, and S. Leipert. Improving Walker’s algorithm to run in

linear time. Technical Report zaik2002-431, ZAIK, Universit¨at zu K¨oln, 2002.

2. E. Reingold and J. Tilford. Tidier drawings of trees. IEEE Transactions on Software

Engineering, 7(2):223–228, 1981.

3. B. Schieber and U. Vishkin. On ﬁnding lowest common ancestors: Simpliﬁcation

and parallelization. In Proceedings of the Third Aegean Workshop on Computing,

volume 319 of Lecture Notes in Computer Science, pages 111–123, 1988.

4. K. Supowit and E. Reingold. The complexity of drawing trees nicely. Acta Infor-

matica, 18(4):377–392, 1983.

5. J. Walker II. A node-positioning algorithm for general trees. Software – Practice

and Experience, 20(7):685–705, 1990.

6. C. Wetherell and A. Shannon. Tidy drawings of trees. IEEE Transactions on

Software Engineering, 5(5):514–520, 1979.