Page 1

Approximation Schemes for Broadcasting in

Heterogenous Networks

?

Samir Khuller

?, Yoo-Ah Kim

?, and Gerhard Woeginger

?

?Department of Computer Science, University of Maryland, College Park, MD 20742,

samir@cs.umd.edu

?Department of Computer Science, University of Maryland, College Park, MD 20742,

ykim@cs.umd.edu

?Department of Mathematics and Computer Science, Eindhoven University of Technology,

Eindhoven, The Netherlands,

gwoegi@igi.tu-graz.ac.at

Abstract. We study the problem of minimizing the broadcast time for a set of

processors in a cluster, where processor

time taken to send a message to any other processor in the cluster. Previously,

it was shown that the Fastest Node First method (FNF) gives a 1.5 approximate

solution. In this paper we show that there is a polynomial time approximation

scheme for the problems of broadcasting and multicasting in such a heterogenous

cluster.

?

?has transmission time

?

?, which is the

1 Introduction

Networks of Workstations (NOWs) are an extremely popular alternative to massively

parallel machines and are widely used (for example the Condor project at Wisconsin

[17]) and the Berkeley NOW project [16]. By simply using off-the-shelf PC’s, a very

powerful workstation cluster can be created, and this can provide a high amount of par-

allelism at relatively low cost. Since NOWs are put together over time, the machines

tend to have different capabilities and this leads to a heterogenous collection of ma-

chines, rather than a homogenous collection, in which all the machines have identical

capabilities.

Onefundamentaloperationthat is usedinsuchclusters,is thatofbroadcast(thisis a

primitive in many message passing systems such as MPI [1,6,8]). In addition it is used

as a primitive in many parallel algorithms. The main objective of a broadcast operation

is to quickly distribute the input data to the entire network for processing. Another sit-

uation is when the system is performinga parallel search, then the successful processor

needs to inform all other processors that the search has concluded successfully. Vari-

ous models for heterogenous environments have been proposed in the literature. One

general model is the one proposed by Bar-Noy et al [3] where the communicationcosts

between links are not uniform. In addition, the sender may engage in another commu-

nication before the current one is complete. An approximation factor with a guarantee

?Research supported by NSF ITR Award CCR-0113192.

Page 2

of

in the theory literature generally assume an underlying communication graph, with the

property that only adjacent nodes in this graph may communicate.

Broadcasting efficiently is an essential operation and many works are devoted to

this (see [18,9,10,4,5] and references therein). In addition, for emergency notification

an understanding of how to perform broadcast quickly is essential.

A simple model and algorithm was proposedby Banikazemi et al [2]. In this model,

heterogeneity among processors is modeled by a non-uniform speed of the sending

processor. A heterogenouscluster is defined as a collection of processors

in which each processor is capable of communicating with any other processor. Each

processor has a transmission time which is the time required to send a message to

any other processor in the cluster. Thus the time required for the communication is

a function of only the sender. Each processor may send messages to other processors in

order, and each processor may be receiving only one message at a time.

Thusa broadcastoperationis implementedas a broadcasttree. Eachnodein the tree

represents a processor of the cluster. The root of the tree is the source of the original

message. The children of a node

The completion time of a node is the time at which it completes receiving the message

from its parent. The completion time of the children of

completion time of

other words, the first child of

child has a completion time of

?

????

?

? is given for the operation of performing a multicast. Other popular models

?

?

??

?

??????

?

?

?are the processors that receive the message from

?

?.

?

?is

?

?

?

?

?

?

?, where

?

?is the

?

?,

?

?is the transmission time of

?

?and

? is the child number. In

?

?has a completion time of

?

?

?

?

?( ?

??), the second

?

?

???

?( ?

??) etc. See Figure 1 for an example.

1

2

3

3

3

3

3

0

1

2

3

4

3

5

1

3

2

3

33

3

0

1

2

3

4

4

4

(a) FNF

(b) Optimal Solution

Fig.1. An example that FNF does not produce an optimal solution. Transmission times of pro-

cessors are inside the circles. Times at which nodes receive a message are also shown.

A commonly used method to find a broadcast tree is referred to as the “Fastest

Node First” (FNF) technique [2]. This works as follows: In each iteration, the algo-

rithm chooses a sender from the set of processors that have received the message (set

?) and a receiver from the set of processors that have not yet received the message (set

?). The algorithmthen picks the senderfrom

as early as possible, and chooses the receiver

transmission time in

intuition is that sending the message to fast processors first is a more effective way to

?

?

? so that

? can finish the transmission

?

?

? as the processor with the minimum

?. Then

? is moved from

? to

? and the algorithm continues. The

2

Page 3

propagate the message quickly. This technique is very effective and easy to implement.

In practice it works extremely well (using simulations) and in fact frequently finds op-

timal solutions as well [2]. However, there are situations when this method also fails to

find an optimal solution. A simple example is shown in Figure 1.

Despite several non-trivial advances in an understanding of the fastest node first

method by Liu [13] (see also work by Liu and Sheng [15] in SPAA 2000) it was not

well understood as to how this algorithm performs in the worst case. For example, can

we show that in all instances the FNF heuristic will find solutions close to optimal?

Liu [13] (see also [15]) shows that if there are only two classes of processors, then

FNF produces an optimal solution. In addition, if the transmission time of every slower

processor is a multiple of the transmission time of every faster processor,then again the

FNF heuristic produces an optimal solution. So for example, if the transmission time of

the fastest processor is 1 and the transmission time of all other processors are powers

of 2, then the algorithm produces an optimal solution. It immediately follows that by

rounding all transmission times to powers of 2 we can obtain a solution using FNF

whose cost is at most twice the cost of an optimal solution4. However, this still did not

explain the fact that this heuristic does much much better in practice. Recently, Khuller

and Kim [12] showed that the FNF heuristic actually produces an optimal solution for

the problem of minimizing the sum of completion times. This property is used to show

that the FNF method has a performance ratio of at most

optimal solution for minimizing broadcast time. In addition the performance ratio of

FNF is at least

the transmission times of the fastest

producesa solution with makespanat most

is

optimal solution.

It was conjectured in [12] that there is a polynomial time approximation scheme

(PTAS)forthis problem.Inthispaperweprovethis conjecture.However,thisalgorithm

is not practical due to its high running time, albeit polynomial.

??? when compared to the

??

??. As a corollary of the above approximation result, it is shown that if

?

?processors are in the range

?

??

????

? then FNF

?

???

?. It is also shown that the problem

??-hard, so unless

?

?

?? there is no polynomial time algorithm for finding an

2 Problem Definition

We are given a set of processors

cast to all the processors. Each processor

with transmission time

sending a message or receiving a message at any point of time. Without loss of gener-

ality, we assume that

message at time zero.

We define the completion time of processor

the message. Our objective is to find a schedule that minimizes

??

?

??

?

?????

?

? and there is one message to be broad-

?

?can send a message to another processor

?

?once it has receivedthe message.Each processorcan be either

?

?

?

?

?

?

???

?

?

?and

?

?

??. Also we assume that

?

?has the

?

?to be the time when

?

?has received

?

???

? ???

?

?

?where

?

that minimizes the time required to send the message to all the processors.

We recall the following definition from [12].

?is the completion time of processor

?

?. In other words, we want to find a schedule

4One approach to obtain aPTAS might be rounding to powers of

work since their proof works only when the completion times of all processors are integers.

???

??. However, this does not

3

Page 4

Definition 1. We define the number of fractional blocks as the number of (fractional)

messages a processor can send by the given time. In other words, given a time

number of fractional blocks of processor

?, the

?

?is

??

?

?

?

???

?.

Our proof makes use of the following results from [12].

Theorem 1. [12] The Fastest Node First algorithmmaximizes the total numberof frac-

tional blocks for any value

?.

3 Approximation Scheme

We now describe a polynomialtime approximationscheme for the problemof perform-

ing broadcast in the minimum possible time. Unfortunately, the algorithm has a very

high running time when compared to the fastest node first heuristic.

We will assume that we know the broadcast time

? of the optimal solution. Since

?

all possible values of the form

In this guessing process we lose a factor of

Let

whose transmission time is at most

set of remaining (slow) processors. We partition

similar transmissions speeds. For

?

??, we knowthat the minimumbroadcasttime

? is between

? and

?, andwe can try

???

??

?for some fixed

??

? and

?

??

???

?

???

?

????????

?.

???

??.

?

?

?

? be a fixed constant. We define a set of fast processors

? as all processors

?

?

?. Formally,

?

?

??

?

??

?

?

?

?

?

?. Let

? be the

? into collections of processors of

?

??

????, define

?

?

?

??

?

??

?

?

???

?

?

?

???

??

?

?

?

?

?

??

We first send messages to processors in

schedule with broadcast time at most

the message first. We then find a schedule for slow processors based on a dynamic

programming approach.

?

?

?

?

?

? where

? is

?

???????

?

?

???????

?

?

?.

? using FNF. We prove that there is a

???

?

????? such that all processors in

? receive

Schedule for

cast schedule. Assume that the schedule for

schedule every processor

?: We use the FNF heuristic for the set

? to generate a partial broad-

? has a broadcast time of

?

???. In this

?

?

?

? becomes idle at some time between

?

???

?

?

?and

?

??

We will prove that there is a schedule with broadcast time at most

such that all processors in

processors. The following lemma relates

schedule to propagate the message to any

?.

???

?

?????

? receive the message first, and then send it to the slow

?

??? with the time taken by the optimal

??

? processors.

Lemma 1. In any schedule, we need at least

processors receive (any portion of) the message.

?

???

?

??

?

? time units to have

??

?

Proof. We prove this by contradiction. In any schedule let

cessors that have completely received the message by time

number of processors that have started receiving the message by time

time

processor in

means we should have that

?

?be the number of pro-

?. In addition, let

?

?be the

?. Suppose that at

?

?

??

???

?

??

?

?, we have

?

?

?

?

?

?

?

???

?. First note that

?

?

?

?

?since each

?

?is getting the message from exactly one (distinct) processor in

?

?. This

?

?

?

???

???.

4

Page 5

If a schedule is able to complete sending the message to at least

by time

FNF maximizes fractional blocks by Theorem 1, we claim that FNF also has at least

??

??? processors

?

?, then the numberof fractionalblocks of this schedule is at least

??

???. Since

??

??? fractional blocks by time

?

?. Let

?

?be the transmission of slowest processor in

?. Notice that in additional time

the message must have finished receiving it. In additional time

certainly can double the number of processors that have receivedthe message. Thus be-

fore

since

in

?

?

??

?

?

?

? all processors that had started receiving

?

?

??

?

?

?

?, FNF most

?

???, more than

??

? processors would have receivedthe message; a contradiction

?

???is the earliest time at which the fastest

??

? processors receive the message

???.

Lemma 2. There is a schedule in which all processors in

later than any processor in

? receive the message no

? and the makespan of the schedule is at most

?????

?

??.

Proof. The main idea behind the proof is to show that an optimal schedule can be

modified to have a certain form. Consider the set of processors of an optimal schedule

that have received any portion of the message by time

some fast processors,

?

???

?

??

?

?. This consists of

?

?and some slow processors

?

?. Let

?

??

?

?

?

?

?and

?

??

?

?

by time

the processors in

processor sends only one message and this will take additional time at most

time

to the processors in

to all remaining processors in additional time

finish the broadcasting in additional time

this schedule is at most

?

?

?. Note that

??

??

????

?

? since

??

?

?

?

??

?

????

? (Lemma 1). In the FNF schedule,

?

??? all processors in

?

?

?

?

?

?

??have the message. We can now have

?

??send messages to all processors in

?

?. Since

??

??

????

?

? each

?

?

?. By

?

???

?

?

?

?, all processors in

?

?

?

?

?certainly have the message in addition

?

??. Notice that the optimal schedule now broadcasts the message

?

?

??

???

?

??

?

?

?. Thus we can also

?

?

??

???

?

??

?

?

?. The broadcast time of

?

???

?

?

?

?

?

?

?

??

???

?

??

?

?

?? ?????

?

??.

Create all possible trees of

labeled trees

of a subset of processors in

processor belongs to

constant.

?: For the processors in

?, we will produce a set

? of

? . A tree

? is any possible tree with broadcast time at most

? consisting

?. Then we label a node in the tree as

? if the corresponding

?

?( ?

??

????). We prove that the number of different trees is

Lemma 3. The size of

? is constant for fixed

?

?

?

?.

Proof. First consider the size of a tree

Let us denote it as

? (that is, the number of processors in the tree).

???. Since the transmission time of processors in

? is greater than

?

the message. It means that given a processor as a root of the tree, within time

have at most

node in the tree can have different label

number of different trees, given a tree

size

?

?, we need at least

?

?

? time units to double the number of processors that received

? we can

?. Now each

?

???

?processors receive the message. Therfore,

????

?

???

?

??

????. To obtain an upperbound of the

? we transform it to a complete binomial tree of

?

???

?by adding nodes labeled as

?. Then the number of different trees is at most

??

???

?

???

?.

5