Algorithms for disk covering problems with the most points
ABSTRACT Usually the covering problem requires all elements in a sys-tem to be covered. In some situations, it is very difficult to figure out a solution, or unable to cover all given elements because of resource constraints. In this paper, we study the issue of the partial covering problem. This problem is also referred to the robust k-center problem and can be applied to many fields. The partial covering problem becomes even more harder when we need to determine the subset of the group of all available elements to share resources. Several approximation algorithms are proposed to cover the most elements in this paper. For some real time systems, such as the battlefield communication system, the algorithm pre-sented with polynomial-time complexity can be efficiently applied. The algorithm complexity analysis illustrates the improvement made by our algorithms, which are compared with other papers for the partial covering problem in the literature. The experimental results show that the perfor-mance of our algorithms is much better than other existing 3-approximation algorithm for the robust k-center problem.
-
Citations (0)
-
Cited In (0)
Page 1
Algorithms for Disk Covering Problems with the Most Points
Bin Xiao
Department of Computing
Hong Kong Polytechnic University
Hung Hom, Kowloon, Hong Kong
csbxiao@comp.polyu.edu.hk
Qingfeng Zhuge,Yi He, Zili Shao,Edwin H.-M. Sha∗
Department of Computer Science
University of Texas at Dallas
Richardson, Texas 75083, USA
{qfzhuge, yxh011010, zxs015000, edsha}@utdallas.edu
ABSTRACT
Usuallythecoveringproblemrequiresallelementsinasys-
tem to be covered. In some situations, it is very difficult to
figure out a solution, or unable to cover all given elements
because of resource constraints. In this paper, we study the
issue of the partial covering problem. This problem is also
referred to the robust k-center problem and can be applied
to many fields. The partial covering problem becomes even
more harder when we need to determine the subset of the
group of all available elements to share resources. Several
approximation algorithms are proposed to cover the most
elements in this paper. For some real time systems, such as
the battlefield communication system, the algorithm pre-
sented with polynomial-time complexity can be efficiently
applied. The algorithm complexity analysis illustrates the
improvement made by our algorithms, which are compared
with other papers for the partial covering problem in the
literature. The experimental results show that the perfor-
mance of our algorithms is much better than other existing
3-approximationalgorithmfortherobustk-centerproblem.
KEY WORDS
Approximation algorithms, partial covering, k-center prob-
lem.
1 Introduction
In the battlefield, there are a lot of communication units
and some command centers. The command centers are re-
sponsible for the successful information exchange among
all mobile units. Because of the mobility for both mobile
units and centers, we try to cover the most mobile units
by a limited number of control centers with transmission
range constraints. This kind of problem is analyzed as the
clustering problem, which is clustering a set of points into
a few groups. Clustering algorithms have been explored
and deployed in many fields [1, 2], such as data compres-
sion, information retrieval, databases applications, image
processing, facility location, clustering nodes in ad hoc net-
works etc. Given that the group number is k, the clustering
problem is also referred to the k-center problem. We define
the k-center problem as follows: Let S be a set of n objects,
∗This work is partially supported by TI University Program, NSF EIA-
0103709 and Texas ARP 009741-0028-2001, USA.
generally represented as points in a d-dimensional metric
space. Given an integer k ≤ n, compute a k-clustering
of S of the smallest possible size. In other words, the k-
center problem is formulated by covering S by k congruent
disks of the smallest possible size. We always assume that
k disks have the same radius r. In this paper, we only pay
attention to the metric space by a plane (d = 2). However,
all algorithms can be easily extended to arbitrary metrics.
Many heuristic algorithms [3, 4, 5] have been studied
well for the k-center problem when k ≥ 3. There are two
major directions. One direction is for the number of disks
is fixed to k and heuristic algorithms try to minimize the ra-
dius r of disks. The other focuses on the radius of each disk
is fixed to r while heuristic algorithms explore minimum
number of disks to cover all points. For the first approach,
Gonzalez [6] gave a 2-approximation algorithm for the k-
center problem in any metric space with time complexity
O(k · n). In the same paper, Gonzalez proved that there is
no polynomial-time algorithm for an approximation factor
smaller than 2 unless P=NP. Feder and Greene improved
the 2-approximation algorithm with complexity O(nlogk)
[7]. Some (1+?) approximation algorithms [7, 8] are stud-
ied with non-polynomial running time. For the second ap-
proach, a polynomial-time approximation scheme can be
achieved within approximation factors arbitrarily close to
1 [7]. Gonzalez proposed an 8-approximation algorithm
for the fixed-size disks covering problem in [9]. Huang
devised a 7-approximation algorithm for the 2-dimension
metric space, and a 21-approximation algorithm for the 3-
dimension metric space [10]. In [11], Franceschetti sum-
marized the best known results in a table. However, to
achieve smaller approximation factor (α < 7), the poly-
nomial running time will be very huge. For example, the
algorithm in [11] for an approximation factor α = 6 will
require the running time to be O(k · n) with k ≈ 1016.
A topic has been studied during these years for the
k-center problem is to cover part points [12, 13]. In some
cases, there is no solution for covering n points in a plane
by k disks with fixed radius. Examples are like building
facilities to provide service within a fixed radius to a certain
fraction of population, or allocating command centers in a
battlefield to support communications among mobile units.
The problem defined in [12] is called the robust k-center
problem, which is to cover at least p points (p ≤ n) by
k disks with radius r. In this paper, we want to cover the
Page 2
most points (at least p points) for n points in a plane by k
available disks. Those k disks are assumed with the same
radius r. This kind of problem is NP-complete problem.
The reasons is that to cover at least p points, when we set
p = n, it will be the same to the k-center problem.
In order to cover as many points as possible (at least
p from n) with k disks, in this paper we have made the
contributions as follows:
• Come up with a new 2-approximation algorithm–
RKC2 to the robust k-center problem.
the authors only presented the best result with a 3-
approximation algorithm. When p is close to n, this
new 2-approximation algorithm becomes polynomial-
time running. (Section 2)
In [12],
• Propose a greedy algorithm to cover part points. (Sec-
tion 3.1)
• Modify the RKC2 algorithm to the RKCP2 algorithm
for the purpose of covering most points. (Section 3.2)
• Transfer the 3-approximation algorithm for the robust
k-center problem in [12] to the RKCP3 algorithm,
which is also applicable to the problem of covering
most points. (Section 3.3)
The rest of this paper is organized as follows. In Sec-
tion 2 we introduce the RKC2 algorithm for the robust k-
center problem and prove it to be 2-approximation. Section
3 shows different heuristic algorithms to cover the most
points and one example is illustrated. Experimental results
are presented in Section 4. And conclusions are drawn in
Section 5.
2 New RKC2 Algorithm
The robust k-center problem defined in this paper only con-
siders points in a plane (the dimension is 2). Let n be the
number of all points, p be a given positive integer such that
p ≤ n. If we have k same disks with radius r, the robust k-
center problem can be defined as whether k disks can cover
at least p points. This problem is an NP-complete problem.
Suppose that n points in a plane area are the clients we
want to serve, the center of k disks are the facility location
to cover p points among n points, then the question stud-
ied here is the same as the facility location problem with
outlier defined in [12].
Figure 1 illustrates how robust measures lead to better
clustering solutions. In the example, to cover all points
requires at least k = 4 disks. However, if we need to cover
the most points with only 2 disks, Figure 1(b) shows the
result to cover p = 13 points.
Following the same definition of algorithm approxi-
mation factor in [12], we show our RKC2 algorithm below
and prove it to be 2-approximation to the optimal cost. For
some cases, we can even know that it is impossible to cover
p points by k disks. Let V be the set of all points in a plane
(a)(b)
Figure 1. (a) To cover all points with k=4; (b) To cover p
points with k=2, p=13.
and |V | = n. We suppose that the set of points satisfies
the triangular inequality. p is a given positive number and
p ≤ n. Let C be a set in our algorithm that includes all
points to be the centers of k disks. Tiis a temporary point
set that includes p points from V . Each disk has the same
radius r. We use Flag below to show whether there is a
solution for k disks to cover any p points within the point
set V . The dist(a,b) is the Euclidean distance between the
point a and b. The algorithm RKC2 is as follows:
• Flag = No
• For i = 1,...,?n
– Select p different points from V , which gener-
ates a new point set Tiwith Ti?= Tj(j = 1, ...,
i-1)
– Arbitrarily select one point from Ti, let this point
be c1, C = {c1}
– For j = 2,...,k, do
∗ For a point t ∈ Tiand t / ∈ C, let
dj(t) = min[dist(t,cl),for∀cl∈ C]
∗ Let dj= maxt∈Tidj(t)
∗ Let cj be the point t, which makes dj to
have the maximum value
∗ C = C ∪ {cj}
– If dk≤ 4r
∗ If k disks with centers in C by radius r can
cover at least p points in V
Return Yes
∗ Else
Flag = Yes
p
?, do
• If Flag = No
It is impossible to cover p points with k disks
• Return No
We explain how this algorithm works. First, Flag is
reset to be No. Whenever it is possible to cover p points
with the given k disks for a particular Ti by the above
algorithm, Flag is set to Yes. In the outer loop, p differ-
ent points from V are selected in every iteration. Those
p points yield a new point set Ti. Because there are?n
until i =?n
p
?
times of different point sets, this outer loop will not end
?, which means we already check all possible
p
Page 3
solutions. For every selected point set Ti, it only includes
p points. Subsequently, one node is arbitrarily chosen from
Tiand this point becomes the center of one disk. We want
to find the other k − 1 center positions for disks. Such
k points construct the center set C. The next center in-
cluded to C is the point in Tithat has the maximum dis-
tance from all the points already in C. When there are k
points in C, we already find k centers by the RKC2 algo-
rithm for the p points arbitrarily chosen to Ti. dkmeans the
maximum distance from ck(the kth center) to every other
centers (c1,...,ck−1). In other words, if k disks have the
diameter by dk, it is enough for them to cover all points in
Ti. If d > 4r, it is impossible for k disks with radius r to
cover all points in Ti, which will be proved by Lemma 2.1.
Otherwise, Flag should be set to Yes to show the existence
for all points in Ticovered only by k disks. Furthermore,
if k disks with centers in C by radius r can cover p points
in V , the RKC2 algorithm already reach a solution to the
robust k-center problem and is ended by the return of Yes.
The Theorem 2.1 will prove it.
After?n
disks with radius r. Since we still can not find a method to
cover p points, the algorithm will return No.
In the RKC2 algorithm, the process to generate a
center set C with given p points is similar to the greedy
algorithm in [6], which has been proved to be a 2-
approximation algorithm. The idea is to execute the main
loop another iteration, and let dk+1be the resulting dis-
tance to the k + 1 center to be added. Also the new center
point set C?becomes C∪{ck+1} that has k+1 points in it.
By the definition of dk+1, we can see that the distance be-
tween any two points in C?is at least dk+1. Furthermore,
any k-clustering must have two points of C?in one same
cluster for |C?| = k + 1. Thus, any k-clustering of Timust
have radius no less than dk+1/2. However for the RKC2
algorithm, the radius of the k-clustering with centers at C
is exactly dk+1. Then we have the lemma below:
Lemma2.1. GivenppointsinTi, theRKC2algorithmpro-
vides a factor 2 approximation to the minimum k-clustering
of Ti. In other words, the radius by disks with centers at
points in C is no larger than 2 times the radius of any k
clustering disks of Ti.
Theorem 2.1. Given a set V of n points from an arbitrary
metric, an integer k ≤ n, and an integer p, the RKC2 al-
gorithm is a 2-approximation algorithm for the robust k-
center problem.
p
?cases are tested and Flag is still No, it guar-
antees that there is no way to cover p points from V by k
Proof. From Lemma 2.1, it is obvious that the RKC2 algo-
rithm generates a 2-approximation solution when dk≤ 4r
happened during the algorithm execution. This is because
when dk≤ 4r, the k disks with radius 2r can cover at least
p points. For the case that every time dk > 4r during p
points selection (?n
no solution exists. Thus the RKC2 algorithm can not find a
center set C with radius 2r to cover p points.
p
?times) from V , it is impossible for k
disks to cover the entire p points with radius r. That means
*?
*?
*?O?
2?
b?
a?
*?
O?
4?
O?
3?
O?
1?
Figure 2. To cover 2 points with a disk by the greedy algo-
rithm.
Given p points in Ti, to generate k centers requires
O(p · k) time. A better way to reduce the time complexity
to O(p · logk) is shown in [7]. Since the RKC2 algorithm
willexecute?n
p
cover the most points, p should be the same as n and the
time complexity for the robust k-center problem becomes
O(n · logk). It is a polynomial-time algorithm. Further-
more, when p is close to n, the RKC2 algorithm can still
remain polynomial-time running.
p
?timesforthedifferentpointsetTi, thetime
complexity should be O(?n
?· plogk). When we want to
3 Different Algorithms to Cover the Most
Points
Given n points in a plane, sometimes it is impossible for
all points to be covered within k disks by radius r. Even if
such cases exist, to find a solution until now will require the
computation time to be exponential, since k-center prob-
lem is NP-complete. In the real-time system, such as cov-
ering all communication unites in a battlefield, this is too
costly. To cover at least p points within k disks is also NP-
complete. The reason is that when we set p = n, the robust
k-center problem becomes the standard k-center problem.
In this section, we show some polynomial-time algorithms
to cover points as many as possible.
3.1Greedy Algorithm
For n points in a plane to be covered by disks with radius
r, there are at most 2?n
center positions, from which the distance to both points (a
and b) are exact r. In Figure 2, there are 2 different disk
covering for the point a and b, which can be disk o1and
o2. Because disk o3covers the same points as o2, it is not
necessary for us to pay attention to it any more. In other
words, we always can move a disk to uniquely represent
twopointswiththosepointsonitscircleedge. Forthepoint
c and d in Figure 2, there is only one case for disk covering
since the distance between c and d is 2r (the diameter).
One disk covering is said to be unique only if there is at
least one point different from any other disk covering. Let
D be the set with different disks covering and G be the disk
set returned by the greedy algorithm that only has k units.
Below is the greedy algorithm to cover the most points.
2
?different cases for disk covering.
This is because for any two point a,b, there are at most two
Page 4
• G = ∅
• For any two points, we generate two new disk units
to cover them if possible. Suppose there are m such
unique disk covering with m ≤ 2 ·?n
• For i = 1,...,k, do
– Select the disk that covers the maximum number
of points from D. Let this disk be Di, D =
D − {Di}, G = G ∪ {Di}.
– Remove the points covered by Difrom all disk
in D.
2
?. These disks
come up with a disk set D = {D1,D2,...,Dm}.
• Return G.
There are at most 2·?n
?). In one iteration, the time to select the disk that covers
including Dito be in disk set G, the greedy algorithm will
refresh the number of points covered by all other remaining
disks in D. Thus it requires an extra time O(2·?n
complexity for the greedy algorithm should be O(2·?n
2
2
2
?different disk covering given
n points. Constructing such disks will consume time O(2 ·
?n
2
themaximumnumberofpointsneedstimeO(2·?n
2
?). After
?). In the
2
2
greedy algorithm, we have k iterations. The computation
?)+
k·O(2·?n
3.2
?+2·?n
RKCP2 Algorithm
?) = O(2·?n
2
?)+O(4k?n
2
?) = O(k·n2).
The algorithm for the robust k-center problem doesn’t re-
quest to cover all points with given k disks. With a little
change to the RKC2 algorithm to solve the robust k-center
problem in Section 2, we can apply it to the case of cover-
ing the most points. Here we assume p = n, which means
the algorithm tries to cover all available points. All points
are in the point set V and C is a point set including k units,
which are the positions for the center of k disks. Below is
the RKCP2 algorithm procedure to cover the most points
from the 2-approximation algorithm RKC2 for the robust
k-center problem.
• Arbitrarily select one point from V , let this point be
c1, C = {c1}
• For i = 2,...,k, do
– For a point t ∈ V and t / ∈ C, let
di(t) = min[dist(t,cl),for∀cl∈ C]
– Let di= maxt∈Vdi(t)
– Let cibe the point t, which makes dito have the
maximum value
– C = C ∪ {ci}
• Return C
The time complexity for the above RKCP2 algorithm
can be O(n · logk) [7].
0200400 6008001000
0
100
200
300
400
500
600
700
800
900
1000
0
12
1
2
3
4
5
6
7
8
9
10
11
13
14
15
16
17
18
19
20
21
22
23
24
25
26
29
27
28
(a)
−400−2000 2004006008001000
0
200
400
600
800
1000
1200
0
12
1
2
3
4
5
6 7
8
9
10
11
13
14
15
16
17
18
19
20
21
22
23
24
25
26
29
27
28
(b)
Figure 3. (a) 30 points can be covered by 6 disks with ra-
dius 180m; (b) The result by the greedy algorithm.
3.3 RKCP3 Algorithm
In [12], the authors illustrate a 3-approximation algorithm
for the robust k-center problem, which can also be adopted
to solve the problem of covering the maximum number of
points. That 3-approximation algorithm has the input with
a radius r for k disks and a set of n points V from an arbi-
trary metric space. It will figure out a solution S such that
cost(S) ≤ 3r. For the n points V , it assumes that there
exists an optimal solution O such that cost(O) = r.
For each point vi ∈ V , Gi(Ei, resp.) is denoted
as the set of points that are within distance r (3r, resp.)
from vi. The set Giis referred as disks of radius r and the
set Eias the corresponding expanded disks of radius 3r.
The RKCP3 algorithm procedure to cover the most points
can be described as follows with a little modification to the
original 3-approximation algorithm in [12].
• Construct all disks and corresponding expanded disks.
• For i = 1,...,k, do
– Let Gi be the heaviest disk, i.e. contains the
most uncovered points.
– Mark as covered all points in the corresponding
expanded disk Ei.
– Update all the disks and expanded disks, i.e., re-
move from them all covered points.
• Return {G1,G2,...,Gk}.
The algorithm above has been proved to be a 3-
approximation algorithm for the robust k-center problem
Page 5
0 200400
(a)
6008001000
0
100
200
300
400
500
600
700
800
900
1000
0
12
1
2
3
4
5
6
7
8
9
10
11
13
14
15
16
17
18
19
20
21
22
23
24
25
26
29
27
28
−2000 200400600 8001000 1200
0
200
400
600
800
1000
1200
0
12
1
2
3
4
5
6 7
8
9
10
11
13
14
15
16
17
18
19
20
21
22
23
24
25
26
29
27
28
(b)
Figure 4. (a) The result by the RKCP2 2-approximation
algorithm; (b) The result by the RKCP3 3-approximation
algorithm.
in [12]. The time complexity for the algorithm should be
O(k · n).
3.4Example for the Above Algorithms
To cover the most points in a plane, we show an example
here applied with different algorithms mentioned above.
Given 30 points randomly scattered in a square by 1,000
meters * 1,000 meters, an optimal solution for all points
covered by 6 disks is shown in Figure 3(a). The results with
the greedy algorithm are illustrated in Figure 3(b). With the
RKCP2 and RKCP3 algorithms to cover the most points,
we have the results shown in Figure 4(a) and Figure 4(b)
respectively. In the greedy algorithms, the center position
of one disk is determined by 2 points on its edge. For the
RKCP3 algorithm to cover the most points, we show a big-
ger circle with radius 720 meters for every covering disk in
Figure 4(b). The number of points covered is counted by
those in the small disks (with radius 180 meters). All points
that are between one small disk and its corresponding big
circle are removed from the system by the 3-approximation
algorithm. Following this property, point 7 is not covered
by the algorithm because it is first removed from the system
after the decision made for the first disk.
4 Simulation Results
We evaluate the performance of different algorithms
through covering different number of points scattered in an
area by disks. The performance of each algorithm is mea-
sured by the number of points covered by given disks and
how many disks are used for the system. In the simulation,
we compare the results for different algorithms.
The position of all points are randomly distributed in
an area by 1,000 * 1,000 meters. The number of points is
varied from 20 to 279 during the simulation. All disks have
the same radius by 180 meters. During the simulation, the
center position of a disk can be anywhere inside the square
area. Different number of disks is assigned according to
how many points in all. The relationship between the num-
ber of points and disks follows the equation 1 below. D
represents how many disks available in the system while P
denotes the number of points inside the simulated square
area.
D = ?P
20? + 3
(1)
In Table 1, the performance of the greedy, 2-
approximation (RKCP2) and 3-approximation (RKCP3)
algorithms are listed with the model defined above. Inside
the square, 20, 50, 100, 200 and 270 points are randomly
scattered for one simulation event respectively. Under Col-
umn “# p”, the data shows the number of points covered by
a special algorithm and the following Column “%” repre-
sents the percentage of the covering part based on all avail-
able points. During the simulation, the number of available
disks is changed according to the number of all points in
the system, which is defined in Equation 1. To cover the
points as many as possible, different algorithms will con-
sume different number of disks. The data in the Column “#
d” is the result of used disks by the execution of those three
algorithms.
For the system with 270 points, there are 16 disks
available. However, the greedy algorithm can cover all
points only with 15 disks for some cases. The RKCP3
algorithm from the 3-approximation algorithm in [12] is
not efficient because it only uses part of available disks.
We can see from Table 1 that when the system has 270
points, 4 disks with radius 720 meters (180 meters * 3) is
probably enough to cover them all. On the contrary, the 2-
approximation algorithm always tries to cover points by all
disks. Under a few circumstance, such as less number of
points in the area and fewer disks available, the RKCP3 al-
gorithm yields a better performance than the RKCP2 algo-
rithm, which can be seen from Table 1 for the system with
50 points. Most of the time, the greedy algorithm performs
the best while the 3-approximation algorithms generate the
worst results.
We show the simulation results for different algo-
rithms in Figure 5 when the number of points increases
from 20 to 279.The curves in Figure 5 represent the
number of covered points by different algorithms. That
straight line represents the total points in the simulation en-
vironment. When an algorithm has a better performance
to cover more points, the simulated curve will be closer
to the straight line. Thus, the greedy algorithm always
yields a good performance, especially for a large number