Content uploaded by Haotian Wang

Author content

All content in this area was uploaded by Haotian Wang on Jul 05, 2019

Content may be subject to copyright.

Trajectory-Based Multi-Hop Relay Deployment

in Wireless Networks

Shilei Tian, Haotian Wang, Sha Li, Fan Wu, and Guihai Chen?

Shanghai Key Laboratory of Scalable Computing and Systems

Department of Computer Science and Engineering, Shanghai Jiao Tong University,

Shanghai 200240, China

{tianshilei,ltwbwht,Zoey.Lee}@sjtu.edu.cn, {fwu,gchen}@cs.sjtu.edu.cn

Abstract. In this paper, we identify a novel problem Trajectory-Based

Relay Deployment (TBRD) which aims at maximizing user connection

time as the users roam through the target area while complying with

relay resource constraints. To solve the TBRD, we ﬁrst propose the con-

cept Demand Nodes (DNs). Next, we design a Demand Node Generation

(DNG) algorithm that transforms the continuous historical user trajec-

tory into a number of discrete DNs. By generating DNs, we convert the

TBRD problem into a Demand Node Coverage (DNC) problem, which is

NP-complete. After that, we design an approximation algorithm, named

Submodular Iterative Deployment Algorithm (SIDA), to solve the DNC

problem with the approximation factor 1 −1

√e·(1−1/k). The simulation

on ﬁve real datasets shows that our algorithm can obtain high coverage

for users in motion, leading to better user experience.

1 Introduction

With the explosive growth of mobile users, wireless coverage has become an

increasingly challenging problem. However, due to transmission distance, limited

coverage of Access Point (AP), path loss, and so forth, the signal quality at

some locations fails to provide satisfactory Internet access [3]. Deploying relays

in multi-hop networks has become an eﬀective method to improve the wireless

coverage and service quality [6].

In this paper, we investigate the relay deployment problem under the non-

stationary user setting. Due to limited transmission power and path loss, the

base station (BS) may fail to cover all users at all times. We hope to deploy

a limited number of relays to keep users connected to the Internet as long as

possible when they are wandering.

?G. Chen is the corresponding author.

This work was supported in part by Program of International S&T Cooperation

(2016YFE0100300), the State Key Development Program for Basic Research of

China (973 project 2014CB340303), China NSF Projects (Nos. 61672348, 61672353,

61422208, and 61472252), and CCF-Tencent Open Research Fund.

2

Existing works designed algorithms according to the exact user locations,

they all assumed that users are stationary, which is not realistic in practice. As

a result, once a user location changes, the network performance will be aﬀected.

In fact, the movements of users within an area are not completely random.

They are strongly aﬀected by people’s social demands [4]. Therefore, some hot

spots, which mean locations where users often pass, or linger around, can be

inferred from the user trajectory. Therefore, we consider utilizing the historical

user trajectory to infer the tendency of the user movement and deploy the relays.

In this paper, we ﬁrst deﬁne the connectivity of the network. Since relays

cannot access the Internet directly, each relay must have a path to the BS.

Then we deﬁne the Trajectory-Based Relay Deployment (TBRD) problem, which

aims at maximizing user connection time as the users roam through the target

area while complying with relay resource constraints. We introduce a concept

Demand Nodes (DNs), which are virtual weighted nodes representing locations

where users often pass or stay for a long time. Next, we propose a matrix-

based trajectory representation and design the Demand Node Generation (DNG)

algorithm. After that, the original TBRD problem is converted to a new problem

called Demand Node Coverage (DNC). We claim that a DN is covered if its

distance to an AP is less than the coverage radius of the AP. The DNC problem

is to maximize the total weight of DNs covered by deployed relays and BS.

The DNC is NP-complete, which can be reduced from a known NP-complete

problem named budget set cover (BSC) [2]. To tackle this problem, we propose

an approximation algorithm, named Submodular Iterative Deployment Algorithm

(SIDA), which has an approximation ratio of 1 −1

√e·(1−1/k), where eis the

mathematical constant, and kis the relay number constraint. Finally, we use

real datasets to evaluate our algorithm. The simulation results indicate that our

algorithm can perform well.

The paper is organized as follows. The problem statement is given in Section

2. Section 3 describes the DNG algorithm. Section 4 presents the SIDA. Simu-

lations are demonstrated in Section 5. Finally, Section 6 concludes this paper.

2 Problem Statement

2.1 System Model

In this model, the user can either communicate with BS directly, or connect to

BS with the help of relays. Since too many hops will lead to a high delay, we

limit the number of communication hops to 2.

2.2 Problem Deﬁnition

The user trajectory set is denoted by T.PBand PRrepresent the set of BS and

relay candidate positions respectively. kis the number of relays we can deploy.

Deﬁnition 1 (Communication Radius). Two APs can communicate with

each other within a communication radius. We use dBand dRto denote the

communication radius of BS and relay respectively.

3

Deﬁnition 2 (2-Hop Relay Connectivity). Given the AP candidate position

set P=PB∪PR, we generate a weighted graph G= (P, E), where (pi, pj)∈E

if the distance between these two locations is less than the corresponding com-

munication radius. The weight of each edge is set to 1.2-hop relay connectivity

means that in the induced graph G[F], there always exists a path between any

selected relay node and the selected BS node, while its distance is less than or

equal to 2.

Deﬁnition 3 (TBRD Problem). Given a set of trajectories T, BS candidate

locations PB, relay candidate locations PR, relay number constraint k, the TBRD

problem is to ﬁnd a BS location pB∈PBand relay locations PS⊂PRto

maximize user connection time. PSmust be subject to |PS|=k, and the induced

subgraph G[{pB} ∪ PS]has 2-hop relay connectivity.

As we mentioned before, hot spots can be inferred from historical user trajectory.

We introduce a novel concept called Demand Node (DN) to represent them.

Deﬁnition 4 (Demand Node). Demand Nodes (DNs) are virtual weighted

nodes representing the locations where users often pass or stay for a long time.

They are at the center of the grids which are generated by the division of the target

area. The weight is the probability of user’s appearance in the corresponding

location. The larger the weight is, it is more possible that users will pass through

or stay at the corresponding location.

Deﬁnition 5 (Coverage Radius). Coverage radius, denoted by rBfor BS and

rRfor a relay, is the distance threshold for the BS or relay. Only DNs whose

distance to an AP is less than its coverage radius can ensure Internet connection

for users. We say that the DN is covered by the corresponding AP.

Before we introduce the Demand Node Coverage (DNC) problem, we ﬁrst give

some deﬁnitions that are used throughout this paper. We use Dto denote the

DNs set, and Wfor the weight set of DNs.

Deﬁnition 6 (Covered DNs Set). The covered DNs set C(·)is the set of

DNs covered by a given AP. For a BS candidate location pi

B∈PB,C(pi

B) =

{dj|dist(dj, pi

B)≤rB}where dist(·)denotes the Euclidean distance. For a relay

candidate location pi

R∈PR,C(pi

R) = {dj|dist(dj, pi

R)≤rR}.

Deﬁnition 7 (Weight Function). The weight function w(·)is the sum of

weights of the covered DNs set. For an AP candidate location p∈PB∪PR,

w(p) = Psi∈C(p)wsi. For an AP candidate location set P, the DNs covered by

Pare represented as DC=∪pi∈PC(pi),w(P) = Psi∈DCwsi.

Deﬁnition 8 (Residual Weight). Considering a selected AP candidate loca-

tion set SA, when we continue to select a AP candidate location set SB, the resid-

ual weight of SBbased on SAis deﬁned as wR(SA, SB) = w(SB)−w(SA∩SB).

Assume the width of the target area is w, and the height is h. There is also a

ﬁlter threshold θ, which constrains the weight of each generated DN to be larger

than θ. Now we can deﬁne the DNG problem.

4

Deﬁnition 9 (DNG Problem). Given a user trajectory set T, the width w

and height hof the target area, and a ﬁlter threshold parameter θ, the DNG

problem is to generate a set of DNs Dand a relative weight set W. The weight

of each DN is in the range of [θ, 1].

Now we can deﬁne the Demand Node Coverage (DNC) problem.

Deﬁnition 10 (DNC Problem). Given a set of DNs Dand the corresponding

weight set W, BS candidate locations PB, relay candidate locations PR, relay

number constraint k, the DNC problem is to ﬁnd a location pB∈PB, and relay

candidate locations subset PS⊆PRto maximize w(F), where F={pB} ∪ PS

while |PS|=k. The induced subgraph complies with the 2-hop relay connectivity

constraint.

3 Demand Node Generation

In this section, we show how to extract “hotspots” which we refer to as Demand

Nodes (DNs) from user trajectories. The Demand Node Generation (DNG) al-

gorithm consists of three major steps: (1) trajectory matrix generation; (2) pre-

diction matrix; (3) ﬁltering.

3.1 Trajectory Matrix Generation

Since the DNs depend on both the temporal and spatial information of the user

trajectory, we segment each trajectory according to a ﬁxed time span tand

record the location of each segment where the user appears in the target area

by a binary matrix. Fig. 1 illustrates the details of converting a trajectory into

a binary matrix. Fig. 1(a) shows a trajectory in the area.

(a) Original (b) Segments (c) 6 ×6 grids

10

0

0

0

1 1 1

1 1

0 00 0

0

0

0

0

0

0

000

0

00 0

0

0

0

000

000

(d) Matrix

Fig. 1. An illustration of the process of a trajectory.

Firstly, we divide the trajectory into a number of segments, and each segment

shows the trajectory of a user at the corresponding time span t, as shown in

Fig. 1(b). Then, the target area is further partitioned into small sizes of grids

which are the candidate locations for the demand nodes. Fig. 1(c) shows the

distribution of the upper left segment. Lastly, Fig. 1(d) shows the binary matrix

5

of the trajectory at one time segment. The whole trajectory area is seen as a

matrix and entries of the matrix represent the partitioned grids. If the trajectory

passes through the grid, the corresponding entry of the matrix is set to 1.

After the conversion, we obtain numerous binary matrices for the target area.

3.2 Prediction Matrix

Since the value of a grid xij is 0 or 1, we assume the probability distributions

of these grids are independent Bernoulli distributions, which can be written as

xij ∼p(xij |µij ) = µxij

ij (1 −µij )1−xij , where the parameter µij ∈[0,1] is the

probability of xij = 1.

We can estimate the µij by maximizing likelihood estimation. However, this

may lead to over-ﬁtted results for small datasets [1]. In order to alleviate this

problem, we ﬁrst introduce a prior distribution p(µij|aij , bij ), beta distribution,

over the parameter µij, which is easy to interpret while having some properties.

The posterior distribution of µij is now obtained by Bayesian theorem

p(µij |Xij ) = p(Xij |µij )p(µij |aij , bij )

Rp(Xij |µij )p(µij |aij , bij )dµij

.(1)

Then, we estimate the value of µij by maximizing the posterior distribution

p(µij |xij ). We see that this posterior distribution has the form

p(µij |Xij )∝µm+aij −1

ij (1 −µij )n−m+bij −1.(2)

Finally, maximizing Eq. (2) with respect to µij , we obtain the maximum

posterior solution given by µij =m+aij

n+aij +bij .

3.3 Filtering

After the prediction matrix of the target area is determined, the DNs are at the

center of those grids with higher probabilities for 1. In our model, a threshold θ

is set, and the grids whose probabilities for 1 are not less than θare DNs.

4 Submodular Iterative Deployment Algorithm (SIDA)

We now focus on selecting the locations for APs from the candidate location set.

It is clear that the weight function w(·) is a submodular function.

4.1 The SIDA

The main idea of SIDA is as follows. First, we construct an undirected graph

G= (P, E), where P=PB∪PR. For any two nodes pi, pj∈P, (pi, pj)∈Eif

dist(pi, pj) is less than the corresponding communication radius. Then, we scan

each pi

Bsequentially, and generate a subgraph with its 2-hop neighbors. The

following operations are taken within this subgraph.

6

Algorithm 1: SIDA

Input: An instance of DNC problem, hPB, PR, k, w(·), wR(·)i

Output: The ﬁnal solution F

1D← ∅;

2for b∈PBdo

3k0←k;S←b;Vt← {v:hop(v, b)≤2, v ∈PR};// hop(v, b)is the

least hop number from vto b, the same as below.

4while k0>0and Vt6=Sdo

5j←0;

6while j≤ bk0/2cdo

7Find max{wR(S, S ∪ {v}) : v∈Vt};S←S∪ {v};j←j+ 1;

8for v∈Sdo

9if vis not connected with bthen

10 Vd← {u|uis one hop neighbor of vthat also one hop neighbor

of b}; Find max{wR(S, S ∪ {u}) : u∈Vd};S←S∪ {u};

11 k0←k− |S|;

12 if w(F)≤w(S)then

13 F←S;

14 return F;

Next, we repeatedly select bk/2ccandidate locations with maximum residual

weight in the subgraph. For each selected candidate location pi, check whether

it is the 1-hop or 2-hop neighbor of the BS. If it is a 2-hop neighbor, then we

check whether those selected locations can construct a path from pito the BS. If

not, we need to select another one pjfrom the 1-hop neighbors of pithat brings

the maximum residual weight while ensuring that pi→pj→BS is a path. In

this way, the number of all selected locations is at most bk/2c×2≤k. It is very

likely that we still have available relays. Therefore, assume that we have selected

grelays, and g < k, then we run the same procedure on this subgraph with

k=k−g. Repeat this procedure and use Sto record all the selected locations,

and it will terminate once |S|=kand Sis a feasible solution.

Finally, choose the solution with the maximum total weight. The details of

SIDA are shown in Algorithm 1.

4.2 Performance Analysis

In this subsection, we analyze the performance guarantee of SIDA. We consider

a BS location, its 2-hop neighbors and the generated subgraph. We propose two

lemma for this subgraph.

Lemma 1. After each greedy iteration li,i= 2, . . . , t,t≤ bk/2c, the inequality

w(Gi)−w(Gi−1)≥1

kw(OP T 0)−w(Gi−1)holds, where Giis the selected set

after i-th iteration, and OP T 0is the optimal solution within the current subgraph.

7

Proof. First, we denote w(Gi)−w(Gi−1) as W0

i, which is the maximum residual

weight in ith iteration according to the greedy strategy. Clearly, w(O P T 0)−

w(Gi−1) is no more than the weight of the elements covered by OP T 0, but

not covered by Gi−1, i.e. w(O P T 0)−w(Gi−1)≤w(OP T 0\Gi−1). Since the size

of the set OP T 0\Gi−1is bounded by the budget k, the total weight of DNs

covered by O P T 0\Gi−1and not covered by Gi−1, is at most kW 0

i. Hence we

get w(OP T 0)−w(Gi−1)≤kW 0

i. Substituting w(Gi)−w(Gi−1) for W0

i, and

multiplying both sides by 1/k, we get the required inequality.

Lemma 2. After each iteration li,i= 2, . . . , t,t≤ bk/2c, the inequality w(Gi)≥

[1 −(1 −1/k)i]w(OP T 0)holds.

Proof. According to Lemma 1, we have:

k(w(Gi)−w(Gi−1)) ≥w(OP T 0)−w(Gi−1)⇒w(Gi)−w(OP T 0)

w(Gi−1)−w(OP T 0)≤1−1/k.

Therefore, let j= 1,2, . . . , i, and multiply those inqualities, we can get:

i

Y

j=1

w(Gj)−w(OP T 0)

w(Gj−1)−w(OP T 0)≤(1 −1/k)i⇒w(Gi)−w(O P T 0)

w(G0)−w(OP T 0)≤(1 −1/k)i.

Since G0=∅, thus w(G0) = 0, then we have w(Gi)≥[1 −(1 −1/k)i]w(O P T 0).

Theorem 1. SIDA achieves an approximation factor of 1−1

√e·(1−1/k)for the

DNC problem.

Proof. For each BS candidate location, the algorithm iterates for at least bk/2c

times. We suppose OP T is the optimal solution of the DNC problem. For the

subgraph which contains OP T , we denote the set of locations selected by SIDA

as FOP T . Then in the light of Lemma 2, we could get:

wFOP T ≥[1 −(1 −1/k)bk/2c]w(OP T )≥h1−1

pe·(1 −1/k)iw(OP T ).

5 Simulations

In this section, we conduct extensive simulation experiments to evaluate our

algorithm via C++. We evaluate our entire procedure including trajectory pro-

cessing, DNG, and SIDA on ﬁve real GPS data [5] from CRAWDAD: NCSU and

KAIST, New York City, Orlando, and North Carolina state fair. We randomly

divide each dataset into training and validation group.

The parameters of each dataset are shown in Table 1. We divide the map

into grids of gB×gBand set the candidate BS locations to the center of these

grids. Similarity, the candidate relay locations are set to the center of gR×gR

grids. dBis set to rR+rB, and dRis set to 2 ×rR. The number of relays we

8

Table 1. Parameters of each dataset

Dataset s rBrRgBgRθ

KAIST 200 1200 600 3000 500 0.14

NCSU 200 1200 600 3000 500 0.21

New York 400 2400 1200 6000 1000 0.21

Orlando 300 1500 1000 5000 1000 0.20

Statefair 20 150 75 350 50 0.35

can deploy is k= 5. For simplicity, both the two parameters aij and bij of beta

distribution are set to 5. The time slot is set to t= 200.

We repeated the partition of the validation set and ran the procedure for

1000 times, and then took the average. The coverage performance for the ﬁve

datasets are 95.10%, 85.83%, 62.52%, 85.44%, and 60.70%, respectively.

6 Conclusion

In this work, we have proposed the Trajectory-Based Relay Deployment (TBRD)

problem in wireless networks, which aims at maximizing user connection time

as the users roam through the target area while complying with relay resource

constraints. We ﬁrst transform the trajectories into a number of virtual weighted

discrete Demand Nodes (DNs). In this way, the original TBRD problem is con-

verted to an NP-complete problem called Demand Node Coverage (DNC) prob-

lem, which is to maximize total covered DN weight. Then, we design an approx-

imation algorithm named Submodular Iterative Deployment Algorithm (SIDA)

to solve the DNC problem, with an approximation ratio of 1 −1

e·√(1−1/k). The

simulation on ﬁve real datasets results show that our algorithm can obtain high

coverage performance and thus signiﬁcantly improve the user experience. To the

best of our knowledge, we are the ﬁrst to consider user trajectories for relay

deployment.

References

1. Bishop, C.M.: Pattern Recognition and Machine Learning. Springer (2006)

2. Hochba, D.S.: Approximation algorithms for np-hard problems. ACM SIGACT

News 28(2), 40–52 (Jun 1997)

3. Ma, L., Teymorian, A.Y., Cheng, X.: A hybrid rogue access point protection frame-

work for commodity wi-ﬁ networks. In: IEEE International Conference on Computer

Communications (INFOCOM) (2008)

4. Musolesi, M., Hailes, S., Mascolo, C.: An ad hoc mobility model founded on social

network theory. In: MSWiM. pp. 20–24 (2004)

5. Rhee, I., Shin, M., Hong, S., Lee, K., Kim, S., Chong, S.: CRAW-

DAD dataset ncsu/mobilitymodels (v. 2009-07-23). Downloaded from

http://crawdad.org/ncsu/mobilitymodels/20090723/GPS (July 2009)

6. Wang, H., Tian, S., Gao, X., Wu, L., Chen, G.: Approximation designs for cooper-

ative relay deployment in wireless networks. In: ICDCS. pp. 2270–2275 (2017)