Content uploaded by Sina Ebrahimi
Author content
All content in this area was uploaded by Sina Ebrahimi on Jun 28, 2021
Content may be subject to copyright.
Available via license: CC BY 4.0
Content may be subject to copyright.
1
Energy-Efficient Task Offloading Under E2E
Latency Constraints
Mohsen Tajallifar, Sina Ebrahimi, Mohammad Reza Javan, Nader Mokari,
and Luca Chiaraviglio,
Abstract
In this paper, we propose a novel resource management scheme that jointly allocates the transmit
power and computational resources in a centralized radio access network architecture. The network
comprises a set of computing nodes to which the requested tasks of different users are offloaded. The
optimization problem minimizes the energy consumption of task offloading while takes the end-to-end-
latency, i.e., the transmission, execution, and propagation latencies of each task, into account. We aim to
allocate the transmit power and computational resources such that the maximum acceptable latency of
each task is satisfied. Since the optimization problem is non-convex, we divide it into two sub-problems,
one for transmit power allocation and another for task placement and computational resource allocation.
Transmit power is allocated via the convex-concave procedure. In addition, a heuristic algorithm is
proposed to jointly manage computational resources and task placement. We also propose a feasibility
analysis that finds a feasible subset of tasks. Furthermore, a disjoint method that separately allocates
the transmit power and the computational resources is proposed as the baseline of comparison. A lower
bound on the optimal solution of the optimization problem is also derived based on exhaustive search
over task placement decisions and utilizing Karush–Kuhn–Tucker conditions. Simulation results show
that the joint method outperforms the disjoint method in terms of acceptance ratio. Simulations also
show that the optimality gap of the joint method is less than 5%.
Index Terms
Mobile edge computing, task offloading, resource allocation, end-to-end latency, task placement.
I. INTRODUCTION
A. Background
In order to fulfill the requirements of 5G mobile networks, key enabling technologies such
as network function virtualization (NFV) and multi-access/mobile edge computing (MEC) are
M. Tajallifar, S. Ebrahimi, and N. Mokari are with the Department of Electrical and Computer Engineering, Tarbiat Modares
University, Tehran, 14115-111 Iran e-mail: nader.mokari@modares.ac.ir. M. Javan is with Shahrood University of Technology.
L. Chiaraviglio is with University of Rome Tor Vergata.
2
Transport
Network
𝑇tx
𝑇prop
𝑇exe
UE Task Base Station Execution Server
Uplink
Downlink
(a) A typical task offloading example.
RRH 1 RRH 2 RRH U
𝑛
BBU Pool
UE 1 UE 2 UE K
𝑚
𝑚′
(𝑚,𝑚′)
NFV-enabled
node
𝑛
Fronthaul
(b) System model.
Fig. 1: A typical task offloading example and system model.
introduced. With NFV, the network functions (NFs) that traditionally used dedicated hardware
are implemented in applications running on top of commodity servers [1]. On the other hand,
MEC aims to support low-latency mobile services by bringing the remote servers closer to the
mobile users [2], [3]. Moreover, MEC enables the offloading of the computational burden of
users’ tasks to reduce the impact of the limited battery power of user equipment (UE). Note that
when executing servers are NFV-enabled, they are able to process various types of tasks. As a
result, there is no restriction on offloading a task to a predetermined server.
A typical task offloading example is shown in Fig. 1(a). In task offloading, the non-processed
data of a task is sent from UE to an executing server that offloads the computational burden of the
task execution on the executing server. As Fig. 1(a) shows, the user transmits the non-processed
data of the task over the wireless link to its serving base station, which results in transmit latency
Ttx. Then, the received data is transmitted to an executing server. Executing servers are placed
at the base station and distant nodes in the transport network. The data transmission through
the transport network adds the propagation latency Tprop to the offloading process. Finally, the
received data is processed at the executing server with execution latency Texe and then is sent
back to the user over the downlink. Therefore, the end-to-end (E2E) latency of task offloading
is equal to the summation of Ttx,Tprop, and Texe in both uplink and downlink.
B. Related Works
We classify the related works on task offloading into four categories and discuss their appli-
cability in practical scenarios.
3
1) Task offloading to multiple executing servers: In task offloading, a UE decides to whether
offload a task to a single executing server or to select an executing server out of multiple servers.
Offloading a task to one server in a set of executing servers in a multi-tier heterogeneous network
is considered in [4], [5]. Moreover, the authors in [6]–[8] propose to offload a user’s task to one of
executing servers at base stations in a multi-cell network. Note that in the aforementioned works,
the executing servers are located at the edge of the radio access network and the computational
resources in the non-radio part of the network are not considered. In contrast, it is possible to
offload a task to any server in the network in [9], i.e., servers in radio access and non-radio parts of
the network. However, radio resources are not allocated in [9]. Note that, ignoring computational
resources in the non-radio part of the network or ignoring radio resource allocation results in an
inefficient task offloading.
2) Task placement and computational resource allocation: Task offloading is comprised of two
steps: i) task placement to select an executing server, and ii) computational resource allocation
that allocates the resources of the executing server to each task. In this context, various works
only focus on task placement with given computational resources for each task [4]–[6], [9]–[12],
while others include resource allocation as well [7], [13]–[21]. Note that the servers in non-radio
part of the network are not involved in these works. As a result, computationally intensive tasks
with moderate sensitivity to latency may occupy the capacity of executing servers in radio part
of the network while high capacity servers in non-radio part of the network are underutilized.
3) Joint Radio and Computational Resource Allocation: Extensive research is made on joint
radio and computational resource allocation [7], [11], [13]–[27]. In these works, radio resources
including transmit power and/or bandwidth as well as computational resources are allocated to
each task. Energy-efficient resource allocation is performed in [11], [13], [17], [19], [21], [24],
[26], [27], and a weighted combination of consumed energy and latency is optimized in [7],
[14]–[16], [20], [22], [25]. Moreover, the impact of radio link quality without radio resource
allocation is taken into account by [5], [6], [8], [10], [28], [29]. In these works, the latency
of data transmission over radio links is taken into account, which impacts the optimal task
placement. Note that although joint optimization of radio and computational resources increases
the degrees of freedom in task offloading, the available computational resources in the radio
access network are very limited, which limit the acceptance ratio of the network.
4) Feasibility Analysis: When task offloading is subjected to a maximum acceptable latency,
sufficient resources are required in various parts of the network. In case of insufficient resources,
4
a feasibility analysis is needed to determine a feasible subset of requested tasks. One approach
to face infeasibility is making some simplifying assumptions, e.g., assuming sufficient available
resources for task offloading [9], [22] or offloading a task when it is beneficial, i.e., when
offloading results in less energy consumption or latency [10]. In practice, however, the resources
are limited and tasks are subjected to execution deadlines. As a result, a feasibility analysis is
inevitable. The feasibility analysis is performed by introducing a binary optimization variable,
which is one when the task is accepted or zero when the task is rejected [4], [7], [12], [13], [15],
[19]–[21], [27]. Note that finding optimal binary variables results in combinatorial optimization
problems that are challenging and of high complexity.
C. Motivation
The performance of a task offloading method is mainly measured by its latency and energy
consumption. In practice, E2E latency comes from radio links, transport network links, and
execution at the servers; and the energy consumption is impacted by consumed transmit power
and computational resources.
Optimizing the performance of task offloading necessitates a joint optimization of all available
resources in the network. However, existing works optimize a subset of resources and focus only
on one part of the whole network. Moreover, the impact of E2E latency is not considered in the
literature. As a result, existing methods may not perform well in practice.
In this paper, we propose a task offloading method that optimizes the energy consumption in
terms of transmit power and computational resources under E2E latency constraints. Throughout
the paper, the task offloading is referred to the process of transmit power allocation over radio
links, task placement, i.e., selecting an executing server and its path, and computational resource
allocation. The proposed method jointly allocates required transmit power to tasks, places each
task in a proper NFV-enabled node, and allocates sufficient computational resources to the tasks.
With this joint method, high latency of radio links caused by weak radio channels is compensated
by a proper task placement and computational resource allocation. In contrast, high execution
latency caused by limited computational resources is compensated by consuming more transmit
power in radio links. As a result, more tasks are served, compared to a disjoint method wherein
transmit power allocation is independent of task placement and computational resource allocation.
NFV enables a general-purpose server to execute various tasks without needing a specialized
server for each task. Therefore, various tasks are dynamically offloaded to general-purpose
5
executing servers in a network of NFV-enabled nodes instead of offloading each task to a
respective specialized server. As a result, a task placement method is needed to determine an
executing server and its route for each task. In spite of conventional routing methods that choose
a route to a predetermined server, our task placement method jointly determines an executing
server, the associated route to the executing server, and the required computational resources in
the executing server.
We assume a deadline for offloading each task, i.e., sending the task from UE to the executing
server and sending it back to UE performed under a maximum acceptable latency constraint. As
a result, the sum of latencies in radio link, transport network links, and execution at the executing
server is less than the maximum acceptable latency. The feasibility of this E2E offloading method
depends on the available resources and location of executing servers in the network. For example,
when the available transmit power is low, the radio link latency is large, which may violate E2E
latency. In contrast, when the available computational resources at the executing server are low,
the execution latency is large, which may also violate the E2E latency constraint. Therefore, our
task offloading method includes a feasibility analysis that finds a set of feasible tasks.
The infeasibility of task offloading depends on the value of maximum acceptable latencies,
i.e., lower values of maximum acceptable latencies result in a larger number of infeasible tasks
and higher values result in a smaller number of infeasible tasks. Inspired by this fact and in
contrast to the existing works, we add a non-negative variable to each maximum acceptable
latency. Non-negative variables are zero for feasible tasks and are positive for infeasible tasks.
Therefore, the set of feasible tasks is obtained by solving an optimization problem that minimizes
the sum of non-negative variables, i.e., maximizes the number of feasible tasks.
Joint task offloading results in a non-convex problem due to coupling optimization variables.
Moreover, the task placement is performed by obtaining binary variables, which makes the
optimization problem further complicated. To deal with the optimization problem, we decouple
transmit power allocation from task placement and computational resource allocation. Transmit
power allocation is performed via the well-known convex-concave procedure (CCP) and a
heuristic algorithm is proposed for task placement and computational resource allocation. CCP
and the heuristic algorithm are alternatively applied until convergence. Note that both CCP and
the heuristic algorithm preserve the monotonicity of convergence.
We also develop two baseline methods to evaluate the efficiency of our joint task offloading
method. The first is a disjoint method in which transmit power allocation is performed inde-
6
pendent of task placement. In doing so, the maximum acceptable E2E latency of each task is
divided into a radio latency constraint and a non-radio latency constraint. We allocate transmit
power under the radio latency constraint. Then, the task placement and computational resource
allocation are performed under the non-radio latency constraint.
The second baseline method achieves a lower bound on the optimal solution of the joint task
offloading optimization problem. The lower bound is achieved by relaxing some constraints in
the optimization problem, which comes from leveraging practical assumptions such as orthog-
onality of wireless channels in large-scale antenna array systems. The optimal solution is then
found by an exhaustive search over all feasible task placement candidates, finding the optimal
computational resource allocation for each placement candidate, and choosing the placement
candidate that results in the lowest objective value.
D. Contributions
In this paper, we develop an energy-efficient task offloading method that offloads the compu-
tational burden of a task from a UE to one of executing servers in a network of NFV-enabled
nodes. In doing so, a task is offloaded by sending non-processed data of the task from the UE
to a radio remote head (RRH) over a radio link, sending the data from the RRH toward the
executing server through a transport network, and sending the processed data back from the
executing server to UE. We assume that each task is offloaded under a respective deadline, i.e.,
the E2E latency of task offloading is less than the maximum acceptable latency of the task.
The main contributions and achievements of this paper are as follows:
•We develop a joint task offloading method in a practical scenario, i.e., the proposed method
allocates the transmit power, finds an executing server and the route to it, and allocates
the computational resources in an energy-efficient manner. Moreover, the proposed method
takes the E2E latency of task offloading into account. By the proposed method, the impact
of weak radio links is compensated by placing the tasks in servers closer to UEs and
consuming more computational resources. In contrast, limited computational resources are
compensated by allocating more transmit power, resulting in an efficient and adaptive task
offloading method.
•We propose a novel method for task placement and computational resource allocation.
While the conventional routing methods find a route to a predetermined node, our proposed
7
method jointly finds the executing server, its associated route, and the required computational
resources in an energy-efficient manner.
•We find a lower bound on the objective function of the optimization problem in the feasibility
analysis, i.e., an upper bound on the acceptance ratio of the proposed method. The lower
bound is obtained by relaxing some of constraints in the optimization problem, performing
an exhaustive search over all feasible task placement candidates, and finding the optimal
computational resource allocation by utilizing Karush-Kuhn-Tucker conditions.
•Simulation results show that the proposed joint method outperforms its disjoint counterpart
in terms of acceptance ratio. Moreover, the lower bound on the optimal solution is almost
tight because the joint method attains the lower bound in practical scenarios. Specifically,
the optimality gap of the joint method is less than 5%.
E. Organization
The rest of the paper is organized as follows. Section II introduces the system model. Section
III describes the optimization problem formulation. In Section IV, we propose joint task offload-
ing while disjoint task offloading and lower bound on optimal task offloading are proposed in
Sections V and VI, respectively. Simulation results are presented in Section VII and the paper
is concluded in Section VIII.
F. Notation
The notation used in this paper are given as follows. The vectors are denoted by bold lowercase
symbols. Operators k·kand | · | are vector norm and absolute value of a scalar, respectively.
(a)Tis transpose of aand [a]+= max(a, 0).A\{a}discards the element afrom the set A.
Finally, a∼ CN(0,Σ)is a complex Gaussian vector with zero mean and covariance matrix Σ.
II. SY ST EM MO DE L
The structure of the radio access network, channel model, and signaling scheme as well as
NFV-enabled network, computational resources, and capacity of network links are described in
this section.
A. Radio Access Network (RAN)
We consider a centralized RAN architecture with a baseband unit (BBU) pool, which serves
a set of URRHs, each equipped with Mantennas. The set of all users is denoted by K. Each
8
user is equipped with a single antenna and the total number of users is K=|K|. The considered
model is shown in Fig. 1(b). It is assumed that each RRH is connected to the BBU pool through
a fronthaul link.
We assume that each user requests a single task. Task kis represented by a triplet < Lk, Dk, Tk>,
where Lkis the load of task k(i.e., the required CPU cycles), Dkis the data size of task k(in
terms of bits), and Tkis the maximum acceptable latency of task k.
Each UE transmits the non-processed data of its task to its serving RRH through a wireless
link. We assume that each UE is served by a single RRH. The set of users served by RRH u
is Ku={k∈ K|Jk
u= 1}where Jk
uis an indicator which equals 1 if UE kis connected to
RRH u(0 otherwise). In this paper, we assume that the UE-RRH assignment is given and fixed.
Focusing on the wireless link, we assume a narrow-band block fading channel model [21]. The
channel vector between UE kand RRH uis denoted by hu,k, where hu,k =pQu,k ˜
hu,k in which
Qu,k represents the path loss between RRH uand UE kand small-scale fading is modeled as
˜
hu,k ∼ CN(0,IM). Similar to [16], [17], we assume that the channel state information (CSI)
is constant over the offloading time. As we show through simulations, this assumption is non-
restrictive in practical scenarios in sub-6 GHz bands. UE ktransmits a symbol xk∼ CN(0,1)
with transmit power ρktoward its serving RRH. The transmit power of UE kis constrained to
a maximum value, i.e., ρk≤Pmax
k∀k. The received signal vector at RRH uis:
yu=X
k∈K
hu,k√ρkxk,∀u. (1)
We assume the maximum ratio combining (MRC) at RRHs because of its simplicity. Neverthe-
less, MRC is asymptotically optimal in massive MIMO systems [30]. Therefore, the combined
signal is:
zu=FH
uyu,∀u, (2)
where Fu= [fk],∀k∈ Kuand fk=hu,k
khu,kk. The estimated signal of UE kis:
zk=fH
khu,k√ρkxk+X
j∈K\{k}
fH
khu,j √ρkxk+fH
knu,∀k∈ Ku,
where nu∼ CN(0, σ2
nIM)is the received noise vector at RRH u. Thus, the signal to interference
plus noise ratio (SINR) of UE kis:
SINRk=khu,kk2ρk
Pj∈K\{k}
|hH
u,khu,j |2
khu,kk2ρj+σ2
n
,∀k∈ Ku.(3)
9
Hence, the achievable data rate by UE kis Rk=Wlog2(1 + SINRk)1bits per second (bps),
where Wis the radio access network bandwidth. The radio transmission latency of task kin the
uplink is Ttx
k=Dk
Rk
2. The sum of data rates of UEs served by RRH uis less than the capacity
of its fronthaul link, i.e., Pk∈KuRk≤Bf,u,∀u. In this paper, similar to [10], [29], and [11],
we assume that the processed data size of task kis small. Moreover, since the power budget of
RRHs is generally large, the radio transmission latency in the downlink is assumed negligible.
B. NFV-enabled Network
The NFV-enabled network includes a graph G= (N,E), where Nand Eare the set of nodes
and edges (or links), respectively. A typical node in Nis denoted by nwhile the BBU pool is
indicated by ¯n(which is also a node in N). The link between two nodes mand m0is denoted
by (m, m0). Each NFV-enabled node is comprised of an executing server and a routing device.
The processing capacity (i.e., the maximum CPU cycles per second that are carried out) of the
executing server in NFV-enabled node nis indicated by Υn. Moreover, the capacity of link
(m, m0)is indicated by B(m,m0)in terms of bps.
In this paper, we assume the full offloading scheme, i.e., the task of each user is completely
executed in an executing server in the NFV-enabled network. Therefore, there is a need for
placing each task to a proper executing server. A task placement decision consists of selecting
an NFV-enabled node nand its associated path from ¯nto n. We denote the bth path between
nodes ¯nand nas pb
nwhere b∈ Bn={1···Bn}and Bnis the total number of paths between
nodes ¯nand n. Note that a path between ¯nand nmay comprise some intermediate nodes,
which only forward the tasks’ data via their routing devices and do not deliver the data to their
executing servers. We define decision variable ξk
pb
n, which equals 1when task kis offloaded to
node nand sent over path pb
n(0otherwise). Each task is offloaded to one and only one node
and path when we have:
X
n∈N X
b∈Bn
ξk
pb
n= 1,∀k. (4)
Indicator I(m,m0)
pb
ndetermines whether a link contributes to a path. The indicator is equal to 1when
link (m, m0)contributes to path pb
n(0 otherwise). Moreover, the set of all links that contribute to
1For wide-band channel model, the data rate of UE kis the sum rate over all sub-carriers allocated to UE k.
2No buffering is assumed in the transport network routing. Therefore, transmission time of tasks’ data over the transport
network links is not taken into account.
10
path pb
nis Epb
n=n(m, m0)∈ E|I(m,m0)
pb
n= 1o. The amount of computational resources allocated
to task kis denoted by υk(in terms of CPU cycles per second). Note that the execution of each
tasks is performed at only one node. To ensure that the allocated computational resources do
not violate the processing capacity of that node, we should have:
X
k∈K X
b∈Bn
υkξk
pb
n≤Υn,∀n. (5)
Since the data of task kis sent over the network with rate Rk, the aggregated rates of all tasks
that pass a link should not exceed its capacity, which is guaranteed by the following constraint:
X
k∈K X
n∈N X
b∈Bn
I(m,m0)
pb
nξk
pb
nRk≤B(m,m0),∀(m, m0)∈ E.(6)
The execution latency of task kis Texe
k=Lk
υk. The processed data of task kis sent to-
ward the BBU pool (i.e., node ¯n). In this paper, we assume the path of uplink and down-
link are the same. Therefore, the overall propagation latency of task kover the path pb
nis
twice the propagation latency of path pb
n. Thus, the propagation latency of task kis Tprop
k=
2Pn∈N Pb∈BnP(m,m0)∈Epb
n
ξk
pb
nδ(m,m0), where δ(m,m0)is the propagation latency of link (m, m0).
Table I summarizes the notation used in the paper.
TABLE I: Main Notation.
Notation Definition Notation Definition
U, M, K Number of RRHs, antennas and users WRadio access network bandwidth
K,N,ESet of all users, nodes and links KuSet of users served by RRH u
Pmax
kPower budget of UE khu,k Channel vector between user kand RRH u
Lk, Dk, Tk
Load, data size and maximum
acceptable latency of task k
ξk
pb
n
Decision variable for assignment of node n
and its associated path pb
nto task k
ΥnProcessing capacity of node n Bf,u Capacity of fronthaul link for RRH u
B(m,m0),
δ(m,m0)
Capacity and propagation latency
of link (m, m0)Λn
Computational energy efficiency coefficient
of the node n
pb
nbth path between nodes ¯nand n υkComputational resources allocated to task k
BnSet of paths between nodes ¯nand n ρkAllocated transmit power to UE k
Epb
nSet of all links that contribute in path pb
nαkNon-negative variable of task k
I(m,m0)
pb
n
Indicator determining whether link (m, m0)
contributes in path pb
n
RkData rate of task k
Jk
u
Indicator determining whether UE k
is assigned to RRH u
Texe
kExecution latency of task k
Ttx
kRadio transmission latency of task k T prop
kPropagation latency of task k
11
III. PROBLEM FORMULATION
In this section, we formulate the optimization problem of joint task offloading. Each task is of-
floaded under its E2E latency constraint and in an energy-efficient manner. The objective function
is E(ξ,υ,ρ) = Pk∈K ρk+ηPn∈N Pk∈K Pb∈BnΛnξk
pb
nυk3, where ξ= [ξ1
p1
1,··· , ξK
pBN
N
]T,υ=
[υ1,··· , υK]T, and ρ= [ρ1,··· , ρK]Tare the vectors of all ξk
pb
n, υk,and ρk, respectively; Λn
denotes the computational energy efficiency coefficient of node n[18], and ηis a weight. Note
that the first term in Eis the transmit power consumption and the second term is the power
consumption of executing servers. Therefore, the joint task offloading optimization problem is:
min
ξ,υ,ρE(ξ,υ,ρ)
s.t. C1: Texe
k+Tprop
k+Ttx
k≤Tk,∀k,
C2: Pk∈K Pb∈Bnυkξk
pb
n≤Υn,∀n,
C3: Pk∈K Pn∈N Pb∈BnI(m,m0)
pb
nξk
pb
nRk≤B(m,m0),∀(m, m0)∈ E,
C4: Pk∈KuRk≤Bf,u,∀u,
C5: ρk≤Pmax
k,∀k,
C6: Pn∈N Pb∈Bnξk
pb
n= 1,∀k,
(7)
under variables: ξ∈ {0,1},υ≥0,ρ≥0. Constraint C1 guarantees that the maximum acceptable
latency of task offloading is respected. Constraints C2 and C3 make sure that all tasks are
offloaded without violation in processing capacity of nodes and capacity of links, respectively.
Constraint C4 ensures the capacity of fronthaul links. Constraint C5 guarantees the power budget
of UEs while constraint C6 makes sure that each task is offloaded to only one node and path.
IV. JOINT TASK OFFLOADING (JTO)
In this section, we solve optimization problem (7). This problem is non-convex due to integer
variable ξand coupling variables in C1-C4. Therefore, we solve (7) by decoupling transmit power
allocation from task placement and computational resource allocation. In doing so, transmit power
is allocated given task placement and allocated computational resources. Then, we perform task
placement and computational resource allocation having allocated transmit powers. The proposed
approach needs a feasible initialization. However, it is likely for constraint C1 to make (7)
infeasible. Thus, we need to propose a feasibility analysis to find a feasible subset of tasks.
A. Feasibility Analysis
The feasible set of (7) is extended by adding a non-negative variable αkto the maximum ac-
ceptable latency of task k. Thus, the feasibility problem is constructed by replacing the objective
12
function of (7) with the sum of non-negative variables, i.e., PK
k=1 αk[31]. The constraints which
cause infeasibility are found by solving the feasibility problem and determining the constraints
with positive values of non-negative variables. The feasibility problem is:
min
ξ,υ,ρ,αPk∈K αk
s.t. C1-a: Texe
k+Tprop
k+Ttx
k≤Tk+αk,∀k∈ K
C2-C6,
(8)
under variables: ξ∈ {0,1},υ≥0,ρ≥0,α≥0. Note that non-negative variables are added
only to C1 because when C1 is eliminated, the optimization problem (7) is always feasible. Thus,
we seek for the tasks whose maximum acceptable latencies are violated and eliminate them one
by one until a subset of feasible tasks remains. The solution to (8) not only provides the infeasible
constraints but also determines the level of infeasibility, i.e., constraints with larger values of
non-negative variables need more resources to become feasible. Therefore, we first eliminate the
tasks with larger values of non-negative variables.
Without loss of equivalence, we add the summation of inequalities in C1-a as a new constraint
C7. Therefore, optimization problem (8) is restated as:
min
ξ,υ,ρ,αPk∈K αk
s.t. C1-a: Texe
k+Tprop
k+Ttx
k≤Tk+αk,∀k
C2-C6,
C7: Pk∈K (Texe
k+Tprop
k+Ttx
k−Tk)≤Pk∈K αk.
(9)
This optimization problem is equivalent with:
min
ξ,υ,ρ,αPk∈K (Texe
k+Tprop
k+Ttx
k)
s.t. C1-a, and C2-C6,
(10)
in which the term Pk∈K Tkis removed from the objective because it is constant. We solve
(10) by decoupling transmit power allocation from task placement and coumputational resource
allocation. In other words, we solve (10) under variables υ,ξ,α, having ρfixed and vice versa.
To perform task placement and computational resource allocation, we need an initial ρ=ρ0
that satisfies C3 and C4, which are satisfied with a small value of Rk, i.e., small values of ρk.
Next, we solve the following optimization problem:
min
α,υ,ξPk∈K (Texe
k+Tprop
k)
s.t. C1-a, C2,C3, and C6
(11)
13
by a heuristic method. As in Algorithm 1, we find variables ξand υthat minimize the objective
of (11). Then, we set the non-negative variables for a feasible C1. In doing so, for task k,
we calculate the amount of unused computational resources at all nodes, formally expressed as
˜
Υk
n= Υn−Pj∈K\{k}Pb∈Bnυjξj
pb
n. Morevoer, the available capacity of link (m, m0)is ˜
Bk
(m,m0)=
B(m,m0)−Pj∈K\{k}Pn∈N Pb∈BnI(m,m0)
pb
nξj
pb
nRj. A task is placed in node nonly when there is
a feasible path between ¯nand n, i.e., a path with sufficient capacity in all of its links. The set
of all such nodes is Nk. For each n∈ Nk, we calculate Texe
k+Tprop
kwhen υk=˜
Υk
n. Next,
we find the node and feasible path that give the smallest Texe
k+Tprop
k, denoted by n?and b?,
respectively. Note that from C1, the sufficient computational resources allocated to task kis
υtemp =Lk
Tk−Ttx
k−Tprop
k
. When ˜
Υk
n?≥υtemp, C1 is satisfied by setting υk=υtemp and αk= 0.
Otherwise, we set υk=˜
Υk
n?and αk=Ttx
k+Texe
k+Tprop
k−Tk. Next, the available computational
resources of nodes and available capacity of links are updated and this process is repeated for
all of tasks. Note that we begin with tasks that require lower resources, i.e., tasks with lower
values of Tk.
Algorithm 1: Heuristic Algorithm for Solving (11).
Input: ρ
1sort α:T[1] ≤T[2] ≤ ·· ·T[|K|]
2for k= [1] : [|K|]do
% Find a feasible node according to capacity of paths terminated at that node
3Nk={n∈ N|∃b:Rk≤˜
Bk
(m,m0)∀(m, m0)∈ Epb
n}
4˜
Υk
n= Υn−Pj∈K\{k}Pb∈Bnυjξj
pb
n
,∀n
%Find the best node and its associated path
5(n?, b?) = arg min
n∈N k,b∈Bn
Texe(˜
Υk
n) + Tprop(pb
n)
6set ξk
pb?
n?= 1 and ξk
pb
n= 0,∀(n, b)6= (n?, b?)
% Update computational resource allocation and non-negative variables
7υtemp =Lk
Tk−Ttx
k−Tprop
k
8if ˜
Υk
n?≥υtemp then
9set υ?
k=υtemp and α?
k= 0
10 else
11 υ?
k=˜
Υk
n?and α?
k=Ttx
k+Texe
k+Tprop
k(pb?
n?)−Tk
Output: α?,ξ?,υ?
14
After solving (11), we allocate the transmit power by solving:
min
ρPK
k=1 Ttx
k
s.t. C1-a, and C3-C5.
(12)
Note that in the heuristic method, we have Ttx
k+Texe
k+Tprop
k=Tk+αk. As a result, any
feasible solution to (12) does not increase Ttx
kbecause (12) is infeasible for larger values of Ttx
k.
Hence, replacing (12) with its feasibility problem counterpart does not impact the decreasing
monotonicity of the objective function in (10). The feasibility problem of (12) is:
find ρ
s.t. C1-a, and C3-C5.
(13)
In solving (13), we note that the constraints C1-a, C3 and C4 are non-convex. Therefore, we
need to find a convexified version of (13).We use CCP [32] to convexify (13). In doing so, we
reformulate C1-a as:
Rk≥Dk
Tk+αk−Tprop,i
k−Texe,i
k
.(14)
where Texe,i
kand Tprop,i
kare the execution latency and propagation latency of task kobtained
from the heuristic method in ith iteration, respectively. In order to convexify (14), we need a
concave approximation of Rkwith respect to ρ. The rate Rkis:
Rk=Wlog2Pj∈K
|hH
u,khu,j |2
|hu,j |2ρj+σ2
n
Pj∈K\{k}
|hH
u,khu,j |2
|hu,j |2ρj+σ2
n, k ∈ Ku,(15)
which is equivalent to:
Rk=Wlog2U
X
u=1 X
j∈Ku
|hH
u,khu,j |2
|hu,j |2ρj+σ2
n
| {z }
hk(ρ)
−Wlog2U
X
u=1 X
j∈Ku\{k}
|hH
u,khu,j |2
|hu,j |2ρj+σ2
n
| {z }
gk(ρ)
.(16)
Both hk(ρ)and gk(ρ)are concave functions of ρ. Thus, we need to find a linear approximation
of gk(ρ), which is ˆgk(ρ) = gk(ρ0) + ∇gk(ρ0)T(ρ−ρ0), where:
[∇gk(ρ)]i=
WPU
u=1 Ii
u
|hH
u,khu,i |2
|hu,i|2
ln(2) PU
u=1 Pj∈Ku\{k}
|hH
u,khu,j |2
|hu,j |2ρj+σ2
n!, i ∈ K\{k},
0, i =k.
(17)
15
Next, we focus on the convex approximation of C3 and C4. To this aim, we find a convex
approximation of Rk, which is found by linear approximation of hk(ρ). Thus, we have ˆ
hk(ρ) =
hk(ρ0) + ∇hk(ρ0)T(ρ−ρ0), where:
[∇hk(ρ)]i=WPU
u=1 Ii
u
|hH
u,khu,i |2
|hu,i|2
ln(2) PU
u=1 Pj∈Ku
|hH
u,khu,j |2
|hu,j |2ρj+σ2
n, i ∈ K,(18)
Finally, the convexified version of (12) is:
find ρ
s.t. C1-b hk(ρ)−ˆgk(ρ)≥Dk
Tk+αk−Tprop,i
k−Texe,i
k
,∀k∈ K
C3-a: Pk∈K Pn∈N Pb∈BnI(m,m0)
pb
nξk
pb
nˆ
hk(ρ)−gk(ρ)≤B(m,m0),∀(m, m0)∈ E
C4-a: Pk∈Kuˆ
hk(ρ)−gk(ρ)≤Bf,u,∀u
C5: ρk≤Pmax
k,∀k,
(19)
under variable: ρ≥0. Note that, based on CCP, any feasible solution of (19) is also feasible
in (13) [32]. The feasibility problem (8) is solved by alternatively solving (11) and (19). Then,
we reject the task that makes (7) infeasible. According to Algorithm 2, we find the value of the
maximum non-negative variable. If the value is positive, its associated task is rejected, the set
of served tasks is updated, and (8) is solved for updated set of tasks. This procedure continues
until all non-negative variables are zero. The output of Algorithm 2 is feasible subset of tasks
K?as well as the solution of (8), i.e., the values of ξini,ρini,and υini, which are utilized as
initialization for solving (7).
B. Optimization
Given the feasible solution ξini,ρini,υini , and the set of accepted tasks K?, we seek for the
solution of (7). Similar to Algorithm 2, we decouple power allocation from task placement and
coumputational resource allocation. The optimization problemof task placement and coumputa-
tional resource allocation is:
min
υ,ξPn∈N Pk∈K Pb∈BnΛnξk
pb
nυk3
s.t. C1-C3, and C6,
(20)
which is non-convex. Note that the objective of (20) is an increasing function of υkand allocating
lower computational resources to task kdecreases the power consumption. But, allocating lower
computational resources increases execution latency and violates the E2E latency constraint. As
16
Algorithm 2: JTO Feasibility Analysis for Solving (8).
Initialize: K={1,··· , K},ξ=0,ρ0:very small
1repeat
2i= 0
3repeat
% Allocate transmit power, computational resources, and place the tasks
4Solve (11) via Algorithm 1 and return υi+1,ξi+1, and αi+1
5Solve (19) and return ρi+1
6i=i+ 1
7until Pk∈K αi
k−Pk∈K αi+1
k≤or i≥Imax
% Discard the infeasible task
8k?= arg max
k∈K
αk
9if αk?>0then
10 K=K\{k?}
11 until Pk∈K αk= 0
Output: ξini =ξi+1,ρini =ρi+1 ,υini =υi+1, and K?=K
a result, we need to find nodes with smaller propagation latency to compensate for increased
execution latency. In doing so, we find a subset of nodes with smaller propagation latency
than the current executing server and with sufficient capacity of links terminating at that nodes.
This set of nodes is N0
k={n0∈ N|∃b0:Rk≤˜
Bk
(m,m0)∀(m, m0)∈ Epb0
n0and Tprop
k(pb0
n0)≤
Tprop
k(pb
n)}, where we assume task kis previously placed through path pb
n. For each node in
N0
k, we calculate the minimum computational resources that satisfy the E2E latency constraint,
i.e., υtemp =Lk
Tk−Ttx
k−Tprop
k(pb0
n0). When ˜
Υk
n0≥υtemp and Λn0υ3
temp ≤Λnυ3
k, we ensure that task
placement through pb0
n0and computational resource allocation υtemp are feasible and result in
lower power consumption. Therefore, we set υk=υtemp. Otherwise, we reinstate υkfor task k.
Algorithm 3 begins with the tasks with larger power consumption, i.e., Λnkυ3
k, where nkdenotes
the executing server of task k. This procedure is repeated for all accepted tasks.
The sub-problem of transmit power allocation, after convexification, is:
min
ρPk∈K ρk
s.t. C1-c: hk(ρ)−ˆgk(ρ)≥Dk
Tk−Tprop,i
k−Texe,i
k
,∀k∈ K
C3-a, C4-a, and C5.
(21)
Based on CCP in Algorithm 4 and starting from ρ0=ρini, an iterative solution of (21)
provides a sub-optimal transmit power allocation. Finally, optimization problem (7) is solved
17
Algorithm 3: Heuristic Algorithm for Solving (20).
Input: ξini,ρini ,υini
1sort: Λ[1]υ3
[1] ≤Λ[2]υ3
[2] ≤ ·· ·Λ[K]υ3
[K]
2for k= [1] : [|K|]do
% Find a feasible node according to capacity of paths terminated at that node
3N0
k={n0∈ N|∃b0:Rk≤˜
Bk
(m,m0),∀(m, m0)∈ Epb0
n0and Tprop
k(pb0
n0)≤Tprop
k(pb
n)}
4for n0∈ N0
kdo
5υtemp =Lk
Tk−Ttx
k−Tprop
k(pb0
n0)
6˜
Υk
n0= Υn0−Pj∈K\{k}Pb∈Bn0υjξj
pb
n0
7if υtemp ≥˜
Υk
n0and Λn0υ3
temp ≤Λn?υ3
kthen
8set υ?
k=υtemp
9set ξk?
pb0
n0
= 1 and ξk ?
pb
n= 0,∀(n, b)6= (n0, b0)
10 break
Output: ξ?,υ?
Algorithm 4: Power Allocation in JTO.
Input: ρ0=ρini,i= 0,= 10−3,Iρ
max = 102
1repeat
% Allocate power to users
2Solve (21) and return ρi+1
3i=i+ 1
4until Pk∈K ρi
k−Pk∈K ρi+1
k≤or i≥Iρ
max
Output: ρ?=ρi+1
via Algorithm 5, which alternatively solves optimization problem (11) via Algorithm 3 and
optimization problem (21) via Algorithm 4.
From the implementation point of view, BBU is responsible for gathering the required in-
formation, performing resource allocation, and sending the decisions to the associated entities.
Specifically, in JTO, BBU needs to acquire CSI of UEs and the available computational resources
in the NFV-enabled nodes. CSI of each UE is estimated at its serving RRH and is forwarded
through fronthaul links with negligible latency. In addition, each NFV-enabled node sends the
available computational resources to the BBU through the transport network. After performing
JTO, BBU transmits the value of allocated powers to RRHs. Next, BBU forwards the received
data of tasks as well as the obtained computational resources to associated NFV-enabled nodes
based on task placement variables. In the downlink, the processed data of tasks are sent to BBU,
18
which in turn transmits UEs processed data to their serving RRH.
Algorithm 5: JTO Optimization Algorithm for Solving (7).
Input: ξ0=ξini,ρ0=ρini ,υ0=υini,K?,i= 0
1repeat
% Place the tasks and allocate the computational resources
2Solve (20) via Algorithm 3 and return υi+1 and ξi+1
% Allocate the transmit power
3Solve (21) via CCP in Algorithm 4 and return ρi+1
4i=i+ 1
5until E(ξi,υi,ρi)− E(ξi+1,υi+1,ρi+1 )≤or i≥Imax
Output: ξ?,ρ?,υ?
C. Convergence analysis
In this subsection, we prove the convergence of Algorithms 2 and 5.
Theorem 1. Algorithm 2 is convergent.
Proof. We show that the objective value of (8), i.e., Pk∈K αk, is non-increasing in each step of
Algorithm 2 and since the objective value is lower bounded by zero, Algorithm 2 is convergent.
In ith iteration of Algorithm 2, Algorithm 1 sets αi+1
keither equal to 0when E2E latency of task
kis guaranteed or equal to Ttx
k+Texe
k+Tprop
k−Tkwhen E2E latency is larger than its maximum
acceptable value. Therefore, we have αi+1
k= [Texe
k+Tprop
k+Ttx
k−Tk]+. Hence, we need to show
that Pk∈K(Ttx
k+Texe
k+Tprop
k)does not increase after ith iteration. Algorithm 1 affloads task k
so that Texe
k+Tprop
kin the objective of (11) is minimized (line 5 in Algorithm 1). As a result,
Algorithm 1 does not increase the objective value of (11), i.e., Pk∈K(Tprop
k(ξi+1)+Texe
k(υi+1)) ≤
Pk∈K(Tprop
k(ξi)+Texe
k(υi)). Moreover, as discussed in subsection IV-A, Algorithm 1 makes C1-a
active, i.e., Ttx
k(ρi) = Tk+αi+1
k−Texe
k(υi+1)−Tprop
k(ξi+1), and therefore, any feasible solution to
(13) does not increase the objective vlaue of (12), i.e., Pk∈K Ttx
k(ρi+1)≤Pk∈K Ttx
k(ρi), which
gives Pk∈K(Texe
k(υi+1) + Tprop
k(ξi+1) + Ttx
k(ρi+1)) ≤Pk∈K (Texe
k(υi) + Tprop
k(ξi) + Ttx
k(ρi)).
As a result, we have Pk∈K αi+1
k≤Pk∈K αi
k, that is, Algorithm 2 is convergent.
Note that Algorithm 2 may eliminate the task with maximum non-negative variable. This
elimination is equivalent to removing the constraints of (8) associated with the eliminated task.
Note that, eliminating a task increases the available capacity of links in transport network and
available computational resources in NFV-enabled nodes. As a result, a search space of Algorithm
1 increases, which may result in lower propagation and execution latencies. Moreover, eliminating
19
a task extends the feasible set of (13). Therefore, data rate of users may increase, which in turn
may decrease Pk∈K Ttx
k. As a result, eliminating the task with maximum non-negative variable
does not increase the objective of (8).
Theorem 2. Algorithm 5 is convergent.
Proof. Algorithm 5 solves (7) by alternating minimization of (20) and (21). Therefore, we need
to show that Algorithm 3 (which solves (20)) and Algorithm 4 (which solves (21)) do not
increase the objective value of (7). According to line 7 of Algorithm 3, computational resource
allocation and task placement do not increase the objective value of (20). In addition, based on
[32], convergence of Algorithm 4 is guaranteed and CCP does not increase the objective of (21).
As a result, the objective value of (7) is non-increasing in each iteration, and since Ψ(ξ,υ,ρ)
is lower bounded by zero, Algorithm 5 is convergent.
D. Summary of JTO
Herein we summarize JTO. We obtain a set of feasible tasks by solving (8). In doing so,
we decouple the power allocation from task placement and computational resource allocation,
which are performed by solving (13) and Algorithm (1), respectively. Then, we solve (7) for
feasible tasks via Algorithm 5, which includes the alternating minimization of (20) and (21) via
Algorithm 3 and Algorithm 4, respectively. Computational complexity (CC) analysis of JTO is
provided in [33] (not included here due to space limitation). Our analysis indicates that JTO is
of polynomial complexity, the same complexity order of state-of-the-art task offloading schemes.
V. DI SJ OI NT TASK OFFLOADING (DTO)
In DTO, transmit power allocation is independent of task placement and computational re-
source allocation. The transmit power is allocated under a radio latency constraint, i.e., Ttx
k≤
TRAN
k. Then, the task placement and computational resource allocation are performed so that
Tprop
k+Texe
k≤Tk−TRAN
k. The convexified sub-problem of the transmit power allocation is:
min
ρPk∈K ρk
s.t. C1-d: hk(ρ)−ˆgk(ρ)≥Dk
TRAN
k
,∀k∈ K
C4-a, and C5.
(22)
20
According to discussion on discussion on (7), a feasibility analysis is needed for (22). Similar
to JTO, the feasibility problem of (22) is:
find ρ
s.t. C1-e: hk(ρ)−ˆgk(ρ)≥Dk
TRAN
k+αk,∀k∈ K
C4-a, and C5,
(23)
which is solved via CVX. Next, the non-negative variables are updated as αk= [Ttx
k−TRAN
k]+
and the task with maximum non-negative variable is eliminated. This procedure is repeated until
a feasible subset of tasks for transmit power allocation is obtained. After this step, (22) is solved
with the feasible subset of tasks. The transmit power allocation phase of DTO is provided in
Algorithm 6.
Algorithm 6: DTO Transmit Power Allocation.
Input: K={1,··· , K},α0:very large, ρ0:very small, TRAN
k= (0, Tk)
1repeat
2i= 0
3repeat
% Allocate the transmit power to users
4Solve (23) via CVX and set ρi+1 =ρ?
% Update the non-negative variables
5αi+1
k= [Ttx
k−TRAN
k]+,∀k∈ K
6i=i+ 1
7until Pk∈K αi
k−Pk∈K αi+1
k≤or i≥Imax
8k?= arg maxk∈K αk
% Discard the infeasible task
9if αk?>0then
10 K=K\{k?}
11 until Pk∈K αk= 0
% Minimize the transmit power
12 Solve (22) via CCP in Algorithm 4 and return ρ?
Output: ρ?,KRAN =K
Having obtained transmit power ρ, task placement and computational resource allocation are
21
performed, whose associated sub-problem is:
min
ξ,υPn∈N Pk∈K Pb∈BnΛnξk
pb
nυk3
s.t. C1-f: Tprop
k+Texe
k≤Tk−TRAN
k,∀k∈ K,
C2, C3, and C6.
(24)
A feasibility analysis is also needed for solving (24). Similar to the transmit power allocation,
we introduce a set of non-negative variables. The resulting sub-problem is similar to (11) by
replacing C1-a with C1-f, which is solved by algorithm 1. After obtaining a set of feasible tasks,
(24) is solved via Algorithm 3. The feasibility analysis and optimization of DTO is provided in
Algorithm 7. CC of DTO is also analyzed in [33]. Our analysis shows that CC of DTO is less
than CC of JTO, however, both are in the same complexity order.
Algorithm 7: DTO Computational Resource Allocation and Task Placement.
Input: KRAN,ξ=0
1repeat
2i= 0
3repeat
% Allocate transmit power, computational resources, and place the tasks
4Solve (24) via Algorithm 1 given υi,ξi, and αiand return υi+1,ξi+1, and αi+1
5i=i+ 1
6until Pk∈K αi
k−Pk∈K αi+1
k≤or i≥Imax
% Find the task with maximum non-negative variable
7k?= arg maxk∈K αk
8if αk?>0then
% Discard the infeasible task
9K=K\{k?}
10 until Pk∈K αk= 0
11 i= 0
12 repeat
% Allocate computational resources and place the tasks
13 Given υiand ξi, solve (24) via Algorithm 3 and return υi+1 and ξi+1
14 until Ψ(ξi,υi,ρ?)−Ψ(ξi+1,υi+1 ,ρ?)≤or i≥Imax
Output: ξ?,υ?
VI. LOWER BOUND ON OPTIMAL SOLUTION (LTO)
Since the optimization problem (8) is non-convex, without loss of the optimality, we make
some assumptions to resolve the non-convexity of (8). First, we note that it is very likely for the
22
fiber-optic links to have sufficient capacity for carrying the traffic of UEs, which is the case for
frontahul links and any wired link in the transport network. As a result, we relax the constraints
C3 and C4 from (8). Note that the relaxation of C3 and C4 extends the feasible set of (8),
resulting in a lower bound on the optimal solution to (8). In addition, with a large number of
antenna elements at RRHs, the channel vectors between different RRHs and a specific user are
approximately orthogonal, i.e., |hH
u,khu,j | ≈ 0for all j6=k[30]. Therefore, the interference in
wireless channels is negligible and (15) becomes:
Rk=Wlog21 + |hu,k |2
σ2
n
ρk, k ∈ Ku.(25)
The elimination of the interference increases Rkwith the same amount of power allocated to
each UE, which again results in a lower bound on the optimal solution to (8). Based on the fact
that min
α,ξ,υ,ρPk∈K αk= min
α,ξ,υmin
ρPk∈K αk, the optimal power allocation is the solution to:
min
ρPk∈K αk
s.t. C1: Texe
k+Tprop
k+Ttx
k≤Tk+αk,∀k,
C5: ρk≤Pmax
k,∀k.
(26)
The data rate in (25) removes the cross-coupling impact of the allocated power to different users.
Hence, without loss of optimality, (26) is solved for each ρkindependently. The associated power
allocation problem is:
min
ρk
Ttx
k
s.t. C1: Texe
k+Tprop
k+Ttx
k≤Tk+αk,∀k,
C5: ρk≤Pmax
k,∀k,
(27)
in which αkin the objective is replaced with Ttx
k. Note that minimizing Ttx
kis equivalent to
maximizing Rk
Dk. Since Rkin (25) is increasing with ρk, the optimal solution of (27) is ρ?
k=Pmax
k.
Note that feasibility of C1 is ensured by optimizing other variables.
Next, we deal with the binary optimization variable ξ. We propose an exhaustive search over
all possible values of ξto avoid any performance loss due to non-convexity of (8), stemmed
from binary ξ. The number of all possible combinations of task placement decisions equals |B||K|
where |B| =Pn|Bn|. Thus, we solve (8) for αand υfor each task placement decision and
select the decision that results in lowest Pkαkas the optimal decision. Note that the exhaustive
search may impose an excessive computational complexity. However, LTO is developed as a
23
baseline for performance evaluation and it is not supposed to work in real-time.
The optimization problem for solving αand υis:
min
υ,αPk∈K αk
s.t. C1-a: Lk
υk≤˜
Tk+αk,∀k∈ K
C2: Pk∈Knυk≤Υn,∀n,
(28)
where ˜
Tk=Tk−Tprop
k−Ttx
kand Knis the set of tasks to be executed at executing server n.
Problem (28) is convex in both αand υ. As a result, the KKT conditions determine the optimal
solution. To derive the KKT conditions, we first write the Lagrangian function as follows:
L=X
k∈K αk+γk(Lk
υk−˜
Tk−αk)−ηkαk−µkυk+X
n∈N
λn X
k∈Kn
υk−Υn!.(29)
By derivating the Lagrangian function with respect to αkand υkwe have:
∂L
∂αk
= 1 −γk−ηk= 0,∀k∈ K,(30)
and
∂L
∂υk
=−γk
Lk
υ2
k−µk+λn= 0,∀k∈ Kn.(31)
In addition, the complementary slackness conditions are:
γk(Lk
υk−˜
Tk−αk) = 0,∀k∈ K,(32)
λn X
k∈Kn
υk−Υn!= 0,∀n∈ N,(33)
ηkαk= 0,∀k∈ K,(34)
µkυk= 0,∀k∈ K.(35)
Constraint C1-a implies υk>0. Hence, from (35) we have µk= 0 and condition (31) results
in:
υk=rLk
λn
,∀k∈ Kn,(36)
which implies λn>0. Thus, (33) gives:
X
k∈Kn
υk= Υn,∀n∈ N.(37)
On the other hand, when (7) is infeasible, we get αk>0. Thus, (34) leads to ηk= 0 and
condition (30) results in γk= 1. As a result, from (32) we get:
αk=Lk
υk−˜
Tk,∀k∈ K.(38)
24
Having αk≥0and (36), the optimal non-negative variable is:
αk= [pLkλn−˜
Tk]+,∀k∈ Kn,(39)
wherein λnis found such that:
X
k∈Kn
Lk
˜
Tk+αk
= Υn,∀n∈ N.(40)
Then, the optimal values of αkand υkare found as in (39) and (36), respectively. Having the
optimal solution of (28) for all possible ξ, the optimal solution of (8) is the solution with lowest
objective of (28).
VII. SIMULATION RES ULTS
In this section, we evaluate the performance of JTO 3. The setup of the simulation is presented
in Table II. We assume that U= 4 RRHs are placed with inter-site distance of 100 m and all
users are served in an area of 100 m radius with a given user-RRH assignment. The nodes in
the transport network are divided into three tiers based on their distance from UEs: the local
tier, the regional tier, and the national tier. Although the number of serving nodes is very large,
there are some distant nodes in each tier that impose a large propagation latency. Hence, we only
incorporate the nodes with reasonable propagation latency in the transport network [7]. Network
graph Gconsists of N= 6 nodes: ¯nat the local tier with zero propagation latency, three nodes at
the regional tier with relatively low propagation latency, and two distant nodes at the national tier.
For simplicity of comparison, we assume that all nodes have the same computational capacity
and all tasks are of the same size, load, and maximum acceptable latency, i.e., Dk=D,Lk=L,
and Tk=T,∀k. Moreover, we assume equal propagation latency and capacity for the network
links. Note that the relatively low value of link capacity (0.4Gbps) is the amount of capacity
solely reserved for task offloading. Finally, the simulations are performed on a 3.30 GHz Core
i5 CPU and 16 GB RAM.
Fig. 2 (a) reports the performance of the feasibility analysis in JTO, showing the acceptance
ratio versus T. The acceptance ratio is defined as the ratio of accepted services by the feasibility
analysis over the total number of the requested tasks. Note that the acceptance ratio increases
by increasing T. This is due to the fact that the tasks with higher Tneed less transmit power
and computational resources to be served. Moreover, for higher T, a larger number of nodes are
3The simulation files are available online at IEEE DataPort with DOI: 10.21227/w5tv-yz53.
25
TABLE II: Simulation Setup.
Parameter Value Parameter Value
Lk106CPU Cycles δ(m,m0)10 ms
M32 Antennas Λn10−28 [18]
Dk0.1Mbits Path Loss 128.1 + 37.6 log Q[21]
Υn109CPU Cycles per Second [6] U4
Pmax
k0.5Watt ISD 100 m
B(m,m0)0.4Gbps W20 MHz [21]
Bf,u 0.6Gbps Noise power −150 dBm/Hz [21]
available for task offloading. In addition, we solve (8) by the alternate search method (ASM),
in which (8) is alternatively solved for each variable. Note that the sub-problem of υis solved
by CVX and the sub-problem of ξis solved by MOSEK (details are not provided due to space
limitation). The effectiveness of JTO against ASM is also shown in Fig. 2 (a). Note that for
latencies smaller than 75 ms, JTO outperforms ASM. Moreover, the performance of both methods
is identical for low values of T. This is due to the fact that the set of accessible NFV-enabled
nodes for low values of Tis restricted to ¯nand therefore, JTO is not able to offload the tasks
to more distant NFV-enabled nodes because their propagation latencies violate the E2E latency
constraints.
The acceptance ratio of JTO for different number of tasks is shown in Fig. 2 (b). Since the
amount of available resources is limited, the acceptance ratio is decreasing with the increase in
the total number of tasks. Again, the superiority of JTO over ASM is observed.
0 10 20 30 40 50 60 70
T (ms)
0
0.2
0.4
0.6
0.8
1
Acceptance Ratio
JTO (Alg. 2)
ASM
27%
24%
(a) Acceptance ratio vs. Tfor K= 30.
10 20 30 40 50 60 70 80 90 100
K
0
0.2
0.4
0.6
0.8
1
Acceptance Ratio
JTO (Alg. 2)
ASM
(b) Acceptance ratio vs. Kfor T= 40 ms.
Fig. 2: Acceptance ratio vs. Tand K
The convergence of Algorithm 2 is shown in Fig. 3 (a). As expected, the sum of non-negative
variables is decreasing in each iteration. Furthermore, Algorithm 2 converges faster than ASM,
which stems from higher acceptance ratio of JTO.
26
0 5 10 15 20 25 30
# Iterations
0
1
2
3
4
k
JTO (Alg. 2)
ASM
103
(a) Convergence of admission control algorithm for
T= 20 ms and K= 30.
5 10 15 20 25
TRAN (ms)
0
0.2
0.4
0.6
0.8
1
Acceptance Ratio
DTO (Alg. 6)
DTO (Alg. 7)
JTO (Alg. 2)
(b) Acceptance ratio of joint vs. disjoint methods in
terms of TRAN for T= 30 ms and K= 30 users.
Fig. 3: Convergence and acceptance ratio of the proposed methods.
The acceptance ratio of JTO is compared with DTO in Fig. 3 (b). The acceptance ratio of
JTO and DTO is depicted for T= 30 ms. For DTO, we obtain the acceptance ratio for different
values of TRAN ∈(0, T ). Moreover, the acceptance ratio of the feasibility analysis in the transmit
power allocation phase of DTO, i.e., Algorithm 6, is depicted. The acceptance ratio of DTO is
increasing for small values of TRAN, that is, the small values of TRAN impose high rates on
users, which is impossible due to either insufficient bandwidth or limited fronthaul capacity.
On the other hand, for larger values of TRAN, the acceptance ratio of Algorithm 6 is 1 but the
task placement and computational resource allocation restricts the number of accepted tasks.
Furthermore, JTO outperforms DTO in different values of TRAN.
Fig. 4 (a) shows the average radio transmission latency, i.e., 1
KPk∈K Ttx
k, and the average
execution latency of tasks, i.e., 1
KPk∈K Texe
kfor different values of Dgiven T= 20 ms.
The average radio transmission latency increases by increasing Dand subsequently the average
execution latency is decreased to maintain the maximum acceptable latency. Therefore, it is
inferred that JTO efficiently manages the transmit power and the computational resources.
Similarly, according to Fig. 4 (b), the average execution latency increases by increasing Land
subsequently this increase is compensated with lower radio transmission latency.
In Fig. 5, we assume there are three classes of tasks (each including 10 tasks) with three
different maximum acceptable latencies, i.e., T(1) = 10 ms, T(2) =50 ms, and T(3) = 100 ms. The
classes (1),(2), and (3) are considered as the sets of tasks with low, medium, and high latency
requirements, respectively. Moreover, we assume there are three nodes (shown by rectangles):
a local node (i.e., ¯n) with zero propagation latency, a regional node with 20 ms propagation
latency, and a national node with 40 ms propagation latency. The propagation latencies are the
27
0.1 0.2 0.3 0.4
D (Mbits)
0
2
4
6
8
10
12
14
16
18
20
Average delay (ms)
Average Tx. delay Average Exe. delay
(a) Average radio transmission and execution latencies
vs. Dfor T= 20 ms and K= 30.
1 10 20 30 40 50
L ( 10 5 CPU cycles)
0
2
4
6
8
10
12
14
16
18
20
Average delay (ms)
Average Tx. delay Average Exe. delay
(b) Average radio transmission and execution latencies
vs. Lfor T= 20 ms and K= 30.
Fig. 4: Average radio transmission and execution latencies in JTO.
5
9
1
10
5
5
3
7
5
4
1
4
4
5
10
1
10 9
Low latency tasks
Medium latency tasks
High latency tasks
(a) C=1 (b) C=10 (c) C=20 (d) C=50
(𝑇(1) =10 ms)
(𝑇(2) =50 ms)
(𝑇(3) =100 ms)
* C is scaled by 109 Cycles per Second
Fig. 5: Placement of the different classes of tasks at three different tiers of nodes for K= 30.
summation of uplink and downlink propagation latencies. Fig. 5 shows the task placement for
different values of the processing capacity of nodes C= Υn,∀n. When C= 1, none of the
nodes is able to serve the tasks in class (1) due to their high resource utilization. However, the
tasks in class (2) are mainly served at the local node and class (3) tasks are placed at regional
and national nodes. When C= 10, some of the tasks in class (1) are placed at the local node.
Moreover, some tasks in class (2) and (3) are served at the local node as well. Furthermore, the
national node does not serve any task because JTO places the tasks at the nearest nodes in order
to reduce the transmit power. When C= 20, more tasks in class (1) are served at the local node
and the acceptance ratio reaches 1. Finally, when C= 50, almost all of the tasks are placed at
the local node to reduce the transmit power consumption. Table III shows the acceptance ratio
of each class for different values C. Note that the acceptance ratio of all classes is increased
by increasing C. Moreover, the acceptance ratio of class (1) is lower than that of classes (2)
and (3). The reason is twofold, one is due to high resource utilization by tasks of this class and
another is due to limited number of available nodes for tasks with low latency requirement (only
28
node ¯nin this example).
TABLE III: Acceptance ratio of JTO for different task classes vs. processing capacity of nodes.
Computational
capacity (109CPU cycles/sec)
Maximum acceptable latency (ms)
T(1) = 10 T(2) = 50 T(3) = 100
C= 1 0 0.5 0.9
C= 10 0.5 0.9 1
C= 20 1 1 1
Fig. 6 shows the acceptance ratio of LTO and JTO for different values of maximum acceptable
latency T. Due to the high computational complexity of exhaustive search in LTO, we consider
a simple network graph comprised of two nodes connected with a single link. Moreover, the
total number of tasks |K| is 20. The acceptance ratio of both JTO and LTO is lower for larger
computational loads. Meanwhile, the acceptance ratio of JTO is almost the same as LTO for
different values of Tand L.
0 10 20 30 40 50
T (ms)
0
0.2
0.4
0.6
0.8
1
Acceptance Ratio
LTO, L=10
JTO, L=10
LTO, L=20
JTO, L=20
Fig. 6: Acceptance ratio of LTO and JTO vs. maximum acceptable latency.
VIII. CONCLUSIONS AND FUTURE WORK
In this paper, we considered an energy-efficient task offloading problem under E2E latency
constraints. We investigated the joint impact of radio transmission, propagation of tasks through
the transport network, and execution of tasks on the experienced latency of tasks. Due to the non-
convexity of the optimization problem, we decoupled the transmit power allocation from task
placement and computational resource allocation. The transmit power allocation was solved by
adopting CCP to convexify the sub-problem. The task placement and computational resource al-
location were solved via our proposed heuristic method, which minimizes the sum of propagation
and execution latencies. Furthermore, to ensure the feasibility of the optimization problem, we
proposed a feasibility analysis that eliminates the tasks causing infeasibility. Simulation results
29
showed the superiority of JTO over both DTO and ASM. The performance of DTO depended
on the part of latency required to be met in the radio access network, i.e., TRAN. However,
JTO showed higher acceptance ratios for different values of TRAN. As future work, we plan to
incorporate task scheduling into JTO. Moreover, the investigation of an innovative solution that
divides the required computational load of each task among several nodes will be an interesting
future research activity.
REFERENCES
[1] B. Yi, X. Wang, K. Li, S. K. Das, and M. Huang, “A comprehensive survey of network function virtualization,” Computer
Networks, vol. 133, pp. 212–262, 2018.
[2] P. Mach and Z. Becvar, “Mobile edge computing: A survey on architecture and computation offloading,” IEEE
Communications Surveys & Tutorials, vol. 19, no. 3, pp. 1628–1656, 2017.
[3] ETSI, “Mobile Edge Computing (MEC); Framework and reference architecture,” ETSI Group Specification MEC 003,
2016.
[4] H. Guo, J. Liu, and J. Zhang, “Computation offloading for multi-access mobile edge computing in ultra-dense networks,”
IEEE Communications Magazine, vol. 56, no. 8, pp. 14–19, 2018.
[5] W. Almughalles, R. Chai, J. Lin, and A. Zubair, “Task execution latency minimization-based joint computation offloading
and cell selection for MEC-enabled HetNets,” in 2019 28th Wireless and Optical Communications Conference (WOCC),
pp. 1–5, IEEE, 2019.
[6] L. Yang, H. Zhang, M. Li, J. Guo, and H. Ji, “Mobile edge computing empowered energy efficient task offloading in 5G,”
IEEE Transactions on Vehicular Technology, vol. 67, no. 7, pp. 6398–6409, 2018.
[7] T. X. Tran and D. Pompili, “Joint task offloading and resource allocation for multi-server mobile-edge computing networks,”
IEEE Transactions on Vehicular Technology, vol. 68, no. 1, pp. 856–868, 2019.
[8] T. Q. Dinh, J. Tang, Q. D. La, and T. Q. Quek, “Offloading in mobile edge computing: Task allocation and computational
frequency scaling,” IEEE Transactions on Communications, vol. 65, no. 8, pp. 3571–3584, 2017.
[9] B. Yang, W. K. Chai, Z. Xu, K. V. Katsaros, and G. Pavlou, “Cost-efficient NFV-enabled mobile edge-cloud for low latency
mobile applications,” IEEE Transactions on Network and Service Management, vol. 15, no. 1, pp. 475–488, 2018.
[10] X. Chen, L. Jiao, W. Li, and X. Fu, “Efficient multi-user computation offloading for mobile-edge cloud computing,”
IEEE/ACM Transactions on Networking, vol. 24, no. 5, pp. 2795–2808, 2015.
[11] K. Zhang, Y. Mao, S. Leng, Q. Zhao, L. Li, X. Peng, L. Pan, S. Maharjan, and Y. Zhang, “Energy-efficient offloading for
mobile edge computing in 5G heterogeneous networks,” IEEE access, vol. 4, pp. 5896–5907, 2016.
[12] T. Li, C. S. Magurawalage, K. Wang, K. Xu, K. Yang, and H. Wang, “On efficient offloading control in cloud radio access
network with mobile edge computing,” in 2017 IEEE 37th International Conference on Distributed Computing Systems
(ICDCS), pp. 2258–2263, IEEE, 2017.
[13] P. Zhao, H. Tian, C. Qin, and G. Nie, “Energy-saving offloading by jointly allocating radio and computational resources
for mobile edge computing,” IEEE Access, vol. 5, pp. 11255–11268, 2017.
[14] X. Zhang, Y. Mao, J. Zhang, and K. B. Letaief, “Multi-objective resource allocation for mobile edge computing systems,”
in 2017 IEEE 28th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC),
pp. 1–5, IEEE, 2017.
30
[15] J. Zhang, X. Hu, Z. Ning, E. C.-H. Ngai, L. Zhou, J. Wei, J. Cheng, and B. Hu, “Energy-latency tradeoff for energy-aware
offloading in mobile edge computing networks,” IEEE Internet of Things Journal, vol. 5, no. 4, pp. 2633–2645, 2017.
[16] C. Wang, F. R. Yu, C. Liang, Q. Chen, and L. Tang, “Joint computation offloading and interference management in wireless
cellular networks with mobile edge computing,” IEEE Transactions on Vehicular Technology, vol. 66, no. 8, pp. 7432–7445,
2017.
[17] C. You, K. Huang, H. Chae, and B.-H. Kim, “Energy-efficient resource allocation for mobile-edge computation offloading,”
IEEE Transactions on Wireless Communications, vol. 16, no. 3, pp. 1397–1411, 2016.
[18] F. Zhou, Y. Wu, R. Q. Hu, and Y. Qian, “Computation rate maximization in UAV-enabled wireless-powered mobile-edge
computing systems,” IEEE Journal on Selected Areas in Communications, vol. 36, no. 9, pp. 1927–1941, 2018.
[19] A. Khalili, S. Zarandi, and M. Rasti, “Joint resource allocation and offloading decision in mobile edge computing,” IEEE
Communications Letters, vol. 23, no. 4, pp. 684–687, 2019.
[20] J. Zhang, W. Xia, F. Yan, and L. Shen, “Joint computation offloading and resource allocation optimization in heterogeneous
networks with mobile edge computing,” IEEE Access, vol. 6, pp. 19324–19337, 2018.
[21] W. Xia, J. Zhang, T. Q. Quek, S. Jin, and H. Zhu, “Power minimization-based joint task scheduling and resource allocation
in downlink C-RAN,” IEEE Transactions on Wireless Communications, vol. 17, no. 11, pp. 7268–7280, 2018.
[22] M.-H. Chen, M. Dong, and B. Liang, “Resource sharing of a computing access point for multi-user mobile cloud offloading
with delay constraints,” IEEE Transactions on Mobile Computing, vol. 17, no. 12, pp. 2868–2881, 2018.
[23] S. Li, N. Zhang, S. Lin, L. Kong, A. Katangur, M. K. Khan, M. Ni, and G. Zhu, “Joint admission control and resource
allocation in edge computing for internet of things,” IEEE Network, vol. 32, no. 1, pp. 72–79, 2018.
[24] J. Guo, Z. Song, Y. Cui, Z. Liu, and Y. Ji, “Energy-efficient resource allocation for multi-user mobile edge computing,”
in GLOBECOM 2017-2017 IEEE Global Communications Conference, pp. 1–7, IEEE, 2017.
[25] M.-H. Chen, B. Liang, and M. Dong, “Joint offloading decision and resource allocation for multi-user multi-task mobile
cloud,” in 2016 IEEE International Conference on Communications (ICC), pp. 1–6, IEEE, 2016.
[26] A. Al-Shuwaili and O. Simeone, “Energy-efficient resource allocation for mobile edge computing-based augmented reality
applications,” IEEE Wireless Communications Letters, vol. 6, no. 3, pp. 398–401, 2017.
[27] Y. Yu, J. Zhang, and K. B. Letaief, “Joint subcarrier and cpu time allocation for mobile edge computing,” in 2016 IEEE
Global Communications Conference (GLOBECOM), pp. 1–6, IEEE, 2016.
[28] J. Liu, Y. Mao, J. Zhang, and K. B. Letaief, “Delay-optimal computation task scheduling for mobile-edge computing
systems,” in 2016 IEEE International Symposium on Information Theory (ISIT), pp. 1451–1455, IEEE, 2016.
[29] J. Li, H. Gao, T. Lv, and Y. Lu, “Deep reinforcement learning based computation offloading and resource allocation for
MEC,” in 2018 IEEE Wireless Communications and Networking Conference (WCNC), pp. 1–6, IEEE, 2018.
[30] L. Lu, G. Y. Li, A. L. Swindlehurst, A. Ashikhmin, and R. Zhang, “An overview of massive MIMO: Benefits and
challenges,” IEEE journal of selected topics in signal processing, vol. 8, no. 5, pp. 742–758, 2014.
[31] J. W. Chinneck, Feasibility and Infeasibility in Optimization. Algorithms and Computational Methods, Springer, 2008.
[32] T. Lipp and S. Boyd, “Variations and extension of the convex–concave procedure,” Optimization and Engineering, vol. 17,
no. 4, pp. 263–287, 2016.
[33] M. Tajallifar, S. Ebrahimi, M. R. Javan, N. Mokari, and L. Chiaraviglio, “Energy-efficient task offloading under E2E
latency constraints.” arXiv preprint arXiv:1912.00187, June 2021.