Conference PaperPDF Available

Effective Communication: When to Pull Updates?

Authors:
Effective Communication: When to Pull Updates?
Pouya Agheli, Nikolaos Pappas, Petar Popovski, and Marios Kountouris
Communication Systems Department, EURECOM, Sophia Antipolis, France
Department of Computer and Information Science, Link¨
oping University, Link¨
oping, Sweden
Department of Electronic Systems, Aalborg University, Aalborg, Denmark
Email: pouya.agheli@eurecom.fr, nikolaos.pappas@liu.se, petarp@es.aau.dk, marios.kountouris@eurecom.fr
Abstract—We study a pull-based communication system where
a sensing agent updates an actuation agent using a query control
policy, which is adjusted in the evolution of an observed infor-
mation source and the usefulness of each update for achieving
a specific goal. For that, a controller decides whether to pull an
update at each slot, predicting what is probably occurring at the
source and how much effective impact that update could have at
the endpoint. Thus, temporal changes in the source evolution could
modify the query arrivals so as to capture important updates.
The amount of impact is determined by a grade of effectiveness
(GoE) metric, which incorporates both freshness and usefulness
attributes of the communicated updates. Applying an iterative
algorithm, we derive query decisions that maximize the long-
term average GoE for the communicated packets, subject to cost
constraints. Our analytical and numerical results show that the
proposed query policy exhibits higher effectiveness than existing
periodic and probabilistic query policies for a wide range of query
arrival rates.
I. INT RO DU C TI ON
The effectiveness problem in communication systems deals
with whether or not a message conveyed by a sender, leads to
a desirable impact at the receiver for achieving a specific goal.
This has been first articulated in [1], and has recently been
revived under the prism of goal-oriented semantic communica-
tion [2], [3]. In cyber-physical systems with interacting sensing
entities and actuation/monitoring agents, a message ought to
be generated and communicated if it can potentially have the
desired impact in the system. In this setting, the communication
goal determines the grade of effectiveness a message has
according to its usefulness or importance in fulfilling a set of
attributes required for achieving that goal. This approach has
the potential to enable system scalability and judicious use of
resources by avoiding the acquisition, processing, and transport
of information that turns out to be ineffective, irrelevant, or
useless.
In current networked intelligent systems, information trans-
fer, e.g., in the form of status update packets, over the network
is mainly done using a push-based communication model.
Packets are pushed toward the endpoint according to the
source’s decision, regardless of what and when the endpoint
The work of P. Agheli and M. Kountouris has received funding from the
European Research Council (ERC) under the European Union’s Horizon 2020
research and innovation programme (Grant agreement No. 101003431). The
work of N. Pappas is supported by the VR, ELLIIT, and the European Union
(ETHER, 101096526), and the work of P. Popovski was supported by the
Villum Investigator Grant “WATER” from the Velux Foundation, Denmark.
actually needs. On the other hand, we have the pull-based
communication framework, where the endpoint requests and
controls the type of the generated information and its arrival
time [4]–[8]. To this end, existing query policies overlook the
evolution of the source (what is probably happening on the
sensing side) and the expected importance or usefulness of the
updates at the time of pulling them. As a result, query-based
policies, even if they are aligned to the receiver’s objectives, do
not necessarily result in high system effectiveness. To address
this challenge and to bridge the pull-based with the push-based
world, we introduce a novel policy where the endpoint pulls
updates based on the statistics of the source and the previously
received updates. Thanks to this policy, the endpoint timely
adapts its query instances according to the received updates,
hence being able to capture updates (e.g., critical source events,
alarms, novelty, etc.) that have high impact for achieving the
subscribed goal. In a way, our policy can be seen as an adaptive
query policy where requests tend to align to the most useful or
important system-wise source realizations.
This paper falls within the realm of pull-based communica-
tion methods, but with a query control policy, owing to which
the query controller timely decides to pull updates following
the source changes and the updates’ expected usefulness. Our
work leverages the concept of Just-in-Time systems [9] and
extends prior work on query age of information (QAoI) [4]–
[8] into the systems where queries arrive at the right times to
request the source realizations (updates) with high usefulness
and significant effect at the endpoint. In this regard, a class
of optimal policies is derived to maximize the long-term
expected effectiveness of the update packets a sensing agent
sends to an acting agent, subject to a communication cost.
Our analytical and simulation results show that the proposed
query policy outperforms existing baseline approaches in terms
of effectiveness grade. Furthermore, we demonstrate that the
solution converges to a threshold-based query control scheme,
where the query controller can timely decide whether to pull an
update or not by merely looking up a table with the obtained
threshold boundaries for a given goal.
II. SY STE M MOD E L
We consider an end-to-end pull-based communication system
in which update packets are generated via a sensing agent
(SA) and transmitted to an actuation agent (AA) for taking
appropriate actions at the endpoint and fulfilling the subscribed
goal. The packet generation is done in the event of receiving
aquery from a query controller at the AA. Here, the packets
contain status updates of a sensor observing an information
source/event or payload data from the application layer towards
the network. We assume the system operates in a slotted manner
with time slots n, for nN. At the n-th time slot, the query
arrival indicator is denoted by αn={0,1}, with αn= 1
indicating the arrival of a query, and αn= 0, otherwise.
Thus, the set of slots at which queries arrive is defined as
N={n| n:αn= 1}. The update and query channels are
assumed to be error-prone and error-free, respectively, and the
whole duration of receiving a query and the consequent update
acquisition and its communication is normalized into one time
slot. Also, we consider xnthe update communicated at slot n,
which can be seen as some form of semantic representation.
In parallel with its acquisition, the n-th update is evaluated
based on its usefulness for satisfying the goal and is attributed
a rank of importance (value) at the source level in the form
of a meta-value vn,n N [10, Section III-A]. For the sake
of generality, we assume that vnis a random variable (r.v.)
that follows a discrete-time Markov process with finite state
space V={νi|i= 1,2, ..., |V|} and transition matrix Pv=
[pij ]|V|×|V | with pij = Pr(vn+1 =νj|vn=νi). From an
effectiveness viewpoint, a successfully received update has a
specific impact at the receiver or decision-making side. In the
proposed model, we consider that the goal is perfectly known
at both SA and AA, thus the same framework is employed to
measure the usefulness of updates at both ends.1
A. Query Control Policies
We consider that the query controller can apply two different
types of policies to pull updates from the SA in a pull-based
system, as follows.
1) Effect-agnostic policy: Under this policy, the queries
arrive with a controlled query rate, pursuing a specific schedule
or a stochastic process, e.g., Poisson, binomial, and Markovian
[5]–[8]. Therefore, the existing effect-agnostic policies pose
aleatoric uncertainty associated with the nature of random
updates by ignoring what is probably occurring at the source
during the time of pulling those updates.
2) Effect-aware policy: With this policy, the AA tries to
infer or predict the probable state of the source and the expected
usefulness of an update at the time of the decision, hence adapt-
ing its query instances and pulling updates to the right time
slots. This can be done via building or updating a model of the
source’s process at the AA. Owing to this policy, albeit packet
generation and communication is requested by queries, what
happens at the source is also considered. Therefore, epistemic
uncertainty arises with the effect-aware policy, as decisions are
done based on probabilistic predictions, instead of accurate
knowledge. This uncertainty could be reduced or harnessed
using for instance machine learning or prediction methods. We
delve into this query control problem in Section IV.
1The analysis can be easily extended to the case where the goal is not shared,
but the AA, based on the received updates, may learn or estimate the probability
distribution of the updates’ usefulness at the endpoint.
III. GRA DE O F EFFEC TI VEN ES S ME TR IC
To model effectiveness, we advocate for a metric that in-
volves two system-level attributes: freshness and usefulness
of the successfully received updates. Freshness describes how
obsolete an update gets as time passes from the instant it is cor-
rectly received at the AA. This attribute is commonly quantified
via the age of information (AoI) metric or its variants. Besides,
the usefulness signifies the rank of importance, as defined in
Section II, each correctly received packet offers at the endpoint.
With the above explanation in mind, we propose the grade of
effectiveness (GoE) metric for measuring the amount of impact
an update has at the endpoint.
A. GoE Formulation
The GoE for the update packet communicated at the n-th
time slot is denoted by GoEnR+
0and modeled in the form
of a composite function f:R+
0×R+
0R+
0of the AoI, called
nfor the n-th time slot, and the usefulness of that update,
i.e., vn, to satisfy the subscribed goal. Thus, we can write2
GoEn=fg(∆n), gv(vn)(1)
where g:R+
0R+
0is a non-increasing penalty function, and
gv:R+
0R+
0shows a non-decreasing utility function. In (1),
n=nmwith 0= 1, and m= max
i:αi(1ǫi)=1, ini. Also,
ǫn=gǫ(d(xn,ˆxn)) depicts the update discrepancy, where ˆxn
indicates the received update at the AA, gǫ:R+
0 {0,1}is
the mapping function to the Boolean space, and d:RR+
0
is a distance function. In (1), if we overlook the usefulness of
updates or their freshness, the GoE metric turns into the penalty
definition for the query AoI (QAoI), i.e., GoEn=fg(∆n),
or the utility formulation for the value of information (VoI), i.e.,
GoEn=fgv(vn), respectively.
B. The Pull-Based System’s GoE
Since the update acquisition and communication is done on
the condition of receiving queries in the pull-based system,
GoE(pull)
nbecomes limited to the slots at which the queries
arrive, i.e., n N . From (1), we can write
GoE(pull)
n= GoEn|αn=1 ×
1
(αn= 1) (2)
where
1
(·)denotes the indicator function.
IV. QUE RY CONT ROL
In this section, we first formulate the query control problem
for the effect-aware policy, and we then propose a solution for
the defined decision/control problem.
A. Problem Formulation
The objective is to maximize the long-term expected GoE
via controlling the query arrivals, subject to a constraint on the
average communication cost Cmax, which cannot be surpassed.
2The GoE metric can be seen as a particular case of the Semantics of
Information (SoI) metric [2], [11] and can also be defined in various forms
based on the scenario, including that utilized for active fault detection in [12].
In this sense, a class of optimal policies, named π, is derived
by solving an optimization problem as follows
P1: max
πlim sup
N→∞
1
NEN
X
n=1
GoE(pull)
nGoE(pull)
0
s.t.lim sup
N→∞
1
NEN
X
n=1
gc(αncn)Cmax (3)
where gc:R+
0R+
0is a non-decreasing function, and cn
indicates the communication cost at the n-th time slot.
We cast P1into an infinite-horizon constrained Markov
decision process (CMDP) based on the following definitions:
I) States: The state at the n-th slot depicts the GoE and
is denoted with a tuple Sn= (∆n, vn). Without loss
of generality, we consider the AoI to be truncated by
the maximum value of max, which is large enough to
represent excessive staleness and meet g(∆max 1)
(1 + ε)g(∆max)with accuracy ε. Given this, Snis a
member of the state space S={σi|i= 1,2, ..., |S|},
which is countable and finite with |S| = max · |V|.
II) Actions: With regard to Section II, we define the action
space Awith two possible outcomes: αn= 1 for pulling
an update, and αn= 0 for keeping silent.
III) Transition probabilities: The transition probability from
Snto Sn+1 under the action αnis P(Sn, αn, Sn+1) =
Pr((∆n+1, vn+1 )|(∆n, vn), αn),n, where we define
Pr((1, νj)|(∆n, νi), αn) = pij αn(1p(n)
ǫ),i, j,
Pr((min{n+1,max}, νi)|(∆n, νi), αn) = αnp(n)
ǫ+
(1αn),i.
Otherwise, we have P(Sn, αn, Sn+1) = 0. Here, νi, νj
V, and p(n)
ǫ= Pr(ǫn= 1).
IV) Rewards: The reward of going from Snto Sn+1 in the
course of the action αnis R(Sn, αn, Sn+1) = GoE(pull)
n+1 .
With the above definitions in mind, we give the following
proposition that states that the expected sum of GoE in (3) is
the same for all initial states, hence there exists an optimal
stationary policy for the defined problem.
Proposition 1. The modeled CMDP pursues the weak acces-
sibility (WA) condition.
Proof. We can partition the space set Sinto two subsets Sa=
{Sn S | Sn= (1, νn),νn V } and Sb=S Sa. The
primary subset contains all states whose n= 1, hence |Sa|=
|V|. The latter subset includes the rest of the states with n
2, and |Sb|= (∆max1)· |V |. In this regard, all states of Sbare
transient under any stationary policy, while every state of an
arbitrary pair of two states in Sais accessible from the other
state. Given this and considering the WA condition as in [13,
Definition 4.2.2], the modeled CMDP is weakly accessible.
As the WA condition holds for the modeled CMDP, it can be
concluded that the optimal expected GoE remains the same for
all initial states, i.e., GoE(pull)
n,n, is independent of GoE(pull)
0
[13, Proposition 4.2.3]. Moreover, Proposition 1 confirms the
existence of an optimal stationary policy πfor P1. This
optimal policy is unichain [13, Proposition 4.2.6]. To solve the
problem, we first relax its constrained form via defining a dual
problem, then we proceed with proposing an algorithm.
B. Dual problem
We convert the constrained form of P1to an unconstrained
one via writing the Lagrange function L(µ;π)as follows
L(µ;π) = lim sup
N→∞
1
NEN
X
n=1 GoE(pull)
nµgc(αncn)
+µCmax (4)
where µ0indicates the Lagrange multiplier. Then, we can
summarize the Lagrange dual problem to be solved as
P2: inf
µ0max
πL(µ;π)
|{z }
:=h(µ)
(5)
in the form of an unconstrained MDP, where h(µ) = L(µ;π
µ)
shows the Lagrange dual function. Here, π
µ:S A appears
as the µ-optimal policy and is derived from the dual problem
for a given µ, as π
µ= arg max
π
L(µ;π).
Since Shas finite states, which ensures the growth condition
as in [14], and GoEn0,n, from (1), P1and P2converge to
the same optimal values [14, Corollary 12.2]. Therefore, we can
find the class of optimal policies πafter applying an iterative
algorithm approach [15].
C. Iterative Algorithm
The iterative approach is illustrated in Algorithm 1 with two
inner and outer loops for deriving the µ-optimal policy, i.e.,
π
µ, and the optimal Lagrange multiplier, i.e., µ, respectively.
1) Computing π
µ:In the inner loop, with a given µfrom
the outer loop, the query policy is iteratively updated taking an
optimal action which maximizes the expected utility (value)
V(t)
πµ(s)for the state s S at the t-th, tN, iteration.
Under the form of the value iteration approach for the unichain
policy MDPs [16], the optimal value function is derived from
Bellman’s equation [17], as follows
V(t)
π
µ(s) = max
α∈A X
s∈S
P(s, α, s)hRµ(s, α, s)+V(t1)
π
µ(s)i.
(6)
Consequently, the optimal policy for s S is updated by
π
µ(s)arg max
α∈A X
s∈S
P(s, α, s)hRµ(s, α, s)+V(t1)
π
µ(s)i
(7)
where Rµ(s, α, s):=R(s, α, s)µgc(αc)is the net reward.
The inner loop stops once the stopping/convergence criterion
max
s∈S V(t+1)
π
µ(s)V(t)
π
µ(s)εvis satisfied, where εvindicates
the convergence accuracy. As the query policies are unichain
with aperiodic transition matrices, the above convergence cri-
terion is reached for some finite iterations [16, Theorem 8.5.4].
2) Computing µ:In order to find the optimal Lagrange
multiplier µ(l)at the l-th, lN, step of the outer loop,
according to the updated π
µfrom the inner loop, we apply
Algorithm 1: Solution for deriving πand µ.
Input: Known parameters N1,cn,n,εµ,Cmax,η,
states S, and actions A. Initial values l1,
µ(0) 0,µ0,µ+1,π
µ0, and
π+
µ0. The form of gc.
1Initialize π
µ(s),s S, via running Utility(µ(0)).
2if EhPN
n=1 gc(αncn)iNCmax then goto 10.
3while |µ+µ| εµdo
Step l:Outer loop (Bisection search)
4Update µ(l)µ++µ
2and π
µUtility(µ(l)).
5if EhPN
n=1 gc(αncn)iNCmax then
6µµ(l), and π
µUtility(µ).
7else µ+µ(l), and π+
µUtility(µ+).
8Reset ll+ 1.
9if EhPN
n=1 gc(αncn)i< NCmax then
π
µ(s)ηπ
µ(s) + (1η)π+
µ(s),s S.
10 return µ=µ(l)and π(s) = π
µ(s),s S.
Function Utility(µ):
Input: Known parameters N1,εv, states S, and
actions A. Initial values t1,π
µ(s)0,
and V(0)
π
µ(s)0,s S.
Iteration t:Inner loop (Value iteration approach)
11 for state s S do
12 compute V(t)
π
µ(s)from (6), then update π
µ(s)as
in (7) with the optimal action from V(t)
π
µ(s).
13 if max
s∈S V(t+1)
πµ(s)V(t)
πµ(s)εvthen
14 step up tt+ 1, and goto 11.
15 return π
µ.
the so-called bisection method with the stopping criterion
|µ+µ| εµand the search accuracy εµ, as depicted in
Algorithm 1. From (4) and (5), the increase of µcontinu-
ously increases the dual function h(µ)while decreasing the
net reward GoE(pull)
nµgc(αncn)and the query arrival rate.
Thus, we search for the smallest value of µthat satisfies the
communication cost Cmax. Since GoE(pull)
nis independent of
µgc(αncn), one can verify that h(µ)is a Lipschitz continuous
function of µwith the Lipschitz constant of
Cmax lim sup
N→∞
1
NEN
X
n=1
gc(αncn).
Thus, the outer loop converges to the optimal multiplier after fi-
nite iterations [18, pp. 294]. The optimal value is attained based
on a simple non-randomized stationary policy or a mix of two
non-randomized policies with a mixing probability η, which
can be obtained such that EPN
n=1 gc(αncn)=NCmax [19].
Algorithm 1 has at most O(2LT 2
max|V |2)arithmetic op-
erations, where Land Tindicate the step sizes of the outer
and inner loops, respectively. The increase of the states and the
iteration size of either loop increase the algorithm’s complex-
ity. Nevertheless, this complexity is manageable in real-world
scenarios owing to the following items:
max is set to a small value since the effectiveness of an
update at the endpoint saturates to a close to zero value
past a level of staleness, making that update not useful.
As the usefulness of the updates can be normalized,i.e.,
νi[0,1],i V, a large outcome space is not
necessarily needed for V.
The communication cost is usually fixed, thus reducing
the bisection search interval, hence L, after some trials.
D. Threshold-Based Query Control Model
Finding the optimal policy πfrom Algorithm 1 gives us
a threshold criterion th for every CMDP’s state, following
which maximizes the GoE of the system subject to the com-
munication cost. With th in hand, the query controller timely
decides to pull an update or not based on the current AoI,
i.e., n, and the meta-value of the latest correctly received
update at the query time, i.e., vmfor m= max
i:αi(1ǫi)=1, ini.
In this sense, the optimal action α
nfor the n-th time slot can
be derived according to two alternative options, as given below.
Option I: The value of th is a function of vm, where
th = (vm)
th , hence α
n=
1
(∆n(vm)
th |vm).(8)
Option II: The level of vth depends on n. Thus, we have
th =v(∆n)
th , hence α
n=
1
(vmv(∆n)
th |n).(9)
V. SIM ULATI ON RE S ULTS
We assess the performance of the proposed effect-aware
query control policy within N= 1000 time slots and
compare it with three existing effect-agnostic query arrival
models, namely (i) periodic model, and stochastic models
following (ii) binomial and (iii) Markovian process. For the
latter model, we have a Markov chain with two states of
“pulling an update” and “keeping silent”, in which the self-
transition probability of the latter state is 0.95, while the
one for the primary state relies on the query rate. The de-
fault query rate for the effect-agnostic policies is 0.8, unless
otherwise stated. Furthermore, we equally divide the interval
[0,1] into |V| = 10 levels and initialize the importance set
V={0,0.11,0.22,0.33,0.44,0.56,0.67,0.78,0.89,1}, each
indicating a normalized rank of importance with pii =pij =
1
|V| ,νi, νj V . The maximum acceptable AoI is max = 10,
and the probability of the update discrepancy is p(n)
ǫ= 0.2,n.
Also, for Algorithm 1, we set εv=εµ= 103and η= 0.5.
For performance evaluation, we define a net GoE (NGoE)
metric, which incorporates the GoE and the cost, as follows
NGoE(pull)
n= expGoE(pull)
ngc(αncn)
= expvnnc0αn(10)
for the n-th slot, n N , where the exponential form is
arbitrarily used to guarantee positiveness. In (10), without loss
0 10 20 30 40 50
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Effect-aware
Effect-agnostic (Periodic)
NGoE(pull)
n
Update time slots (n)
Fig. 1. The effectiveness performance of different query control policies within
N= 50 slots.
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Effect-aware (Simulation)
Effect-aware (Analytical)
Effect-agnostic (Periodic)
Effect-agnostic (Binomial)
Effect-agnostic (Markovian)
Average NGoE
CDF of average NGoE
Fig. 2. The CDF of the average NGoE provided over N= 1000 time slots.
of generality, we employ a linear form for g,gv, and gc, and a
multiplication form for faccording to (1). Besides, we consider
uniform cost cn=c0,n, and unless otherwise specified, we
initialize c0= 0.5and Cmax = 0.4.
Fig. 1 depicts a 50-slot snapshot of the status update system
according to the arrived queries and the updates’ NGoE,
comparing the normalized GoE of the proposed effect-aware
with the periodic effect-agnostic query control policy. The latter
policy is assumed to have 7-slot period intervals. We observe
that our effect-aware policy enables capturing updates with the
highest NGoE, of which the horizons are highlighted. However,
in some slots, the effect-aware policy does not pull usefulness
updates due to the probabilistic uncertainties of the CMDP
problem, while the periodic one can catch them by chance.
Fig. 2 presents the cumulative distribution function (CDF) of
the average NGoE provided by applying the effect-aware and
effect-agnostic policies over N= 1000 time slots. It is shown
that the effect-aware policy highly boosts the effectiveness of
the system, thanks to its prediction of the updates’ usefulness
0 0.2 0.4 0.6 0.8 1
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
Effect-aware (Simulation)
Effect-aware (Analytical)
Effect-agnostic (Periodic)
Effect-agnostic (Binomial)
Effect-agnostic (Markovian)
QAoI-aware
VoI-aware
Average NGoE
Controlled query rate
(a)
0 0.2 0.4 0.6 0.8 1
0
0.02
0.04
0.06
0.08
0.1
0.12
Effect-aware (Simulation)
Effect-aware (Analytical)
Effect-agnostic (Periodic)
Effect-agnostic (Binomial)
Effect-agnostic (Markovian)
QAoI-aware
VoI-aware
Average NGoE
Controlled query rate
(b)
Fig. 3. The interplay between average NGoE and query rate for different query
control policies and cost coefficients (a) c0= 0.1and (b) c0= 0.5.
and through pulling significant updates, compared to the effect-
agnostic policies. Specifically, the effect-aware policy increases
the effectiveness by 91%,47%, and 149% on average, as
compared to the binomial, Markovian, and periodic models,
respectively. This comes at the cost of 16% on average higher
transmission rate for the effect-aware policy than the others.
Figs. 3 (a) and 3 (b) show the interplay between average
NGoE and controlled query rate for c0= 0.1and c0= 0.5,
respectively, for effect-aware and effect-agnostic policies, as
well as for two special cases relevant to QAoI and VoI within
N= 1000 time slots. In the QAoI-aware case, decision policies
are obtained so as to maximize the long-term expected QAoI
regardless of the usefulness or importance of the updates,
whereas in the VoI-aware case, decision policies depend on the
expected VoI, without taking freshness into account. We can
see that the effect-aware policy outperforms the other effect-
agnostic policies for all query rates and under both cost values.
Comparing Fig. 3 (a) with Fig. 3 (b), we infer that, for c0= 0.1,
the special cases offer almost the same performance as the
0 50 100 150 200
0
1
2
3
c0= 0.1
c0= 0.5
c0= 1
Expected utility
Iteration
Fig. 4. The expected utility obtained in each iteration of Algorithm 1.
effect-aware policy considering both freshness and usefulness
attributes. This is because all policies can pull the same updates,
thus having identical performance. However, for c0= 0.5, the
effect-aware policy outperforms both special cases for all query
rates, even using fewer resources, owing to its effect-aware
query control. Also, since n1, and 0vn1,n,
bypassing the freshness attribute in the GoE metric leads to a
significant performance drop. Besides, the increase of the cost
decreases the offered average NGoE, and that at a faster speed
for the effect-agnostic policies. Indeed, not useful or irrelevant
transmissions under the effect-agnostic policies result in higher
waste of resources and larger gap between the achieved NGoE.
To study the convergence of the iterative approach for solving
the CMDP problem regarding to Algorithm 1, we plot Fig. 4,
which illustrates the value (expected utility) obtained in each
iteration for different cost coefficients. It can be seen that the
policy convergences to its final value after 130,133, and 137
iterations, sequentially, for the cost coefficients c0= 0.1,c0=
0.5, and c0= 1. We can also observe that the value provided
in each iteration decreases by increasing the cost, in line with
the formulation of the CMDP’s rewards in Section IV-A.
Finally, Table I demonstrates the threshold boundaries for the
threshold-based query control model discussed in Section IV-D
for cost coefficients c0= 0.5and c0= 1. The term “Silent”
corresponds to αn= 0, and “Pull” indicates αn= 1. Either
Option I or Option II could be applied to find th based
on (vm)
th or v(∆n)
th , respectively. As in Table I, the query
controller pulls updates in case vm0.44,m, regardless
of the current AoI. Also, updates are always pulled if n4,
n, independent of the degree of usefulness. For the above
conditions, we thus have fixed th = 4 and vth = 0.44
under any cost coefficient for the primary and latter options,
respectively. In order to derive the optimal action for the other
conditions (CMDP’s states), the communication cost plays a
key role, hence resulting in variable threshold metrics. As the
cost increases, the controller should pull updates merely under
more critical conditions so as to increase the system’s NGoE.
VI. CO NCL US I ON
We have proposed an effect-aware query control policy
for pull-based communication systems, in which the query
controller timely decides whether to pull an update, depending
on the source’s evolution and the updates’ effectiveness at the
endpoint. We have considered the problem of GoE maximiza-
TABLE I
ILL US TR ATIO N OF T HE T HR ES HO LD B OU NDA RI ES F OR T HE
TH RE SH OLD -BA SE D QU ERY C ON TRO L MO DE L
vm
00.44 0.56 0.67 0.78 0.89 1
c0= 0.5
n
12Silent
3Silent
410
Pull Pull Pull
c0= 1
n
13Silent
410 Pull Pull
tion based on a CMDP with finite state spaces for the AoI
and the usefulness rank and have provided an algorithm to
find the class of optimal policies. Our results have shown that
the effect-aware query policy could provide significant gains in
terms of normalized GoE compared to effect-agnostic policies
for different query rates and communication costs.
REF ER ENC ES
[1] C. E. Shannon and W. Weaver, The Mathematical Theory of Communi-
cation. University of Illinois Press, Urbana, 1949.
[2] M. Kountouris and N. Pappas, “Semantics-empowered communication
for networked intelligent systems,” IEEE Comm. Mag., vol. 59, no. 6,
2021.
[3] P. Popovski, O. Simeone, F. Boccardi, D. G¨
und¨
uz, and O. Sahin,
“Semantic-effectiveness filtering and control for post-5G wireless con-
nectivity, J. IISc, vol. 100, no. 2, 2020.
[4] B. Yin, S. Zhang, Y. Cheng, L. X. Cai, Z. Jiang, S. Zhou, and Z. Niu,
“Only those requested count: Proactive scheduling policies for minimizing
effective age-of-information, in IEEE INFOCOM, 2019.
[5] F. Li, Y. Sang, Z. Liu, B. Li, H. Wu, and B. Ji, “Waiting but not aging:
Optimizing information freshness under the pull model,” IEEE/ACM
Transactions on Networking, vol. 29, no. 1, 2021.
[6] J. Holm, A. E. Kalør, F. Chiariotti, B. Soret, S. K. Jensen, T. B. Pedersen,
and P. Popovski, “Freshness on demand: Optimizing age of information
for the query process,” in IEEE ICC, 2021.
[7] F. Chiariotti, J. Holm, A. E. Kalør, B. Soret, S. K. Jensen, T. B. Pedersen,
and P. Popovski, “Query age of information: Freshness in pull-based
communication,” IEEE Trans. on Commun., vol. 70, no. 3, 2022.
[8] M. E. Ildiz, O. T. Yavascan, E. Uysal, and O. T. Kartal, “Query age of
information: Optimizing AoI at the right time,” in IEEE ISIT, 2022.
[9] C. E. Leiserson, “Systolic priority queues,” Proceedings of the Caltech
Conference On Very Large Scale Integration, 1979.
[10] P. Agheli, N. Pappas, and M. Kountouris, “Semantic filtering and
source coding in distributed wireless monitoring systems,” arXiv preprint
arXiv:2305.11549, 2023.
[11] N. Pappas and M. Kountouris, “Goal-oriented communication for real-
time tracking in autonomous systems,” in IEEE ICAS, 2021.
[12] G. J. Stamatakis, N. Pappas, A. Fragkiadakis, and A. Traganitis,
“Semantics-aware active fault detection in IoT, in 20th WiOpt, 2022.
[13] D. Bertsekas, Dynamic programming and optimal control. Athena
Scientific, 2007, vol. 2.
[14] E. Altman, Constrained Markov decision processes. CRC press, 1999.
[15] M. Hatami, M. Leinonen, Z. Chen, N. Pappas, and M. Codreanu,
“On-demand AoI minimization in resource-constrained cache-enabled
IoT networks with energy harvesting sensors, IEEE Transactions on
Communications, vol. 70, no. 11, 2022.
[16] M. L. Puterman, Markov decision processes: discrete stochastic dynamic
programming. John Wiley & Sons, 2014.
[17] R. Bellman, “On the theory of dynamic programming,” Proceedings of
The National Academy of Sciences, vol. 38, no. 8, 1952.
[18] G. Wood, Bisection global optimization methods, C. A. Floudas and P. M.
Pardalos, Eds. Springer Science & Business Media, 2009.
[19] F. J. Beutler and K. W. Ross, “Optimal policies for controlled Markov
chains with a constraint,” Journal of mathematical analysis and applica-
tions, vol. 112, no. 1, 1985.
... On the other hand, different communication metrics that take the content of the message into account, such as the AoI, Value of Information (VoI), and semantic metric, e.g., Bilingual Evaluation Understudy (BLEU) [16], structural similarity index measure (SSIM) [17] have been considered in different tasks including UGV control [18], image transmission and [19] and image classification [20]. The author in [21] introduced a pull-based communication system, where a sensing agent adjusts an actuation agent through a query control policy. The results show that the proposed agent is able to respond to changing information sources and is updated to be effective in achieving specific goals. ...
Preprint
Real-time three-dimensional (3D) scene representations serve as one of the building blocks that bolster various innovative applications, e.g., digital manufacturing, Virtual/Augmented/Extended/Mixed Reality (VR/AR/XR/MR), and the metaverse. Despite substantial efforts that have been made to real-time communications and computing, real-time 3D scene representations remain a challenging task. This paper investigates the tradeoff between timeliness and fidelity in real-time 3D scene representations. Specifically, we establish a framework to evaluate the impact of communication delay on the tradeoff, where the real-world scenario is monitored by multiple cameras that communicate with an edge server. To improve fidelity for 3D scene representations, we propose to use a single-step Proximal Policy Optimization (PPO) method that leverages the Age of Information (AoI) to decide if the received image needs to be involved in 3D scene representations and rendering. We test our framework and the proposed approach with different well-known 3D scene representation methods. Simulation results reveal that real-time 3D scene representation can be sensitively affected by communication delay, and our proposed method can achieve optimal 3D scene representation results.
Article
Full-text available
Deep reinforcement learning (DRL) has been a key machine learning technique in many 5G and 6G applications. DRL agents learn optimal (or sub-optimal) policies by interacting with the environment. However, this process often involves numerous uninformative and repetitive message transmissions between the DRL agent and its environment. In this paper, we address the problem of reducing interactions between the DRL agent and the environment, called goal-oriented DRL. Meanwhile, Terahertz (THz) bands and unmanned aerial vehicles (UAVs) are considered two of the main enablers of 6G. Therefore, we investigate the goal-oriented DRL problem in a THz-enabled UAV-aided network. We formulate it as an optimization problem with the goals of i) reducing interactions between the UAV (DRL agent) and IoT devices (environment), ii) maximizing the number of served IoT devices, and iii) ensuring fairness. The constraints include the movement characteristics of IoT devices, the maximum speed limitation of the UAV, the QoS requirements of the served IoT devices, and the limited uplink coverage of the THz-enabled UAV. This problem is a mixed-integer nonlinear programming optimization problem and is NP-hard. To address this problem, we employ the decoupling optimization method and an approach inspired by the self-triggered method from control engineering. Specifically, the problem is divided into two sub-problems; Then, we propose using supervised learning as a teacher for DRL to reduce the interactions. Our simulation results show that the goal-oriented DRL approach outperforms conventional methods by reducing interactions and maintaining good performance in terms of the number of served IoT devices and fairness.
Article
Full-text available
The problem of goal-oriented semantic filtering and timely source coding in multiuser communication systems is considered here. We study a distributed monitoring system in which multiple information sources, each observing a physical process, provide status update packets to multiple monitors having heterogeneous goals. Two semantic filtering schemes are first proposed as a means to admit or drop arrival packets based on their goal-dependent importance, which is a function of the intrinsic and extrinsic attributes of information and the probability of occurrence of each realization. Admitted packets at each sensor are then encoded and transmitted over block-fading wireless channels so that served monitors can timely fulfill their goals. A truncated error control scheme is derived, which allows transmitters to drop or retransmit undelivered packets based on their significance. Then, we formulate the timely source encoding optimization problem and analytically derive the optimal codeword lengths assigned to the admitted packets which maximize a weighted sum of semantic utility functions for all pairs of communicating sensors and monitors. Our analytical and numerical results provide the optimal design parameters for different arrival rates and highlight the improvement in timely status update delivery using the proposed semantic filtering, source coding, and error control schemes.
Article
Full-text available
We consider a resource-constrained IoT network, where multiple users make on-demand requests to a cache-enabled edge node to send status updates about various random processes, each monitored by an energy harvesting sensor. The edge node serves users’ requests by deciding whether to command the corresponding sensor to send a fresh status update or retrieve the most recently received measurement from the cache. Our objective is to find the best actions of the edge node to minimize the average age of information (AoI) of the received measurements upon request, i.e., average on-demand AoI , subject to per-slot transmission and energy constraints . First, we derive a Markov decision process model and propose an iterative algorithm that obtains an optimal policy. Then, we develop an asymptotically optimal low-complexity algorithm – termed relax-then-truncate – and prove that it is optimal as the number of sensors goes to infinity. Simulation results illustrate that the proposed relax-then-truncate approach significantly reduces the average on-demand AoI compared to a request-aware greedy policy and a weighted AoI policy, and also depict that it performs close to the optimal solution even for moderate numbers of sensors.
Article
Full-text available
Wireless connectivity has traditionally been regarded as an opaque data pipe carrying messages, whose context-dependent meaning and effectiveness have been ignored. Nevertheless, in emerging cyber-physical and autonomous networked systems, acquiring, processing, and sending excessive amounts of distributed real-time data, which ends up being stale or useless to the end user, will cause communication bottlenecks, increased latency, and safety issues. We envision a communication paradigm shift, which makes the semantics of information (i.e., the significance and usefulness of messages) the foundation of the communication process. This entails a goal-orient-ed unification of information generation, transmission, and reconstruction, by taking into account process dynamics, signal sparsity, data correlation, and semantic information attributes. We apply this structurally new, synergetic approach to a communication scenario where the destination is tasked with real-time source reconstruction for the purpose of remote actuation. Capitalizing on semantics-empowered sampling and communication policies, we show significant reduction in both reconstruction error and cost of actuation error, as well as in the number of uninformative samples generated.
Article
We study a pull-based status update communication model where a source node submits update packets to a channel with random transmission delay, at times requested by a remote destination node. The objective is to minimize the average query-age-of-information (QAoI), defined as the average age-of-information (AoI) measured at query instants that occur at the destination side according to a stochastic arrival process. In reference to a push-based problem formulation defined in the literature where the source decides to update or wait at will, with the objective of minimizing the time average AoI at the destination, we name this problem the Pull-or-Wait (PoW) problem. We identify the PoW problem in the case of a single query as a stochastic shortest path (SSP) problem with uncountable state and action spaces, which has not been solved in previous literature. We derive an optimal solution for this SSP problem and use it as a building block for the solution of the PoW problem under periodic query arrivals.
Article
Age of Information (AoI) has become an important concept in communications, as it allows system designers to measure the freshness of the information available to remote monitoring or control processes. However, its definition tacitly assumes that new information is used at any time, which is not always the case: the instants at which information is collected and used may be dependent on a certain query process, and resource-constrained environments such as most Internet of Things (IoT) use cases require precise timing to fully exploit the limited available transmissions. In this work, we consider a pull-based communication model in which the freshness of information is only important when the receiver generates a query: if the monitoring process is not using the value, the age of the last update is irrelevant. We optimize the Age of Information at Query (QAoI), a metric that samples the AoI at relevant instants, better fitting the pull-based resource-constrained scenario, and show how this can lead to very different choices. Our results show that QAoI-aware optimization can significantly reduce the average and worst-case perceived age for both periodic and stochastic queries.