Content uploaded by Aresh Dadlani
Author content
All content in this area was uploaded by Aresh Dadlani on Aug 31, 2020
Content may be subject to copyright.
1
Context-Aware Privacy Preservation in Network
Caching: An Information Theoretic Approach
Seyedeh B. Hassanpour, Abolfazl Diyanat, Ahmad Khonsari, Seyed P. Shariatpanahi, and Aresh Dadlani
Abstract—Caching has been recognized as a viable solution to
surmount the limited capability of backhaul links in handling abun-
dant network traffic. Although optimal approaches for minimizing
the average delivery load do exist, current caching strategies fail to
avert intelligent adversaries from obtaining invaluable contextual
information by inspecting the wireless communication links and
thus, violating users’ privacy. Grounded in information theory, in
this paper, we propose a mathematical model for preservingprivacy
in a network caching system involving a server and a cache-aided
end user. We then present an efficient content caching method that
maximizes the degree of privacy preservation while maintaining
the average delivery load at a given level. Given the Pareto optimal
nature of the proposed 𝜖-constraint optimization approach, we also
obtain the maximum privacy degree achievable under any given
average delivery load. Numerical results and comparisons validate
the correctness of the our context-oriented privacy model.
Index Terms—Content caching, context-oriented privacy, error
probability bound, Pareto optimal, information theory.
I. INTRODUCTION
BACKHAUL congestion due to the ever-rising data traffic
growth is arguably the most prominent limiting factor
in 5G network performance. Network caching, where popular
contents are cached closer to the edge devices, has recently
been deemed as a feasible remedy to improve network capac-
ity and alleviate the traffic load during peak hours [1]–[3].
Despite the benefits from placing requested contents in close
proximity of interested users, proactive caching facilitates
adversary nodes to access the data and gather information on
users’ preferences and location. Preserving privacy in caching
systems is therefore, a great challenge as user data revelation
is usually against users’ confidentiality agreements [4].
In general, privacy in networks can be preserved based
on two distinct aspects. In data-oriented privacy, the data
integrity is maintained while being transmitted over the net-
work such that accessibility to on-the-fly content is prohibited
to malicious nodes [5]. Existing works on data-oriented pri-
vacy mostly emphasize on either proposing new encryption
methods or adding extra layers of encryption in the network
[6]. Contrarily, context-oriented privacy refers to preserving
the privacy of users and their requested files which can be
divulged by eavesdropping certain features of the transmitted
packets such as the time and location of creation, usage
pattern, and other file-related statistics that remain unguarded
in conventional protection mechanisms [7], [8].
With regard to cache content placement (CCP), an optimal
probabilistic caching strategy is investigated in [9] to max-
imize the cache hit probability and cache-aided throughput
in wireless D2D caching networks. In [10], a hybrid caching
scheme jointly optimized with the transmission schemes is
Server User
Adversary
Files Cache
f1
. . .
fM
0N0N
Fig. 1. A server with 𝑀files, each of size 𝑁bits, communicating with an
end user with a cache size of 𝑁bits in presence of an adversary.
proposed to handle the trade-off between the signal coop-
eration gain and the caching diversity gain. While neither
[9] nor [10] consider the presence of external adversaries,
there exist a handful of studies devoted to CCP strategies
that focus on context-oriented privacy. The authors in [11]
presented a protocol enforcing fair subdivision of limited
cache storage and context privacy provision. More recently,
a wireless caching network wherein multiple cooperative ad-
versarial nodes tap contextual information of users is studied
in [12]. The authors aimed to maximize the probability of
delivering all requested content within a given radius. To
add randomness to the eavesdroppers’ estimates, which are
based on the eavesdropped transmitted packets, they applied
probabilistic caching to obtain the optimal probabilities. By
virtue of the non-convex essence of their CCP optimization
problem, the authors adopted genetic algorithm to find the
best caching to mislead the eavesdroppers. The works in [11]
and [12], however, do not address the trade-off between traffic
load minimization and degree of context privacy.
The main contribution of this letter is to introduce an
information theoretical formulation for a new CCP protocol
that characterizes the trade-off between traffic load and context
privacy of the system. In particular, we consider a strong
adversarial node with coverage over the entire area so as
to reduce the extra traffic overhead imposed due to coop-
eration between eavesdroppers. Assuming that the adversary
is equipped with the best estimator while overhearing the
channel, we also derive analytical bounds for estimation error
in terms of the Fano-Kovalevskij inequality [13]. Finally,
using the proposed 𝜖-constraint CCP optimization model, we
minimize the average delivery rate over the channel for any
desired level of privacy through simulation results.
II. SY ST EM MO DE L DESCRIPTION
Consider the content delivery system shown in Fig. 1which
comprises of a server with 𝑀files, denoted by the set F=
{𝑓𝑘|1≤𝑘≤𝑀}, each of length 𝑁bits, and a cache-enabled
user with storage capacity of 𝑁bits. In the content placement
phase (CPP), the server preloads a file or a mixture of files into
the user’s cache using a given strategy until it is completely
2
Feasible region
PA
Pe
Pe= 1 −PA
Pe= 0
0.5
01
0.5Adversary without knowledge of the delivery load
Adversary with knowldege of the delivery load in optimal strategy
Fig. 2. The adversary best estimation error probability for F={𝐴, 𝐵 }.
filled. As a result, we have the constraint Í𝑀
𝑘=1|𝑓𝑐
𝑘|=𝑁on
the preloaded files, where |𝑓𝑐
𝑘|denotes the size of the cached
portion of file 𝑓𝑘. In the content delivery phase (CDP) that
follows, the user requests for a file from F. The server then
transfers the requested file only if it has not been cached earlier
at the user’s device. We assume that the user requests for file
𝑓𝑘with probability 𝑝𝑘. Since the user requests for at least one
file, we thus have Í𝑀
𝑘=1𝑝𝑘=1.
Moreover, we consider a passive adversary that eavesdrops
the communication between the server and the user. Without
any prior knowledge on the requested file, the adversary
attempts to detect the file from set Fin the CDP. We assume
that the adversary is armed with the best estimator. Let ˆ
𝑘
denote the adversary’s estimation of the index of the requested
file and 𝑃𝑒be the adversary’s estimation error in the sense of
maximum a posterior probability (MAP), defined as:
𝑃𝑒=Pr[ˆ
𝑘≠𝑘],0≤𝑃𝑒≤1.(1)
We reserve the term adversary’s best estimation for the
estimation with minimum 𝑃𝑒among all possible estimations.
III. PROP OS ED SECURE CCP APP ROAC H
In this section, we will describe our approach towards
characterizing the trade-off between privacy and delivery
efficiency. For the sake of illustration, consider a server
that stores two distinct files, namely 𝐴and 𝐵, with request
probabilities 𝑃𝐴and 𝑃𝐵(𝑃𝐴≥𝑃𝐵), respectively. If the file
popularity is known to the adversary, then he/she can select
the most popular file as his/her estimation without having any
information about the CDP. In this case, 𝑃𝑒=1−𝑃𝐴which
corresponds to the dashed line in Fig. 2.
Intuitively, an efficient CCP strategy is to minimize the total
number of transferred bits by considering the popularity of
the files requested by the user. The file popularity distribution
(𝑝𝑘)however, is known to all entities in the system, including
the adversary node. Additionally, the adversary is also aware
that the server in traditional network caching preloads the
most popular file into the user’s cache. In such a setting,
it becomes trivial to guess the index of the file (solid line
in Fig. 2). In what follows, we devise a CCP strategy that
achieves maximum ambiguity (or degree of privacy) for a
given delivery load.
A. Adversary Error Probability Bounds
To obtain the analytical upper and lower bounds on the ad-
versary’s best estimation error, we adopt the Fano-Kovalevskij
inequality for binary variables in our two file scenario, whereas
the results of [14] are used for 𝑀 > 2in the following theorem.
Theorem 1. For any estimator ˆ
𝑘such that 𝑘→𝑌→ˆ
𝑘is a
Markov chain with 𝑃𝑒=Pr[ˆ
𝑘≠𝑘], we have:
Ψ≤𝑃𝑒≤𝐻(𝑘|𝑌)
2(2)
with Ψ4
=(𝐻−1(𝐻(𝑘|𝑌)),if 𝑀=2,
𝐻(𝑘|𝑌)−1
log2(𝑀−1),if 𝑀 > 2,(3)
where 𝑀=|F | and 𝑌is the random variable (r.v.) of adver-
sary’s observation corresponding to the number of bits trans-
mitted from the server to the user over the network during the
CDP. The conditional entropy 𝐻(𝑘|𝑌)is the total ambiguity
in 𝑘given observation 𝑌and 𝐻−1is the inverse of the binary
entropy function 𝐻(𝜗)=−𝜗log2(𝜗)−(1−𝜗)log2(1−𝜗).
Proof. See Appendix A.
With the bounds derived for the adversary’s best estimation
error in Theorem 1, we now maximize the lower bound on
𝑃𝑒in (3). Thus, exploiting the proposed approach guarantees
that 𝑃𝑒in the sense of MAP under any estimator will always
be greater than a particular threshold 𝜉∈R++, i.e. 𝑃𝑒=
Pr[ˆ
𝑘≠𝑘] ≥ 𝜉. As a result, maximizing the error probability
lower bound in (3) is equivalent to maximizing the conditional
entropy 𝐻(𝑘|𝑌). Indeed, this maximum error probability is
what we technically define as the privacy degree.
B. Exemplary Case (𝑀=2)
In this subsection, we consider the scenario depicted in
Fig. 2. We define the indicator r.v. 𝑘which is zero, if the user
requests for file 𝐴with probability 𝑃𝐴and one, if otherwise
with probability 𝑃𝐵=1−𝑃𝐴. We also let r.v. 𝑍represent
the number of bits of file 𝐴stored in the cache. For discrete
values of 𝑗, where 0≤𝑗≤𝑁, the probability mass function
(pmf) of 𝑍is given as 𝑝𝑧
𝑗,Pr[𝑍=𝑗]. Upon completion of
the CPP, when the user requests for one of the files, the server
transmits 𝑌bits to satisfy the demand (i.e. 𝑌=𝑁−𝑍). The term
𝐻(𝑘|𝑌)essentially captures the adversary’s ambiguity about
the identity of the requested file by knowing the size of the
transmitted data and determines its error as stated in Lemma 1.
Lemma 1. The term 𝐻(𝑘|𝑌)is calculated as follows:
𝐻(𝑘|𝑌)=𝐻(𝑘) + 𝐻(𝑍) − 𝐻(𝑌),(4)
where 𝐻(𝑘),𝐻(𝑍), and 𝐻(𝑌)are the entropies of 𝑘,𝑍, and
𝑌, respectively.
Proof. See Appendix B.
By letting 𝒑𝑍=[𝑝𝑧
0, 𝑝𝑧
1, . . . , 𝑝 𝑧
𝑗, . . . , 𝑝 𝑧
𝑁]be the distribution
of 𝑍, we observe that designing the CCP strategy is equivalent
to determining the vector 𝒑𝑍. Since 𝐻(𝑘)is independent of
𝒑𝑍, we therefore, arrive at the following optimization model
3
RN
Z1Z2... ZM
Fig. 3. Depiction of fraction of file 𝑓𝑘∈ F in an 𝑁−bit cache using r.v. 𝑍𝑘.
from (4) to achieve maximum ambiguity without imposing any
constraints on the delivery load:
𝒑∗
𝑍=argmax
𝒑𝑍
𝐻(𝑘|𝑌)=𝐻(𝑍) − 𝐻(𝑌)(5a)
s.t.
𝑁
Õ
𝑗=0
𝑝𝑧
𝑗=1,0≤𝑝𝑧
𝑗≤1,0≤𝑗≤𝑁. (5b)
Next, we prove in Proposition 1that an optimal solution
of (5) is uniformly distributed at the cost of transmitting 𝑁/2
bits.
Proposition 1. The uniform distribution is one of the optimal
solutions of (5).
Proof. See Appendix C.
To maximize the adversary’s ambiguity subject to a specific
delivery load constraint, we now augment the following con-
straint to the model in (5). That is to say, we maximize the
degree of privacy while keeping the number of bits transmitted
in the CDP below some constant 𝐶value:
𝒑∗
𝑍=argmax
𝒑𝑍
𝐻(𝑘|𝑌)=𝐻(𝑍) − 𝐻(𝑌)(6a)
s.t.
𝑁
Õ
𝑗=0
𝑝𝑧
𝑗=1,0≤𝑝𝑧
𝑗≤1,0≤𝑗≤𝑁, (6b)
𝑃𝐴(𝑁−𝑍) + 𝑃𝐵𝑍≤𝐶, (6c)
where 𝑃𝐴(𝑁−𝑍) + 𝑃𝐵𝑍is the load delivered in the CDP and
𝐶corresponds to the effective capacity of the communication
link. Note that the augmented model in (6) is efficient in the
Pareto optimality sense. Lemma 2proves an optimal solution
for this model. For 𝐶 < 𝑁/2, we obtain the optimal solution
numerically in Section IV.
Lemma 2. If 𝐶≥𝑁/2, then the uniform distribution is one
of the optimal solutions for the augmented model in (6).
Proof. See Appendix D.
C. General Case
We now extend our approach to any arbitrary value of 𝑀.
Suppose that the user chooses file 𝑓𝑘with probability 𝑝𝑘. We
define the random process Z={𝑍𝑘}, where 𝑍𝑘denotes the
fraction of file 𝑓𝑘(in bits) cached at the user’s end. Due to
the limited cache capacity, we have Í𝑀
𝑘=1𝑍𝑘=𝑁as depicted
in Fig. 3. Consequently, the pmf of the delivery load 𝑌can be
expressed as follows:
𝑃𝑌=Pr[𝑌=𝑗]=
𝑀
Õ
𝑘=1
𝑝𝑘×Pr[𝑍𝑘=𝑁−𝑗].(7)
f1f2f3f4f5f6f7f8f9f10 f11 f12
P5
k=1 pk≃1
3P8
k=6 pk≃1
3P12
k=9 pk≃1
3
Fig. 4. Example of a file set Fwith 𝑀=12 split into three subsets such
that the sum of popularity in each subset equals 1
3.
It should be noted that the r.v.s 𝑍𝑘are not independent and
their distribution is given by the following matrix:
𝑷𝑍=
𝑝𝑧
10 𝑝𝑧
11 . . . 𝑝𝑧
1𝑁
𝑝𝑧
20 𝑝𝑧
21 . . . 𝑝𝑧
2𝑁
.
.
..
.
.....
.
.
𝑝𝑧
𝑀0𝑝𝑧
𝑀1. . . 𝑝𝑧
𝑀 𝑁
,(8)
where element 𝑝𝑧
𝑘 𝑗 ,Pr[𝑍𝑘=𝑗]. As a result, designing the
CCP strategy corresponds to computing matrix 𝑷𝑍. Hence, the
optimal privacy-load trade-off can be formally characterized as
the following optimization model:
𝑷∗
𝑍=argmax
𝑷𝑍
𝐻(𝑍1, . . . , 𝑍 𝑀) − 𝐻(𝑌)(9a)
s.t.
𝑀
Õ
𝑘=1
𝑝𝑧
𝑘 𝑗 =1,0≤𝑝𝑧
𝑘 𝑗 ≤1,0≤𝑗≤𝑁 , (9b)
𝑀
Õ
𝑘=1
𝑝𝑘(𝑁−𝑍𝑘) ≤ 𝐶 , (9c)
𝑀
Õ
𝑘=1
𝑍𝑘=𝑁. (9d)
Note that 𝐻(𝑘)is independent of the optimization variable
and can be removed from the cost function in (9a). Constraint
(9c) ensures that the communication cost imposed on the net-
work remains below a threshold value 𝐶. Thus, by controlling
𝐶in (9c), and then attaining the corresponding privacy degree
by solving the optimization problem, the trade-off between
traffic load and privacy degree can be managed. Evidently, (9)
reduces to (5) when 𝑀=2.
D. Sub-Optimal Heuristic for Large 𝑀and 𝑁
When the number of files (𝑀) or the size of each file (𝑁) is
large, the optimization problem in (9) becomes computation-
ally expensive to solve. For large file sizes, we can split the
files into smaller portions called chunks, instead of splitting
them at the bit level. As for large number of files, in what
follows, we present a heuristic that makes solving the problem
feasible. Although this method yields a sub-optimal solution,
it is a quid pro quo for reduction in computational complexity.
The three steps of this library splitting heuristic are as follows:
•Step 1: Split the library Finto 𝑞equi-probable subsets,
such that the aggregated popularity in each subset almost
equals 1/𝑞. Fig. 4illustrates an example for 𝑞=3.
•Step 2: Split the cache of each user into 𝑞memory slots
such that each slot contains approximately 𝑁/𝑞bits.
•Step 3: Define a sub-problem solution as the optimal
caching of a subset into its corresponding memory
slot. Subsequently, solve the equivalent optimization sub-
problem 𝑞times for each subset.
4
0 10 20 30 40 50
0
0.2
0.4
0.6
0.8
1
.02 ∗j
j
The probability Z
Simulation results
Theoretical Formula
(a)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
30
35
40
45
50
55
60
65
70
Privacy degree, H(k|Y)
Delivery load
Numerical results
(b)
0.5 0.6 0.7 0.8 0.9
0
0.1
0.2
0.3
0.4
0.5
PA
Adversary error probability Pe
Proposed approach
Adversary without knowing delivery load
(c)
1 2 3 5 6 10 15 30
0
0.2
0.4
0.6
0.8
1
q
ψ
(d)
Fig. 5. (a) CDF of the solution of (5). (b) Pareto-optimal curve for (5)-(6c). (c) Adversary estimation error for 𝑀=2. (d) The difference between cost function
for optimal and sub-optimal problem (𝜓) versus 𝑞.
IV. SIMULATION RESULTS
The proposed CCP model is implemented using Matlab and
a95% confidence level is adopted to demonstrate the accuracy
of the Monte Carlo simulation results. The setup comprises of
a server with library F={𝐴, 𝐵}and a user cache of size 𝑁=7
bits. We generate 20000 samples (0for requesting file 𝐴and
1for requesting file 𝐵) with 𝑃𝐴=0.7and 𝑃𝐵=0.3.
To validate Proposition 1, we use the fmincon function
in Matlab to compute an optimal solution of (5) in Fig. 5(a).
The plot shows the cumulative density function (CDF) of the
resulting distribution which is exactly same as the uniform
distribution.
Fig. 5(b) plots the curve of the delivery load constraint
(𝐶) against the degree of privacy (𝐻(𝑘|𝑌)). As evident in the
figure, 𝐻(𝑘|𝑌)=0when the delivery load constraint is 𝑃𝐵𝑁,
whereas we achieve maximum privacy (𝐻(𝑘|𝑌)=1bit)when
beyond 𝑁/2bits are transferred in the CDP.
We compare the adversary error probability (𝑃𝑒) with
respect to 𝑃𝐴in Fig. 5(c). This figure depicts the significant
increment of lower bound of 𝑃𝑒(as compared to Fig. 2) and
thus, reaching the upper bound which is 1−𝑃𝐴. It is worth
noting that when 𝑃𝐴=0.5and 𝑃𝐴=1, the error probability
in our approach converges to that of an adversary with no
knowledge of the delivery load. As this figure clearly shows,
the maximum difference occurs roughly at 𝑃𝐴=0.7which
implies that our proposed approach achieves a relatively lower
degree of privacy as compared to reference points 𝑃𝐴=0.5
and 𝑃𝐴=1.
Finally, Fig. 5(d) plots the difference between the optimal
solution in (9) and sub-optimal solution discussed in Sec-
tion III-D for a library Fwith 𝑀=30 files. We denote this
difference by 𝜓and scale it between optimal (𝜓=0) and non-
optimal (𝜓=1) solution. When 𝑞=1, the optimal solution is
obtained from (9). But for example, if 𝑞=2, we divide the
library into two subsets and solve the optimization problem
for these two subsets independently. This figure shows that
we can reduce the time complexity of our problem by a factor
of 10 at the expense of obtaining a sub-optimal solution with
20% difference from the optimal solution.
V. CONCLUSION AND FUTURE WO RK
In this paper, we have proposed a caching strategy that
maximizes the adversary’s best estimation error while mini-
mizing the average delivery load, which is momentous in terms
of energy consumption and limited bandwidth in wireless
links. In the presented approach, we have formulated an 𝜖-
constraint optimization model to alter the statistical behavior
of the server so as to misguide the adversary. Furthermore, we
have maximized the Fano lower bound for the best adversary
estimation using information theory to reduce the adversary’s
accessibility to useful contextual information. Simulation re-
sults also validate the effectiveness of our approach. This
work can be further extended to investigate the same trade-off
considering multiple users where the adversary must estimate
both, the transmitted file and the requesting user.
APPENDIX
A. Proof of Theorem 1
For proof on the upper bound, we refer the reader to [13].
For the lower bound, we define the error event given estimator
ˆ
𝑘as:
𝐸=1 if ˆ
𝑘≠𝑘,
0 if ˆ
𝑘=𝑘. (10)
𝐻(𝐸 , 𝑘 |ˆ
𝑘)can be expanded as follows:
𝐻(𝐸 , 𝑘 |ˆ
𝑘)=𝐻(𝑘|ˆ
𝑘) + 𝐻(𝐸|𝑘, ˆ
𝑘)=𝐻(𝐸|ˆ
𝑘) + 𝐻(𝑘|𝐸, ˆ
𝑘).
If the selected file index (𝑘) and the estimated file index ( ˆ
𝑘) are
known to the adversary, then he/she can determine the error
without ambiguity, i.e. 𝐻(𝐸|𝑘, ˆ
𝑘)=0. Thus, we will have:
𝐻(𝑘|ˆ
𝑘)=𝐻(𝐸|ˆ
𝑘) + 𝐻(𝑘|𝐸, ˆ
𝑘)
(𝑎)
≤𝐻(𝐸) + 𝐻(𝑘|𝐸, ˆ
𝑘)(𝑏)
≤𝐻(𝑃𝑒) + 𝐻(𝑘|𝐸, ˆ
𝑘)
𝐻(𝑘|𝑌)(𝑐)
≤𝐻(𝑘|ˆ
𝑘) ≤ 𝐻(𝑃𝑒) + 𝐻(𝑘|𝐸 , ˆ
𝑘).(11)
Conditioning reduces entropy, so we have (a). The identity
(b) stems from the fact that 𝐸is a binary r.v.. For identity (c),
according to the Markov chain property, we have 𝐻(𝑘|𝑌) ≤
𝐻(𝑘|ˆ
𝑘). Therefore, we arrive at the inequality (11). We now
simplify (11) for caching two and more files as below:
1) For 𝑀=2:In this case, if the adversary knows which file
is selected, he/she can determine the error without ambiguity,
i.e. 𝐻(𝑘|𝐸, ˆ
𝑘)=0. Using this fact and (11), we arrive at:
𝐻(𝑘|𝑌) ≤ 𝐻(𝑃𝑒)=⇒𝐻−1(𝐻(𝑘|𝑌)) ≤ 𝑃𝑒,(12)
where 𝐻(𝑃𝑒)=−𝑃𝑒log2(𝑃𝑒) − (1−𝑃𝑒)log2(1−𝑃𝑒)and 𝐻−1(·)
is the inverse of 𝐻.
5
2) For 𝑀 > 2:We can write (11) as:
𝐻(𝑘|𝑌) ≤ 𝐻(𝑃𝑒) + 𝐻(𝑘|𝐸 , ˆ
𝑘)
(𝑎)
≤1+𝐻(𝑘|𝐸, ˆ
𝑘)(𝑏)
≤1+𝑃𝑒log2(𝑀−1).(13)
In (13), identity (a) follows from the fact that 𝐻(𝑃𝑒) ≤ 1. The
identity (b) is due to [14, Theorem 2.10.1]. Thus, we conclude
that:
𝐻(𝑘|𝐸, ˆ
𝑘)=Pr{𝐸=0}𝐻(𝑘|𝐸=0,ˆ
𝑘)
| {z }
Equal to zero
+
Pr{𝐸=1}
| {z }
𝑃𝑒
𝐻(𝑘|𝐸=1,ˆ
𝑘) ≤ 𝑃𝑒log2(𝑀−1).
Rearranging the terms results in 𝐻(𝑘|𝑌) −1
log2(𝑀−1)≤𝑃𝑒. This com-
pletes the proof.
B. Proof of Lemma 1
𝐻(𝑘|𝑌)can be written as follows [14, §2.2]:
𝐻(𝑘|𝑌)=𝐻(𝑘, 𝑌 ) − 𝐻(𝑌).(14)
To obtain the entropy 𝐻(𝑘|𝑌), we calculate the joint entropy,
𝐻(𝑘 , 𝑌), and 𝐻(𝑌)separately. If file 𝐴is requested and 𝑁−𝑗
bits of file 𝐴are stored in the cache, then the server should
transmit the remaining 𝑗bits in the CDP. To compute 𝐻(𝑘 , 𝑌 ),
we need the joint distribution Pr[𝑘=𝑖 , 𝑌 =𝑗]for 𝑖∈ {0,1}and
𝑗∈ {0,1,2, . . . , 𝑁 }, as given below:
Pr[𝑘=𝑖 , 𝑌 =𝑗]=Pr[𝑌=𝑗|𝑘=𝑖] × Pr [𝑘=𝑖],
Pr[𝑌=𝑗|𝑘=0]=Pr[𝑍=𝑁−𝑗],
Pr[𝑌=𝑗|𝑘=1]=Pr[𝑍=𝑗].(15)
After some mathematical manipulations, we get:
𝐻(𝑘 , 𝑌)=−𝑃𝐴log2𝑃𝐴−𝑃𝐵log2𝑃𝐵(16)
=−𝑃𝐴
𝑁
Õ
𝑗=0
Pr[𝑍=𝑁−𝑗]log2Pr [𝑍=𝑁−𝑗]
−𝑃𝐵
𝑁
Õ
𝑗=0
Pr[𝑍=𝑗]log2Pr [𝑍=𝑗]=𝐻(𝑘) + 𝐻(𝑍).
Substituting (16) in (14) eventually yields (4), which com-
pletes the proof.
C. Proof of Proposition 1
We first suggest the uniform distribution as a possible
solution and then prove that this solution maximizes the cost
function in (5a) and satisfies (5b). For observation 𝑌, we have:
Pr[𝑌=𝑗]=𝑃𝐴Pr[𝑍=𝑁−𝑗] + 𝑃𝐵Pr[𝑍=𝑗].(17)
Since conditioning always reduces entropy [14], we have:
𝐻(𝑘|𝑌) ≤ 𝐻(𝑘)using Lemma 1
=============⇒𝐻(𝑍) ≤ 𝐻(𝑌).(18)
Now, we prove that for the uniform distribution, 𝐻(𝑍) −
𝐻(𝑌)becomes zero and 𝐻(𝑘|𝑌)attains its maximum value.
Furthermore, 𝑍achieves its maximum entropy (𝐻(𝑍)=
log2(𝑁+1)). Using the definition of entropy and (17), we
obtain the following:
𝐻(𝑌)=−
𝑁
Õ
𝑗=0𝑃𝐴+𝑃𝐵
𝑁+1log2𝑃𝐴+𝑃𝐵
𝑁+1=log2(𝑁+1).(19)
According to (19), we have:
𝐻(𝑍) − 𝐻(𝑌)=log2(𝑁+1) − log2(𝑁+1)=0.(20)
The uniform distribution is thus, one of the optimal solutions
of (5) and not the unique solution. This completes the proof.
D. Proof of Lemma 2
Suppose the r.v. 𝑍follows a uniform distribution. Hence,
E[𝑍]=
𝑁
Õ
𝑘=0
𝑘 𝑝𝑘=
𝑁
Õ
𝑘=0
𝑘
𝑁+1=𝑁
2.(21)
According to (21), we have:
E[𝑍]=𝑁
2≥𝑃𝐴𝑁−𝐶
𝑃𝐴−𝑃𝐵
=⇒𝐶≥𝑁
2.(22)
We already know that the uniform distribution is a feasible
(and not unique) solution for (5), i.e., the model without any
delivery load constraint. This completes the proof.
REFERENCES
[1] L. Li, G. Zhao, and R. S. Blum, “A survey of caching techniques in
cellular networks: Research issues and challenges in content placement
and delivery strategies,” IEEE Commun. Surv. Tutor., vol. 20, no. 3, pp.
1710–1732, 2018.
[2] S. Wang, T. Wang, and X. Cao, “In-network caching: An efficient content
distribution strategy for mobile networks,” IEEE Wireless Commun.,
vol. 26, no. 5, pp. 84–90, 2019.
[3] S. B. Hassanpour, A. Khonsari, S. P. Shariatpanahi, and A. Dadlani,
“Hybrid coded caching in cellular networks with D2D-enabled mobile
users,” in Proc. IEEE International Symposium on Personal, Indoor and
Mobile Radio Communications (PIMRC), 2019, pp. 1–6.
[4] J. Zhang, B. Chen, Y. Zhao, X. Cheng, and F. Hu, “Data security
and privacy-preserving in edge computing paradigm: Survey and open
issues,” IEEE Access, vol. 6, pp. 18209–18 237, 2018.
[5] A. A. Zewail and A. Yener, “Device-to-device secure coded caching,”
IEEE Trans. Inf. Forensics Security, vol. 15, pp. 1513–1524, 2020.
[6] M. Mukherjee, R. Matam, L. Shu, L. Maglaras, M. A. Ferrag, N. Choud-
hury, and V. Kumar, “Security and privacy in fog computing: Chal-
lenges,” IEEE Access, vol. 5, pp. 19293–19 304, 2017.
[7] A. Diyanat, A. Khonsari, and S. P. Shariatpanahi, “A dummy-based
approach for preserving source rate privacy,” IEEE Trans. Inf. Forensics
Security, vol. 11, no. 6, pp. 1321–1332, 2016.
[8] Y. Wang, Z. Tian, S. Su, Y. Sun, and C. Zhu, “Preserving location privacy
in mobile edge computing,” in Proc. IEEE International Conference on
Communications (ICC), 2019, pp. 1–6.
[9] Z. Chen, N. Pappas, and M. Kountouris, “Probabilistic caching in
wireless d2d networks: Cache hit optimal versus throughput optimal,”
IEEE Commun. Lett., vol. 21, no. 3, pp. 584–587, 2017.
[10] G. Zheng, H. A. Suraweera, and I. Krikidis, “Optimization of hybrid
cache placement for collaborative relaying,” IEEE Commun. Lett.,
vol. 21, no. 2, pp. 442–445, 2017.
[11] D. Andreoletti, C. Rottondi, S. Giordano, G. Verticale, and M. Tornatore,
“An open privacy-preserving and scalable protocol for a network-
neutrality compliant caching,” in Proc. IEEE International Conference
on Communications (ICC), 2019, pp. 1–6.
[12] F. Shi, L. Fan, X. Liu, Z. Na, and Y. Liu, “Probabilistic caching
placement in the presence of multiple eavesdroppers,” Wireless Com-
munications and Mobile Computing, vol. 2018, 2018.
[13] M. Feder and N. Merhav, “Relations between entropy and error proba-
bility,” IEEE Trans. Inf. Theory, vol. 40, no. 1, pp. 259–266, 1994.
[14] T. M. Cover and J. A. Thomas, Elements of Information Theory.
Hoboken, NJ, USA: Wiley, 2012.