Content uploaded by Brij B Gupta
Author content
All content in this area was uploaded by Brij B Gupta on Feb 06, 2018
Content may be subject to copyright.
Available via license: CC BY-NC-ND 4.0
Content may be subject to copyright.
2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2017.2787422, IEEE Access
IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID 1
Cognitive Privacy Middleware for Deep
Learning Mashup in Environmental IoT
Ahmed M. Elmisery, Mirela Sertovic, B. B. Gupta
Abstract— Data mashup is a web technology that combines information from multiple sources into a single web application.
Mashup applications support new services, like environmental monitoring. The different organizations utilize data mashup
services to merge datasets from the different Internet of multimedia things (IoMT) context-based services in order to leverage
the performance of their data analytics. However, mashup different datasets from multiple sources is a privacy hazard as it
might reveal citizens specific behaviors in different regions. In this paper, we present our efforts to build a cognitive -based
middleware for private data mashup (CMPM) to serve a centralized environmental monitoring service. The proposed
middleware is equipped with concealment mechanisms to preserve the privacy of the merged datasets from multiple IoMT
networks involved in the mashup application. In addition, we presented an IoT- enabled data mashup service, where the
multimedia data is collected from the various IoMT platforms, then fed into an environmental deep learning service in order to
detect interesting patterns in hazardous areas. The viable features within each region were extracted using a multiresolution
wavelet transform, then fed into a discriminative classifier to extract various patterns. We also provide a scenario for IoMT-
enabled data mashup service and experimentation results.
Index Terms—IoT Networks, Cloud Computing, Environmental Monitoring, Smart Cities, Big Data mashup, Multimedia data
—————————— u ——————————
1 INTRODUCTION
nvironmental hazards of natural origin involve large
extensions of land such as earthquakes, tsunamis, vol-
cano eruptions, landslides and forest fires are common in
countries like Chile and produce emergency scenarios
where roads are often saturated or damaged and power
supplies are down, disrupting connectivity. These hazards
can easily affect a large number of people and isolate
them from their surrounding environment. While infor-
mation and storage capabilities are becoming virtually
limitless, in such situations, accessing the right infor-
mation at the right time by the right organization is a cru-
cial requirement to take proper decisions and to publish
highly relevant information to the affected communities
and helpers in charge of handling the emergency situa-
tions [1]. Decision makers usually require access to highly
accurate information servers and data application to es-
timate the number of affected citizens in a certain region
and the best available ways to support them.
Environmental monitoring is one of the areas, which at-
tracts public concern. The advance of cloud computing
and Internet of things reshaped the manner in which the
sensed information is being managed and accessed. The
advances in sensor technologies have accelerated the
emergence of environmental sensing service. These new
services grasp the significance of new techniques in order
to understand the complexities and relations in the col-
lected sensed information. Particularly, it utilizes portable
sensing devices to extend the sensing range, and cloud-
computing environments to analyse the big amount of
data collected by various Internet of multimedia things
(IoMT) networks in a productive form. Various kinds of
sensors are being deployed in the environment as the
physical foundation for most of the environmental sens-
ing services. It is highly desirable to link the sensed data
with external data collected from different services in or-
der to increase the accuracy of the predictions [2] In re-
gions with environmental hazards, a large number of citi-
zens makes intensive observations about these regions
using their mobile phone during their daily activities.
This massive data is expected to be generated from differ-
ent sources and published on various Internet of multi-
media things (IoMT) context-based services such as Face-
book®, Waze® and Foursquare®. In such situation, it is
beneficial to include such data in the decision-making
process of environmental monitoring services. In this con-
text, Data Mash-up services appear as a promising tool to
accumulates this data and manage in an appropriate way.
Data mashup [3] is a web technology that combines in-
formation from multiple sources into a single web appli-
cation for specific task or request. Mashup technology
was first introduced in [4] and since then it creates a new
horizon for service providers to integrate their data to
deliver highly customizable services to their customers
[3]. Data mashup can be used to merge datasets from ex-
ternal IoMT context-based services to leverage the moni-
toring service from different perspectives like providing
more precise predictions and performance, and alleviat-
xxxx-xxxx/0x/$xx.00 © 200x IEEE Published by the IEEE Computer Society
E
————————————————
• Department of Electronics Engineering, Universidad Tecnica Federico
Santa Maria, Chile. E-mail: ahmedmisery@gmail.com.
• Faculty of Humanities and Social Sciences, University of Zagreb, Croatia.
E-mail: msertovic@yahoo.com.
• T National Institute of Technology Kurukshetra, India. E-mail: bbgup-
ta.nitkkr@gmail.com.
This work was partially financed by the “Dirección General de Investi-
gación, Innovación y Postgrado” of Federico Santa María Technical
University- Chile, in the project Security in Cyber-Physical Systems for
Power Grids (UTFSM-DGIP PI.L.17.15), by Advanced Center for Elec-
trical and Electronic Engineering (AC3E) CONICYT-Basal Project
FB0008, by the Microsoft Azure for Research Grant (0518798).
2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2017.2787422, IEEE Access
2 IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID
ing cold start problems [5] for new environmental moni-
toring services. Due to that, Providers of the next genera-
tion environmental monitoring services keen to gain ac-
curate data mash-up services for their systems. However,
privacy is an essential concern for the application of
mashup in IoMT-enabled environmental monitoring, as
the generated insights obviously require the integration of
different behavioural and neighbouring environment data
of citizens and from multiple IoMT context-based ser-
vices. This might reveal private citizens’ behaviours that
were not available before the data mashup. A serious pri-
vacy breach can occur if the same citizen is registered on
multiple sites, so adversaries can try to deanonymize the
citizen’s identity by correlating the information contained
in the mashuped data with other information obtained
from external public databases. These breaches prevent
IoMT context based services to reveal raw behavioural
data of the citizen to each other or to the mashup service.
Moreover, divulgence citizens’ data represent infringe-
ment against personal privacy laws that might be applied
in some countries where these sites operate. As a result, if
the citizens know their raw data are revealed to other par-
ties, they will absolutely distrust this site. According to
surveys results in [6, 7] the users might leave a service
provider because of privacy concerns.
We believe that environmental cognition services can be
enriched by extensive data collection infrastructures of
IoMT-enabled data mashup services especially in the do-
main of urban environmental monitoring. IoMT mashup
techniques can be used to merge datasets from external
IoMT networks to leverage the functionalities of envi-
ronmental deep learning service from different perspec-
tives like providing more precise predictions and compu-
tation performance, improving the reliability toward citi-
zens, minimizing the impacts of environmental hazards
on affected citizens, and providing an early response in
cases when the event is inevitable. Due to that, Providers
of the next generation environmental cognition services
keen to utilize IoMT-enabled data mashup services for
their systems. Effective multimedia mining is an essential
requirement for the IoMT-enabled data mashup services,
since, the extracted patterns obviously requires the inte-
gration of different multimedia contents generated from
multiple IoMT networks. These multimedia contents may
contain random noise, which complicates the pattern dis-
covery process. A serious decline in accuracy occurs when
the noisy data is present in the pile of contents that will be
processed through the data mashup techniques. Handling
this noisy data is a real challenge since it is hard to be dis-
tinguished from an abnormal data, it could prevent the
environmental deep learning service from fully embrac-
ing the useful data extracted from the mashup service.
Managing this problem will enable the IoMT- enabled
data mashup services to execute different recognition
methods for identifying the abnormal objects in an effec-
tive manner.
In this work, we proposed Cognitive -based middleware
for private data mashup (CMPM) that bear in mind pri-
vacy issues related to mashup multiple datasets from
IoMT context-based services for environmental monitor-
ing purposes. We focus on stages related to datasets col-
lection and processing and omit all aspects related to en-
vironmental monitoring, mainly because these stages are
critical with regard to privacy as they involve different
entities. We present two cognitive concealment algo-
rithms to protect citizens’ privacy and preserve the ag-
gregates in the mashuped datasets in order to maximize
usability and attain accurate insights. Using these algo-
rithms, each party involved in the mashup is given a
complete control on the privacy of its dataset. In the rest
of this paper, we will generically refer to behavioural and
neighbouring environment data as Items. Section II de-
scribes some related work. In section III we introduce
IoMT-enabled data mashup network scenario landing our
CMPM. In section IV introduces the proposed cognitive
concealment algorithms used in our CMPM. In section V
introduces the proposed anomaly detection solution used
within the environmental cognition service. Section VI
describes some experiments and results based on con-
cealment algorithms for IoT context-based services. Final-
ly, Section VII includes conclusions and future work.
2 RELATED WORK
In practice, end-users have shown an increasing privacy
concern when they share their behavioural and location
data, especially when this data is shared with untrusted
parties [8].This happens due to the following reasons:
First, the behavioural and location data collected by the
end-users are personal by nature, e.g., the end-users
might decline to reveal their physical daily activities,
along with the location and time where they perform
such activities. Second, despite the apparently benign
nature of collected data. This data can be realized to de-
duce the private data of end-users that have not inten-
tionally shared. For example, private personal infor-
mation can be inferred from the brainwave data of users
wearing popular brainwear wireless EEG headsets [9],
such as the digits of PIN numbers, ATM card data, loca-
tion of residence and other sensitive data. Third, the in-
dependent attribute of data collection amplifies the con-
sideration and need for a privacy-respecting technique
when handling the gathered data, since the data can be
collected by malignant third parties at any time or loca-
tion without an explicit consent from the end-users. Final-
ly, the ambiguity of third parties’ practices and their obli-
gations in the issues related to data breaches due to cyber-
attacks or insider attacks. For example, a security re-
searcher succeeded in detecting severe vulnerabilities in
drug infusion pumps that allow an attacker to change the
amount of injected drug to a fatal dose that should harm
the users [10]. The manufacturers of the affected brands
failed to patch the security lapses in their products de-
ployments and sued the researcher. Cases similar to this,
hinder the wide acceptance of various monitoring ser-
vices. Hence, this is a crucial need to preserve the privacy
of sensitive data of end-users. Based on the results of a
recent survey, increasing demand for privacy protection
has been a major concern for the end- users who offering
their data to untrusted third parties in order to receive
2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2017.2787422, IEEE Access
AUTHOR ET AL.: TITLE 3
any value-added services [8]. The end-users insist that
they need full control over the data collection process and
cannot tolerate that their data is stored in a remote loca-
tion and accessible to different external parties.
For the review of related work, we identify two funda-
mental research categories were identified: First privacy
preserving systems are discussed. Finally, vision-based
environmental monitoring systems are shortly surveyed.
2.1 Privacy Preserving Systems
The majority of the literature addresses the problem of
privacy on third-party services [11-16]; Due to it is a po-
tential source of leakage of personally identifiable Infor-
mation. However, a few works have studied the privacy
for mashup services [17]. The work in [3] discussed a pri-
vate data mashup system, where the authors formalize
the problem as achieving a k-anonymity on the integrated
data without revealing detailed information about this
process or disclosing data from one party to another. In
[18] it is proposed a theoretical framework to preserve the
privacy of customers and the commercial interests of
merchants. Their system is a hybrid recommender that
uses secure two-party protocols with public key infra-
structure to achieve the desired goals. In [19, 20] it is sug-
gested another method for privacy preserving on central-
ized services by adding uncertainty to the data, using a
randomized perturbation technique while attempting to
make sure that necessary statistical aggregates don’t get
disturbed much. Hence, the server has no knowledge
about true values of individual data for each user. They
demonstrate that this method does not decrease essential-
ly the obtained accuracy of the results. But recent research
work [21, 22] pointed out that these techniques don’t pro-
vide levels of privacy as it was previously thought. In [22]
it is Pointed out that arbitrary randomization is not safe
because it is easy to breach the privacy protection it offers.
They proposed a random matrix based spectral filtering
techniques to recover the original data from perturbed
data. Their experiments revealed that in many cases ran-
dom perturbation techniques preserve very little privacy.
2.3 Deep Vision-based Environmental Monitoring
Systems
Three main research studies based on machine vision
were performed in order to estimate monitor Environ-
mental hazards. Martinez-de Dios et al.[23, 24] proposed
a method, which computes a 3D perception model of for-
est fires from multispectral complementary views includ-
ing an aerial one. Infrared cameras in mid-infrared spec-
tral window and in the far infrared windows are used
with visible cameras. A statistical sensor fusion approach
using Kalman filtering is employed to merge measure-
ments from different sensors in order to obtain an overall
estimation. Telemetry sensors, GPS data and artificial
beacons or natural marks (such as a tree or a fire fighter
truck) are necessary for the calibration procedure. The
position of the fire front, the rate of spread and the maxi-
mum height of the flames are estimated. Experiments
were carried out on lands of up to 2.5 hectares. In [25], a
method was proposed in which 3D points are computed
from fire feature points matched using stereoscopic imag-
es. From these points, the geometrical characteristics of a
fire front like its position on the ground, its shape and its
surface are estimated. This method does not need refer-
ence marks on the working field of view. The use of sev-
eral stereovision systems allows obtaining a complete 3D
form of the fire front and the estimation of its volume, but
the technique presented in [26] is only at the laboratory
scale. In another work, the use of NIR stereovision sys-
tems was introduced to obtain fire measurement even in
the presence of smoke [27]. Experiments were carried out
indoors and outdoors on platforms of a maximum size of
about 0.5 hectare. Verstockt et al.[28, 29] have developed a
method using a series of cameras distributed around the
fire to compute a 3D model of fires and smokes. This
framework merges the single-view detection results of the
multiple cameras by homographic projection onto multi-
ple horizontal and vertical planes, which slice the scene.
The crossing of these slices creates a 3D grid of virtual
sensor points. The location of the fire, its size and its di-
rection of propagation are estimated with precision. This
procedure is limited to fire fronts not larger than 2x4 m2.
One of the most important aspects in the extraction of fire
characteristics is the detection and extraction of the fire
region. The robustness of the measures is correlated with
the efficiency of the segmentation technique. This task is
very challenging when conducted in outdoor unstruc-
tured environment. The majority of the work in wildland
fire segmentation is conducted in the visible spectrum.
Little work was conducted in the NIR and other infrared
spectrums. In the visible spectrum, different colour spaces
such as RGB, YCbCr, CIE L*a*b*, YUV, HSI are used and it
is observed that no colour system seems to be more effec-
tive than another to characterize the fire. Concerning the
segmentation methods, it is difficult to determine the effi-
ciency of each of them in the specific case of wildland
fires because of the lack of a benchmark of these methods
on a standard database of wildfire images. However,
comparisons of different methods on a representative da-
taset revealed that the methods of Phillips et al.[30], Rossi
et al.[31] and Collumeau et al.[32] Outperform other
methods. More recently, machine learning based tech-
niques [33] were developed that have shown an increased
performance in the fire front detection and segmentation.
A benchmarking of different fire det ection and fire seg-
mentation is given in [33]. In the thermal infrared spec-
trum, the segmentation becomes easier but the thermal
infrared cameras are very expensive and have low resolu-
tions compared to their visible counterparts. The use of
NIR cameras seems to be promising. Ideally, a hybrid sys-
tem, which combines visible and infrared spectrums,
would perform better in urban fire detection and segmen-
tation.
Regarding the unitization of IoT in environmental moni-
toring systems. In [34], the authors give an overview of
past work dealing with the use of aerial vehicles in the
context of forest fires. The majority of this work was deal-
ing with the collection of information and an aerial view
of the fire propagation in order to help in firefighting [35,
36]. Fire detection using aerial vehicles was also conduct-
2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2017.2787422, IEEE Access
4 IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID
ed in other research [37, 38]. A pioneering work using
low-cost aerial vehicles with on-board visible and infra-
red cameras in close range fire detection experiments was
conducted in [39-42]. Nominating Internet of multimedia
things for fire detection which combine insights from
ground-based and airborne sensors along with multi-
modal cooperative vision analytics can permit a better
segmentation, detection, and monitoring of urban forest
fires. Additionally, the utilization Internet of multimedia
things allows collaborative modes in building comple-
mentary three-dimensional views, which will enable the
extraction of 3D geometrical characteristics of fires at a
larger scale.
A methodolog y for data mashup service for IOMT ena-
bled collaborative monitoring were proposed in [17]. The
authors consider the scenario where the IoMT-enabled
data mashup (MDMS) integrates datasets from multiple
IoMT networks for the environmental cognition service;
figure (1) illustrates the architecture supported in this
work. The proposed architecture hosts an intelligent mid-
dleware for private data mashup (DIMPM), which ena-
bles connectivity to diverse IoT devices via varied sensing
technologies. In doing so, the functionalities of the pro-
posed architecture support a cloud based infrastructure
for environmental cognition services. The cloud environ-
ment promotes a service-oriented approach to big data
management, providing a deep learning layer for analyz-
ing the merged data. The architecture follows a layered
approach, where the bottom layer is the Environmental
IoT devices, while the highest layer is the environmental
cognition service.
Fig. 1. IoMT-enabled data mashup with Third Party Envi-
ronmental Cognition Service.
The data mashup process can be summarized as follows;
• The environmental deep learning service sends a
query to the IoMT- enabled data mashup service to
gather information related to a specific region to lev-
erage its predictions and performance.
• At the IoMT - enabled data mashup:
o The coordinator agent at search in its cache to
determine the providers which could satisfy
this query, then it transforms the query into ap-
propriate sub-queries languages suitable for
each provider’s database.
o The manager agent unit sends each sub-query
to the candidate IoT providers to incite them
about the data mashup process.
• Based on prior agreement between the mashup pro-
vider and data providers, the providers who agree to
offer purpose specific datasets to the mashup process
will:
o Forward the sub-query to its manager agent
within the intelligent middleware for private
data mashup.
o The manager agent rewrites the sub-query con-
sidering the privacy preferences for its host and
produces a modified sub-query for the data that
can be published. This step allows the manager
agent to audit all issued sub-queries and pre-
vent ones that can extract sensitive information.
o The resulting dataset is concealed to hide real
data using the appropriate obfuscation algo-
rithm depending of the type of multimedia da-
ta.
o Finally, each provider submits its concealed da-
ta to the IoMT- enabled data mashup service
that in turn unites these results and performs
further analysis on them.
The obtained information is delivered to environmen-
tal cognition service. The environmental deep learning
service uses these datasets to accomplish its data analytics
goals.
3 DATA MASHUPS IN IOT-ENABLED
ENVIRONMENTAL MONITORING
SCENARIO
We co ns id er t he sc en ar io w he re t he Io MT-enabled data
mashup service (IoMT-enabled DMS) integrates various
types of datasets from multiple IoMT context-based ser-
vices for the IoT-enabled environmental monitoring; fig-
ures (2) and (3) illustrates the scenario used in this work.
We a ssu me a ll t he in vo lv ed p ar tie s f ollo w th e se mi -honest
model, which is a realistic assumption because each party
needs to accomplish some business goals and increases its
revenues. Also, we assume all parties involved in the data
mashup have similar items set (activities’ catalogue) but
the users’ sets are not identical. Each IoT context-based
service has its own ETL (Extract, Transform, Load) service
that has the ability to learn behavioural and neighbouring
environment data of citizens. The data mashup process
based on CMPM can be summarized as follows; the envi-
ronmental deep learning service sends a query to the
IoMT-enabled DMS to gather information related to be-
havioural and neighbouring environment data of citizens
in a specific region to leverage its predictions and perfor-
mance. The coordinator Agent in IoMT-enabled DMS
lookup in its providers’ cache to determine the providers
could satisfy that query then it transforms query of the
environmental deep learning service into appropriate
sub-queries languages suitable for each provider’s data-
2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2017.2787422, IEEE Access
AUTHOR ET AL.: TITLE 5
base. The manager agent unit sends each sub-query to the
candidate providers to incite them about the data mashup
process. The provider who decides to participate in that
process, forwards the sub-query to its manager agent to
refine it considering its privacy preferences. This step al-
lows the manager agent to audit all issued sub-queries
and prevent ones that can extract sensitive information.
The resulting dataset sent to the local concealment agent
(LOA) to hide real participants’ data using the appropri-
ate concealment algorithm. Then, every synchronization
agents at each provider along with the coordinator agent
engage in distributed joint process to identify frequent
and partially frequent items in each dataset, then send the
joined results to the coordinator. The coordinator agent
builds a virtualized schema for the datasets and submits
it to each provider involved in the mashup process. Based
on this virtualized schema, the providers incite their
global concealment agent (GOA) to start the appropriate
concealment algorithm on the locally concealed datasets.
Finally, the providers submit all the resulting datasets to
IoMT-enabled DMS that in turn unites these results and
delivers them to the environmental cognition service. The
environmental deep learning service uses these datasets
to accomplish the required data analytics goals. We use
anonymous pseudonyms identities to alleviate providers’
identity problems, as the database providers does not
want to reveal their ownership of the data to competing
providers moreover the IoMT-enabled DMS will keen to
hide the identities of providers as a business asset.
The recommendation process based on the two stage ob-
fuscation algorithms can be summarized as following:
The IoMT-enabled DMS acts as an integrator that collects
data from lower level IoMT context provider, processing
them, and delivering the result to its upper level envi-
ronmental deep learning service. Each IoMT context pro-
vider is responsible for sensing and collecting various
types of data from the physical world. This data can be
textual data or multimedia data. This data represents
some parameters, measurements and conditions in
streets, specific regions and buildings, transportation and
the air quality. That means IoMT devices can be utilized
to monitor and collect data from everything. The process
of merging the sensed data at each IoMT context provider
can be summarized as follows. The IoMT context provid-
er broadcast message to devices’ owners in its network to
incite them to submit their personal profiles and multi-
media data in order to start notifying the users about the
status of specific environmental hazard. Individual users
who decide to respond to this request, specify their priva-
cy preferences to their IoMT context provider then submit
the requested data. More details about this process can be
summerized as follows:
1. An IoMT context provider, broadcasts a message to
other devices’ owners in their network to indicate its
intention to start notifying the users of specific region
about the status of specific environmental hazard in
their surroundings. The provider requests may re-
quire either of textual and/or multimedia data.
2. Individual users that decide to participate in that re-
quest, integrates all textual and/or multimedia data
that they collected for a specific region. In addition,
each participant specifies its privacy preferences re-
garding textual and/or multimedia data. Finally, they
submit the collected data and preferences to the re-
quester.
3. In order to hide the identities and personal infor-
mation of the participants' group from The IoMT con-
text provider, each participant masks the list of items
provided by responding users using anonymous in-
dexes which are linked to the actual items indexes
through a secret map ! known only by them as in ta-
ble 1. One important issue to standardize this secret
map is to use hashing functions using a group gener-
ated key to mask the list of regions and users from
the IoMT context provider.
Anonymous
Index or Hash
value
Region
Index
Item
Name
Data1
Data2
Image1
A1
R1
…
…
…
..
A2
R2
…
…
…
…
A3
R3
…
…
…
…
Ta bl e 1. Secret Map ! Used by the participants
Due to that, the IoMT context provider will not be
able to deal directly with items names but their hash
values or anonymous index. Additionally, the users’
data are also anonymized.
4. Each participant submits the collected data together
with pseudonyms of devices’ owners who participat-
ed in collection process to the IoMT context provider.
5. The IoMT context provider inserts pseudonyms into
user database and their data into its database. The
IoMT context provider updates its model using re-
ceived data, then produces a list "#$ "%& "'& ( ") of
anonymous indexes that users in the same cluster
have chosen in the past.
6. The participants then submit their secret maps, so
they able to update their data. They can unmask the
list "# using the shared secret map ! to get final list.
4 PROPOSED COGNITIVE
CONCEALMENT ALGORITHMS
In the next sub-sections, we introduce our proposed cog-
nitive algorithms used to preserve the privacy of the re-
sulting datasets with minimum loss of accuracy. A closer
look at the attack model proposed in [43] reveals that, if a
set of behavioural and neighbouring environment data of
certain citizen is fully distinguishable from the data of
other citizens in the dataset with respect to some features.
This citizen can be identified if an attacker correlates the
revealed data with data from other publicly accessible
databases. Therefore, it is highly desirable that the dataset
has at least a minimum number of items should have a
2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2017.2787422, IEEE Access
6 IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID
similar feature vector to every real item released by each
participant. A real item in the released dataset can be de-
scribed by a certain number of features in a feature vector,
such as place of activity, type of activity, duration, time,
date and so on. Both implicit and explicit ways can be
used to extract this information and to construct these
feature vectors and to maintain them. Additionally, the
data sparsity problem associated with ETL services can be
used to formulate some attacks as also shown in [43]. Be-
fore starting, we introduce a couple of relevant defini-
tions.
Fig.2: The Building Blocks of IoMT-enabled DMS and CMPM
Fig.3: IoMT-enabled data mashup service with Third Party
IoMT context-based services
Definition 1 (Dissimilarity measure): this metric
measures the amount of divergence between two items
with respect to their feature vector. We use the notation
*+,-.& -/0 to denote the dissimilarity measure between
items -. and -/ based on the feature vector of each item.
*+-.& -/1 2 3 4 -.5-/ [-. is similar to4-/], 2 is a user de-
fined threshold value.
Definition 2 (Affinity group): the set of items that are
similar to item -. with respect to 678 attribute 9: of the
feature vector and it is called affinity group of -. and de-
noted by ;<=-..
;<=-.$ -/> ?/@ -.5-/A 9 $ 9:
4444444$ -/> ?/@*+-.& -/1 2
Definition 3 (K-Similar item group): Let 4?B be the real
items dataset and 4?B its locally concealed version. We
say 4?B satisfies the property of k-Similar item group
(where K is defined value) provided for every item44-.>
4?B.There is at least k-1 other distinct fake items
44-/C& ( 44-/,DEC0 > 4?/ forming affinity group such that:
FG4 44-/H454FG4 4-.& I4J K L K M N J
4.1 Local Concealment using Clustering Based
Obfuscation (CBO) Algorithm
Our motivation to propose CBO is the limitation of the
current anonymity models. The current anonymity mod-
els proposed in the literature failed to provide an overall
anonymity as they don’t consider matching items based
on their feature’ vectors. CBO uses the feature vectors of
the current real items to select fake items highly similar to
real items to create homogeneous concealed dataset. Us-
ing fake transactions to maintain privacy was presented
in [3], [44, 45], the authors considered adding fake trans-
actions to anonymise the original data transactions. This
approach has several advantages over other schemes in-
cluding that any off-the-shelf data analytics algorithms
can be used for analysing the concealed data and the abil-
ity to provide a high theoretical privacy guarantee. The
locally concealed dataset obtained using CBO should be
indistinguishable from the original dataset in order to
preserve privacy. The core idea for CBO is to split the da-
taset into two subsets, the first subset is modified to satis-
fy K-Similar item group definition, and the other subset is
concealed by substituting real items with fake items based
probabilistic approach. CBO creates a concealed dataset
?O as following:
1. The sensitive items are suppressed from the dataset
based on provider preferences thereafter we will have
the suppressed dataset4? as the real dataset.
2. Selecting a P percent of highest frequent items in da-
taset ? to form a new subset4?B4. This step aims to re-
duce the substituted fake items inside the concealed
dataset4?O. Moreover, it maintains data quality by pre-
serving the aggregates of highly frequent preferences.
3. CBO builds affinity groups for each real item I4-.> 4?B
through adding fake items to form K-Similar items
group. We implemented this task as a text categoriza-
tion problem based on the feature vectors of real items.
We al so i mp le me nt ed a ba g-of-words naive Bayesian
text classifier [46] that extended to handle a vector of
bags of words. The task continues until all items in 4?B
2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2017.2787422, IEEE Access
AUTHOR ET AL.: TITLE 7
are belonging to different affinity groups, then we get a
new dataset4?B .
4. For each44-. 4 > ?.$ ? N 4?B44, CBO selects a real item
-.4 from real item set44?. with probability4Q or selects
a fake item -/4 from the candidate fake item set ?/
with probability4J N Q . The selected item 4-O is added
as a record to the concealed dataset4?O. This method
achieves the desired privacy guarantee because the
type of selected item and Q are unknowns to external
parties. The process continues until all real items in ?.
are selected.
5. Finally, the concealed dataset ?O is merged with the
subset 4?B , which obtained from step 3.
Analysis of Local Concealment using CBO
In terms of performance, CBO requires supplementary
storage costs and computations costs. The supplementary
storage costs can be reduced by clustering items in the
resulting dataset into C clusters and use the feature’ vec-
tors of top N items with high rates in each cluster for CBO
algorithm. Thus supplementary storage costs will be in
order of44R,;S0. The computation costs for CBO are di-
vided between computational complexities required to
create affinity groups and adding fake items. Obviously,
the computation overhead in creating affinity groups
dominates, and it can be reduced by selecting lower val-
ues for4P.
4.2 Global Concealment using Random Ratings
Generation (RRG) Algorithm
After executing CBO, the synchronization agents build a
virtualized schema with the aid of the coordinator agent
at IoMT-enabled DMS then the global concealment agent
starts executing the RRG algorithm. The coordinator
agent will not be able to know the real items in the
merged datasets as they already concealed locally using
CBO algorithm. The main aim for the RRG is to alleviate
data sparsity problem by filling the empty cells in such a
way to improve the accuracy of the predictions at the en-
vironmental monitoring side and increase the attained
privacy for providers. The RRG algorithm consists of fol-
lowing steps:
1. The global concealment agent finds the number of ma-
jority frequent items -T and partially frequent items by
all users4- N -T, where - denotes the total number of
items in merged datasets.
2. The global concealment agent randomly selects an in-
teger U between 0 and 100, and then chooses a uniform
random number V4over the range4WX& U Y.
3. The global concealment agent decides V percent of the
partially frequent items in merged datasets and uses
the KNN to predicate the values of the empty cells for
that percentage.
4. The remaining empty cells are filled by random values
chosen using a distribution reflecting the frequent
items in the merged datasets.
Analysis of Global perturbation using RRG
The privacy of the merged datasets is maintained because
all the processing is done on the datasets that previously
processed using CBO. The global concealment agent im-
proves the overall privacy and accuracy by increasing the
density of the merged datasets due to the filled cells. With
increasing U values, the RRG reduces the randomness in
the frequencies. That might increase the accuracy of the
predictions while decreases the privacy level. So, RRG
should select U in a way to achieve the required balance
between privacy and accuracy.
5 PROPOSED ENVIRONMENTAL DEEP
LEARNING SERVICE
In this section, we proposed a new service for anomaly
detection with markov based segmentation approach to
detect possible regions of interest, then fed the extracted
features within each region into a discriminative classifier
to extract various patterns.
Fig. 4. Building Blocks of the Anomaly Detection System
Figure 4 depicts the basic flowchart of our approach,
which consists of four modules. Firstly, the noise present
in the captured images is eliminated with the help of a
noise removal and background subtraction processes. The
second module executes Markov based approach to seg-
ment the possible regions of interest. The third module
extracts the viable features within each region using a
multiresolution wavelet transform. Finally, the last mod-
ule, is a discriminative classifier that learns effective fea-
tures from each region and distinguish it into anomaly or
normal region. In the next sub-sections, we introduce the
various steps involved in our proposed anomaly detec-
tion service. Each step utilizes an effective technique,
which plays an important role in the system. The building
blocks of the proposed systems are depicted in the figure
(4). The proposed anomaly detection system consists of
following steps:
Step 1: Noise Removal and Background Subtraction
Processes
The noise present in the captured images is eliminated
with the help of anisotropic diffusion combined with non-
local mean and Gaussian background process [47]. The
method successfully analyses the images according to
each and every pixel present without eliminating the im-
portant features such as line, interpretations, and edges.
More over the anisotropic diffusion process can effective-
ly analyse blurred images. This process is applied to im-
2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2017.2787422, IEEE Access
8 IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID
ages using following equation:
Z-
Z7 $[L\ ],^& _& 70`- $`]a`- b ] ^& _& 7 c-44,J0
Where [L\ ] 4,^& _& 704`- represents the divergence op-
erator of the diffusion coefficient c(x,y,t) in relation with
the image gradient operator `-. Based on (1) the aniso-
tropic diffusion is applied to the image, if the pixel cor-
rupted with the noise can be replaced using a non-local
approach [43], where the similarity between pixels of the
image is determined using the pixel intensity and is de-
fined as follow:
\L $ 4d L b 4e L 4,f0
Where \,L0 is defined as the current value of pixel4L in
given image -, d,L0 is defined as the “true” value of pixel4L
and e,L0 is defined as the noise mixed with the value of
pixel4L. The noise exists in the image is analyzed accord-
ing to the following assumption, that e4,L0 is an inde-
pendent value extracted from Gaussian distribution with
a variance g2 and mean h4equal to 0. Based on that, the
similarity between the neighboring pixels is defined de-
pending on the weights i6& j%4and i 6& j'. Then the
non-local mean value of each pixel [43] is calculated as
follows,
Sk G 6 $ i 6& j G,j0
lmn
44,o0
G is defined as the image with noise, and the weights
i,6& j 0 satisfy that X K i 6& j K J and i 6& j $ J
l and
is defined as follow:
i 6& j $ J
p,60 q
r+st4,uvr'wv,:&l 0
x4444,y0
z is as defined as the standard deviation of the noise
and p,60 is defined as a normalizing constant and is de-
fined as follow:
p6 $ q N[,6& j 0
8
l
J&f 4444,{0
8 is defined as the weight-decay control parameter. Af-
ter that the neighborhood similarity value is defined as
using the weighted value of the pixel and is calculated as
follow:
[ 6& j $ G S:N G,S l0'&|
'J&f 444,}0
F is defined as the neighborhood filter employed on
the neighborhood’s squared difference ~•€+and is defined
as following:
F $ %
•‚Hƒ
J„,f … L@J0'
•‚Hƒ
۠+ (5)
‡ is the distance between the weight and the center of
the neighborhood filter. F provides higher values if the
pixels near the neighborhood center, and provide lower
values if the pixels near the neighborhood edge. Finally,
these values are used to generate the final image.
The background subtraction was performed [48] using
the Gaussian model. The background model has been
constructed using the selective average method for elimi-
nating the unwanted background pixel information as
follow:
4ˆ‰Š^& _ $ -+,^& _0
Š
+†%
S444,{0
Where ˆ‰Š^& _ is defined as the intensity of pixel
,^& _0 of the background model, -+,^& _0 is defined as the
intensity of pixel ,^& _0 of the ‡‹x frame of the captured
video, and S is defined as the number of video frames
utilized to construct the background model. The back-
ground model is defined using a Gaussian mixture model
as follow,
6^ Œ $ i€•,Ž@h€& •€0
•
۠%
4IL $ J& a a a & ‰444,‘0
Where ^ is defined as continuous-valued data vector,
i€ are defined as the mixture weights, and •,Ž@h€0 are
defined as the component of gaussian density functions
[49], After that the probability value of each pixel is calcu-
lated,
i€S,h‹•‹& p0
’
۠%
44,“0
S is the probability density function that has a mean
vector ” and covariance •. i€ is defined as the weight of
the L‹x Gaussian. The new pixel value p‹ is compared to
each Gaussian, if the Gaussian weight is matched p N
hx1 [zx, then the Gaussian parameters are updated in
accordance with:
i€&‹ $ J N • – i€&‹r% b • – ‰€&‹
h‹$ J N U – h‹r% b U – p‹
z‹
'$ J N U – h‹r% b U – p‹N h‹—– ,p‹N h‹0
U $ • – S,h‹0
• is the learning rate for the Gaussian weight. Addi-
tionally, the unmatched pixel are eliminated using4i€&‹ $
J $ • – i€&‹ N J. If none of the pixel matches the Gaussi-
ans weight, lowest weight pixel is replaced with4p‹. When
the Gaussians values are stored in a corresponding index
with a descending order, the initial values of this index
will probably represent the background. After eliminating
the background pixels and noise, the images are fed into
the next step.
Step 2: Regions Segmentation
The segmentation approach is done using markov ran-
dom field [10] to ensure the effective extraction of mean-
ingful regions. It uses the local image feature value, prior
probability, and marginal distribution value of the image.
At the start, Markov random neighboring value must be
defined from the image in terms of both first and second
order neighboring values. Then the initial probability
value for each feature value is set as 0 or 1. After that the
mean and variance value of each pixel value is computed
and labelled in the image. From the computed values,
marginal distribution value is calculated according to the
2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2017.2787422, IEEE Access
AUTHOR ET AL.: TITLE 9
Bayes theorem. Finally, the probability value must be cal-
culated and the pixels with similar values are grouped
into the particular cluster or region. This process is re-
peated until the prior probability value reaches to a max-
imum value other than the defined one. The extracted
regions are fed into the next step.
Step 3: Features Extraction
The multiresolution wavelet transform was employed for
feature extraction. At first, the segmented regions are di-
vided into sub-regions [50] in all the directions and then
the key elements of the scale descriptors are selected. This
step starts with applying a Gaussian filter on the image to
detect the key elements. The maximum and minimum
values of the edges are determined using the following
equation4? ^& _& z $ k ^& _& ˜€z N k,^ & _& ˜
™z0, where
? ^& _& z 4is the difference in the Gaussian image,
k^& _& ˜z is the convolution value of the im-
age4k ^& _& ˜z $ š ^& _ & Mz – -,^& _ 0, and -,^& _0 is the
Gaussian blur value. Detecting key elements is accom-
plished using Taylor series, which is calculated as:
?^ $ ? b Z?—
Z^ ^ b J
f^—Z'?
Z^'^4444,›0
From the detected key elements and their locations,
each key elements is assigned magnitude ‡^& _ 4and ori-
entation œ ^& _ 4in every direction as following:
‡^& _
$ k ^ b J& _ N k ^ N J& _ 'b k ^& _ b J N k ^& _ N J '
œ^& _ $ •7•ef k ^& _ b J N k ^& _ N J & k ^ b J& _
N k ^ N J& _
Based on the extracted key elements, different features
can be calculated such as mean, standard deviation, en-
tropy and variance. The extracted features are fed into the
next step.
Step 4: Anomaly Detection
In the last step, the extracted features are used to train
support vector machine classifier to detect anomalies
from the captured videos, the training stage reduces mis-
classification error and increases the recognition rate.
Each feature in the training dataset is represented as4? $
^€& _€^€> ~:& _€> NJ&J . The output value of this
stage is defined as {1,-1}, in which 1 is represented as the
normal feature and -1 denoted as the anomaly feature.
Then the feature belongs to the class is defined by apply-
ing the hyper plane which is calculated as ia ^ N • $ X,
where x is represented as the features exists in the train-
ing set, The normal hyper plane vector is w and hyper
plane offset is b. The extreme learning neural networks
[51] were utilized to reduce the maximum margin classifi-
cation, which in turn improves the anomaly detection
process. At the testing stage, the extracted features are
matched with the training features to successfully detect
the anomaly features. The accuracy of the proposed sys-
tem was examined using the experimental results.
7 EXPERIMENTAL RESULTS
The proposed algorithms are implemented in C++, we
used message-passing interface (MPI) for a distributed
memory implementation of RRG algorithm to mimic a
distributed network of nodes. Since, there is no publicly
available datasets for environmental hazards on the in-
ternet repositories. Therefore, we constructed our own
datasets that utilizes the video footages of Forest fires
dataset, which was provided by the National Protection
and Rescue Directorate of Croatia, and other fire videos
from an online social service such as YouTube. This da-
taset consists of 1020 fire and non-fire video clips. There
are 130 forest fire video clips, 260 indoor fire video clips,
320 outdoor fire video clips and 310 non-fire video clips
among the collected dataset. The resolutions of video
clips were 480x360 pixels and each video clips consists of
200∼300 frames. Almost 1/4 of the video clips were used
for testing while the remaining were used for training.
The testing set contains 40 forest fire video clips, 60 in-
door fire video clips, 80 out fire video clips and 80 non-
fire video clips. For negative video clips, collection of vid-
eos contains some kind of flame, such as ambulance light,
Flame Effect Light, and so on. Table 2 shows that the pro-
posed techniques achieved the real-time performance for
this resolution. The most time-consuming part was relat-
ed to the calculation of pixel intensity. To check the effect
of the resolution of videos on the processing time, another
video sequence, which had 1280x720 pixels resolution,
was tested. Tests have shown that the complexity was
increased by more than 2 times for 5.33 larger frame size.
The Precision of this solution applied on these videos is
about 94%.
Table 2: Performance of Proposed Anomaly Detection System
In order to evaluate the effect of our proposed conceal-
ment algorithms on mashuped datasets used in problem
solving. A dataset pulled from the SportyPal® network
that was linked to another dataset containing behavioural
and neighbouring environment data of 8000 students in
the University of Zagreb in Croatia in the period of 2006
to 2008. For the purpose of this work, we intended to
measure two aspects in this dataset, which are privacy
breach levels and accuracy of results. We divide the da-
taset into a training set and testing set. The training set is
concealed then used as a database for the monitoring ser-
vice. To evaluate the accuracy of the generated predic-
tions, we used the mean average error (MAE) metric pro-
posed in [52]. To measure the privacy breach levels, we
Precision
Recall
Tr ue
Nega-
tive
Rate
Accuracy
F-
Meas-
ure
Processing Time
(MS)
480x360
1280x72
0
0.8998
0.89
0.82
0.9543
0.9128
25.1
55.2
2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2017.2787422, IEEE Access
10 IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID
used mutual information as a measure for the notion of
privacy breach of 4žŸ through4ž .
In the first experiment, we want to measure the rela-
tion between the quantity of real items in the concealed
dataset and privacy breach, we select ¡ in a range from
1.0 to 5.5, and we increased the number of real items
from 100 to 1000. We select fake items set using uni-
form distribution as a baseline. As shown in figure (5),
our generated fake set reduces the privacy breach and
performs much better than uniform fake set. As the
number of real items increase the uniform fake set get
worse as more information is leaked while our optimal
fake set does not affect with that attitude.
Fig. 5: Privacy breach for optimal and uniform fake
sets
Fig. 6: MAE of the generated predictions vs. conceal-
ment rate
In the second experiment, we measured the relation
between the quantity of fake items in the subset 4ž¢
and the accuracy of the classification results. We select
a set of real items from our dataset, then we split it into
two subsets4ž¢ and4žŸ.We concealed subset žŸ with
fixed value for ¡ to obtain the subset4ž£. We append
the subset4ž¢ with either items from optimal fake set
or uniform fake set. Thereafter, we gradually increased
the percentage of real items in ž¢ that are selected
from our dataset from 0.1 to 0.9. Figure (6) shows MAE
values as a function of the concealment rate for the
whole concealed dataset4ž£. The IoMT context-based
service can select a concealment rate based on its pri-
vacy preferences. Hence, with a higher value for the
concealment rate, higher accurate predictions can be
attained by the monitoring service. Adding items from
the optimal fake set have a minor impact on MAE of
the results without having to select a higher value for
the concealment rate.
Fig. 7. MAE of the generated predictions for different ¤
values
Finally, Due to different levels of privacy concerns be-
tween IoT context based providers, they might select
various values for the ¤ parameter that might affect the
accuracy and privacy of the overall predictions. This
probably influences on their revenues, since IoT-
enabled CMS pays for the usage of their databases to
achieve a certain prediction quality. To evaluate how
various privacy levels, affect the accuracy of predic-
tions, we performed two experiments using our da-
taset. We varied the value of ¤4from 0 to 100 to show
how the different values of 4¥ affect the accuracy and
privacy of the results Note that when the value of ¤4 is
0, this means select all the partially frequent and infre-
quent items then fill the selected items with random
values chosen using a distribution reflecting the data
in the merged datasets. Once we set the value of¤, we
can randomly select the value of ¥ over the range [0,4¤],
after calculating the values of MAE and privacy breach
of the results, figures (7) and (8) depict the results. As
seen from figure (5) accuracy becomes better with
augmented ¤ values, as the size of selected portion
filled using KNN is increased and the size of random-
ized portion is decreased. Although, augmenting the
values of ¤ attains lower values for MAE. However, we
still have a decent accuracy level for the predictions.
Accuracy losses result from an error in the predictions
such that the predicted items might not represent true
frequent items in regard to those infrequent items. Also
there is an error yield from using KNN predictions
2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2017.2787422, IEEE Access
AUTHOR ET AL.: TITLE 11
with different values for the K parameter. Using these
errors; we guarantee in the merged datasets lower val-
ues for the privacy breach metric as shown in figure
(7). This can contribute to overcoming some privacy
breaches that might happen due to the mashup process
of various datasets from independent IoT context
based services [53]. We can conclude that accuracy
losses due to privacy concerns are small and our pro-
posed algorithms make it possible to offer accurate
predictions.
Fig. 8. Privacy breach of the generated predictions for
different ρ values
8 CONCLUSIONS
In this work, we presented our ongoing work on
building a cognitive -based middleware for private
data mashup (CMPM) to serve centralized IoT-enabled
environmental monitoring service. We gave a brief
overview over the mashup process and two conceal-
ment mechanisms. A Novel anomaly detection solu-
tion was also presented in detail, which achieves
promising results in terms of performance. The exper-
iments were conducted on a real dataset and it shows
that the accuracy of our solution is more than 94%.
However, the ability to detect environmental hazards
such as fires and reduce false positives depends mainly
on the image quality. Additionally, the experiments
show our approach reduces privacy breaches and at-
tains accurate results. We realized many challenges in
building an IoMT-enabled data mashup service. As a
result, we focused on environmental monitoring ser-
vice scenario. This allows us to move forward in build-
ing an integrated system while studying issues such as
a dynamic data release at a later stage and deferring
certain issues such as virtualized schema and auditing
to future research agenda.
ACKNOWLEDGMENTS
This work was partially financed by the “Dirección
General de Investigación, Innovación y Postgrado” of
Federico Santa María Technical University- Chile, in
the project Security in Cyber-Physical Systems for
Power Grids (UTFSM-DGIP PI.L.17.15), by Advanced
Center for Electrical and Electronic Engineering
(AC3E) CONICYT-Basal Project FB0008, by the Mi-
crosoft Azure for Research Grant (0518798).
REFERENCES
[1] T. Catarci, M. de Leoni, A. Marrella, M. Mecella, B. Salvatore, G.
Vetere, S. Dustdar, L. Juszczyk, A. Manzoor, and H.-L. Truong,
“Pervasive software environments for supporting disaster
responses,” IEEE Internet Computing, vol. 12, no. 1, pp. 26-37,
2008.
[2] J. San-Miguel-Ayanz, E. Schulte, G. Schmuck, A. Camia, P. Strobl,
G. Liberta, C. Giovando, R. Boca, F. Sedano, and P. Kempeneers,
“Comprehensive monitoring of wildfires in Europe: the European
forest fire information system (EFFIS),” 2012.
[3] T. Trojer, B. C. M. Fung, and P. C. K. Hung, “Service-Oriented
Architecture for Privacy-Preserving Data Mashup,” in Proceedings
of the 2009 IEEE International Conference on Web Services, 2009,
pp. 767-774.
[4] R. D. Hof, “Mix, Match, And Mutate,” BusinessWeek, 2005.
[5] M. d. Gemmis, L. Iaquinta, P. Lops, C. Musto, F. Narducci, and G.
Semeraro, “Preference Learning in Recommender Systems,” in
European Conference on Machine Learning and Principles and
Practice of Knowledge Discovery in Databases (ECML/PKDD),
Slovenia, 2009.
[6] L. F. Cranor, “'I didn't buy it for myself' privacy and ecommerce
personalization,” in Proceedings of the 2003 ACM workshop on
Privacy in the electronic society, Washington, DC, 2003.
[7] C. Dialogue, "Cyber Dialogue Survey Data Reveals Lost Revenue
for Retailers Due to Widespread Consumer Privacy Concerns,"
Cyber Dialogue, 2001.
[8] J. S. Olson, J. Grudin, and E. Horvitz, “A study of preferences for
sharing and privacy,” in CHI '05 extended abstracts on Human
factors in computing systems, Portland, OR, USA, 2005, pp. 1985-
1988.
[9] I. Martinovic, D. Davies, M. Frank, D. Perito, T. Ros, and D. Song,
“On the feasibility of side-channel attacks with brain-computer
interfaces,” in Proceedings of the 21st USENIX conference on
Security symposium, Bellevue, WA, 2012, pp. 34-34.
[10] D. Storm, “MEDJACK: Hackers hijacking medical devices to create
backdoors in hospital networks,” Computerworld, 2015.
[11] A. M. Elmisery, and D. Botvich, “Enhanced middleware for
collaborative privacy in IPTV recommender services,” Journal of
Convergence, vol. 2, no. 2, pp. 10, 2011.
[12] A. M. Elmisery, and D. Botvich, "Agent based middleware for
private data mashup in IPTV recommender services." pp. 107-111.
[13] A. M. Elmisery, and D. Botvich, “Multi-agent based middleware for
protecting privacy in IPTV content recommender services,”
Multimedia Tools and Applications, vol. 64, no. 2, pp. 249-275,
2012.
[14] A. M. Elmisery, “Private personalized social recommendations in an
IPTV system,” New Review of Hypermedia and Multimedia, vol.
20, no. 2, pp. 145-167, 2014/04/03, 2014.
[15] A. M. Elmisery, S. Rho, and D. Botvich, “A Fog Based Middleware
for Automated Compliance With OECD Privacy Principles in
Internet of Healthcare Things,” IEEE Access, vol. 4, pp. 8418-8441,
2016.
[16] A. M. Elmisery, S. Rho, and D. Botvich, “A distributed collaborative
platform for personal health profiles in patient-driven health social
network,” Int. J. Distrib. Sen. Netw., vol. 2015, pp. 11-11, 2015.
[17] A. M. Elmisery, S. Rho, M. Sertovic, K. Boudaoud, and S. Seo,
“Privacy aware group based recommender system in multimedia
services,” Multimedia Tools and Applications, vol. 76, no. 24, pp.
26103-26127, December 01, 2017.
[18] A. Esma, "Experimental Demonstration of a Hybrid Privacy-
Preserving Recommender System." pp. 161-170.
2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2017.2787422, IEEE Access
12 IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID
[19] H. Polat, and W. Du, “Privacy-Preserving Collaborative Filtering
Using Randomized Perturbation Techniques,” in Proceedings of the
Third IEEE International Conference on Data Mining, 2003, pp. 625.
[20] H. Polat, and W. Du, “SVD-based collaborative filtering with
privacy,” in Proceedings of the 2005 ACM symposium on Applied
computing, Santa Fe, New Mexico, 2005, pp. 791-795.
[21] Z. Huang, W. Du, and B. Chen, “Deriving private information from
randomized data,” in Proceedings of the 2005 ACM SIGMOD
international conference on Management of data, Baltimore,
Maryland, 2005, pp. 37-48.
[22] H. Kargupta, S. Datta, Q. Wang, and K. Sivakumar, “On the Privacy
Preserving Properties of Random Data Perturbation Techniques,” in
Proceedings of the Third IEEE International Conference on Data
Mining, 2003, pp. 99.
[23] J. R. Martinez-de Dios, B. C. Arrue, A. Ollero, L. Merino, and F.
Gómez-Rodríguez, “Computer vision techniques for forest fire
perception,” Image and vision computing, vol. 26, no. 4, pp. 550-
562, 2008.
[24] J. R. Martínez-de Dios, L. Merino, F. Caballero, and A. Ollero,
“Automatic forest-fire measuring using ground stations and
unmanned aerial systems,” Sensors, vol. 11, no. 6, pp. 6328-6353,
2011.
[25] L. Rossi, T. Molinier, A. Pieri, M. Akhloufi, Y. Tison, and F.
Bosseur, “Measurement of the geometric characteristics of a fire
front by stereovision techniques on field experiments,” Measurement
Science and Technology, vol. 22, no. 12, pp. 125504, 2011.
[26] L. Rossi, T. Molinier, M. Akhloufi, Y. Tison, and A. Pieri,
“Estimating the surface and volume of laboratory-scale wildfire fuel
using computer vision,” IET Image Processing, vol. 6, no. 8, pp.
1031-1040, 2012.
[27] L. Rossi, T. Toulouse, M. Akhloufi, A. Pieri, and Y. Tison,
“Estimation of spreading fire geometrical characteristics using near
infrared stereovision,” Proc. of SPIE-IS&T/Vol, vol. 8650, pp.
86500A-1, 2013.
[28] S. Verstockt, S. Van Hoecke, N. Tilley, B. Merci, B. Sette, P.
Lambert, C.-F. J. Hollemeersch, and R. Van de Walle, “FireCube: a
multi-view localization framework for 3D fire analysis,” Fire Safety
Journal, vol. 46, no. 5, pp. 262-275, 2011.
[29] S. Verstockt, “Multi-modal video analysis for early fire detection,”
Ghent University, 2011.
[30] W. Phillips Iii, M. Shah, and N. da Vitoria Lobo, “Flame recognition
in video,” Pattern recognition letters, vol. 23, no. 1, pp. 319-327,
2002.
[31] R. Lucile, A. Moulay, and T. Yves, “Dynamic fire 3D modeling
using a real-time stereovision system,” Journal of Communication
and Computer, vol. 6, no. 10, pp. 54-61, 2009.
[32] J.-F. Collumeau, H. Laurent, A. Hafiane, and K. Chetehouna, "Fire
scene segmentations for forest fire characterization: a comparative
study." pp. 2973-2976.
[33] T. Toulouse, L. Rossi, T. Celik, and M. Akhloufi, “Automatic fire
pixel detection using image processing: a comparative analysis of
rule-based and machine learning-based methods,” Signal, Image and
Video Processing, vol. 10, no. 4, pp. 647-654, 2016.
[34] C. Yuan, Y. Zhang, and Z. Liu, “A survey on technologies for
automatic forest fire monitoring, detection, and fighting using
unmanned aerial vehicles and remote sensing techniques,” Canadian
journal of forest research, vol. 45, no. 7, pp. 783-792, 2015.
[35] C. C. Wilson, and J. B. Davis, “Forest fire laboratory at Riverside
and fire research in California: Past, present, and future,” 1988.
[36] V. G. Ambrosia, and T. Zajkowski, "Selection of appropriate class
UAS/sensors to support fire monitoring: experiences in the United
States," Handbook of Unmanned Aerial Vehicles, pp. 2723-2754:
Springer, 2015.
[37] V. Ambrosia, “Remotely piloted vehicles as fire imaging platforms:
The future is here,” Wildfire Magazine, vol. 11, no. 3, pp. 9-16,
2002.
[38] R. Charvat, R. Ozburn, S. Bushong, K. Cohen, and M. Kumar,
"SIERRA Team Flight of Zephyr UAS at West Virginia Wild Land
Fire Burn," Infotech@ Aerospace 2012, p. 2544, 2012.
[39] A. Ollero, S. Lacroix, L. Merino, J. Gancet, J. Wiklund, V. Remuß,
I. V. Perez, L. G. Gutiérrez, D. X. Viegas, and M. A. G. Benitez,
“Multiple eyes in the skies: architecture and perception issues in the
COMETS unmanned air vehicles project,” IEEE robotics &
automation magazine, vol. 12, no. 2, pp. 46-57, 2005.
[40] J. Martínez-de-Dios, L. Merino, A. Ollero, L. Ribeiro, and X.
Viegas, “Multi-UAV experiments: application to forest fires,”
Multiple Heterogeneous Unmanned Aerial Vehicles, pp. 207-228,
2007.
[41] L. Merino, F. Caballero, J. R. Martínez-de-Dios, I. Maza, and A.
Ollero, “An unmanned aircraft system for automatic forest fire
monitoring and measurement,” Journal of Intelligent & Robotic
Systems, vol. 65, no. 1, pp. 533-548, 2012.
[42] L. Merino, J. R. Martínez-de Dios, and A. Ollero, "Cooperative
unmanned aerial systems for fire detection, monitoring, and
extinguishing," Handbook of Unmanned Aerial Vehicles, pp. 2693-
2722: Springer, 2015.
[43] A. Narayanan, and V. Shmatikov, “Robust De-anonymization of
Large Sparse Datasets,” in Proceedings of the 2008 IEEE
Symposium on Security and Privacy, 2008.
[44] J.-L. Lin, and J. Y.-C. Liu, “Privacy preserving itemset mining
through fake transactions,” in Proceedings of the 2007 ACM
symposium on Applied computing, Seoul, Korea, 2007, pp. 375-379.
[45] J.-L. Lin, and Y.-W. Cheng, “Privacy preserving itemset mining
through noisy items,” Expert Systems with Applications, vol. 36, no.
3, Part 1, pp. 5711-5717, 2009.
[46] D. D. Lewis, “Naive (Bayes) at Forty: The Independence
Assumption in Information Retrieval,” in Proceedings of the 10th
European Conference on Machine Learning, 1998, pp. 4-15.
[47] D. Tschumperlé, and L. Brun, "Non-local image smoothing by
applying anisotropic diffusion PDE's in the space of patches." pp.
2957-2960.
[48] M. Piccardi, "Background subtraction techniques: a review." pp.
3099-3104.
[49] A. Elgammal, D. Harwood, and L. Davis, “Non-parametric model
for background subtraction,” Computer Vision—ECCV 2000, pp.
751-767, 2000.
[50] T. Ojala, M. Pietikainen, and T. Maenpaa, “Multiresolution gray-
scale and rotation invariant texture classification with local binary
patterns,” IEEE Transactions on pattern analysis and machine
intelligence, vol. 24, no. 7, pp. 971-987, 2002.
[51] G. Feng, G.-B. Huang, Q. Lin, and R. Gay, “Error minimized
extreme learning machine with growth of hidden nodes and
incremental learning,” IEEE Transactions on Neural Networks, vol.
20, no. 8, pp. 1352-1357, 2009.
[52] J. L. Herlocker, J. A. Konstan, L. G. Terveen, and J. T. Riedl,
“Evaluating collaborative filtering recommender systems,” ACM
Trans. Inf. Syst., vol. 22, no. 1, pp. 5-53, 2004.
[53] P. Golle, F. McSherry, and I. Mironov, “Data collection with self-
enforcing privacy,” in Proceedings of the 13th ACM conference on
Computer and communications security, Alexandria, Virginia, USA,
2006, pp. 69-78.