Conference Paper

Comparison of De-Identification Techniques for Privacy Preserving Data Analysis in Vehicular Data Sharing

Authors:
  • Continental Automotive Technologies GmbH
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Vehicles are becoming interconnected and autonomous while collecting, sharing and processing large amounts of personal, and private data. When developing a service that relies on such data, ensuring privacy preserving data sharing and processing is one of the main challenges. Often several entities are involved in these steps and the interested parties are manifold. To ensure data privacy, a variety of different de-identification techniques exist that all exhibit unique peculiarities to be considered. In this paper, we show at the example of a location-based service for weather prediction of an energy grid operator, how the different de-identification techniques can be evaluated. With this, we aim to provide a better understanding of state-of-the-art de-identification techniques and the pitfalls to consider by implementation. Finally, we find that the optimal technique for a specific service depends highly on the scenario specifications and requirements.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The most detailed analysis attempting to raise re-identification risks were proposed in [29][30][31]. Brasher [29] discussed the limitation of anonymization and encouraged its conjunction with pseudonymization to reduce the re-identification risks. Li et al. discussed the inference and de-anonymization risks within the driverless environment [30]. ...
... Li et al. discussed the inference and de-anonymization risks within the driverless environment [30]. Löbner et al. [31] evaluated the re-identification risks and impact within the vehicular context through a real test bed scenario, though the efforts remain limited to some of the privacypreserving techniques without reviewing them thoroughly. ...
... • Developing a significant reference point for academic research on ACS, public transportation operators, automobile manufacturers (OEMs), policymakers, and service providers, acquiring or looking forward to deploying ACSs within their systems. [33] 2022 Mulder and Vellinga [27] 2021 Löbner et al. [31] 2021 ENISA [32] 2021 Pattinson et al. [19] 2020 Krontiris et al. [21] 2020 Costantini et al. [13] 2020 Vallet [20] 2019 Ribeiro and Nakamura [28] 2019 Li et al. [30] 2019 Taeihagh and Lim [17,34] 2018 Veitas and Delaere [6] 2018 Bastos et al. [22] 2018 Ainsalu et al. [2] 2018 Karnouskos and Kerschbaum [25] 2018 Brasher [29] 2018 Collingwood [23] 2017 WP29 [35] 2014 Glancy [24] 2012 This work a Privacy challenges. b Data subjects. ...
Article
Full-text available
The fast evolution and prevalence of driverless technologies has facilitated the testing and deployment of automated city shuttles (ACSs) as a means of public transportation in smart cities. For their efficient functioning, ACSs require a real-time data compilation and exchange of information with their internal components and external environment. However, that nexus of data exchange comes with privacy concerns and data protection challenges. In particular, the technical realization of stringent data protection laws on data collection and processing are key issues to be tackled within the ACSs ecosystem. Our work provides an in-depth analysis of the GDPR requirements that should be considered by the ACSs’ stakeholders during the collection, storage, use, and transmission of data to and from the vehicles. First, an analysis is performed on the data processing principles, the rights of data subjects, and the subsequent obligations for the data controllers where we highlight the mixed roles that can be assigned to the ACSs stakeholders. Secondly, the compatibility of privacy laws with security technologies focusing on the gap between the legal definitions and the technological implementation of privacy-preserving techniques are discussed. In face of the GDPR pitfalls, our work recommends a further strengthening of the data protection law. The interdisciplinary approach will ensure that the overlapping stakeholder roles and the blurring implementation of data privacy-preserving techniques within the ACSs landscape are efficiently addressed.
... In a context of smart city vehicular data sharing, there are electricity providers that have to receive location data and weather data from cars that do the measurement. With HE, it is possible for the electricity supplier to compute on this data without having access to the actual location of the vehicles and still having the supposed results [152]. ...
... One approach for vehicular MPC communication is described by [152] that makes a cooperative control strategy incorporating efficient MPC, reducing latency and integrating a secret function sharing scheme. The MPC can be performed using a separate map on different clusters, where each cluster has different vehicles, which together calculate the average energy demand of a given cluster with secure multiparty computing. ...
Article
Full-text available
Smart cities, leveraging IoT technologies, are revolutionizing the quality of life for citizens. However, the massive data generated in these cities also poses significant privacy risks, particularly in de-anonymization and re-identification. This survey focuses on the privacy concerns and commonly used techniques for data protection in smart cities, specifically addressing geolocation data and video surveillance. We categorize the attacks into linking, predictive and inference, and side-channel attacks. Furthermore, we examine the most widely employed de-identification and anonymization techniques, highlighting privacy-preserving techniques and anonymization tools; while these methods can reduce the privacy risks, they are not enough to address all the challenges. In addition, we argue that de-identification must involve properties such as unlikability, selective disclosure and self-sovereignty. This paper concludes by outlining future research challenges in achieving complete de-identification in smart cities.
... In contrast, our paper aims at the technical realization of GDPR requirements. In addition to the system model, our work reflects on implementation considerations by selecting suitable privacypreserving technologies (PPT) that are required for a technical realization in the vehicle, which can be challenging itself [19,28]. The concept of the Privacy Manager (cf. ...
Conference Paper
Cars are getting rapidly connected with their environment allowing all kind of mobility services based on the data from various sensors in the car. Data privacy is in many cases only ensured by legislation, i. e., the European General Data Protection Regulation (GDPR), but not technically enforced. Therefore, we present a system model for enforcing purpose limitation based on data tagging and attribute-based encryption. By encrypting sensitive data in a way only services for a certain purpose can decrypt the data, we ensure access control based on the purpose of a service. In this paper, we present and discuss our system model with the aim to improve technical enforcement of GDPR principles. CCS CONCEPTS • Security and privacy → Human and societal aspects of security and privacy; Privacy protections; Usability in security and privacy; • Computer systems organization → Special purpose systems.
... Inferences can be seen as "new" data, created through the combination of (personal) data of different types and sources. Inferences can also be targeted at de-identified data [73,49] when combining the existing data set with another set to re-identify users. The ethical issue is now how these inferences should be treated under consideration of all circumstances, that is the different entities, creator, data subjects, involved, the type of data as well as its purpose and processing. ...
Chapter
Full-text available
Enabling cybersecurity and protecting personal data are crucial challenges in the development and provision of digital service chains. Data and information are the key ingredients in the creation process of new digital services and products. While legal and technical problems are frequently discussed in academia, ethical issues of digital service chains and the commercialization of data are seldom investigated. Thus, based on outcomes of the Horizon2020 PANELFIT project, this work discusses current ethical issues related to cybersecurity. Utilizing expert workshops and encounters as well as a scientific literature review, ethical issues are mapped on individual steps of digital service chains. Not surprisingly, the results demonstrate that ethical challenges cannot be resolved in a general way, but need to be discussed individually and with respect to the ethical principles that are violated in the specific step of the service chain. Nevertheless, our results support practitioners by providing and discussing a list of ethical challenges to enable legally compliant as well as ethically acceptable solutions in the future.
Chapter
With federated learning, information among different clients can be accessed to train a central model that aims for an optimal use of data while keeping the clients’ data local and private. But since its emergence in 2017, several threats such as gradient attacks or model poisoning attacks against federated learning have been identified. Therefore, federated learning cannot be considered as stand alone privacy preserving machine learning technique. Thus, we analyse how and where local differential privacy can compensate for the drawbacks of federated learning while keeping its advantage of combining data from different sources. In this work, we analyse the different communication channels and entities in the federated learning architecture that may be attacked or try to reveal data from other entities. Thereby, we evaluate where local differential privacy is helpful. Finally, for our spam and ham email classification model with local differential privacy, we find that setting a local threshold of F1-Score on the clients’ level can reduce the consumption of privacy budget over several rounds, and decrease the training time. Moreover, we find that for the central model a significantly higher F1-Score than those set on the local level for the clients can be achieved.
Article
Driver emotions play a vital role in driving safety and performance. Consequently, regulating driver emotions through empathic interfaces have been investigated thoroughly. However, the prerequisite - driver emotion sensing - is a challenging endeavor: Body-worn physiological sensors are intrusive, while facial and speech recognition only capture overt emotions. In a user study (N=27), we investigate how emotions can be unobtrusively predicted by analyzing a rich set of contextual features captured by a smartphone, including road and traffic conditions, visual scene analysis, audio, weather information, and car speed. We derive a technical design space to inform practitioners and researchers about the most indicative sensing modalities, the corresponding impact on users' privacy, and the computational cost associated with processing this data. Our analysis shows that contextual emotion recognition is significantly more robust than facial recognition, leading to an overall improvement of 7% using a leave-one-participant-out cross-validation.
Article
Full-text available
Anonymization is a practical solution for preserving user’s privacy in data publishing. Data owners such as hospitals, banks, social network (SN) service providers, and insurance companies anonymize their user’s data before publishing it to protect the privacy of users whereas anonymous data remains useful for legitimate information consumers. Many anonymization models, algorithms, frameworks, and prototypes have been proposed/developed for privacy preserving data publishing (PPDP). These models/algorithms anonymize users’ data which is mainly in the form of tables or graphs depending upon the data owners. It is of paramount importance to provide good perspectives of the whole information privacy area involving both tabular and SN data, and recent anonymization researches. In this paper, we presents a comprehensive survey about SN (i.e., graphs) and relational (i.e., tabular) data anonymization techniques used in the PPDP. We systematically categorize the existing anonymization techniques into relational and structural anonymization, and present an up to date thorough review on existing anonymization techniques and metrics used for their evaluation. Our aim is to provide deeper insights about the PPDP problem involving both graphs and tabular data, possible attacks that can be launched on the sanitized published data, different actors involved in the anonymization scenario, and major differences in amount of private information contained in graphs and relational data, respectively. We present various representative anonymization methods that have been proposed to solve privacy problems in application-specific scenarios of the SNs. Furthermore, we highlight the user’s re-identification methods used by malevolent adversaries to re-identify people uniquely from the privacy preserved published data. Additionally, we discuss the challenges of anonymizing both graphs and tabular data, and elaborate promising research directions. To the best of our knowledge, this is the first work to systematically cover recent PPDP techniques involving both SN and relational data, and it provides a solid foundation for future studies in the PPDP field.
Conference Paper
Full-text available
Ubiquitous computing has also reached the insurance indus-try in the form of Usage Based Insurance models. Modern rates use sensordata to offer the user suitable pricing models adapted to his character.Our overview shows that insurance companies generally rely on drivingbehaviour to assess a user in risk categories. Based on the collected data,a new attack using kNN-DTW shows that, with the derived informa-tion, the identification of a driver in a group of all users of a vehicle ispossible with more than 90% accuracy and therefore may represent amisuse of the data collection. Thus, motivated by the General Data Pro-tection Regulation, questions regarding anonymisation become relevant.The suitability of standard methods known from Big Data is evaluatedin the complex scenario Pay-How-You-Drive using both real-world andsynthetic data. It shows that there are open questions considering thefield of privacy-friendly Pay-How-You-Drive models.
Article
Full-text available
Existing traffic flow forecasting approaches by deep learning models achieve excellent success based on a large volume of datasets gathered by governments and organizations. However, these datasets may contain lots of user’s private data, which is challenging the current prediction approaches as user privacy is calling for the public concern in recent years. Therefore, how to develop accurate traffic prediction while preserving privacy is a significant problem to be solved, and there is a trade-off between these two objectives. To address this challenge, we introduce a privacy-preserving machine learning technique named federated learning and propose a Federated Learning-based Gated Recurrent Unit neural network algorithm (FedGRU) for traffic flow prediction. FedGRU differs from current centralized learning methods and updates universal learning models through a secure parameter aggregation mechanism rather than directly sharing raw data among organizations. In the secure parameter aggregation mechanism, we adopt a Federated Averaging algorithm to reduce the communication overhead during the model parameter transmission process. Furthermore, we design a Joint Announcement Protocol to improve the scalability of FedGRU. We also propose an ensemble clustering-based scheme for traffic flow prediction by grouping the organizations into clusters before applying FedGRU algorithm. Extensive case studies on a real-world dataset demonstrate that FedGRU can produce predictions that are merely 0.76 km/h worse than the state-of-the-art in terms of mean average error under the privacy preservation constraint, confirming that the proposed model develops accurate traffic predictions without compromising the data privacy.
Article
Full-text available
A Vehicular Ad-hoc Network (VANET) is a type of Mobile Ad-hoc Network (MANET) that is used to provide communications between nearby vehicles, and between vehicles and fixed infrastructure on the roadside. VANET is not only used for road safety and driving comfort but also for infotainment. Communication messages in VANET can be used to locate and track vehicles. Tracking can be beneficial for vehicle navigation using Location Based Services (LBS). However, it can lead to threats on location privacy of vehicle users; since it can profile them and track their physical location. Therefore, to successfully deploy LBS, user’s privacy is one of major challenges that must be addressed. In this paper, we propose Privacy-Preserving Fully Homomorphic Encryption over Advanced Encryption Standard (P2FHE-AES) scheme for LBS query. This scheme is required for location privacy protection to encourage drivers to use this service without any risk of being pursued. It is implemented using Network Simulator (NS-2), Simulation of Urban Mobility (SUMO), and Cloud simulation (CloudSim). Analysis and evaluation results demonstrate that P2FHE-AES scheme can preserve the privacy of the drivers’ future routes in an efficient and secure way. The results prove the feasibility and efficiency of P2FHE-AES scheme in terms of query's response time, query accuracy, throughput and query overhead.
Article
Full-text available
The concept of cloud computing relies on central large datacentres with huge amounts of computational power. The rapidly growing Internet of Things with its vast amount of data showed that this architecture produces costly, inefficient and in some cases infeasible communication. Thus, fog computing, a new architecture with distributed computational power closer to the IoT devices was developed. So far, this decentralised fog-oriented architecture has only been used for performance and resource management improvements. We show how it could also be used for improving the users’ privacy. For that purpose, we map privacy patterns to the IoT / fog computing / cloud computing architecture. Privacy patterns are software design patterns with the focus to translate “privacy-by-design” into practical advice. As a proof of concept, for each of the used privacy patterns we give an example from a smart vehicle scenario to illustrate how the patterns could improve the users’ privacy.
Article
Full-text available
Abstract Incredible amounts of data is being generated by various organizations like hospitals, banks, e-commerce, retail and supply chain, etc. by virtue of digital technology. Not only humans but machines also contribute to data in the form of closed circuit television streaming, web site logs, etc. Tons of data is generated every minute by social media and smart phones. The voluminous data generated from the various sources can be processed and analyzed to support decision making. However data analytics is prone to privacy violations. One of the applications of data analytics is recommendation systems which is widely used by ecommerce sites like Amazon, Flip kart for suggesting products to customers based on their buying habits leading to inference attacks. Although data analytics is useful in decision making, it will lead to serious privacy concerns. Hence privacy preserving data analytics became very important. This paper examines various privacy threats, privacy preservation techniques and models with their limitations, also proposes a data lake based modernistic privacy preservation technique to handle privacy preservation in unstructured data.
Conference Paper
Full-text available
The large-scale monitoring of computer users' software activities has become commonplace, e.g., for application telemetry, error reporting, or demographic profiling. This paper describes a principled systems architecture---Encode, Shuffle, Analyze (ESA)---for performing such monitoring with high utility while also protecting user privacy. The ESA design, and its Prochlo implementation, are informed by our practical experiences with an existing, large deployment of privacy-preserving software monitoring. With ESA, the privacy of monitored users' data is guaranteed by its processing in a three-step pipeline. First, the data is encoded to control scope, granularity, and randomness. Second, the encoded data is collected in batches subject to a randomized threshold, and blindly shuffled, to break linkability and to ensure that individual data items get "lost in the crowd" of the batch. Third, the anonymous, shuffled data is analyzed by a specific analysis engine that further prevents statistical inference attacks on analysis results. ESA extends existing best-practice methods for sensitive-data analytics, by using cryptography and statistical techniques to make explicit how data is elided and reduced in precision, how only common-enough, anonymous data is analyzed, and how this is done for only specific, permitted purposes. As a result, ESA remains compatible with the established workflows of traditional database analysis. Strong privacy guarantees, including differential privacy, can be established at each processing step to defend against malice or compromise at one or more of those steps. Prochlo develops new techniques to harden those steps, including the Stash Shuffle, a novel scalable and efficient oblivious-shuffling algorithm based on Intel's SGX, and new applications of cryptographic secret sharing and blinding. We describe ESA and Prochlo, as well as experiments that validate their ability to balance utility and privacy.
Article
Full-text available
Vehicular ad hoc networks may one day prevent injuries and reduce transportation costs by enabling new safety and traffic management applications, but these networks raise privacy concerns because they could enable applications to perform unwanted surveillance. Researchers have proposed privacy protocols, measuring privacy performance based on metrics such as k-anonymity. Because of the frequency and precision of location of queries in vehicular applications, privacy measurement may be improved by considering additional factors. This paper defines continuous network location privacy; presents KDT-anonymity, which is a composite metric including average anonymity set size, i.e., K, average distance deviation, i.e., D, and anonymity duration, i.e., T; derives formulas to calculate theoretical values of K, D, and T; evaluates five privacy protocols under realistic vehicle mobility patterns using KDT-anonymity; and compares KDT-anonymity with prior metrics.
Conference Paper
Full-text available
The growing popularity of location-based systems, allowing unknown/untrusted servers to easily collect huge amounts of information regarding users' location, has recently started raising serious privacy concerns. In this paper we introduce geoind, a formal notion of privacy for location-based systems that protects the user's exact location, while allowing approximate information -- typically needed to obtain a certain desired service -- to be released. This privacy definition formalizes the intuitive notion of protecting the user's location within a radius $r$ with a level of privacy that depends on r, and corresponds to a generalized version of the well-known concept of differential privacy. Furthermore, we present a mechanism for achieving geoind by adding controlled random noise to the user's location. We describe how to use our mechanism to enhance LBS applications with geo-indistinguishability guarantees without compromising the quality of the application results. Finally, we compare state-of-the-art mechanisms from the literature with ours. It turns out that, among all mechanisms independent of the prior, our mechanism offers the best privacy guarantees.
Conference Paper
Full-text available
We continue a line of research initiated in [10,11]on privacy-preserving statistical databases. Consider a trusted server that holds a database of sensitive information. Given a query function f mapping databases to reals, the so-called true answer is the result of applying f to the database. To protect privacy, the true answer is perturbed by the addition of random noise generated according to a carefully chosen distribution, and this response, the true answer plus noise, is returned to the user. Previous work focused on the case of noisy sums, in which f = ∑i g(x i ), where x i denotes the ith row of the database and g maps database rows to [0,1]. We extend the study to general functions f, proving that privacy can be preserved by calibrating the standard deviation of the noise according to the sensitivity of the function f. Roughly speaking, this is the amount that any single argument to f can change its output. The new analysis shows that for several particular applications substantially less noise is needed than was previously understood to be the case. The first step is a very clean characterization of privacy in terms of indistinguishability of transcripts. Additionally, we obtain separation results showing the increased value of interactive sanitization mechanisms over non-interactive.
Conference Paper
Full-text available
Advances in sensing and tracking technology enable location-based applications but they also create signif- icant privacy risks. Anonymity can provide a high de- gree of privacy, save service users from dealing with service providers' privacy policies, and reduce the ser- vice providers' requirements for safeguarding private in- formation. However, guaranteeing anonymous usage of location-based services requires that the precise location information transmitted by a user cannot be easily used to re-identify the subject. This paper presents a mid- dleware architecture and algorithms that can be used by a centralized location broker service. The adaptive al- gorithms adjust the resolution of location information along spatial or temporal dimensions to meet specified anonymity constraints based on the entities who may be using location services within a given area. Using a model based on automotive traffic counts and carto- graphic material, we estimate the realistically expected spatial resolution for different anonymity constraints. The median resolution generated by our algorithms is 125 meters. Thus, anonymous location-based requests for urban areas would have the same accuracy currently needed for E-911 services; this would provide sufficient resolution for wayfinding, automated bus routing ser- vices and similar location-dependent services.
Article
In the automotive industry, cars now coming off assembly lines are sometimes referred to as "rolling data centers" in acknowledgment of all the entertainment and communications capabilities they contain. The fact that autonomous driving systems are also well along in development does nothing to allay concerns about security. Indeed, it would seem the stakes of automobile cybersecurity are about to become immeasurably higher just as some of the underpinnings of contemporary cybersecurity are rendered moot.
Article
One of the most promising application area of Industrial Internet of Things (IIoT) is Vehicular Ad hoc NETworks (VANETs). VANETs are largely used by Intelligent Transportation Systems (ITS) to provide smart and safe road transport. To reduce the network burden, Software Defined Networks (SDNs) acts as a remote controller. Motivated by the need for greener IIoT solutions, this paper proposes an energy-efficient end-to-end security solution for Software Defined Vehicular Networks (SDVN). Besides SDN's flexible network management, network performance, and energy-efficient end-to-end security scheme plays a significant role in providing green IIoT services. Thus, the proposed SDVN provides lightweight end-to-end security. The end-to-end security objective is handled in two levels: i) In RSU-based Group Authentication (RGA) scheme, each vehicle in the RSU range receives a group id-key pair for secure communication and ii) In private-Collaborative Intrusion Detection System (p-CIDS), SDVN detects the potential intrusions inside the VANET architecture using collaborative learning that guarantees privacy through a fusion of differential privacy and homomorphic encryption schemes. The SDVN is simulated using NS2 & Matlab, and the simulation results provide higher energy efficiency through reduced end-to-end security communication cost and decentralized learning compared with other existing mechanisms. In addition, the p-CIDS detects the intruder with an accuracy of 96.81% in the SDVN.
Article
Federated learning involves training statistical models over remote devices or siloed data centers, such as mobile phones or hospitals, while keeping data localized. Training in heterogeneous and potentially massive networks introduces novel challenges that require a fundamental departure from standard approaches for large-scale machine learning, distributed optimization, and privacy-preserving data analysis. In this article, we discuss the unique characteristics and challenges of federated learning, provide a broad overview of current approaches, and outline several directions of future work that are relevant to a wide range of research communities.
Article
Autonomous vehicles benefit, in various aspects, from the cooperation of the Industrial Internet and Cyber-Physical Systems. Users in autonomous vehicles submit query contents to service providers. However, privacy concerns arise. Existing works on privacy preservation of query contents rely on location perturbation or k-anonymity, and suffer from insufficient protection of privacy or low query utility. To overcome these drawbacks, this paper proposes the notion of client-based personalized k-anonymity. To measure the performance of CPkA, we presents a privacy metric and a utility metric, and formulate two problems to establish anonymity optimal CPkA mechanisms. An approach, including two modules, to establish optimal mechanisms is proposed. The first module is to build optimal in-group mechanisms. The second module computes optimal grouping of query contents. These two modules are combined to establish optimal mechanisms. We employ real-life datasets and synthetic prior distributions to validate the effectiveness and efficiency of the established mechanisms.
Chapter
We consider the problem of designing scalable, robust protocols for computing statistics about sensitive data. Specifically, we look at how best to design differentially private protocols in a distributed setting, where each user holds a private datum. The literature has mostly considered two models: the “central” model, in which a trusted server collects users’ data in the clear, which allows greater accuracy; and the “local” model, in which users individually randomize their data, and need not trust the server, but accuracy is limited. Attempts to achieve the accuracy of the central model without a trusted server have so far focused on variants of cryptographic multiparty computation (MPC), which limits scalability.
Article
For privacy concerns to be addressed adequately in today's machine-learning (ML) systems, the knowledge gap between the ML and privacy communities must be bridged. This article aims to provide an introduction to the intersection of both fields with special emphasis on the techniques used to protect the data.
Article
Today’s artificial intelligence still faces two major challenges. One is that, in most industries, data exists in the form of isolated islands. The other is the strengthening of data privacy and security. We propose a possible solution to these challenges: secure federated learning. Beyond the federated-learning framework first proposed by Google in 2016, we introduce a comprehensive secure federated-learning framework, which includes horizontal federated learning, vertical federated learning, and federated transfer learning. We provide definitions, architectures, and applications for the federated-learning framework, and provide a comprehensive survey of existing works on this subject. In addition, we propose building data networks among organizations based on federated mechanisms as an effective solution to allowing knowledge to be shared without compromising user privacy.
Article
Privacy is one of the most important social and political issues in our information society, characterized by a growing range of enabling and supporting technologies and services. Amongst these are communications, multimedia, biometrics, big data, cloud computing, data mining, internet, social networks, and audio-video surveillance. Each of these can potentially provide the means for privacy intrusion. De-identification is one of the main approaches to privacy protection in multimedia contents (text, still images, audio and video sequences and their combinations). It is a process for concealing or removing personal identifiers, or replacing them by surrogate personal identifiers in personal information in order to prevent the disclosure and use of data for purposes unrelated to the purpose for which the information was originally obtained. Based on the proposed taxonomy inspired by the Safe Harbour approach, the personal identifiers, i.e., the personal identifiable information, are classified as non-biometric, physiological and behavioural biometric, and soft biometric identifiers. In order to protect the privacy of an individual, all of the above identifiers will have to be de-identified in multimedia content. This paper presents a review of the concepts of privacy and the linkage among privacy, privacy protection, and the methods and technologies designed specifically for privacy protection in multimedia contents. The study provides an overview of de-identification approaches for non-biometric identifiers (text, hairstyle, dressing style, license plates), as well as for the physiological (face, fingerprint, iris, ear), behavioural (voice, gait, gesture) and soft-biometric (body silhouette, gender, age, race, tattoo) identifiers in multimedia documents.
Conference Paper
In vehicular ad hoc networks (VANETs) tracking of participants is an issue that is examined by many research groups. These groups came up with several different concepts of counter measures against tracking attacks. All of these presented techniques seem to offer a pretty good protection. We pick out two very promising concepts - the Mix Zones and the Silent Periods - to examine them in a simulation environment to actually identify their strengths and weaknesses. Our simulation results show rather high success rates for attackers with relatively unsophisticated attack heuristics. Furthermore we confirm the correlation between several influencing factors and the success rates of attacks and study the connection to the common metrics k-anonymity and entropy.
Article
In recent years, location-based services have become very popular, mainly driven by the availability of modern mobile devices with integrated position sensors. Prominent examples are points of interest finders or geo-social networks such as Facebook Places, Qype, and Loopt. However, providing such services with private user positions may raise serious privacy concerns if these positions are not protected adequately. Therefore, location privacy concepts become mandatory to ensure the user’s acceptance of location-based services. Many different concepts and approaches for the protection of location privacy have been described in the literature. These approaches differ with respect to the protected information and their effectiveness against different attacks. The goal of this paper is to assess the applicability and effectiveness of location privacy approaches systematically. We first identify different protection goals, namely personal information (user identity), spatial information (user position), and temporal information (identity/position + time). Secondly, we give an overview of basic principles and existing approaches to protect these privacy goals. In a third step, we classify possible attacks. Finally, we analyze existing approaches with respect to their protection goals and their ability to resist the introduced attacks.
Conference Paper
Although the privacy threats and countermeasures associated with location data are well known, there has not been a thorough experiment to assess the effectiveness of either. We examine location data gathered from volunteer subjects to quantify how well four different algorithms can identify the subjects' home locations and then their identities using a freely available, programmable Web search engine. Our procedure can identify at least a small fraction of the subjects and a larger fraction of their home addresses. We then apply three different obscuration countermeasures designed to foil the privacy attacks: spatial cloaking, noise, and rounding. We show how much obscuration is necessary to maintain the privacy of all the subjects.
Secure computation for machine learning with SPDZ
  • Valerio Pastro
  • Mariana Raykova
Valerie Chen, Valerio Pastro, and Mariana Raykova. 2019. Secure computation for machine learning with SPDZ. arXiv preprint arXiv:1901.00329 (2019).
The Success Of Autonomous Vehicles Hinges On Smart Cities. Inrix Is Making It Easier To Build Them
  • Liane Yvkoff
Liane Yvkoff. 2020. The Success Of Autonomous Vehicles Hinges On Smart Cities. Inrix Is Making It Easier To Build Them. Forbes. https:
Monetizing car data-new service business opportunities to create new customer benefits
  • Michele Bertoncello
  • Gianluca Camplone
  • Paul Gao
  • Hans-Werner Kaas
  • Detlev Mohr
  • Timo Möller
  • Dominik Wee
Michele Bertoncello, Gianluca Camplone, Paul Gao, Hans-Werner Kaas, Detlev Mohr, Timo Möller, and Dominik Wee. 2016. Monetizing car data-new service business opportunities to create new customer benefits. McKinsey & Company (2016).
Big Data and B2B platforms: the next big opportunity for Europe -Report on market deficiencies and regulatory barriers affecting cooperative
  • Alexandra Campmas
  • Nadina Iacob
  • Felice Simonelli
  • Hien Vu
Alexandra Campmas, Nadina Iacob, Felice Simonelli, and Hien Vu. 2021. Big Data and B2B platforms: the next big opportunity for Europe -Report on market deficiencies and regulatory barriers affecting cooperative, connected and automated mobility.
Study on the Technical Evaluation of De-Identification Procedures for Personal Data in the Automotive Sector
  • Kai Rannenberg
  • Sebastian Pape
  • Frederic Tronnier
  • Sascha Löbner
Kai Rannenberg, Sebastian Pape, Frederic Tronnier, and Sascha Löbner. 2021. Study on the Technical Evaluation of De-Identification Procedures for Personal Data in the Automotive Sector. Technical Report. Goethe University Frankfurt. https://doi.org/10.21248/gups.63413
Privacy-preserving classification of personal text messages with secure multi-party computation: An application to hate-speech detection
  • Devin Reich
  • Ariel Todoki
  • Rafael Dowsley
  • Martine De Cock
  • Anderson Ca Nascimento
Devin Reich, Ariel Todoki, Rafael Dowsley, Martine De Cock, and Anderson CA Nascimento. 2019. Privacy-preserving classification of personal text messages with secure multi-party computation: An application to hate-speech detection. arXiv preprint arXiv:1906.02325 (2019).
Protecting Privacy when Disclosing Information: k-Anonymity and its Enforcement Through Generalization and Suppresion
  • P Samarati
  • Sweeney
P Samarati and L Sweeney. 1998. Protecting Privacy when Disclosing Information: k-Anonymity and its Enforcement Through Generalization and Suppresion. Proc of the IEEE Symposium on Research in Security and Privacy (1998).
FEDLOC: Federated learning framework for data-driven cooperative localization and location data processing
  • Feng Yin
  • Zhidi Lin
  • Yue Xu
  • Qinglei Kong
  • Deshi Li
Feng Yin, Zhidi Lin, Yue Xu, Qinglei Kong, Deshi Li, Sergios Theodoridis, and Shuguang Cui. 2020. FEDLOC: Federated learning framework for data-driven cooperative localization and location data processing. https://doi.org/10.1109/ ojsp.2020.3036276 arXiv:2003.03697