Conference Paper

Epistemic privacy

DOI: 10.1145/1376916.1376941 Conference: Proceedings of the Twenty-Seventh ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2008, June 9-11, 2008, Vancouver, BC, Canada
Source: DBLP


We present a novel definition of privacy in the framework of offline (retroactive) database query auditing. Given information about the database, a description of sensitive data, and assumptions about users' prior knowledge, our goal is to determine if answering a past user's query could have led to a privacy breach. According to our definition, an audited property A is private, given the disclosure of property B, if no user can gain confidence in A by learning B, subject to prior knowledge constraints. Privacy is not violated if the disclosure of B causes a loss of confidence in A. The new notion of privacy is formalized using the well-known semantics for reasoning about knowledge, where logical properties correspond to sets of possible worlds (databases) that satisfy these properties. Database users are modelled as either possibilistic agents whose knowledge is a set of possible worlds, or as probabilistic agents whose knowledge is a probability distribution on possible worlds. We analyze the new privacy notion, show its relationship with the conventional approach, and derive criteria that allow the auditor to test privacy efficiently in some important cases. In particular, we prove characterization theorems for the possibilistic case, and study in depth the probabilistic case under the assumption that all database records are considered a-priori independent by the user, as well as under more relaxed (or absent) prior-knowledge assumptions. In the probabilistic case we show that for certain families of distributions there is no efficient algorithm to test whether an audited property A is private given the disclosure of a property B, assuming P ≠ NP. Nevertheless, for many interesting families, such as the family of product distributions, we obtain algorithms that are efficient both in theory and in practice.

Download full-text


Available from: Ronald Fagin, Nov 06, 2014
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The collection of digital information by governments, corporations, and individuals has created tremendous opportunities for knowledge- and information-based decision making. Driven by mutual benefits, or by regulations that require certain data to be published, there is a demand for the exchange and publication of data among various parties. Data in its original form, however, typically contains sensitive information about individuals, and publishing such data will violate individual privacy. The current practice in data publishing relies mainly on policies and guidelines as to what types of data can be published and on agreements on the use of published data. This approach alone may lead to excessive data distortion or insufficient protection. Privacy-preserving data publishing (PPDP) provides methods and tools for publishing useful information while preserving data privacy. Recently, PPDP has received considerable attention in research communities, and many approaches have been proposed for different data publishing scenarios. In this survey, we will systematically summarize and evaluate different approaches to PPDP, study the challenges in practical data publishing, clarify the differences and requirements that distinguish PPDP from other related problems, and propose future research directions.
    ACM Computing Surveys 06/2010; 42(4). DOI:10.1145/1749603.1749605 · 3.37 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes a way to generate a robot genome that contributes to defining the personality of a software robot or an artificial life in a mobile phone. The personality should be both complex and feature-rich, but still plausible by human standards for an emotional life form. However, it becomes increasingly difficult and time-consuming to ensure reliability, variability and consistency for the robot's personality while manually initializing values for the individual genes. To overcome this difficulty, this paper proposes a neural network algorithm for a genetic robot's personality (NNGRP) and an upgraded version of a previously introduced evolutionary algorithm for a genetic robot's personality (EAGRP). The robot genomes for heterogeneous personalities are demonstrably generated via the NNGRP and the EAGRP and compared. The implementation is embedded into genetic robots in a mobile phone to verify the feasibility and effectiveness of each algorithm.
    Data & Knowledge Engineering 11/2011; 70(11):923-954. DOI:10.1016/j.datak.2011.06.002 · 1.12 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Extending traditional access control and complementing emerging usage control, inference-usability confinement aims at customising sensitive data to be returned to a client in such a way that the manipulated items are still useful for the recipient but do not enable any usage beyond the intended ones. In the context of a logic-oriented information system, a confinement mechanism generates an inference-proof view of the actually stored instance(s) while interacting with a client. We survey our specific approach to policy-driven inference-usability confinement for a server-client architecture, discussing various parameters and the resulting confinement mechanisms. Basically, the confinement is achieved by enforcing an invariant of the following kind: at any point in time, the information content of the data available to a client does not violate any protection requirement expressed by a declarative confidentiality policy. In this context, the information content of data and, accordingly, the inference-proofness of such data crucially depend on the client's a priori knowledge, general reasoning capabilities and awareness of the confinement mechanism.
    International Journal of Computational Science and Engineering 03/2012; 7(1):17-37. DOI:10.1504/IJCSE.2012.046178
Show more