If you want to read the PDF, try requesting it from the authors.

Abstract

Content-based publish/subscribe provides a loosely-coupled and expressive form of communication for large-scale distributed systems. Confidentiality is a major challenge for publish/subscribe middleware deployed over multiple administrative domains. Encrypted matching allows confidentiality-preserving content-based filtering but has high performance overheads. It may also prevent the use of classical optimizations based on subscriptions containment. We propose a support mechanism that reduces the cost of encrypted matching, in the form of a prefiltering operator using Bloom filters and simple randomization techniques. This operator greatly reduces the amount of encrypted subscriptions that must be matched against incoming encrypted publications. It leverages subscription containment information when available, but also ensures that containment confidentiality is preserved otherwise. We propose containment obfuscation techniques and provide a rigorous security analysis of the information leaked by Bloom filters in this case. We conduct a thorough experimental evaluation of prefiltering under a large variety of workloads. Our results indicate that prefiltering is successful at reducing the space of subscriptions to be tested in all cases. We show that while there is a tradeoff between prefiltering efficiency and information leakage when using containment obfuscation, it is practically possible to obtain good prefiltering performance while securing the technique against potential leakages.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... P2P architectures can be made highly resilient to failures and bring a cheaper scalability compared to centralized architectures [4]. P2P-based applications have been successfully developed to support various services (e.g., file sharing [5], video streaming [6], pub-sub [7]). ...
... Figure 2 reports the theoretical bandwidth consumption of P3LS compared to receiving only the stream of interest for various parameters. To plot this graph, we study the performance improvement over k-anonymity for k ∈ [2,3,4,5,6,7], N ∈ [3,4,5,6,7,8,9,10,12], and α ∈ [0.1, 0.4, 0.8]. First, one can observe that plausible deniability provides a wide range of performance and adversary missing rate. ...
... Figure 2 reports the theoretical bandwidth consumption of P3LS compared to receiving only the stream of interest for various parameters. To plot this graph, we study the performance improvement over k-anonymity for k ∈ [2,3,4,5,6,7], N ∈ [3,4,5,6,7,8,9,10,12], and α ∈ [0.1, 0.4, 0.8]. First, one can observe that plausible deniability provides a wide range of performance and adversary missing rate. ...
Conference Paper
Full-text available
Video consumption is one of the most popular Inter-net activities worldwide. The emergence of sharing videos directly recorded with smartphones raises important privacy concerns. In this paper we propose P3LS, the first practical privacy-preserving peer-to-peer live streaming system. To protect the privacy of its users, P3LS relies on k-anonymity when users subscribe to streams, and on plausible deniability for the dissemination of video streams. Specifically, plausible deniability during the dissemination phase ensures that an adversary is never able to distinguish a user's stream of interest from the fake streams from a statistical analysis (i.e., using an analysis of variance). We exhaustively evaluate P3LS and show that adversaries are not able to identify the real stream of a user with very high confidence. Moreover, P3LS consumes 30% less bandwidth than the standard k-anonymity approach where nodes fully contribute to the dissemination of k streams.
... Policy enforcement is based on fully homomorphic encryption (FHE) scheme [17]. In previous works, security approaches [8], [9], [18] have focused on the confidentiality of data. Privacy-protection approaches [19]- [21] have focused on preserving service privacy. ...
... A Bloom Filter is a simple, space-efficient, and randomized data structure for representing a set of strings compactly for efficient membership query [9]. In our framework, an access control policy is encoded into bloom filters and sent as part of access credentials when publishing an encrypted event. ...
... In the context of publish/subscribe systems, the security requirements mainly include the privacy of participant services, the data confidentiality and access control. There are extensive studies [9], [18]- [21], [27], [28] on these security issues of publish/subscribe services. ...
Article
The publish/subscribe paradigm provides a loosely-coupled and scalable communication model for large-scale IoT service systems, such as supervisory control and data acquisition (SCADA). Data confidentiality and service privacy are two crucial security issues for the publish/subscribe model deployed in different domains. Existing access control schemes for such model cannot address both issues at the same time. In this paper, we propose a comprehensive access control framework (CACF) to bridge this gap. The design principle of the proposed framework is two fold: (a) a bi-directional policy matching scheme for protecting the privacy of IoT services; and (b) a fully homomorphic encryption scheme for encrypting published events to protect data confidentiality. We analyze the correctness and security of CACF. We prototype CACF based on Apache ActiveMQ, an open source message broker, and evaluate its performance. Experimental results indicate that our security system meets the latency requirements for very high quality SCADA services.
... Adding a port to an instance at a specific position in its position space E tplate is performed via the construct addPort(label, position). As the Router instance has a Ring topology, a port is created at each of the four positions 0, 0.25, 0.5, 0.75 within the position space E ring = [0, 1[ (See line [5][6][7][8]. For the four other instances, a port is setup at an arbitrary position in the position space E clique of each component and the node in charge of that port will be considered the leader of that clique (line 11,15,19,23). ...
... More precisely, it was shown in [6] that P(B(f e ) = j) is given by the formula ...
... Starting from a random neighborhood, individual nodes repeatedly select a random neighbor q (line 2), exchange their current neighborhood with that of q (noted Γ(q), line 4), and use the gained information to select more similar neighbors (line 5) 2 . Similarly, when receiving a new neighborhood pushed to them, nodes update their local view with the new nodes they have just heard of (lines [6][7][8]. The intuition behind this greedy procedure is that if A is similar to B, and B to C, C is likely to be similar to A as well. ...
Thesis
With the advent of the IoT, Smart Cities, and other large-scale extremely heterogeneous systems, distributed systems are becoming both more complex and pervasive every day and will soon be intractable with current approaches. To hide this growing complexity and facilitate the management of distributed systems at all stages of their life-cycle, this thesis argues for a holistic approach, where the function of a system is considered as a whole, moving away from the behavior of individual components. In parallel to this rise in abstraction levels, basic building blocks need to become more autonomous and able to react to circumstances, to alleviate developers and automate most of the low level operations. We propose three contributions towards this vision : 1) Pleiades: a holistic approach to build complex structures by assembly, easily programmable and supported by an efficient, self-organizing gossip-based run-time engine. 2) Mind-the-Gap: a gossip-based protocol to detect partitions and other large connectivity changes in MANETs, thanks to periodic opportunistic aggregations and a stochastic representation of the network membership. 3) HyFN: an extension to traditional gossip protocols that is able to efficiently solve the k-Furthest-Neighbors problem, which standard methods have been unable to up to now. We believe these three contributions demonstrate our vision is realistic and highlight its attractive qualities.
... As a direct consequence, the brokers' inability to perform computations over sensitive fields, precludes also the possibility to perform additional optimizations at the broker level, such as leveraging subscription containment. Although determining subscription containment can be seen as a confidentiality liability in some cases [Barazzutti et al. 2012;Barazzutti et al. 2015;Raiciu and Rosenblum 2006], depending on the level of security desired in the pub/sub application domain, it can also consist in a viable tool for improving performance. ...
... This scenario involves periods of high throughput publications and a large number of active subscriptions depending on the stock market activity, which are well suited for optimizations based on the containment support of ASPE. However, these optimizations decrease the confidentiality of the scheme [Barazzutti et al. 2015]. ...
... Another approach is to augment encrypted subscriptions and publications with compact structures allowing to pre-filter subscriptions cheaply. The approach in [Barazzutti et al. 2012;Barazzutti et al. 2015] proposes to augment subscriptions and publications with Bloom filters [Broder et al. 2002] encoding equality constraints for subscriptions and attributes values for publications. By allowing group membership comparisons, the filters allow knowing when a publication is sure not to match a given subscription. ...
Article
Full-text available
Publish/subscribe (pub/sub) is an attractive communication paradigm for large-scale distributed applications running across multiple administrative domains. Pub/sub allows event-based information dissemination based on constraints on the nature of the data rather than on pre-established communication channels. It is a natural fit for deployment in untrusted environments such as public clouds linking applications across multiple sites. However, pub/sub in untrusted environments leads to major confidentiality concerns stemming from the content-centric nature of the communications. This survey classifies and analyzes different approaches to confidentiality preservation for pub/sub, from applications of trust and access control models to novel encryption techniques. It provides an overview of the current challenges posed by confidentiality concerns and points to future research directions in this promising field.
... Defining the first variant is merely for efficiency purposes: if the ordering is exposed, i.e., the scheme mechanism permits comparing distances between encrypted points, we can leverage it. Furthermore, the structure of the encrypted subscriptions and how they are stored might vary (e.g., if the scheme permits evaluating subscription coverage, encrypted subscriptions could be organized in containment trees [Barazzutti et al., 2017]). Executing 1NN queries sequentially over each pair of encrypted points might, therefore, require reorganizing their storage. ...
... We observed this in previous work [Onica et al., 2015], and noted that it is actually desirable for stronger security reasons. Also, a scheme might also permit evaluating if an encrypted subscription covers another (i.e., the case where a publication matching the covering subscription will match all covered subscriptions), which can lead to consistent improvement, by minimizing the number of evaluated subscriptions [Barazzutti et al., 2017]. However, coverage support can also lead to leaks of information on the subscription domain [Raiciu and Rosenblum, 2006]. ...
... In future research work, we will further solve the above problems. We plan to combine the two-strategy attribute-based authorization [31] and time-limited key management to realize more fine-grained access control and efficient key revocation, and further adopt the Bloomer Filter [32] to optimize the matching process to achieve fast authentication. ...
Preprint
Full-text available
With the dramatically increasing deployment of intelligent devices, the Internet of Things (IoT) has attracted more attention and developed rapidly. It effectively collects and shares data from the surrounding environment to achieve better IoT services. For data sharing, the publish-subscribe (PS) paradigm provides a loosely-coupled and scalable communication model. However, due to the loosely-coupled nature, it is vulnerable to many attacks, resulting in some security threats to the IoT system, but it cannot provide the basic security mechanisms such as authentication and confidentiality to ensure the data security. Thus, in order to protect the system security and users’ privacy, this paper presents a secure blockchain based privacy-preserving access control scheme for PS system, which adopt the fully homomorphic encryption (FHE) to ensure the confidentiality of the publishing events, and leverage the ledger to store the large volume of data events and access cross-domain information. Finally, we analyze the correctness and security of our scheme, moreover, we deploy our proposed prototype system on two computers, and evaluate its performance. The experimental results show that our PS system can efficiently achieve the equilibrium between the system cost and the security requirement.
... Confidentiality in pub/sub systems has been widely studied [2,7,10,20,23]. Several works have been proposed to ensure publications' confidentiality based on encryption techniques [1,11,14,17,25,26]. ...
Conference Paper
Full-text available
User revocation is one of the main security issues in publish and subscribe (pub/sub) systems. Indeed, to ensure data confidentiality, the system should be able to remove malicious subscribers without affecting the functionalities and decoupling of authorised subscribers and publishers. To revoke a user, there are solutions, but existing schemes inevitably introduce high computation and communication overheads, which can ultimately affect the system capabilities. In this paper, we propose a novel revocation technique for pub/sub systems that can efficiently remove compromised subscribers without requiring regeneration and redistribution of new keys as well as re-encryption of existing data with those keys. Our proposed solution is such that a subscriber’s interest is not revealed to curious brokers and published data can only be accessed by the authorised subscribers. Finally, the proposed protocol is secure against the collusion attacks between brokers and revoked subscribers.
Article
The publish/subscribe paradigm provides loosely coupled and scalable communication for the Internet of Things (IoT). In this paradigm, access control is an efficient approach to guaranteeing security. However, existing access control methods are not suitable for the publish/subscribe paradigm in the sensing layer of the IoT due to their coarse‐grained controls and lack of self‐configuration. To address these problems, in this paper, we propose a topic‐centric access control model (TCAC) to realize fine‐grained authorization for the sensing layer of the IoT. First, we use topics, a fundamental concept for the publish/subscribe paradigm, as the basic access control unit to dynamically authorize access according to the attributes of devices, users, and topics. Second, an administration model for TCAC is proposed to manage these attributes and configure access policies to effectively implement user‐driven access controls. Finally, a healthcare case is used to demonstrate the security of the proposed TCAC. The results show that our model is dynamic, fine‐grained, and user driven.
Conference Paper
Online Social Networks (OSNs) are the core of many social interactions nowadays. Privacy is an important building block of free societies, and thus, for OSNs. Therefore, OSNs should support privacy-enabled communication between citizens that participate. OSNs function as group communication systems and can be build in centralized and distributed styles. Centralized, privacy is under the sole control of a single entity. If this entity is distributed, privacy can be improved as all participants contribute to privacy. Peer-to-peer-based group communication systems overcome this issue, at the cost of large messaging overhead. The message overhead is mainly caused by early message duplication due to disjunct routing paths. In this paper, we introduce ant colony optimization to reduce the messaging overhead in peer-to-peer-based group communication systems, bridging the gap between privacy and efficiency. To optimize disjunct routing paths, we apply our adapted privacy sensitive ant colony optimization to encourage re-usage and aggregation of known paths. Our results indicate a 9–31% lower messaging overhead compared to the state of the art. Moreover, our ant colony optimization-based method reuses paths without leaking additional information, that is, we maintain the anonymity sets so that participants remain probable innocent.
Article
Full-text available
Content-Based Publish-Subscribe (CBPS) is an asynchronous messaging paradigm that supports a highly dynamic and many-to-many communication pattern based on the content of the messages themselves. In general, a CBPS system has three distinct parties -Content Publishers , Content Brokers, and Subscribers -working in a highly decoupled fashion. The ability to seamlessly scale on demand has made CBPS systems the choice of distributing messages/documents produced by Content Publishers to many Subscribers through Content Brokers. Most of the current systems assume that Content Brokers are trusted for the confidentiality of the data published by Content Publishers and the privacy of the subscriptions, which specify their interests, made by Subscribers. However, with the increased use of technologies, such as service oriented architectures and cloud computing, essentially outsourcing the broker functionality to third-party providers, one can no longer assume the trust relationship to hold. The problem of providing privacy/confidentiality in CBPS systems is challenging, since the solution to the problem should allow Content Brokers to make routing decisions based on the content without revealing the content to them. The problem may appear unsolvable since it involves conflicting goals, but in this paper, we propose a novel approach to preserve the privacy of the subscriptions made by Subscribers and confidentiality of the data published by Content Publishers using cryptographic techniques when third-party Content Brokers are utilized to make routing decisions based on the content. We analyze the security of our approach to show that it is indeed sound and provide experimental results to show that it is practical.
Conference Paper
Full-text available
The publish/subscribe model offers a loosely-coupled communication paradigm where applications interact indirectly and asynchronously. Publisher applications generate events that are sent to interested applications through a network of brokers. Subscriber applications express their interest by specifying filters that brokers can use for routing the events. Supporting confidentiality of messages being exchanged is still challenging. First of all, it is desirable that any scheme used for protecting the confidentiality of both the events and filters should not require the publishers and subscribers to share secret keys. In fact, such a restriction is against the loose-coupling of the model. Moreover, such a scheme should not restrict the expressiveness of filters and should allow the broker to perform event filtering to route the events to the interested parties. Existing solutions do not fully address those issues. In this paper, we provide a novel scheme that supports (i) confidentiality for events and filters; (ii) filters can express very complex constraints on events even if brokers are not able to access any information on both events and filters; (iii) and finally it does not require publishers and subscribers to share keys.
Article
Full-text available
Many network solutions and overlay networks utilize probabilistic techniques to reduce information processing and networking costs. This survey article presents a number of frequently used and useful probabilistic techniques. Bloom filters and their variants are of prime importance, and they are heavily used in various distributed systems. This has been reflected in recent research and many new algorithms have been proposed for distributed systems that are either directly or indirectly based on Bloom filters. In this survey, we give an overview of the basic and advanced techniques, reviewing over 20 variants and discussing their application in distributed systems, in particular for caching, peer-to-peer systems, routing and forwarding, and measurement data summarization.
Conference Paper
Full-text available
For over fifty years, “record linkage” procedures have been refined to integrate data in the face of typographical and semantic errors. These procedures are traditionally performed over personal identifiers (e.g., names), but in modern decentralized environments, privacy concerns have led to regulations that require the obfuscation of such attributes. Various techniques have been proposed to resolve the tension, including secure multi-party computation protocols, however, such protocols are computationally intensive and do not scale for real world linkage scenarios. More recently, procedures based on Bloom filter encoding (BFE) have gained traction in various applications, such as healthcare, where they yield highly accurate record linkage results in a reasonable amount of time. Though promising, no formal security analysis has been designed or applied to this emerging model, which is of concern considering the sensitivity of the corresponding data. In this paper, we introduce a novel attack, based on constraint satisfaction, to provide a rigorous analysis for BFE and guidelines regarding how to mitigate risk against the attack. In addition, we conduct an empirical analysis with data derived from public voter records to illustrate the feasibility of the attack. Our investigations show that the parameters of the BFE protocol can be configured to make it relatively resilient to the proposed attack without significant reduction in record linkage performance.
Conference Paper
Full-text available
Cloud Computing resources are handled through control interfaces. It is through these interfaces that the new machine images can be added, existing ones can be modified, and instances can be started or ceased. Effectively, a successful attack on a Cloud control interface grants the attacker a complete power over the victim's account, with all the stored data included. In this paper, we provide a security analysis pertaining to the control interfaces of a large Public Cloud (Amazon) and a widely used Private Cloud software (Eucalyptus). Our research results are alarming: in regards to the Amazon EC2 and S3 services, the control interfaces could be compromised via the novel signature wrapping and advanced XSS techniques. Similarly, the Eucalyptus control interfaces were vulnerable to classical signature wrapping attacks, and had nearly no protection against XSS. As a follow up to those discoveries, we additionally describe the countermeasures against these attacks, as well as introduce a novel "black box" analysis methodology for public Cloud interfaces.
Conference Paper
Full-text available
Service providers like Google and Amazon are moving into the SaaS (Software as a Service) business. They turn their huge infrastructure into a cloud-computing environment and aggressively recruit businesses to run applications on their platforms. To enforce security and privacy on such a service model, we need to protect the data running on the platform. Unfortunately, traditional encryption methods that aim at providing "unbreakable" protection are often not adequate because they do not support the execution of applications such as database queries on the encrypted data. In this paper we discuss the general problem of secure computation on an encrypted database and propose a SCONEDB Secure Computation ON an Encrypted DataBase) model, which captures the execution and security requirements. As a case study, we focus on the problem of k-nearest neighbor (kNN) computation on an encrypted database. We develop a new asymmetric scalar-product-preserving encryption (ASPE) that preserves a special type of scalar product. We use APSE to construct two secure schemes that support kNN computation on encrypted data; each of these schemes is shown to resist practical attacks of a different background knowledge level, at a different overhead cost. Extensive performance studies are carried out to evaluate the overhead and the efficiency of the schemes.
Conference Paper
Full-text available
One of the main challenges faced by content-based pub- lish/subscribe systems is handling large amount of dynamic subscriptions and publications in a multidimensional con- tent space. To reduce subscription forwarding load and speed up content matching, subscription covering, subsump- tion and merging techniques have been proposed. In this pa- per we propose MICS, Multidimensional Indexing for Con- tent Space that provides an efficient representation and pro- cessing model for large number of subscriptions and pub- lications. MICS creates a one dimensional representation for publications and subscriptions using Hilbert space filling curve. Based on this representation, we propose novel con- tent matching and subscription management (covering, sub- sumption and merging) algorithms. Our experimental eval- uation indicates that the proposed approach significantly speeds up subscription management operations compared to the naive linear approach.
Conference Paper
Full-text available
Achieving expressive and efficient content-based routing in publish/subscribe systems is a difficult problem. Traditional approaches prove to be either inefficient or severely limited in their expressiveness and flexibility. We present a novel routing method, based on Bloom filters, which shows high efficiency while simultaneously preserving the flexibility of content-based schemes. The resulting implementation is a fast, flexible and fully decoupled content-based publish/subscribe system.
Article
Full-text available
Well adapted to the loosely coupled nature of distributed interaction in large-scale applications, the publish/subscribe communication paradigm has recently received increasing attention. With systems based on the publish/subscribe interaction scheme, subscribers register their interest in an event, or a pattern of events, and are subsequently asynchronously notified of events generated by publishers. Many variants of the paradigm have recently been proposed, each variant being specifically adapted to some given application or network model. This paper factors out the common denominator underlying these variants: full decoupling of the communicating entities in time, space, and synchronization. We use these three decoupling dimensions to better identify commonalities and divergences with traditional interaction paradigms. The many variations on the theme of publish/subscribe are classified and synthesized. In particular, their respective benefits and shortcomings are discussed both in terms of interfaces and implementations.
Conference Paper
Full-text available
Users of content-based publish/subscribe systems (CBPS) are interested in receiving data items with values that satisfy certain conditions. Each user submits a list of subscription specifications to a broker, which routes data items from publishers to users. When a broker receives a notification that contains a value from a publisher, it forwards it only to the subscribers whose requests match the value. However, in many applications, the data published are confidential, and their contents must not be revealed to brokers. Furthermore, a user’s subscription may contain sensitive information that must be protected from brokers. Therefore, a difficult challenge arises: how to route publisher data to the appropriate subscribers without the intermediate brokers learning the plain text values of the notifications and subscriptions. To that extent, brokers must be able to perform operations on top of the encrypted contents of subscriptions and notifications. Such operations may be as simple as equality match, but often require more complex operations such as determining inclusion of data in a value interval. Previous work attempted to solve this problem by using one-way data mappings or specialized encryption functions that allow evaluation of conditions on ciphertexts. However, such operations are computationally expensive, and the resulting CBPS lack scalability. As fast dissemination is an important requirement in many applications, we focus on a new data transformation method called Asymmetric Scalar-product Preserving Encryption (ASPE) [1]. We devise methods that build upon ASPE to support private evaluation of several types of conditions. We also suggest techniques for secure aggregation of notifications, supporting functions such as sum, minimum, maximum and count. Our experimental evaluation shows that ASPE-based CBPS incurs 65% less overhead for exact-match filtering and 50% less overhead for range filtering compared to the state-of-the-art.
Article
Full-text available
Content-based publish-subscribe is an efficient communication paradigm that supports dynamic, many-to-many data dissemination in a distributed environment. A publish-subscribe system deployed over a wide-area net- work must handle information dissemination across distinct authoritative domains and heterogeneous platforms. Such an environment raises serious security concerns. This paper describes a practical scheme that preserves confidentiality against eavesdroppers for private content-based publish-subscribe systems over public networks. In this scheme, publications and subscriptions are encrypted, while the publish-subscribe infrastructure is able to make correct routing decisions based on encrypted publications and subscriptions. Plaintexts are not revealed in the infrastructure for the purpose of security and efficiency. This scheme efficiently supports interval-matching as a predicate function for subscriptions. The security of this scheme is analyzed, and further improved by several techniques.
Conference Paper
Full-text available
Real-world entities are not always represented by the same set of features in different data sets. Therefore matching and linking records corresponding to the same real-world entity distributed across these data sets is a challenging task. If the data sets contain private information, the problem becomes even harder due to privacy concerns. Existing solutions of this problem mostly follow two approaches: sanitization techniques and cryptographic techniques. The former achieves privacy by perturbing sensitive data at the expense of degrading matching accuracy. The later, on the other hand, attains both privacy and high accuracy under heavy communication and computation costs. In this paper, we propose a method that combines these two approaches and enables users to trade off between privacy, accuracy and cost. Experiments conducted on real data sets show that our method has significantly lower costs than cryptographic techniques and yields much more accurate matching results compared to sanitization techniques, even when the data sets are perturbed extensively.
Article
Content-Based Publish/Subscribe (CBPS) is an interaction model where the interests of subscribers are stored in a content-based forwarding infrastructure to guide routing of notifications to interested parties. In this paper, we focus on answering the following question: Can we implement content-based publish/subscribe while keeping subscriptions and notifications confidential from the forwarding brokers? Our contributions include a systematic analysis of the problem, providing a formal security model and showing that the maximum level of attainable security in this setting is restricted. We focus on enabling provable confidentiality for commonly used applications and subscription languages in CBPS and present a series of practical provably secure protocols, some of which are novel and others adapted from existing work. We have implemented these protocols in SIENA, a popular CBPS system. Evaluation results show that confidential content-based publish/subscribe is practical: A single broker serving 1000 subscribers is able to route more than 100 notifications per second with our solutions.
Chapter
This chapter introduces PADRES, the publish/subscribe model with the capability to correlate events, uniformly access data produced in the past and future, balance the traffic load among brokers, and handle network failures. The new model can filter, aggregate, correlate and project any combination of historic and future data. A flexible architecture is proposed consisting of distributed and replicated data repositories that can be provisioned in ways to tradeoff availability, storage overhead, query overhead, query delay, load distribution, parallelism, redundancy and locality. This chapter gives a detailed overview of the PADRES content-based publish/subscribe system. Several applications are presented in detail that can benefit from the content-based nature of the publish/subscribe paradigm and take advantage of its scalability and robustness features. A list of example applications are discussed that can benefit from the content-based nature of publish/subscribe paradigm and take advantage of its scalability and robustness features.
Conference Paper
Content-based routing is widely used in large-scale distributed systems as it provides a loosely-coupled yet expressive form of communication: consumers of information register their interests by the means of subscriptions, which are subsequently used to determine the set of recipients of every message published in the system. A major challenge of content-based routing is security. Although some techniques have been proposed to perform matching of encrypted subscriptions against encrypted messages, their computational cost is very high. To speed up that process, it was recently proposed to embed Bloom filters in both subscriptions and messages to reduce the space of subscriptions that need to be tested. In this article, we provide a comprehensive analysis of the information leaked by Bloom filters when implementing such a "prefiltering" strategy. The main result is that although there is a fundamental trade-off between prefiltering efficiency and information leakage, it is practically possible to obtain good prefiltering while securing the scheme against leakages with some simple randomization techniques.
Conference Paper
By routing messages based on their content, publish/subscribe (pub/sub) systems remove the need to establish and maintain fixed communication channels. Pub/sub is a natural candidate for designing large-scale systems, composed of applications running in different domains and communicating via middleware solutions deployed on a public cloud. Such pub/sub systems must provide high throughput, filtering thousands of publications per second matched against hundreds of thousands of registered subscriptions with low and predictable delays, and must scale horizontally and vertically. As large-scale application composition may require complex publications and subscriptions representations, pub/sub system designs should not rely on the specific characteristics of a particular filtering scheme for implementing scalability. In this paper, we depart from the use of broker overlays, where each server must support the whole range of operations of a pub/sub service, as well as overlay management and routing functionality. We propose instead a novel and pragmatic tiered approach to obtain high-throughput and scalable pub/sub for clusters and cloud deployments. We separate the three operations involved in pub/sub and leverage their natural potential for parallelization. Our design, named StreamHub, is oblivious to the semantics of subscriptions and publications. It can support any type and number of filtering operations implemented by independent libraries. Experiments on a cluster with up to 384 cores indicate that StreamHub is able to register 150 K subscriptions per second and filter next to 2 K publications against 100 K stored subscriptions, resulting in nearly 400 K notifications sent per second. Comparisons against a broker overlay solution shows an improvement of two orders of magnitude in throughput when using the same number of cores.
Conference Paper
Data protection is a challenge when outsourcing medical analysis, especially if one is dealing with patient related data. While securing transfer channels is possible using encryption mechanisms, protecting the data during analyses is difficult as it usually involves processing steps on the plain data. A common use case in bioinformatics is when a scientist searches for a biological sequence of amino acids or DNA nucleotides in a library or database of sequences to identify similarities. Most such search algorithms are optimized for speed with less or no consideration for data protection. Fast algorithms are especially necessary because of the immense search space represented for instance by the genome or proteome of complex organisms. We propose a new secure exact term search algorithm based on Bloom filters. Our algorithm retains data privacy by using Obfuscated Bloom filters while maintaining the performance needed for real-life applications. The results can then be further aggregated using Homomorphic Cryptography to allow exact-match searching. The proposed system facilitates outsourcing exact term search of sensitive data to on-demand resources in a way which conforms to best practice of data protection.
Conference Paper
Content-based publish/subscribe is an appealing paradigm for building large-scale distributed applications. Such applications are often deployed over multiple administrative domains, some of which may not be trusted. Recent attacks in public clouds indicate that a major concern in untrusted domains is the enforcement of privacy. By routing data based on subscriptions evaluated on the content of publications, publish/subscribe systems can expose critical information to unauthorized parties. Information leakage can be avoided by the means of privacy-preserving filtering, which is supported by several mechanisms for encrypted matching. Unfortunately, all existing approaches have in common a high performance overhead and the difficulty to use classical optimization for content-based filtering such as per-attribute containment. In this paper, we propose a novel mechanism that greatly reduces the cost of supporting privacy-preserving filtering based on encrypted matching operators. It is based on a pre-filtering stage that can be combined with containment graphs, if available. Our experiments indicate that pre-filtering is able to significantly reduce the number of encrypted matching for a variety of workloads, and therefore the costs associated with the cryptographic mechanisms. Furthermore, our analysis shows that the additional data structures used for pre-filtering have very limited impact on the effectiveness of privacy preservation.
Conference Paper
Private matching solutions allow two parties to find common data elements over their own datasets without revealing any additional private information. We propose a new concept involving an intermediate entity in the private matching process: we consider the problem of broker-based private matching where end-entities do not interact with each other but communicate through a third entity, namely the Broker, which only discovers the number of matching elements. Although introducing this third entity enables a complete decoupling between end-entities (which may even not know each other), this advantage comes at the cost of higher exposure in terms of privacy and security. After defining the security requirements dedicated to this new concept, we propose a complete solution which combines searchable encryption techniques together with counting Bloom filters to preserve the privacy of end-entities and provide the proof of the matching correctness, respectively.
Conference Paper
The publish/subscribe model offers a loosely-coupled communication paradigm where applications interact indirectly and asynchronously. Publisher applications generate events that are forwarded to subscriber applications by a network of brokers. Subscribers register by specifying filters that brokers match against events as part of the routing process. Brokers might be deployed on untrusted servers where malicious entities can get access to events and filters. Supporting confidentiality of events and filters in this setting is still an open challenge. First of all, it is desirable that publishers and subscribers do not share secret keys, such a requirement being against the loose-coupling of the model. Second, brokers need to route events by matching encrypted events against encrypted filters. This should be possible even with very complex filters. Existing solutions do not fully address these issues. This work describes the implementation of a novel schema that supports (i) confidentiality for events and filters; (ii) filters that express very complex constraints on events even if brokers are not able to access any information on both events and filters; (iii) and finally, does not require publishers and subscribers to share keys. We then describe an e-Health application scenario for monitoring patients with chronic diseases and show how our encryption schema can be used to provide confidentiality of the patients' personal and medical data, and control who can receive the patients' data and under which conditions.
Conference Paper
Bloom filters provide a space- and time-efficient mean to check the inclusion of an element in a set. In some applications it is beneficial, if the set represented by the Bloom filter is only revealed to authorized parties. Particularly, operations data in supply chain management can be very sensitive and Bloom filters can be applied to supply chain integrity validation. Despite the protection of the represented set, Bloom filter operations, such as the verification of set inclusion, need to be still feasible. In this paper we present privacy-preserving, publicly verifiable Bloom filters which offer both: privacy for the represented set and public Bloom filter operations. We give security proofs in the standard model.
Conference Paper
Publish/Subscribe systems have become a prevalent model for delivering data from producers (publishers) to consumers (subscribers) distributed across wide-area networks while decoupling the publishers and the subscribers from each other. In this paper we present Meghdoot, which adapts content-based publish/subscribe systems to Distributed Hash Table based P2P networks in order to provide scalable content delivery mechanisms while maintaining the decoupling between the publishers and the subscribers. Meghdoot is designed to adapt to highly skewed data sets, which is typical of real applications. The experimental results demonstrate that Meghdoot balances the load among the peers and the design scales well with increasing number of peers, subscriptions and events.
Conference Paper
Online applications are vulnerable to theft of sensitive information because adversaries can exploit software bugs to gain access to private data, and because curious or malicious administrators may capture and leak data. CryptDB is a system that provides practical and provable confidentiality in the face of these attacks for applications backed by SQL databases. It works by executing SQL queries over encrypted data using a collection of efficient SQL-aware encryption schemes. CryptDB can also chain encryption keys to user passwords, so that a data item can be decrypted only by using the password of one of the users with access to that data. As a result, a database administrator never gets access to decrypted data, and even if all servers are compromised, an adversary cannot decrypt the data of any user who is not logged in. An analysis of a trace of 126 million SQL queries from a production MySQL server shows that CryptDB can support operations over encrypted data for 99.5% of the 128,840 columns seen in the trace. Our evaluation shows that CryptDB has low overhead, reducing throughput by 14.5% for phpBB, a web forum application, and by 26% for queries from TPC-C, compared to unmodified MySQL. Chaining encryption keys to user passwords requires 11--13 unique schema annotations to secure more than 20 sensitive fields and 2--7 lines of source code changes for three multi-user web applications.
Conference Paper
Two convincing paradigms have emerged for achieving scal- ability in widely distributed systems: publish/subscribe communication and role-based, policy-driven control of access to the system by applications. A strength of publish/ subscribe is its many-to-many communication paradigm and loose coupling of components, so that publishers need not know the recipients of their data and subscribers need not know the number and location of publishers. But some data is sensitive, and its visibility must be controlled carefully for personal and legal reasons. We describe the requirements of several application domains where the event-based paradigm is appropriate yet where security is an issue. Typical are the large-scale systems required by government and public bod- ies for domains such as healthcare, police, transport and environmental monitoring. We discuss how a publish/subscribe service can be se- cured; firstly by specifying and enforcing access control pol- icy at the service API, and secondly by enforcing the se- curity and privacy aspects of these policies within the ser- vice network itself. Finally, we describe an alternative to whole-message encryption, appropriate for highly sensitive and long-lived data destined for specific domains with var- ied requirements. We outline our investigations and findings from several research projects in these areas.
Conference Paper
Privacy and confidentiality are crucial issues in content-based publish/subscribe (CBPS) networks. We tackle the problem of end-user privacy in CBPS. This problem raises a challenging requirement for handling encrypted data for the purpose of routing based on protected content and encrypted subscription information. We suggest a solution based on a commutative multiple encryption scheme in order to allow brokers to operate in-network matching and content based routing without having access to the content of the packets. This is the first solution that avoids key sharing among end-users and targets an enhanced CBPS model where brokers can also be subscribers at the same time.
Article
In this paper trade-offs among certain computational factors in hash coding are analyzed. The paradigm problem considered is that of testing a series of messages one-by-one for membership in a given set of messages. Two new hash-coding methods are examined and compared with a particular conventional hash-coding method. The computational factors considered are the size of the hash area (space), the time required to identify a message as a nonmember of the given set (reject time), and an allowable error frequency. The new methods are intended to reduce the amount of space required to contain the hash-coded information from that associated with conventional methods. The reduction in space is accomplished by exploiting the possibility that a small fraction of errors of commission may be tolerable in some applications, in particular, applications in which a large amount of data is involved and a core resident hash area is consequently not feasible using conventional methods. In such applications, it is envisaged that overall performance could be improved by using a smaller core resident hash area in conjunction with the new methods and, when necessary, by using some secondary and perhaps time-consuming test to “catch” the small fraction of errors associated with the new methods. An example is discussed which illustrates possible areas of application for the new methods. Analysis of the paradigm problem demonstrates that allowing a small number of test messages to be falsely identified as members of the given set will permit a much smaller hash area to be used without increasing reject time.
Conference Paper
Third-party cloud computing represents the promise of out- sourcing as applied to computation. Services, such as Mi- crosoft's Azure and Amazon's EC2, allow users to instanti- ate virtual machines (VMs) on demand and thus purchase precisely the capacity they require when they require it. In turn, the use of virtualization allows third-party cloud providers to maximize the utilization of their sunk capital costs by multiplexing many customer VMs across a shared physical infrastructure. However, in this paper, we show that this approach can also introduce new vulnerabilities. Using the Amazon EC2 service as a case study, we show that it is possible to map the internal cloud infrastructure, iden- tify where a particular target VM is likely to reside, and then instantiate new VMs until one is placed co-resident with the target. We explore how such placement can then be used to mount cross-VM side-channel attacks to extract information from a target VM on the same machine.
Conference Paper
We design an encryption scheme called Multi-dimensional Range Query over Encrypted Data (MRQED), to address the privacy concerns related to the sharing of network audit logs and various other applications. Our scheme allows a network gateway to encrypt summaries of network flows before submitting them to an untrusted repository. When network intrusions are suspected, an authority can release a key to an auditor, allowing the auditor to decrypt flows whose attributes (e.g., source and destination addresses, port numbers, etc.) fall within specific ranges. However, the privacy of all irrelevant flows are still preserved. We formally define the security for MRQED and prove the security of our construction under the decision bilinear Diffie-Hellman and decision linear assumptions in certain bilinear groups. We study the practical performance of our construction in the context of network audit logs. Apart from network audit logs, our scheme also has interesting applications for financial audit logs, medical privacy, untrusted remote storage, etc. In particular, we show that MRQED implies a solution to its dual problem, which enables investors to trade stocks through a broker in a privacypreserving manner.
Article
We present, implement, and analyze a new scalable centralized algorithm, called OFT, for establishing shared cryptographic keys in large, dynamically changing groups. Our algorithm is based on a novel application of one-way function trees. In comparison with the top-down logical key hierarchy (LKH) method of Wallner et al., our bottom-up algorithm approximately halves the number of bits that need to be broadcast to members in order to rekey after a member is added or evicted. The number of keys stored by group members, the number of keys broadcast to the group when new members are added or evicted, and the computational efforts of group members, are logarithmic in the number of group members. Among the hierarchical methods, OFT is the first to achieve an approximate halving in broadcast length, an idea on which subsequent algorithms have built. Our algorithm provides complete forward and backward security: Newly admitted group members cannot read previous messages, and evicted members cannot read future messages, even with collusion by arbitrarily many evicted members. In addition, and unlike LKH, our algorithm has the option of being member contributory in that members can be allowed to contribute entropy to the group key. Running on a Pentium II, our prototype has handled groups with up to 10 million members. This algorithm offers a new scalable method for establishing group session keys for secure large-group applications such as broadcast encryption, electronic conferences, multicast sessions, and military command and control.
Article
It is often necessary for two or more or more parties that do not fully trust each other to selectively share data. We propose a search scheme based on Bloom filters and Pohlig-Hellman encryption. A semi-trusted third party can transform one party's search queries to a form suitable for querying the other party's database, in such a way that neither the third party nor the database owner can see the original query. Furthermore, the encryption keys used to construct the Bloom filters are not shared with this third party. Provision can be made for thirdparty "warrant servers", as well as "censorship sets" that limit the data to be shared.
Article
A secure index is a data structure that allows a querier with a "trapdoor" for a word x to test in O(1) time only if the index contains x; The index reveals no information about its contents without valid trapdoors, and trapdoors can only be generated with a secret key. Secure indexes are a natural extension of the problem of constructing data structures with privacy guarantees such as those provided by oblivious and history independent data structures. In this paper, we formally define a secure index and formulate a security model for indexes known as semantic security against adaptive chosen keyword attack (ind-cka). We also develop an e#cient indcka secure index construction called z-idx using pseudo-random functions and Bloom filters, and show how to use z-idx to implement searches on encrypted data. This search scheme is the most e#cient encrypted data search scheme currently known; It provides O(1) search time per document, and handles compressed data, variable length words, and boolean and certain regular expression queries. The techniques developed in this paper can also be used to build encrypted searchable audit logs, private database query schemes, accumulated hashing schemes, and secure set membership tests.
Article
This paper presents SIENA, an event notification service that we have designed and implemented to exhibit both expressiveness and scalability. We describe the service's interface to applications, the algorithms used by networks of servers to select and deliver event notifications, and the strategies used Effort sponsored by the Defense Advanced Research Projects Agency, and Air Force Research Laboratory, Air Force Materiel Command,USAF, under agreement numbers F30602-94-C-0253, F3060297 -2-0021, F30602-98-2-0163, F30602-99-C-0174, F30602-00-2-0608, and N66001-00-8945; by the Air Force Office of Scientific Research, Air Force Materiel Command, USAF, under grant number F49620-98-1-0061; and by the National Science Foundation under Grant Number CCR-9701973. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the Defense Advanced Research Projects Agency, Air Force Research Laboratory, or the U.S. Government
Conference Paper
The use of cryptographic hash functions like MD5 or SHA-1 for message authentication has become a standard approach in many applications, particularly Internet security protocols. Though very easy to implement, these mechanisms are usually based on ad hoc techniques that lack a sound security analysis. We present new, simple, and practical constructions of message authentication schemes based on a cryptographic hash function. Our schemes, NMAC and HMAC, are proven to be secure as long as the underlying hash function has some reasonable cryptographic strengths. Moreover we show, in a quantitative way, that the schemes retain almost all the security of the underlying hash function. The performance of our schemes is essentially that of the underlying hash function. Moreover they use the hash function (or its compression function) as a black box, so that widely available library code or hardware can be used to implement them in a simple way, and replaceability of the underlying hash function is easily supported.
Article
When the probability of measuring a particular value of some quantity varies inversely as a power of that value, the quantity is said to follow a power law, also known variously as Zipf's law or the Pareto distribution. Power laws appear widely in physics, biology, earth and planetary sciences, economics and finance, computer science, demography and the social sciences. For instance, the distributions of the sizes of cities, earthquakes, solar flares, moon craters, wars and people's personal fortunes all appear to follow power laws. The origin of power-law behaviour has been a topic of debate in the scientific community for more than a century. Here we review some of the empirical evidence for the existence of power-law forms and the theories proposed to explain them.
The Cloud: Data protection and privacy-Whose cloud is it anyway?
  • S Liston
Towards efficient privacy-preserving SQL processing in untrusted clouds
  • M Kurpicz
M. Kurpicz, "Towards efficient privacy-preserving SQL processing in untrusted clouds," Master's thesis, University of Neuchâtel, July 2013.
The Cloud: data protection and privacy -Whose cloud is it anyway?" in 12th Global Symposium for Regulators, ser
  • S Liston
S. Liston, "The Cloud: data protection and privacy -Whose cloud is it anyway?" in 12th Global Symposium for Regulators, ser. GSR, 2012.
Public-key encrypted Bloom filters with applications to supply chain integrity," in Data and Applications Security and Privacy XXV, ser. Lecture Notes in Computer Science
  • F Kerschbaum
F. Kerschbaum, "Public-key encrypted Bloom filters with applications to supply chain integrity," in Data and Applications Security and Privacy XXV, ser. Lecture Notes in Computer Science, Y. Li, Ed., vol. 6818, 2011.