Article

Message-Locked Encryption for Lock-Dependent Messages

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Motivated by the problem of avoiding duplication in storage systems, Bellare, Keelveedhi, and Ristenpart have recently put forward the notion of Message-Locked Encryption (MLE) schemes which subsumes convergent encryption and its variants. Such schemes do not rely on permanent secret keys, but rather encrypt messages using keys derived from the messages themselves. We strengthen the notions of security proposed by Bellare et al. by considering plaintext distributions that may depend on the public parameters of the schemes. We refer to such inputs as lock-dependent messages. We construct two schemes that satisfy our new notions of security for message-locked encryption with lock-dependent messages. Our main construction deviates from the approach of Bellare et al. by avoiding the use of ciphertext components derived deterministically from the messages. We design a fully randomized scheme that supports an equality-testing algorithm defined on the ciphertexts. Our second construction has a deterministic ciphertext component that enables more efficient equality testing. Security for lock-dependent messages still holds under computational assumptions on the message distributions produced by the attacker. In both of our schemes the overhead in the length of the ciphertext is only additive and independent of the message length.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Conventional encryption is the first way to encrypt data to protect data confidentiality, but it disables deduplication because the same plaintext generates different ciphertexts under different keys. To enable data deduplication, other solutions, like convergent encryption (CE) [5] and message-locked encryption (MLE) [6], use plaintext to generate keys to generate the ciphertext. However, the external attacker can perform a brute force attack to obtain the user's private data because the file content has low min-entropy. ...
... In addition, due to the significant computational resources required for model training, it is not possible to apply and implement methods based on machine learning or deep learning models on massive collected data to preserve privacy [11]. Another type of attack called poison attack [6] tries to create a mismatch between the tag of chunk h(ch) and the stored chunk ch ′ . Consequently, the subsequent users with ch will have access to a fake copy of the chunk in the cloud after the duplicate check. ...
Article
Full-text available
The forthcoming Fog storage system should provide end users with secured and faster access to cloud services and minimise storage capacity using data deduplication. This method stores a single copy of data and provides a link to the cloud/fog owners. In client-side data deduplication, the system can reduce network bandwidth levels by duplicate check. This solution fails to cover user privacy and optimise the latency of real-time communications. Motivated by this, this magazine paper develops PrivAcy-preServing data deduplication in Fog stOraGe system (PASFOG) as a data deduplication protocol implemented between cloud storage and users to mitigate brute-force and poison attacks. PASFOG is implemented in fog computing to reduce real-time delay and communication when performing duplicate checks. Also, we propose PrivAcypreServing data dedupliCatiOn in blockchaIN-based Fog stOraGe system (PASCOINFOG) utilised blockchain techniques to realise a reliable system. In PASCOINFOG, when users want to send chunks to the cloud/fog nodes, process the duplicate check and create a new block for the blockchain to reduce real-time latency/communication and protect the cloud from attackers. The proposed protocols can enhance user privacy and reduce real-time communication delay, crucial for consumer electronics applications such as cloud storage and IoT devices.
... In order to formalize the precise security definition for convergent encryption, Bellare, Keelveedhi and Ristenpart [8] introduced a cryptographic primitive named message locked encryption, and detailed several definitions to capture various security requirements. Abadi et al. [9] then strengthened the security definition in [8] by considering the plaintext distributions depending on the public parameters of the schemes. This model was later extended by Bellare and Keelveedhi [11] by providing privacy for messages that are both correlated and dependent on the public system parameters. ...
... Since message-locked encryption cannot resist to brute-force attacks where files falling into a known set will be recovered, an architecture that provides secure deduplicated storage resisting brute-force attacks was put forward by Keelveedhi, Bellare and Ristenpart [10] and realized in a system called server-aided encryption for deduplicated storage. In this paper, a similar technique to that [9] is used to achieve secure deduplication with regard to the private cloud in the concrete construction. ...
Article
Attribute-based encryption (ABE) has been widely used in cloud computing where a data provider outsources his/her encrypted data to a cloud service provider, and can share the data with users possessing specific credentials (or attributes). However, the standard ABE system does not support secure deduplication, which is crucial for eliminating duplicate copies of identical data in order to save storage space and network bandwidth. In this paper, we present an attribute-based storage system with secure deduplication in a hybrid cloud setting, where a private cloud is responsible for duplicate detection and a public cloud manages the storage. Compared with the prior data deduplication systems, our system has two advantages. First, it can be used to confidentially share data with users by specifying access policies rather than sharing decryption keys. Second, it achieves the standard notion of semantic security for data confidentiality while existing systems only achieve it by defining a weaker security notion. In addition, we put forth a methodology to modify a ciphertext over one access policy into ciphertexts of the same plaintext but under other access policies without revealing the underlying plaintext.
... Bellare et al. [14] propose a theoretical framework of MLE, and provide formal definitions of privacy and tag consistency. The follow-up studies [2,12] further examine message correlation and parameter dependency of MLE. ...
Preprint
Rekeying refers to an operation of replacing an existing key with a new key for encryption. It renews security protection, so as to protect against key compromise and enable dynamic access control in cryptographic storage. However, it is non-trivial to realize efficient rekeying in encrypted deduplication storage systems, which use deterministic content-derived encryption keys to allow deduplication on ciphertexts. We design and implement REED, a rekeying-aware encrypted deduplication storage system. REED builds on a deterministic version of all-or-nothing transform (AONT), such that it enables secure and lightweight rekeying, while preserving the deduplication capability. We propose two REED encryption schemes that trade between performance and security, and extend REED for dynamic access control. We implement a REED prototype with various performance optimization techniques and demonstrate how we can exploit similarity to mitigate key generation overhead. Our trace-driven testbed evaluation shows that our REED prototype maintains high performance and storage efficiency.
... In Chifor et al.(2018), the authors focused on the security issues in IoT environment. The authors addressed this issue by proposing a lightweight secure authorization stack by incorporating cloud connected devices for smart applications.The proposed security stack is suitable for integrating heterogeneous devices using the existing OS and frameworks without additional overhead.A new R-MLE method is developed byAbadi et al. (2013) to enhance the existing MLE. Generally, the MLE method works according to the public parameter but the newly developed R-MLE method generates the file tags randomly. ...
Preprint
Full-text available
In cloud computing environment data redundancy and data integrity management are the consequential issues. They result in huge space wastage and compromises data security and data accuracy. Data deduplication and data auditing are the techniques used to efficiently reduce storage space and to verify the integrity of the cloud stored data. Most of the existing data deduplication and auditing algorithms are mainly focusing on static data storage environment which does not efficiently fits to the dynamic nature of cloud storage. Integrating data deduplication with integrity verification is very challenging to handle and such existing systems suffers from faking of audit results and data ownership leakage. Also, existing cloud data auditing schemes does not provide compensation to users, if their data integrity is compromised. Blockchain technology with its key features, provides security, trust and efficiency to handle data storage in cloud. In order to solve the issues in existing data auditing integrated deduplication systems, this paper proposes Blockchain incorporated secure cloud data storage system supporting data auditing and deduplication with fair remittance. The implementation results and security analysis confirm that the proposed system outperforms well in terms of data integrity verification and deduplication with enhanced accuracy and security compared to state of the arts.
... One such nation that has endured the scars of conflict is Sudan, a country situated in northeastern Africa, whose recent history has been marred by civil strife and turmoil. In the wake of these tumultuous events, the Sudanese government has been faced with the daunting challenge of reconstructing its data repositories, a vital component of any modern nation's governance and administrative framework [2]. ...
Article
The aim of this research is to assess the Role of Cloud Storage in Rebuilding Governmental Data After Conflict, with a deeper insight into Sudanese case. To investigate the role of cloud storage in rebuilding governmental data after conflict in Sudan, this study employs a quantitative research methodology. The primary data collection instrument is a structured questionnaire designed to gather insights and perspectives from 80 participants of key stakeholders involved in the cloud storage adoption and data reconstruction efforts in Sudan, including (IT professionals and data management experts, Senior-level government officials and policymakers, Representatives from cloud service providers, End-users and data consumers from various government agencies). The collected data underwent accurate statistical analysis, including descriptive and inferential techniques, to identify patterns, trends, and relationships among the variables under investigation. Quantitative data was analyzed using "SPSS" statistical software. The findings of the research revealed that, a total of 18 participants representing (22.5%) agree that there is a high level of awareness and understanding of cloud storage technologies among government agencies in Sudan, while 28 participants representing (35%) disagree. Which indicates the need of spreading the awareness and understanding of cloud storage technologies among government agencies in Sudan. Furthermore, Pearson Chi-Square test was used to measure the role of cloud storage in rebuilding governmental data after conflict in Sudan, a significance value of (0.021) which is less than (p-value=.05) suggests that there is a significant role of cloud storage in rebuilding governmental data after conflict in Sudan.
... Abadi et al . [10] constructed MLE2 for lockdependent messages based on MLE to improve the security of data de-duplication. Liu et al . ...
Article
Full-text available
In the current era of information explosion, users’ demand for data storage is increasing, and data on the cloud has become the first choice of users and enterprises. Cloud storage facilitates users to backup and share data, effectively reducing users’ storage expenses. As the duplicate data of different users are stored multiple times, leading to a sudden decrease in storage utilization of cloud servers. Data stored in plaintext form can directly remove duplicate data, while cloud servers are semi-trusted and usually need to store data after encryption to protect user privacy. In this paper, we focus on how to achieve secure de-duplication and recover data in ciphertext for different users, and determine whether the indexes of public key searchable encryption and the matching relationship of trapdoor are equal in ciphertext to achieve secure de-duplication. For the duplicate file, the data user’s re-encryption key about the file is appended to the ciphertext chain table of the stored copy. The cloud server uses the re-encryption key to generate the specified transformed ciphertext, and the data user decrypts the transformed ciphertext by its private key to recover the file. The proposed scheme is secure and efficient through security analysis and experimental simulation analysis.
... Compared with CE, the core idea of MLE has not changed so that it is also unable to achieve semantic security. 10,11 Bellare et al 12 proposed DupLESS, in which different owners of the same data and the trusted key server execute the obfuscated pseudorandom function (OPF) to generate the same encryption key. To improve the security, Duan 13 proposed a scheme in which the key server is eliminated and a distributed key generation protocol is implemented by the clients before uploading the data. ...
Article
Full-text available
To accommodate the new demand for the deduplication of encrypted data, secure encrypted data deduplication technologies have been widely adopted by cloud service providers. At present, of particular concern is how deduplication can be applied to the ciphertexts encrypted by semantically secure symmetric encryption scheme. Avoiding disadvantages of the existing methods, in this article, we propose a blockchain‐based secure encrypted data deduplication protocol supporting client‐side semantically secure encryption. In the proposed protocol, the smart contracts are deployed by the first file uploader, and then the subsequent uploaders implement an interactive proof of ownership for the same file with the help of the smart contracts executing a cloud data integrity auditing protocol. The smart contracts play the role of the trusted third party and therefore make up for the poor feasibility for the existence of a trusted third party in real scenario. In addition, in the proposed protocol, there is no need for other clients who have uploaded the same file to be online to help the current uploader obtain the encryption key. We also prove its security and evaluate its performance.
... In order to achieve semantic security, Abadi et al. proposed a completely randomized security deduplication scheme on the basis of MLE [13], which has semantic security and can achieve cross-user secure deduplication. The core of this scheme is to randomize data tags and find duplicate data through tags. ...
Article
Full-text available
Convergent encryption has been widely used in secure deduplication technology, but because the data itself is predictable, directly using the hash value of the data as a key is vulnerable to brute force attacks. To this end, researchers have proposed some more secure key management methods. However, they have limited scope of application and poor performance. Therefore, this paper proposes a hierarchical key management scheme based on threshold blind signature. The convergence key generated by multiple key servers ensures the key’s confidentiality, and it effectively avoid the threat of brute force attacks. Moreover, key servers are divided into master key nodes and sub-key nodes, which can reduce the interaction between key servers and improve the efficiency of system initialization. This architecture enables sub-key nodes to be distributed in multiple independent network domains and interact with master key nodes through the Internet. On the one hand, it supports to cross-domain deduplication, and on the other hand, it makes the sub-key node closer to the end user, reducing communication delay for improving key generation efficiency. The experimental results show that the proposed scheme has a greater performance improvement in system initialization and key generation than the fully distributed key management scheme.
... Reinforces attempts to de-duplicate relevant data, since similar data intercepted by entirely unique users (and sometimes continuous poor treatment of encryption algorithms by customers) may finish up with different cipher texts [4], [5]. Therefore, a subject of recent empirical intrigue could be the way to actually implement database de-duplication [6], [7], [8] on secured content. Types of data de-duplication processes have been developed in the literature only in recent times. ...
Conference Paper
Data de-duplication would greatly reduce the overhead of cloud computing services for storage and transfer, and it has possible applications in this big data-driven environment. Current systems that de-duplication of data are typically designed either to tackle brute-force threats or increase the accessibility of quality and data, but not all situations. In the context of shortening redundant data disclosure in this publication, we tend to evaluate 3-tier cross-domain architecture, degree program suggest an accurate and confidentiality issues saving big data de-duplication in the cloud, we also tend not to be aware of any current theme that attains authority. Both confidentiality and data de-duplication is accomplished by EPCDD and brute-force attacks are countered. In general, we generally take accountability into evaluation for having greater safeguards of privacy than current schemes. We appear to show that, in terms of computing, interactions and warehousing expenses, EPCDD performs better competitive processes. The dynamic operation user may also perform detailed update, data leakage & restoring through this project. Upon accessing it, the file is sent to the local distributor to verify de-duplication when the document is not reproduced, then the file will be saved on the cloud, and the replica file will obtain evidence of copyright. Authentication process and cross-domain de-duplication of big data in the cloud can be accomplished by this method. In addition, that time complexity of the EPCDD replication search is indexed.
... Data users prefer to encrypt their data, especially the confidential ones, before uploading to CSPs for protecting data security and privacy, thus increasing the difficulty to identify duplicated data. Employing the data content to generate data-encryption keys with a certain algorithm ensures that all users of identical data can generate the same keys and produce identical encrypted data (Abadi, BonehIlya, Mironov, Raghunathan, & Segev, 2013;Bellare & Keelveedhi, 2015;Bellare, Keelveedhi, & Ristenpart, 2013). Therefore the CSP can easily detect duplication by comparing the encrypted data without knowing anything about the raw data. ...
Chapter
A cloud can be considered as a distributed computing environment consists of a set of virtualized and interconnected computers or nodes. Dynamically provisioned cloud presented as consolidated computing resources based on Service Level Agreements (SLA) entrenched through an agreement between the consumers and service provider. The cloud consists of virtualized, distributed datacenters and here the applications are offered on-demand, as a service. Every datacenter needs a very much consistent as well as inexpensive power supply. Usually, this is fulfilled by combining grid electricity, which ensures affordability, and backup diesel generators for an emergency, that assures consistency. Unfortunately, this system has some flaws, relying on increasingly unstable electricity prices, and a very high rate of carbon emissions from the diesel generators, this could direct to the problems of environmental sustainability. We will look into the energy generation and consumption of different components of datacenters. The power sources of these power-hungry datacenters are affecting environmental sustainability. Later, in this chapter we will explore some of the solutions to reduce energy consumption such as green cloud computing, using renewable energy sources, and so on.
... The algorithm A, P, R, S of any Message-Locked Encryption (MLE) method. MLE will provide protection only for unreliable data inside this two-dimensional context (9,10). Association and parameter dependency are two ways of protection for MLE one. ...
Article
Data deduplication is necessary for corporations to minimize the hidden charges linked with backing up their data using Public Cloud platform. Incapable data storage on its own can become improvident, and such problem are enlarging in the Public Cloud at other and scattered satisfied confirmed storage structure are creating multiple clone of single account for collating or other purposes. Deduplication is friendly in cost shrinking by lengthening the benefit of a precise volume of data. Miserably, data duplicity having several safety constraints, so more than one encoding is appropriate to validate the details. There is a system for dynamic Information-Locking and Encoding with Convergent Encoding. In this Information-Locking and Encoding with Convergent Encoding, the data is coded first and then the cipher text is encoded once more. Chunk volume is used for deduplication to diminish disk capacity. The same segments would still be encrypted in the same cipher message. The format of the key neither be abbreviated from encrypted chunk data by the hacker. The comprehension is also guarded from the cloud server. The center of attention of this document is to reducing disk storage and provides protection for online cloud deduplication.
... In our most efficient system cipher text size, encryption, and decryption time scales linearly with the complexity of the access formula [14]. The first method is avoid using tags that are derived deterministically from the message [16]. They designed a fully randomized scheme that supported equality test over cipher text. ...
Article
Full-text available
The main objective of this system is to develop an experimental model to store the data securely in cloud computing environment and provides high level of trustworthy data maintenance scheme to its users with Advanced Cryptographic Standards (ACS) scheme. In this system a novel proof based trustworthy data management scheme to upload the data into the server with proper cryptographic principles such as Advanced Cryptographic Standards (ACS), with these systems the data owner and data users is successfully maintain and retrieve the data to and from the server. Authentication is the procedure to ensure the cloud data in a secured manner. The strong user authentication is the main requirement for cloud computing that reduces the unauthorized user access of data on cloud. Data security is most important issue of cloud computing. Role-based access control (RBAC) method controls access to computer or network environmental based on the roles given to individual users within an organization. Roles are defined according to job skill, authority, and responsibility within an organization. In RBAC roles can be easily created, changed, or discontinued as the needs of an organization involve, without updating the privileges for every user. We propose a new Secure Cloud Data to enhancement the framework model in data security and cloud storage model by integrating the dual system encryption technology with selective proof technique.
Article
Full-text available
In the current area of information explosion, users’ demand for data storage is increasing, and data on the cloud has become the first choice of users and enterprises. Cloud storage facilitates users to backup and share data, effectively reducing users’ storage expenses. As the duplicate data of different users are stored multiple times, leading to a sudden decrease in storage utilization of cloud servers. Data stored in plaintext form can directly remove duplicate data, while cloud servers are semi-trusted and usually need to store data after encryption to protect user privacy. In this paper, we focus on how to achieve secure re-duplication and recover data in ciphertext for different users, and determine whether the indexes of public key searchable encryption and the matching relationship of trapdoor are equal in cipher text to achieve secure de-duplication. For the duplicate file, the data user’s re-encryption key about the file is appended to the ciphertext chain table of the stored copy. The cloud server uses the re-encryption key to generate the specified transformed ciphertext, and the data user decrypts the transformed ciphertext by its private key to recover the file. The proposed scheme is secure and efficient through security analysis and experimental simulation analysis.
Article
This work provides a privacy-preserving multi-dimensional media sharing system, SMACD, for portable cloud computing scenarios within the context of widespread media sharing enabled by cloud computing and devices. Attribute-based encryption is used to jumble each media layer, ensuring media privacy and granular access control. To reduce the complexity of the get-to arrangement and adapt to the properties of multi-dimensional media, a multi-level get-to arrangement development with a secret-sharing plot is presented. For both intra-server and inter-server deduplication, decentralized key servers are offered. With cloud computing, databases and application software are moved to sizable data centers, where data and service management may not be entirely reliable. However, this special feature brings up a number of new, poorly understood security challenges. Due to the fact that both user data and applications are on provider premises, cloud computing raises data security problems as neither is entirely contained on the user's machine. Although clouds often have a single architecture, they can have numerous customers with various needs. The solution provided by all cloud providers is to use encryption methods to encrypt the data. To solve every issue, there's also a potential that the cloud service is unreliable
Article
To implement encrypted data deduplication in a cloud storage system, users must encrypt files using special encryption algorithms (e.g., convergent encryption (CE)), which cannot provide strong protection. The confidential level of an outsourced file is determined by the user himself/herself subjectively or by the owner number of the file objectively. These files owned by a few users are considered strictly confidential and require strong protection. In this paper, we design, analyze and implement LSDedup, which attains a high storage efficiency while providing strictly confidential files (SCFiles) with strong protection. LSDedup allows cloud users to securely interact with cloud servers to check the confidential level of an outsourced file. Users encrypt the SCFiles using standard symmetric encryption algorithms to achieve a high security level, whereas encrypting the less confidential files (LSFiles) using CE such that cloud servers can perform deduplication. LSDedup is designed to prevent cloud servers reporting fake confidential level and a fake file user claiming the ownership of the file. Formal analysis is provided to justify its security. Besides, we implement an LSDedup prototype using Alibaba Cloud as backend storage. Our evaluations demonstrate that LSDedup can work with existing cloud service providers’ APIs and achieves modest performance overhead.
Article
Full-text available
It is non-trivial to provide semantic security for user data while achieving deduplication in cloud storage. Some studies deploy a trusted party to store deterministic tags for recording data popularity, then provide different levels of security for data according to popularity. However, deterministic tags are vulnerable to offline brute-force attacks. In this paper, we first propose a popularity-based secure deduplication scheme with fully random tags, which avoids the storage of deterministic tags. Our scheme uses homomorphic encryption (HE) to generate comparable random tags to record data popularity and then uses the binary search in the AVL tree to accelerate the tag comparisons. Besides, we find the popularity tamper attacks in existing schemes and design a proof of ownership (PoW) protocol against it. To achieve scalability and updatability, we introduce the multi-key homomorphic proxy re-encryption (MKH-PRE) to design a multi-tenant scheme. Users in different tenants generate tags using different key pairs, and the cross-tenant tags can be compared for equality. Meanwhile, our multi-tenant scheme supports efficient key updates. We give comprehensive security analysis and conduct performance evaluations based on both synthetic and real-world datasets. The results show that our schemes achieve efficient data encryption and key update, and have high storage efficiency.
Article
Version Control System (VCS) plays an essential role in software supply chain, as it manages code projects and enables efficient collaboration. For a private repository, where source code is a high-profile asset and needs to be protected, VCS' security is extremely important. Traditional (unencrypted or encrypted) VCS solutions rely on a trusted service provider to host the code and enforce access control, which is not realistic enough for real-world threats. If the service provider peep in or the hackers break into the repository, the read & write privilege to the sensitive code is totally lost. Therefore, we consider whether one can relax the assumption on the server by introducing a covert adversary , namely, it may act maliciously, but will not misbehave if it can be caught doing so. However, protecting sensitive code and enforcing access control on a covert adversarial server is a challenging task. Existing encryption-based VCS solutions failed to address this challenge, as they offered limited access control functionalities, introduced heavy key management overhead or storage overhead. Moreover, the crucial feature of compression of the source files were missing in an encrypted and versioned storage. To address these problems, we introduce Gringotts , an end-to-end encrypted VCS, tailored for read & write access control, version control and source file compression. We present a formal model and propose a scheme with detailed analysis. We also implement and evaluate Gringotts on top-10 most starred code projects on GitHub. The results demonstrate that Gringotts introduces low latency (less than 0.3 s) for commit encryption and decryption, supports fine-grained access control and rich version control functionalities with practical performance.
Article
Conventional encrypted deduplication approaches retain the deduplication capability on duplicate chunks after encryption by always deriving the key for encryption/decryption from the chunk content, but such a deterministic nature causes information leakage due to frequency analysis. We present TED , a tunable encrypted deduplication primitive that provides a tunable mechanism for balancing the trade-off between storage efficiency and data confidentiality. The core idea of TED is that its key derivation is based on not only the chunk content but also the number of duplicate chunk copies, such that duplicate chunks are encrypted by distinct keys in a controlled manner. In particular, TED allows users to configure a storage blowup factor, under which the information leakage quantified by an information-theoretic measure is minimized for any input workload. In addition, we extend TED with a distributed key management architecture, and propose two attack-resilient key generation schemes that trade between performance and fault tolerance. We implement an encrypted deduplication prototype TEDStore to realize TED in networked environments. Evaluation on real-world file system snapshots shows that TED effectively balances the trade-off between storage efficiency and data confidentiality, with small performance overhead.
Article
Data deduplication at the network edge significantly improves communication efficiency in edge-assisted cloud storage systems. With the increasing concern about data privacy, secure deduplication has been proposed to provide data security while supporting deduplication. Since conventional secure deduplication schemes are mainly based on deterministic encryption, they are vulnerable to frequency analysis attacks. Some recent research has focused on this problem, where several works studied the trade-off between deduplication efficiency and resistance to frequency analysis attacks. However, no existing work can provide different deduplication efficiency and protection for different data chunks. In this paper, we propose a security-aware and efficient data deduplication scheme for edge-assisted cloud storage systems. It not only improves the efficiency of deduplication but also reduces information leakage caused by frequency analysis attacks. In particular, we first define the security level for chunks to measure the security needs of users. Then, we develop an encryption scheme with multiple levels of security for deduplication. It provides higher security protection for chunks with higher security levels, while sacrificing security to achieve higher deduplication efficiency for chunks with lower security levels. We also analyze the security of our proposed scheme. Evaluations on the real-world datasets show the efficiency of our design.
Article
Cloud Storage Providers generally maintain a single copy of the identical data received from multiple sources to optimize the space. They cannot deduplicate the identical data when the clients upload the data in the encrypted form. To address this problem, recently, Duplicateless Encryption for Simple Storage (DupLESS) scheme is introduced in the literature. Besides, the data stored in the cloud is unreliable due to the possibility of data losses in remote storage environments. The DupLESS scheme, on the other hand, keeps both the key and the data on a single storage server, which is unreliable if that server goes down. In essence, the existing related works aim to handle either secure-deduplication or reliability limited to either key reliability or the data reliability. Hence, there is a need to develop a secure-deduplication mechanism that is not vulnerable to any malicious activity, semantically secures both data and key, and achieves the reliability. To address these problems, this paper proposes the dualDup framework that (a) optimizes the storage by eliminating the duplicate encrypted data from multiple users by extending DupLESS concept, and (b) securely distributes the data and key fragments to achieve the privacy and reliability using Erasure Coding scheme. The proposed approach is implemented in Python on the top of the Dropbox datacenter and corresponding results are reported. Experiments are conducted in a realistic environment. The results demonstrate that the proposed framework achieves reliability with an average storage overhead of 66.66% corresponding to the Reed–Solomon(3,2) codes. We validated through security analysis that the proposed framework is secure from insider and outsider adversaries. Moreover, dualDup framework provides all the aspects of deduplication, attack mitigation, key security and management, reliability, and QoS features as compared to other state-of-the-art deduplication techniques.
Article
Data deduplication is of critical importance to reduce the storage cost for clients and to relieve the unnecessary storage pressure for cloud servers. While various techniques have been proposed for secure deduplication of identical files/blocks, the effective and secure deduplication solutions on fuzzy similar data (image, video, and others) which occupy a large portion in the real world across wide applications, remain open. In this paper, we propose a novel deduplication system, named Fuzzy Deduplication (FuzzyDedup), to implement the secure deduplication of similar data (i.e., similar files, chunks, or blocks). In particular, we leverage the similarity-preserving hash, a fuzzy extractor based on error-correcting codes, and the encryption with customized design to construct a fuzzy-style deduplication encryption scheme (FuzzyMLE), achieving the ciphertext-based deduplication for similar data. Besides, to defend against data ownership cheating attack and duplicate-faking attack, a fuzzy-style proof of ownership scheme (FuzzyPoW) is designed for the cloud server to securely verify a client in possession of the similar data. To further enhance security and efficiency, we also propose both server-aided and random-tag FuzzyMLE to make FuzzyDedup robust against off-line brute-force attack and to support tag randomization, respectively. Then, we design Hamming distance reduction and tag cutting optimization algorithms to improve the tag query efficiency of FuzzyDedup. In the end, we formally prove the security of our solution and conduct experiments on real-world datasets for performance evaluation. Experimental results exhibit the efficiency of FuzzyDedup in terms of computation cost and communication overhead.
Article
Distributed fog computing has received wide attention recently. It enables distributed computing and data management on the network nodes within the close vicinity of IoT devices. An important service of fog-cloud based systems is data deduplication. With the increasing concern of privacy, some privacy-preserving data deduplication schemes have been proposed. However, they cannot support lossless deduplication of encrypted similar data in the fog-cloud network. Meanwhile, no existing design can protect message equality information while resisting brute-force and frequency analysis attacks. In this paper, we propose a privacy-preserving and compression-based data deduplication system under the fog-cloud network, which supports lossless deduplication of similar data in the encrypted domain. Specifically, we first use the generalized deduplication technique and cryptographic primitives to implement secure deduplication over similar data. Then, we devise a two-level deduplication protocol that can perform secure and efficient deduplication at distributed fog nodes and the cloud. The proposed system can not only resist brute-force and frequency analysis attacks but also ensure that only the data operator can capture the message equality information. We formally analyze the security of our design. Performance evaluations demonstrate that our proposed design is efficient in computing, storage, and communication.
Article
Full-text available
Cloud storage is an ideal platform to accommodate massive data. However, with the increasing number of various devices and improved processing power, the amount of generated data is becoming gigantic. Therefore, this calls for a cost‐effective way to outsource massively generated data to a remote server. Cloud service providers utilise deduplication technique which deduplicates redundant data by aborting identical uploading requests and deleting redundant files. However, current deduplication mechanisms mainly focus on the storage saving of the server, and ignore the sustainable and long‐term financial interests of servers and users. This is not helpful to expand outsourcing and deduplication services. Blockchain is an ideal solution to achieve an economical and incentive‐driven deduplication system. Though some current research studiess have integrated deduplication with blockchain, they did not utilise blockchain as a financial tool. Meanwhile, it lacks an arbitration mechanism to settle disputes between the server and the user, especially in a Bitcoin payment where the payment is not confirmed immediately and a dispute may occur. This creates a burden to achieve fair and transparent incentive‐based deduplication service. In this work, we construct a deduplication system with financial incentives for the server and the user based on Bitcoin. The data owner will pay money via Bitcoin to the server for outsourcing the file, but this fee can be compensated by charging deduplication users with some fees to acquire the deduplication service. The server and the user can receive revenues using deduplication service. Disputes on the fair distribution of incentives can be settled by our arbitration protocol with chameleon hashes as arbitration tags. We give concrete construction and security requirements for our proposed BDAI. The security analysis shows that our BDAI is theoretically secure. The performance evaluation shows that our proposed BDAI is acceptably efficient for the deduplication. Meanwhile, we evaluate and conclude that 1% of outsourcing fee (or less) is a reasonable and preferable price for each deduplication user to pay as compensation for data owner.
Article
In this paper, we introduce a concept of transparent integrity auditing and propose a concrete scheme based on the blockchain, which goes one step beyond existing public auditing schemes, since the auditing does not rely on third-party auditors while freeing users from heavy communication costs on auditing the data integrity. Then we construct a secure transparent deduplication scheme based on the blockchain that supports deduplication over encrypted data and enables users to attest the deduplication pattern on the cloud server. Such a scheme allows users to directly benefit from data deduplication and protects data content against anyone who does not own the data. Finally, we integrate the proposed transparent integrity auditing scheme and transparent deduplication scheme into one system, dubbed BLIND. We evaluate BLIND from security and efficiency, which demonstrates that BLIND achieves a strong security guarantee with high efficiency.
Chapter
A Hierarchical Key Assignment Scheme (HKAS) is a method to assign some private information and secret keys to a set of classes in a partially ordered hierarchy, so that the private information of a higher class together with some public information can be used to derive the keys of all classes lower down in the hierarchy. Historically, HKAS has been introduced to enforce multi-level access control, where it can be safely assumed that the public information is made available in some authenticated form. Subsequently, HKAS has found application in several other contexts where, instead, it would be convenient to certify the trustworthiness of public information. Such application contexts include key management for IoT and for emerging distributed data acquisition systems such as wireless sensor networks. In this paper, motivated by the need of accommodating this additional security requirement, we first introduce a new cryptographic primitive: Verifiable Hierarchical Key Assignment Scheme (VHKAS). A VHKAS is a key assignment scheme with a verification procedure that allows honest users to verify whether public information has been maliciously modified to induce an honest user to obtain an incorrect key. Then, we design and analyse VHKASs which are provably secure. Our solutions support key update for compromised secret keys by making a limited number of changes to public and private information.
Article
In this article, we propose d ual t raceable d istributed a ttribute b ased e ncryption with s ubset k eyword s earch system (DT-DABE-SKS, abbreviated as DT\mathcal {DT} ) to simultaneously realize data source trace (secure provenance) and user trace (traitor trace) and flexible subset keyword search from polynomial interpolation. Leveraging non-interactive zero-knowledge proof technology, DT\mathcal {DT} preserves privacy for both data providers and users in normal circumstances, but a trusted authority can disclose their real identities if necessary, such as the providers deceitfully uploading false data or users maliciously leaking secret attribute key. Next, we introduce the new conception of updatable and transferable message-lock encryption (UT-MLE) for block-level dynamic encrypted file update, where the owner does not have to download the whole ciphertext, decrypt, re-encrypt and upload for minor document modifications. In addition, the owner is permitted to transfer file ownership to other system customers with efficient computation in an authenticated manner. A nontrivial integration of DT\mathcal {DT} and UT-MLE lead to the distributed ABSE with ownership transfer system ( DTOT\mathcal {DTOT} ) to enjoy the above merits. We formally define DT\mathcal {DT} , UT-MLE, and their security model. Then, the instantiations of DT\mathcal {DT} and UT-MLE, and the formal security proof are presented. Comprehensive comparison and experimental analysis based on real dataset affirm their feasibility.
Article
Data deduplication can efficiently eliminate data redundancies in cloud storage and reduce the bandwidth requirement of users. However, most previous schemes depending on the help of a trusted key server (KS) are vulnerable and limited because they suffer from revealing information, poor resistance to attacks, great computational overhead, etc. In particular, if the trusted KS fails, the whole system stops working, i.e., single-point-of-failure. In this paper, we propose a Secure and Efficient data Deduplication scheme (named SED) in a JointCloud storage system which provides the global services via collaboration with various clouds. SED also supports dynamic data update and sharing without the help of the trusted KS. Moreover, SED can overcome the single-point-of-failure that commonly occurs in the classic cloud storage system. According to the theoretical analyses, our SED ensures the semantic security in the random oracle model and has strong anti-attack ability such as the brute-force attack resistance and the collusion attack resistance. Besides, SED can effectively eliminate data redundancies with low computational complexity and communication and storage overhead. The efficiency and functionality of SED improves the usability in client-side. Finally, the comparing results show that the performance of our scheme is superior to that of the existing schemes.
Article
It is observed that modern vehicles are becoming more and more powerful in computing, communications, and storage capacity. By interacting with other vehicles or with local infrastructures (i.e., fog) such as road-side units, vehicles and fog devices can collaboratively provide services like crowdsensing in an efficient and secure way. Unfortunately, it is hard to develop a secure and privacy-preserving crowdsensing report deduplication mechanism in such a system. In this paper, we propose a scheme FVC-Dedup to address this challenge. Specifically, we develop cryptographic primitives to realize secure task allocation and guarantee the confidentiality of crowdsensing reports. Duringthe report submission, we improve the message-lock encryption (MLE) scheme to realize privacy-preserving report deduplication and resist the fake duplicate attacks. Besides, we construct a novel signature scheme to achieve efficient signature aggregation and record the contributions of each participant fairly without knowing the crowdsensing data. The security analysis and performance evaluation demonstrate that FVC-Dedup can achieve secure and privacy-preserving report deduplication with moderate computing and communication overhead.
Chapter
Land use/Land cover classification is essential from the point of Earth explorations and scientific investigation. Earlier, the land‐use and land cover changes are monitored with the assistance of the pixel‐based approach, and now objects based approaches have taken their place. In the pixel‐based approach, image pixels are examined to monitor the changes developed in land use and land cover. The newly developed method in the field of remote sensing is object‐based change detection (OBCD) techniques. These approaches have entirely changed the study of remotely sensing and satellite image processing. The pixel‐based approach made a comparison based on changing pixels between two or a series of images. In contrast, the object‐based approach constitutes the formation of objects (classes), e.g., water, urban, agriculture, soil, etc., between two images and making a comparison between them. One of the important accept of these techniques is how accurately they provide us information about the changes in the nearby surrounding. This paper begins with the development of traditional pixel‐based methods and ends with the evolution of the latest object‐based change detection techniques. LANDSAT and PALSAR images are used to represent the changes developed in the land use/ land cover using pixel‐based and object‐based approaches.
Chapter
Deduplication has been extensively studied in terms of cloud data security and privacy. However, whether a deduplication scheme can be practically adopted is rarely investigated. Researchers have discussed the impact of economic facts and incentive mechanisms in motivating the acceptance of deduplication, but they either did not deeply explore this issue or failed to propose an effective solution. This chapter employs game theory to capture the interactions between stakeholders in two types of deduplication schemes: a server-controlled deduplication scheme (S-DEDU) and a client-controlled deduplication scheme. We propose a bounded discount–based incentive mechanism in S-DEDU, which attracts data users to participate. We also design an individualized discount–based incentive mechanism for motivating data holders to behave cooperatively and protecting data privacy at the same time. Furthermore, we conduct realistic dataset–based experiments to demonstrate the validity of our incentive mechanisms for motivating the adoption of deduplication. In the end, we summarize this chapter and propose prospective research outlooks.
Article
Cloud storage is a cost-effective platform to accommodate massive data at low cost. However, advances of cloud services propel data generation, which pushes storage servers to its limit. Deduplication is a popular technique enjoyed by most current cloud servers, which detects and deletes redundant data to save storage and bandwidth. For security concerns, proof-of-ownership (PoW) can be used to guarantee ownership of data such that no malicious user could pass deduplication easily or utilize such mechanism for malicious purposes. Generally, PoW is implemented in static data archive where the data file is supposed to be read-only. However, to satisfy users’ needs for dynamical manipulation on data and support real-time data services, it is required to devise efficient PoW for dynamic archive. Inspired by malleable signature, which offers authentication even after its committed message changes, we propose the notion of bidirectional and malleable proof-of-ownership ( \sf {BM\mbox{-}PoW} ) for the above challenge. Our proposed \sf {BM\mbox{-}PoW} consists of bidirectional PoW ( {\mbox{B-PoW}} ), malleable PoW ( {\mbox{M-PoW}} ) and dispute arbitration protocol DAP\sf {DAP} . While our {\mbox{B-PoW}} is proposed for a static setting, the {\mbox{M-PoW}} caters specifically for dynamic manipulation of data. In addition, our proposed arbitration protocol DAP\sf {DAP} achieves accountable redaction which can arbitrate the originality of file ownership. We provide the security analysis of our proposal, and performance evaluation that suggests our proposed {\mbox{B-PoW}} is secure and efficient for large file in static data archive. In addition, our proposed {\mbox{M-PoW}} achieves acceptable performance under dynamic setting where data is supposed to be outsourced first and updated later in dynamic data archive.
Chapter
In the present digital scenario, data is of prime significance for individuals and moreover for organizations. With the passage of time, data content being produced increases exponentially, which poses a serious concern as the huge amount of redundant data contents stored on the cloud employs a severe load on the cloud storage systems itself which cannot be accepted. Therefore, a storage optimization strategy is a fundamental prerequisite to cloud storage systems. Data deduplication is a storage optimization strategy that is used for deleting identical copies of redundant data, optimizing bandwidth, improves utilization of storage space, and hence, minimizes storage cost. To guarantee the security parameter, the data which is stored on the cloud must be in an encrypted form to ensure the security of the stored data. Consequently, executing deduplication safely over the encrypted information in the cloud seems to be a challenging job. This chapter discusses various existing data deduplication techniques with a notion of securing the data on the cloud that addresses this challenge.
Conference Paper
Full-text available
We initiate a study of randomness condensers for sources that are efficiently samplable but may depend on the seed of the condenser. That is, we seek functions Cond : {0,1}n ×{0,1}d → {0,1}m such that if we choose a random seed S ← {0,1}d , and a source X=A(S)X={\mathcal A}(S) is generated by a randomized circuit A\mathcal A of size t such that X has min-entropy at least k given S, then Cond(X;S) should have min-entropy at least some k′ given S. The distinction from the standard notion of randomness condensers is that the source X may be correlated with the seed S (but is restricted to be efficiently samplable). Randomness extractors of this type (corresponding to the special case where k′ = m) have been implicitly studied in the past (by Trevisan and Vadhan, FOCS ‘00). We show that: Unlike extractors, we can have randomness condensers for samplable, seed-dependent sources whose computational complexity is smaller than the size t of the adversarial sampling algorithm A\mathcal A. Indeed, we show that sufficiently strong collision-resistant hash functions are seed-dependent condensers that produce outputs with min-entropy k=mO(logt)k' = m - {\mathcal O}(\log t), i.e. logarithmic entropy deficiency. Randomness condensers suffice for key derivation in many cryptographic applications: when an adversary has negligible success probability (or negligible “squared advantage” [3]) for a uniformly random key, we can use instead a key generated by a condenser whose output has logarithmic entropy deficiency. Randomness condensers for seed-dependent samplable sources that are robust to side information generated by the sampling algorithm imply soundness of the Fiat-Shamir Heuristic when applied to any constant-round, public-coin interactive proof system.
Article
Full-text available
Farsite is a secure, scalable file system that logically functions as a centralized file server but is physically distributed among a set of untrusted computers. Farsite provides file availability and reliability through randomized replicated storage; it ensures the secrecy of file contents with cryptographic techniques; it maintains the integrity of file and directory data with a Byzantine-fault-tolerant protocol; it is designed to be scalable by using a distributed hint mechanism and delegation certificates for pathname translations; and it achieves good performance by locally caching file data, lazily propagating file updates, and varying the duration and granularity of content leases. We report on the design of Farsite and the lessons we have learned by implementing much of that design.
Article
Full-text available
As the volume of data increases, so does the demand for online storage services, from simple backup services to cloud storage infrastructures. Although deduplication is most effective when applied across multiple users, cross-user deduplication has serious privacy implications. Some simple mechanisms can enable cross-user deduplication while greatly reducing the risk of data leakage. Cloud storage refers to scalable and elastic storage capabilities delivered as a service using Internet technologies with elastic provisioning and usebased pricing that doesn't penalize users for changing their storage consumption without notice.
Conference Paper
Full-text available
We strengthen the foundations of deterministic public-key encryption via definitional equivalences and standard-model constructs based on general assumptions. Specifically we consider seven notions of privacy for deterministic encryption, including six forms of semantic security and an indistinguishability notion, and show them all equivalent. We then present a deterministic scheme for the secure encryption of uniformly and independently distributed messages based solely on the existence of trapdoor one-way permutations. We show a generalization of the construction that allows secure deterministic encryption of independent high-entropy messages. Finally we show relations between deterministic and standard (randomized) encryption.
Conference Paper
Full-text available
We present as-strong-as-possible definitions of privacy, and constructions achieving them, for public-key encryption schemes where the encryption algorithm is deterministic. We obtain as a consequence database encryption methods that permit fast (i.e. sub-linear, and in fact logarithmic, time) search while provably providing privacy that is as strong as possible subject to this fast search constraint. One of our constructs, called RSA-DOAEP, has the added feature of being length preserving, so that it is the first example of a public-key cipher. We generalize this to obtain a notion of efficiently-searchable encryption schemes which permit more flexible privacy to search-time trade-offs via a technique called bucketization. Our results answer much-asked questions in the database community and provide foundations for work done there.
Conference Paper
Full-text available
The study of deterministic public-key encryption was initiated by Bellare et al. (CRYPTO '07), who provided the "strongest possible" notion of security for this primitive (called PRIV) and constructions in the random oracle (RO) model. We focus on constructing efficient deterministic encryption schemes without random oracles. To do so, we propose a slightly weaker notion of security, saying that no partial information about encrypted messages should be leaked as long as each message is a-priori hard-to-guess given the others (while PRIV did not have the latter restriction). Nevertheless, we argue that this version seems adequate for many practical applications. We show equivalence of this definition to single-message and indistinguishability-based ones, which are easier to work with. Then we give general constructions of both chosen-plaintext (CPA) and chosen-ciphertext-attack (CCA) secure deterministic encryption schemes, as well as efficient instantiations of them under standard number-theoretic assumptions. Our constructions build on the recently-introduced framework of Peikert and Waters (STOC '08) for constructing CCA-secure probabilistic encryption schemes, extending it to the deterministic-encryption setting as well.
Conference Paper
Full-text available
POST is a cooperative, decentralized messaging system that supports traditional services like electronic mail (email), news, instant messaging, as well as collaborative applica- tions such as shared calendars and whiteboards. Unlike existing implementations of such services, POST is highly resilient, secure, scalable and does not rely on dedicated servers. POST is built upon a peer-to-peer (p2p) overlay network, consisting of participants' desktop computers. We sketch POST's basic messaging infrastructure, which pro- vides shared, secure, single-copy message storage, user- specific metadata, and notification. As an example applica- tion, we sketch how POST can be used to construct a cooper- ative, secure email service called ePOST.
Conference Paper
Full-text available
As the world moves to digital storage for archival purposes, there is an increasing demand for systems that can provide secure data storage in a cost-effective manner. By identifying common chunks of data both within and between files and storing them only once, deduplication can yield cost savings by increasing the utility of a given amount of storage. Unfortunately, deduplication exploits identical content, while encryption attempts to make all content appear random; the same content encrypted with two different keys results in very different ciphertext. Thus, combining the space efficiency of deduplication with the secrecy aspects of encryption is problematic. We have developed a solution that provides both data security and space efficiency in single-server storage and distributed storage systems. Encryption keys are generated in a consistent manner from the chunk data; thus, identical chunks will always encrypt to the same ciphertext. Furthermore, the keys cannot be deduced from the encrypted chunk data. Since the information each user needs to access and decrypt the chunks that make up a file is encrypted using a key known only to the user, even a full compromise of the system cannot reveal which chunks are used by which users.
Conference Paper
Full-text available
We present as-strong-as-possible definitions of privacy, and constructions achieving them, for public-key encryption schemes where the encryption algorithm is deterministic. We obtain as a consequence database encryption methods that permit fast (i.e. sub-linear, and in fact logarithmic, time) search while provably providing privacy that is as strong as possible subject to this fast search constraint. One of our constructs, called RSA-DOAEP, has the added feature of being length preserving, so that it is the first example of a public-key cipher. We generalize this to obtain a notion of efficiently-searchable encryptionschemes which permit more flexible privacy to search-time trade-offs via a technique called bucketization. Our results answer much- asked questions in the database community and provide foundations for work done there.
Article
Motivated by applications in large storage systems, we initiate the study of incremental deterministic public-key encryption. Deterministic public-key encryption, introduced by Bellare, Boldyreva, and O’Neill (CRYPTO ’07), provides an alternative to randomized public-key encryption in various scenarios where the latter exhibits inherent drawbacks. A deterministic encryption algorithm, however, cannot satisfy any meaningful notion of security for low-entropy plaintexts distributions, but Bellare et al. demonstrated that a strong notion of security can in fact be realized for relatively high-entropy plaintext distributions. In order to achieve a meaningful level of security, a deterministic encryption algorithm should be typically used for encrypting rather long plaintexts for ensuring a sufficient amount of entropy. This requirement may be at odds with efficiency constraints, such as communication complexity and computation complexity in the presence of small updates. Thus, a highly desirable property of deterministic encryption algorithms is incrementality: Small changes in the plaintext translate into small changes in the corresponding ciphertext. We present a framework for modeling the incrementality of deterministic public-key encryption. Our framework extends the study of the incrementality of cryptography primitives initiated by Bellare, Goldreich and Goldwasser (CRYPTO ’94). Within our framework, we propose two schemes, which we prove to enjoy an optimal tradeoff between their security and incrementality up to lower-order factors. Our first scheme is a generic method which can be based on any deterministic public-key encryption scheme, and, in particular, can be instantiated with any semantically secure (randomized) public-key encryption scheme in the random-oracle model. Our second scheme is based on the Decisional Diffie–Hellman assumption in the standard model. The approach underpinning our schemes is inspired by the fundamental “sample-then-extract” technique due to Nisan and Zuckerman (JCSS ’96) and refined by Vadhan (J. Cryptology ’04), and by the closely related notion of “locally computable extractors” due to Vadhan. Most notably, whereas Vadhan used such extractors to construct private-key encryption schemes in the bounded-storage model, we show that techniques along these lines can also be used to construct incremental public-key encryption schemes.
Article
Backup is cumbersome and expensive. Individual users almost never back up their data, and backup is a significant cost in large organizations. This paper presents Pastiche , a simple and inexpensive backup system. Pastiche exploits excess disk capacity to perform peer-to-peer backup with no administrative costs. Each node minimizes storage overhead by selecting peers that share a significant amount of data. It is easy for common installations to find suitable peers, and peers with high overlap can be identified with only hundreds of bytes. Pastiche provides mechanisms for confidentiality, integrity, and detection of failed or malicious peers. A Pastiche prototype suffers only 7.4% overhead for a modified Andrew Benchmark, and restore performance is comparable to cross-machine copy.
Conference Paper
We show that it is possible to guarantee meaningful security even for plaintext distributions that depend on the public key. We extend the previously proposed notions of security, allowing adversaries to adaptively choose plaintext distributions after seeing the public key, in an interactive manner. The only restrictions we make are that: (1) plaintext distributions are unpredictable (as is essential in deterministic public-key encryption), and (2) the number of plaintext distributions from which each adversary is allowed to adaptively choose is upper bounded by 2 p , where p can be any predetermined polynomial in the security parameter. For example, with p = 0 we capture plaintext distributions that are independent of the public key, and with p = O(s logs) we capture, in particular, all plaintext distributions that are samplable by circuits of size s. Within our framework we present both constructions in the random-oracle model based on any public-key encryption scheme, and constructions in the standard model based on lossy trapdoor functions (thus, based on a variety of number-theoretic assumptions). Previously known constructions heavily relied on the independence between the plaintext distributions and the public key for the purposes of randomness extraction. In our setting, however, randomness extraction becomes significantly more challenging once the plaintext distributions and the public key are no longer independent. Our approach is inspired by research on randomness extraction from seed-dependent distributions. Underlying our approach is a new generalization of a method for such randomness extraction, originally introduced by Trevisan and Vadhan (FOCS ’00) and Dodis (PhD Thesis, MIT, ’00).
Conference Paper
We formalize a new cryptographic primitive that we call message-locked encryption (MLE), where the key under which encryption and decryption are performed is itself derived from the message. MLE provides a way to achieve secure deduplication (space-efficient secure outsourced storage), a goal currently targeted by numerous cloudstorage providers. We provide definitions both for privacy and for a form of integrity that we call tag consistency. Based on this foundation, we make both practical and theoretical contributions. On the practical side, we provide ROM security analyses of a natural family of MLE schemes that includes deployed schemes. On the theoretical side the challenge is standard model solutions, and we make connections with deterministic encryption, hash functions secure on correlated inputs and the sample-then-extract paradigm to deliver schemes under different assumptions and for different classes of message sources. Our work shows that MLE is a primitive of both practical and theoretical interest.
Conference Paper
We introduce the notion of dual projective hashing. This is similar to Cramer-Shoup projective hashing, except that instead of smoothness, which stipulates that the output of the hash function looks random on no instances, we require invertibility, which stipulates that the output of the hash function on no instances uniquely determine the hashing key, and moreover, that there is a trapdoor which allows us to efficiently recover the hashing key. We show a simple construction of lossy trapdoor functions via dual projective hashing. Our construction encompasses almost all known constructions of lossy trapdoor functions, as given in the works of Peikert and Waters (STOC ’08) and Freeman et al. (PKC ’10). We also provide a simple construction of deterministic encryption schemes secure with respect to hard-to-invert auxiliary input, under an additional assumption about the projection map. Our construction clarifies and encompasses all of the constructions given in the recent work of Brakerski and Segev (Crypto ’11). In addition, we obtain a new deterministic encryption scheme based on LWE.
Conference Paper
Deterministic public-key encryption, introduced by Bellare, Boldyreva, and O’Neill (CRYPTO ’07), provides an alternative to randomized public-key encryption in various scenarios where the latter exhibits inherent drawbacks. A deterministic encryption algorithm, however, cannot satisfy any meaningful notion of security when the plaintext is distributed over a small set. Bellare et al. addressed this difficulty by requiring semantic security to hold only when the plaintext has high min-entropy from the adversary’s point of view.
Conference Paper
Disk-based deduplication storage has emerged as the new-generation storage system for enterprise data protection to replace tape libraries. Deduplication removes redundant data segments to compress data into a highly compact form and makes it economical to store backups on disk instead of tape. A crucial requirement for enterprise data protection is high throughput, typically over 100 MB/sec, which enables backups to complete quickly. A significant challenge is to identify and eliminate duplicate data segments at this rate on a low-cost system that cannot afford enough RAM to store an index of the stored segments and may be forced to access an on-disk index for every input segment. This paper describes three techniques employed in the production Data Domain deduplication file system to relieve the disk bottleneck. These techniques include: (1) the Summary Vector, a compact in-memory data structure for identifying new segments; (2) Stream-Informed Segment Layout, a data layout method to improve on-disk locality for sequentially accessed segments; and (3) Locality Preserved Caching, which maintains the locality of the fingerprints of duplicate segments to achieve high cache hit ratios. Together, they can remove 99% of the disk accesses for deduplication of real world workloads. These techniques enable a modern two-socket dual-core system to run at 90% CPU utilization with only one shelf of 15 disks and achieve 100 MB/sec for single-stream throughput and 210 MB/sec for multi-stream throughput.
Conference Paper
Public-key encryption schemes rely for their IND-CPA secu- rity on per-message fresh randomness. In practice, randomness may be of poor quality for a variety of reasons, leading to failure of the schemes. Expecting the systems to improve is unrealistic. What we show in this paper is that we can, instead, improve the cryptography to oset the lack of possible randomness. We provide public-key encryption schemes that achieve IND-CPA security when the randomness they use is of high quality, but, when the latter is not the case, rather than breaking com- pletely, they achieve a weaker but still useful notion of security that we call IND-CDA. This hedged public-key encryption provides the best pos- sible security guarantees in the face of bad randomness. We provide sim- ple RO-based ways to make in-practice IND-CPA schemes hedge secure with minimal software changes. We also provide non-RO model schemes relying on lossy trapdoor functions (LTDFs) and techniques from deter- ministic encryption. They achieve adaptive security by establishing and exploiting the anonymity of LTDFs which we believe is of independent interest.
Conference Paper
We present a (probabilistic) public key encryption (PKE) scheme such that when being implemented in a bilinear group, anyone is able to check whether two ciphertexts are encryptions of the same message. Interestingly, bilinear map operations are not required in key generation, encryption or decryption procedures of the PKE scheme, but is only required when people want to do an equality test (on the encrypted messages) between two ciphertexts that may be generated using different public keys. We show that our PKE scheme can be used in different applications such as searchable encryption and partitioning encrypted data. Moreover, we show that when being implemented in a non-bilinear group, the security of our PKE scheme can be strengthened from One-Way CCA to a weak form of IND-CCA.
Conference Paper
Tahoe is a system for secure, distributed storage. It uses capabilities for access control, cryptography for con dentiality and integrity, and erasure coding for fault-tolerance. It has been deployed in a commercial backup service and is cur- rently operational. The implementation is Open Source. Categories and Subject Descriptors: D.4.6 [Security and Protection]: Access controls; E.3 [Data Encryp- tion]: Public key cryptosystems; H.3 [Information Sys-tems]: Information Storage and Retrieval; E.4 [Coding and Information Theory]: Error control codes.
Conference Paper
Users rarely consider running network file systems over slow or wide-area networks, as the performance would be unacceptable and the bandwidth consumption too high. Nonetheless, efficient remote file access would often be desirable over such networks---particularly when high latency makes remote login sessions unresponsive. Rather than run interactive programs such as editors remotely, users could run the programs locally and manipulate remote files through the file system. To do so, however, would require a network file system that consumes less bandwidth than most current file systems.This paper presents LBFS, a network file system designed for low-bandwidth networks. LBFS exploits similarities between files or versions of the same file to save bandwidth. It avoids sending data over the network when the same data can already be found in the server's file system or the client's cache. Using this technique in conjunction with conventional compression and caching, LBFS consumes over an order of magnitude less bandwidth than traditional network file systems on common workloads.
Conference Paper
Backup is cumbersome and expensive. Individual users almost never back up their data, and backup is a significant cost in large organizations. This paper presents Pastiche, a simple and inexpensive backup system. Pastiche exploits excess disk capacity to perform peer-to-peer backup with no administrative costs. Each node minimizes storage overhead by selecting peers that share a significant amount of data. It is easy for common installations to find suitable peers, and peers with high overlap can be identified with only hundreds of bytes. Pastiche provides mechanisms for confidentiality, integrity, and detection of failed or malicious peers. A Pastiche prototype suffers only 7.4% overhead for a modified Andrew Benchmark, and restore performance is comparable to cross-machine copy.
Conference Paper
This paper addresses deterministic public-key encryption schemes (DE), which are designed to provide meaningful security when only source of randomness in the encryption process comes from the message itself. We propose a general construction of DE that unifies prior work and gives novel schemes. Specifically, its instantiations include: The first construction from any trapdoor function that has sufficiently many hardcore bits. The first construction that provides “bounded” multi-message security (assuming lossy trapdoor functions).
Article
We put forward the notion of targeted malleability: given a homomorphic encryption scheme, in various scenarios we would like to restrict the homomorphic computations one can perform on encrypted data. We introduce a precise framework, generalizing the foundational notion of non-malleability introduced by Dolev, Dwork, and Naor (SICOMP '00), ensuring that the malleability of a scheme is targeted only at a specific set of "allowable" functions. In this setting we are mainly interested in the efficiency of such schemes as a function of the number of repeated homomorphic operations. Whereas constructing a scheme whose ciphertext grows linearly with the number of such operations is straightforward, obtaining more realistic (or merely non-trivial) length guarantees is significantly more challenging. We present two constructions that transform any homomorphic encryption scheme into one that offers targeted malleability. Our constructions rely on standard cryptographic tools and on succinct non-interactive arguments, which are currently known to exist in the standard model based on variants of the knowledge-of-exponent assumption. The two constructions offer somewhat different efficiency guarantees, each of which may be preferable depending on the underlying building blocks.
Conference Paper
Motivated by applications in large storage systems, we initiate the study of incremental deterministic public-key encryption. Deterministic public-key encryption, introduced by Bellare, Boldyreva, and O’Neill (CRYPTO ’07), provides a realistic alternative to randomized public-key encryption in various scenarios where the latter exhibits inherent drawbacks. A deterministic encryption algorithm, however, cannot satisfy any meaningful notion of security for low-entropy plaintexts distributions, and Bellare et al. demonstrated that a strong notion of security can in fact be realized for relatively high-entropy plaintext distributions.
Conference Paper
The Farsite distributed file system provides availability by replicating each file onto multiple desktop computers. Since this replication consumes significant storage space, it is important to reclaim used space where possible. Measurement of over 500 desktop file systems shows that nearly half of all consumed space is occupied by duplicate files. We present a mechanism to reclaim space from this incidental duplication to make it available for controlled file replication. Our mechanism includes: (1) convergent encryption, which enables duplicate files to be coalesced into the space of a single file, even if the files are encrypted with different users' keys; and (2) SALAD, a Self-Arranging Lossy Associative Database for aggregating file content and location information in a decentralized, scalable, fault-tolerant manner. Large-scale simulation experiments show that the duplicate-file coalescing system is scalable, highly effective, and fault-tolerant.
Article
This paper presents LBFS, a network file system designed for low bandwidth networks. LBFS exploits similarities between files or versions of the same file to save bandwidth. It avoids sending data over the network when the same data can already be found in the server's file system or the client's cache. Using this technique, LBFS achieves up to two orders of magnitude reduction in bandwidth utilization on common workloads, compared to traditional network file systems
Article
We argue that the random oracle model ---where all parties have access to a public random oracle--- provides a bridge between cryptographic theory and cryptographic practice. In the paradigm we suggest, a practical protocol P is produced by first devising and proving correct a protocol P R for the random oracle model, and then replacing oracle accesses by the computation of an "appropriately chosen" function h. This paradigm yields protocols much more efficient than standard ones while retaining many of the advantages of provable security. We illustrate these gains for problems including encryption, signatures, and zero-knowledge proofs. Department of Computer Science & Engineering, Mail Code 0114, University of California at San Diego, 9500 Gilman Drive, La Jolla, CA 92093. E-mail: mihir@cs.ucsd.edu y Department of Computer Science, University of California at Davis, Davis, CA 95616, USA. E-mail: rogaway@cs.davis.edu 1 1 Introduction Cryptographic theory has provided a p...
Article
Encryption that is only semantically secure should not be used on messages that depend on the underlying secret key; all bets are o# when, for example, one encrypts using a shared key K the value K. Here we introduce a new notion of security, KDM security, appropriate for key-dependent messages. The notion makes sense in both the public-key and shared-key settings. For the latter we show that KDM security is easily achievable within the random-oracle model. By developing and achieving stronger notions of encryption-scheme security it is hoped that protocols which are proven secure under "formal" models of security can, in time, be safely realized by generically instantiating their primitives.
Article
The random oracle model is a very convenient setting for designing cryptographic protocols. In this idealized model all parties have access to a common, public random function, called a random oracle. Protocols in this model are often very simple and efficient; also the analysis is often clearer. However, we do not have a general mechanism for transforming protocols that are secure in the random oracle model into protocols that are secure in real life. In fact, we do not even know how to meaningfully specify the properties required from such a mechanism. Instead, it is a common practice to simply replace — often without mathematical justification — the random oracle with a ‘cryptographic hash function’ (e.g., MD5 or SHA). Consequently, the resulting protocols have no meaningful proofs of security. We propose a research program aimed at rectifying this situation by means of identifying, and subsequently realizing, the useful properties of random oracles. As a first step, we introduce a new primitive that realizes a specific aspect of random oracles. This primitive, called oracle hashing, is a hash function that, like random oracles, ‘hides all partial information on its input’. A salient property of oracle hashing is that it is probabilistic: different applications to the same input result in different hash values. Still, we maintain the ability to verify whether a given hash value was generated from a given input. We describe constructions of oracle hashing, as well as applications where oracle hashing successfully replaces random oracles.
Conference Paper
Encryption that is only semantically secure should not be used on messages that depend on the underlying secret key; all bets are off when, for example, one encrypts using a shared key K the value K. Here we introduce a new notion of security, KDM security, appropriate for key-dependent messages. The notion makes sense in both the public-key and shared-key settings. For the latter we show that KDM security is easily achievable within the random-oracle model. By developing and achieving stronger notions of encryption-scheme security it is hoped that protocols which are proven secure under “formal” models of security can, in time, be safely realized by generically instantiating their primitives.
Random oracles are practical: A paradigm for designing efficient protocols
  • M Bellare
  • P Rogaway
  • D E Denning
  • R Pyle
  • R Ganesan
  • R S Sandhu
Bellare, M., Rogaway, P.: Random oracles are practical: A paradigm for designing efficient protocols. In: Denning, D.E., Pyle, R., Ganesan, R., Sandhu, R.S., Ashby, V. (eds.) ACM Conference on Computer and Communications Security, pp. 62-73. ACM (1993)
  • R Cramer
Cramer, R. (ed.): TCC 2012. LNCS, vol. 7194. Springer, Heidelberg (2012)
A unified approach to deterministic encryption: New constructions and a connection to computational entropy
  • B Fuller
  • A Neill
  • L Reyzin
Fuller, B., O'Neill, A., Reyzin, L.: A unified approach to deterministic encryption: New constructions and a connection to computational entropy. In: Cramer [15], pp. 582–599
  • T Johansson
  • P Q Nguyen
Johansson, T., Nguyen, P.Q. (eds.): EUROCRYPT 2013. LNCS, vol. 7881. Springer, Heidelberg (2013)
  • D Pointcheval
  • T Johansson
Pointcheval, D., Johansson, T. (eds.): EUROCRYPT 2012. LNCS, vol. 7237. Springer, Heidelberg (2012)
  • D Wagner
Wagner, D. (ed.): CRYPTO 2008. LNCS, vol. 5157. Springer, Heidelberg (2008)
Deterministic encryption: Definitional equivalences and constructions without random oracles
  • M Bellare
  • M Fischlin
  • A Neill
  • T Ristenpart
On notions of security for deterministic encryption, and efficient constructions without random oracles
  • A Boldyreva
  • S Fehr
  • A Neill
Hedged public-key encryption: How to protect against bad randomness
  • M Bellare
  • Z Brakerski
  • M Naor
  • T Ristenpart
  • G Segev
  • H Shacham
  • S Yilek
Bellare, M., Brakerski, Z., Naor, M., Ristenpart, T., Segev, G., Shacham, H., Yilek, S.: Hedged public-key encryption: How to protect against bad randomness. In: Matsui, M. (ed.) ASIACRYPT 2009. LNCS, vol. 5912, pp. 232-249. Springer, Heidelberg (2009)
Probabilistic public key encryption with equality test
  • G Yang
  • C H Tan
  • Q Huang
  • D S Wong
Yang, G., Tan, C.H., Huang, Q., Wong, D.S.: Probabilistic public key encryption with equality test. In: Pieprzyk, J. (ed.) CT-RSA 2010. LNCS, vol. 5985, pp. 119-131. Springer, Heidelberg (2010)