A comprehensive study of Convergent and Commutative Replicated Data Types

Source: OAI

ABSTRACT Eventual consistency aims to ensure that replicas of some mutable shared object converge without foreground synchronisation. Previous approaches to eventual consistency are ad-hoc and error-prone. We study a principled approach: to base the design of shared data types on some simple formal conditions that are sufficient to guarantee eventual consistency. We call these types Convergent or Commutative Replicated Data Types (CRDTs). This paper formalises asynchronous object replication, either state based or operation based, and provides a sufficient condition appropriate for each case. It describes several useful CRDTs, including container data types supporting both \add and \remove operations with clean semantics, and more complex types such as graphs, montonic DAGs, and sequences. It discusses some properties needed to implement non-trivial CRDTs.

Download full-text


Available from: Carlos Baquero, Aug 29, 2015
  • Source
    • "To understand when these virtues and limitations are relevant in practice, we survey both practitioner accounts and academic literature, perform experimental analysis on modern cloud infrastructure, and analyze representative applications for their semantic requirements. Our experiences with a HAT prototype running across multiple georeplicated datacenters indicate that HATs offer a one to three order of magnitude latency decrease compared to traditional distributed serializability protocols, and they can provide acceptable semantics for a wide range of programs, especially those with monotonic logic and commutative updates [4] [57]. HAT systems can also enforce arbitrary foreign key constraints for multi-item updates and, in some cases, provide limited uniqueness guarantees. "
    [Show abstract] [Hide abstract]
    ABSTRACT: To minimize network latency and remain online during server failures and network partitions, many modern distributed data storage systems eschew transactional functionality, which provides strong semantic guarantees for groups of multiple operations over multiple data items. In this work, we consider the problem of providing Highly Available Transactions (HATs): transactional guarantees that do not suffer unavailability during system partitions or incur high network latency. We introduce a taxonomy of highly available systems and analyze existing ACID isolation and distributed data consistency guarantees to identify which can and cannot be achieved in HAT systems. This unifies the literature on weak transactional isolation, replica consistency, and highly available systems. We analytically and experimentally quantify the availability and performance benefits of HATs--often two to three orders of magnitude over wide-area networks--and discuss their necessary semantic compromises.
  • Source
    • "However, G-Set ignores the intention of remove operations, LWWelement-Set is not allowed to scale since it uses the tombstone mechanism and OR-Set requires transparent mechanism of unique tag generation between different sites. Recently, SU-Set(Ibanez et al., 2012) is proposed as a CRDT for RDF graphs based on OR-Set (Shapiro et al., 2011) that supports the SPARQL 1.1 Upadate operation and guarantees consistency. SU- Set is designed to serve as base for an RDF-Store CRDT that could be implemented in an RDF engine. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Commutative Replicated Data Type CRDT is a convergence philosophy invented as a new generation of technique that ensures consistency maintenance of replica in collaborative editors without any difficulty over Peer-to-Peer P2P networks. This technique has been successfully applied to different data representation types in scalable collaborative editing for linear, tree document structure and semi-structured data types but not yet on set data type ensuring Causality, Consistency and Intention CCI preservation criteria. In this paper, we propose a srCE approach, a novel CRDT for a set structure to facilitate the collaborative and concurrent editing of Resource Description Framework RDF stores in large scale by different members of virtual community. Our approach ensures CCI model and is not tied to a specific case and therefore can be applied for any document that complies to set structure. A prototype implementation using Friend of a Friend FOAF data sets with and without the srCE model illustrates significant improvement in scalability and performance.
    International Journal of Computer Applications in Technology 01/2013; 48(1):1-13. DOI:10.1504/IJCAT.2013.055562
  • Source
    • "Ensuring eventual consistency of a file system is complex [5], while ensuring eventual consistency of a set can be achieved in numerous ways with quite simple algorithms. For instance, [17] defines multiple replicated sets with different behaviors and performances. So we can imagine a file system as the set of absolute paths present in the file system. 1) A first layer contains the set of independent couples (path, type) which are elements present in the file system . "
    [Show abstract] [Hide abstract]
    ABSTRACT: Collaborative working is increasingly popular, but it presents challenges due to the need for high responsiveness and disconnected work support. To address these challenges the data is optimistically replicated at the edges of the network, i.e. personal computers or mobile devices. This replication requires a merge mechanism that preserves the consistency and structure of the shared data subject to concurrent modifications. In this paper, we propose a generic design to ensure eventual consistency (every replica will eventually view the same data) and to maintain the specific constraints of the replicated data. Our layered design provides to the application engineer the complete control over system scalability and behavior of the replicated data in face of concurrent modifications. We show that our design allows replication of complex data types with acceptable performances.
Show more