A Low-Latency Non-Blocking Atomic Commitment
- SourceAvailable from: Ricardo Jimenez-Peris[Show abstract] [Hide abstract]
ABSTRACT: Middleware platforms are becoming very popular among system developers. Due to its popularity, there is an increasing demand for dependable middleware support. In the past few years several research efforts have concentrated in augmenting the dependability of middle-ware infrastructures which have led to the definition of the FT-Corba specification. Active replication is one of the main techniques that have been used to achieve some of the required dependability attributes such as high-availability. This kind of replication requires determin-istic replicas to behave as a state machine what has been traditionally achieved by restricting replicas to be single-threaded. Unfortunately, single-threading results too restrictive for mid-dleware servers, especially transactional ones, where it is not admissible to process requests sequentially. In this paper, we show how it is possible to remove this restriction. We present a deterministic scheduling algorithm for multithreaded replicas in a transactional framework. Determinism of multithreaded replicas is achieved with a combination of reliable total order multicast and a deterministic scheduler. The former guarantees that all the replicas see the external events in the same order. The latter, ensures that all threads are scheduled in the same way at all replicas. One of the novelties of the approach is that determinism is achieved without resorting to inter-replica communication. Additionally, the paper also addresses how to perform online recovery while maintaining replica determinism in order to keep a high level of availability. £ This paper consolidates results from the papers  and .
- [Show abstract] [Hide abstract]
ABSTRACT: With the advent of cheaper hardware and Internet, distributed systems have become pervasive. Due to this increasing dependability on distributed systems, there is a growing need to guarantee the availability and consistency of services in the presence of component failures and network partitions. However, a distributed systems without fault-tolerance is more vulnerable than its centralized version. Fault-tolerance plays a crucial role to guarantee the availability and consistency of the system in the advent of failures and network partitions. Traditional fault- tolerant distributed systems have adopted ad-hoc approaches and based on monolithic architectures. These monolithic systems prevent the reuse of its components and are difficult to maintain. The goal of this project is to propose a new approach for building fault-tolerant distributed systems, a component-based architecture in which components devoted to different aspects of fault-tolerance can be woven together to achieve the desired degree of dependability.
- [Show abstract] [Hide abstract]
ABSTRACT: The increasingly pervasive use of clusters makes replication a central element of modern information systems. Replication, however, must nowadays play a dual functionality: it must increase both the availability and the processing capacity of the application. Most existing data replication protocols cannot do this as they improve availability at the cost of scalability. In this paper we present a protocol that achieves this dual goal. The contribution is to demonstrate that data replication does not need to severely affect the overall scalability and that it can be efficiently implemented in a middleware layer without having to modify the databases underneath.