Bounded Version Vectors
José Bacelar Almeida Paulo Sérgio Almeida Carlos Baquero
Departamento de Informática, Universidade do Minho
Version vectors play a central role in update tracking un-
der optimistic distributed systems, allowing the detection
of obsolete or inconsistent versions of replicated data. Ver-
sion vectors do not have a bounded representation; they are
based on integer counters that grow indeﬁnitely as updates
occur. Existing approaches to this problem are scarce; the
mechanisms proposed are either unbounded or operate only
under speciﬁc settings. This paper examines version vec-
tors as a mechanism for data causality tracking and clariﬁes
their role with respect to vector clocks. Then, it introduces
bounded stamps and proves them to be a correct alternative
to integer counters in version vectors. The resulting mecha-
nism, bounded version vectors, represents the ﬁrst bounded
solution to data causality tracking between replicas subject
to local updates and pairwise symmetrical synchronization.
Keywords: Replication, causality, version vectors, up-
date tracking, bounded state.
Optimistic replication is a critical technology in dis-
tributed systems, in particular when improving availability
of database systems and adding support to mobility and par-
titioned operation . Under optimistic replication, data
replicas can evolve autonomously by incorporation new up-
dates into their state. Thus, when contact can be established
between two or more replicas, mutual consistency must be
evaluated and potential divergence detected.
The classic mechanism for assessing divergence between
mutable replicas is provided by version vectors which,
since their introduction by Parker et al , have been one
of the cornerstones of optimistic data management. Version
vectors associate to each replica a vector of integer coun-
ters that keeps track of the last update that is known to have
been originated in every other replica and in the replica it-
self. The mechanism is simple and intuitive but requires a
state of unbounded size, since each counter in the vector
can grow indeﬁnitely.
The potential existence of a bounded substitute to version
vectors has been overlooked by the community. A possible
cause is a frequent confusion of the roles played by ver-
sion vectors and vector clocks (e.g. [16, 17]), that have the
same representation [13, 4, 12], together with the existence
of a minimality result by Charron-Bost , stating that vec-
tor clocks are the most concise characterization of causality
among process events.
In this article we show that a bounded solution is possi-
ble for the problem addressed by version vectors: the detec-
tion of mutual inconsistency between replicas subject to lo-
cal updates and pairwise symmetrical synchronization. We
present a mechanism, bounded stamps, that can be used to
replace integer counters in version vectors, stressing that
the minimality result that precludes bounded vector clocks
does not apply to version vectors.
1.1 On version vectors and vector clocks
Asynchronous distributed systems track causality and log-
ical time among communicating processes by means of
several mechanisms [11, 18], in particular vector clocks
While being structurally equivalent to version vectors,
vector clocks serve a very distinct purpose. Vector clocks
track causality by establishing a strict partial order on the
events of processes that communicate by message pass-
ing, and are known to be the most concise solution to this
problem. Vector clocks, being a vector of integer counters,
are unbounded in size, but so is the number of events that
must be ordered and timestamped by them. In short, vector
clocks order an unlimited number of events occurring in a
given number of processes.
If we consider the role of version vectors, data causal-
ity, there is always a limit to the number of possible rela-
tions that can be established on the set of replicas. This
limit is independent on the number of update events that
are considered on any given run. For example, in a two
replica system only four cases can occur: ,
, and . If the two replicas are al-
ready divergent the inclusion of new update events on any
of the replicas does not change their mutual divergence and
the corresponding relation between them. In short, version
vectors order a given number of replicas, according to an
unlimited number of update events.
The existence of a limited number of relations is a nec-
essary but not sufﬁcient condition for the existence of a
bounded characterization mechanism. A relation, which
is a global abstraction, must be encoded and computed
through local operations on replica pairs without the need
for a global view. This is one of the important properties of
2 Data causality and version vectors
Data causality on a set of replicas can be assessed via set
inclusion of the sets of update events known to each replica.
Data causality is the pre-order deﬁned by:
being and the sets of update events (globally unique
events), known to replicas and .
When tracking data causality with version vectors in
a replica system, one associates to each replica
a vector of integer counters. The
order on version vectors is the standard pointwise (coor-
where denotes component of vector .
The operations on version vectors, formally presented in
Figure 1, are as follows:
Initialization ( ) establishes the initial system state. All
vectors are initialized with zeroes.
Update ( ) an update event in replica increments .
Figure 1: Semantics of version vector operations.
Synchronization ( ) synchronization of and is
achieved by taking the pointwise join (greatest ele-
ment) of and .
This classic mechanism encodes data causality because
comparing version vectors gives the same result as compar-
ing sets of known update events. For all runs and replicas
Figure 2 shows a run with version vectors in a four
replica system. Updates are depicted by a “ ” and synchro-
nization by two “ ” connected by a line.
2.1 Version vector slices
All operations over version vectors exhibit a pointwise na-
ture: a given vector position is only compared or updated to
the same position in other vectors, resulting from all infor-
mation about updates originated in replica being stored
in component of each version vector. This allows a de-
composition of the replicated system into slices, where
each slice represents the updates that were originated in a
given replica. Slice for a replica system is made up of
the th component of each version vector:
This means that data causality in replicas can be en-
coded by the concatenation of the representation for each
of the slices. It also means that it is enough to concen-
trate on a subproblem: encoding the distributed knowledge
about a single source of updates, and the corresponding ver-
sion vector slice (VVS). The source of updates increments
0 1 OO
1 2 OO
0 0 1 OO
0 2 2 2
Figure 2: Version Vectors: example run, depicting slice counters by a boxed digit.
Figure 3: VVS semantics for slice 0.
its counter and all other replicas keep potentially outdated
copies of that counter; this subproblem amounts to storing
a distributed representation of a total order.
For the remainder of the paper we will concentrate, for
notational convenience and without loss of generality, on
ﬁnding a bounded representation for slice 0. Figure 3
presents the semantics of version vectors restricted to slice
0; in the run presented in Figure 2 this slice is shown using
3 Informal presentation
We now give an informal presentation of the mechanism
and give some intuition of how it works and how it ac-
complishes its purpose. Having shown that it is enough to
concentrate on a subproblem (a single source of updates)
and the corresponding slice of version vectors, we now
present the stamp that will replace, in each replica, the in-
teger counter of the corresponding version vector.
For problem size , i.e. assuming replicas, with
the “primary” where updates take place and
the “secondary” replicas, we represent a stamp by some-
It has a representation of bounded size, as it consists of
rows, each with at most symbols (letters here), taken
from a ﬁnite set . An example run consisting of four
replicas is presented in Figure 4.
A stamp is, in abstract, a vector of totally ordered sets.
Each of the components (rows in our notation) repre-
sents a total order, with the greatest element on the left (the
ﬁrst row above means ). In a stamp for replica
, row ( ) is what we call the principal
order (displayed with a gray background), while the other
rows are the cached orders. (Thus, the stamp above would
belong to replica .) The cached order in row repre-
sents the principal order of replica at some point in time,
propagated to replica (either directly or indirectly through
The greatest element of the principal order (on the left,
depicted in bold over gray) is what we call the principal
element. It represents the most recent update (in the pri-
mary) known by the replica. In a representation using an
inﬁnite total ordered set instead of nothing more would
be needed. This element can be thought of as “correspond-
ing” to the value of the integer counter in version vectors.
The left column in a stamp (depicted in bold) is what
we call the principal vector; it is made up of the greatest
element of each order (row). It represents the most recent
local knowledge about the principal element of each replica
In a stamp, there is a relationship between the principal
order and the principal vector: the elements in the principal
vector are the same ones as in the principal order. In other
words, the set of elements in the principal vector is ordered
according to the principal order.
3.1 Comparison and synchronization as well de-
ﬁned local operations
As we will show below, the mechanism is able to compare
two stamps by a local operation on the respective principal
orders. No global knowledge is used: not even a global
Figure 4: Bounded stamps: example run.
order on the set of symbols is assumed. For comparison
purposes is simply an unordered set, with elements that
are ordered differently in different stamps. As an example,
the comparison of
involves looking at bcand ca, and gives .
When synchronizing two stamps, in the positions of the
two principal elements, the resulting value will be the max-
imum of the two principal elements; the rest of the resulting
principal vector will be the pointwise maximum of the re-
spective values. The comparisons are performed according
to the principal orders of the two stamps involved.
Is is important to notice that, in general, it is not possi-
ble to take two arbitrary total orders and merge them into
a new total order. As such, it could be thought that com-
puting the maximum as mentioned above is ill deﬁned. As
we will show, several properties of the model can be ex-
plored that make these operations indeed possible and well
deﬁned. We will also show that it is possible to totally order
the elements in the resulting principal vector, i.e. to obtain
a new principal order.
3.2 Garbage collection for symbol reuse
The boundedness of the mechanism is only possible
through symbol reuse. When an update operation is per-
formed, instead of incrementing an integer counter, some
symbol is chosen to become the new principal element. By
using a ﬁnite set of symbols , an update will eventually
reuse a symbol that was already used in the past to repre-
sent some previous update that has been synchronized with
However, by reusing symbols, an obvious problem arises
that needs to be addressed: the symbol reuse cannot com-
promise the well-deﬁnedness of the comparison operations
described above. As an example, it would not be accept-
able that, due to reuse, the principal orders of two stamps
end up being ab c and ca, as it would not be possible to
overcome the ambiguity between and and
to infer which one is the greatest stamp.
To address the problem, the mechanism implements a
distributed “garbage collection” of symbols. This is accom-
plished through the extra information in the cached orders.
As we will show, any element in the principal order/vector
of any replica is also present in the primary replica (in some
principal or cached order). This is the key property towards
symbol reuse: when an update is performed, any symbol
which is not present in the primary replica is considered
“garbage” and can be (re)used for the new principal ele-
As an example, in Figure 4, when the ﬁnal update occurs,
symbol can be used for the new principal element because
Figure 5: Counter mode principal vectors.
it is not present in the primary replica:
Notice that the scheme only assures that does not occur
in the principal orders/vectors. In this example occurs in
some cached orders of replicas
but this is not a problem because those elements will not be
used in comparisons; the “old” will not be confused with
the “new” .
3.3 Synopsis of formal presentation
The formal presentation and proof of correctness will make
use of an unbounded mechanism which we call the counter
mode principal vectors (CMPV). This auxiliary mechanism
represents what the evolution of the principal vector would
be if we could afford to use integer counters. The mecha-
nism makes use of the total order on natural numbers and
does not encode orders locally. In Figure 5 we present part
of the run in Figure 4 using the counter mode mechanism.
The bulk of the proof consists in establishing several
properties of the CMPV model that allow the relevant com-
parison operations to be computed in a well-deﬁned way
Figure 6: Semantics of operations in CMPV.
using only local information. The key idea is that, exploit-
ing these properties, bounded stamps can be seen as an en-
coding of CMPV using a ﬁnite set , where the principal
orders are used to encode the relevant order information.
4 Counter Mode Principal Vectors
Version Vector Slices (VVS) rely on an unbounded totally
ordered set — the integers. Their unbounded nature is ac-
tually a consequence of adopting a predetermined order re-
lation (and hence globally known) to capture data causality
among replicas. To overcome this, we enrich VVS in a
way that order judgments become, in a sense, local to each
replica. In this way, it will be possible to dynamically en-
code the causality order and open the perspective of bound-
ing the “counters” domain.
For a replica index , its local state in the CMPV model
is denoted by and deﬁned as the tuple where
is a vector of integers with size — the principal vector
for (see Figure 5). The value in position of vector
is denoted by and represents the knowledge of stamp
concerning the most recent update known by stamp .
The element plays a central role since it holds ’s view
about the more recent update — this is essentially the infor-
mation contained in VVS counters and we call it the prin-
cipal element for stamp .
Figure 6 deﬁnes the semantics of the operations in the
CMPV model. Symbol denotes the join operation under
integer ordering (i.e. taking the maximum element). Notice
that the order information is only required to perform the
synchronization operation. Moreover, comparisons are al-
ways between principal elements or pointwise (between the
same position in two principal vector). Occasionally, it will
be convenient to write for the result of the synchro-
nization on stamps and (i.e. the principal vector of
one of these stamps after synchronization).
Atrace consists of a sequence of operations starting with
and followed by an arbitrary number of updates and syn-
chronizations. In the remainder, when stating properties in
the CMPV, we will leave implicit that they only refer to
reachable states, i.e. states that result from some trace of
operations. Induction over the traces is the fundamental
tool to prove invariance properties, as the following simple
facts about CMPV.
Proposition 1. For every replica , and index ,
Proof. Simple induction on the length of traces.
Given stamps and we deﬁne their data causality
order under CMPV ( ) as the comparison of their princi-
By Figure 6 it can be seen that the computation of princi-
pal elements only depends upon principal elements. More-
over, if we restrict the impact of the operations to the princi-
pal element we recover the VVS semantics (Figure 3). This
observation leads immediately to the correctness of CMPV
as a data causality encoding for slice 0:
This result is not surprising since CMPV was deﬁned as a
semantics preserving extension of VVS.
Next we will show that the additional information con-
tained in the CMPV model makes it possible to avoid re-
lying on the integer order, and to replace it with a locally
encoded order. For this, we will use a non-trivial invariant
on the global state given by the following lemma. Its proof
is presented in the appendix since it requires an auxiliary
deﬁnition and some additional lemmata.
Lemma 2. For every stamp and and index ,
Proof. See appendix A.
Recall that the order information is only required to per-
form the synchronization operation. Moreover, compar-
isons are always between principal elements or pointwise
(between the same position in two principal vector). In the
following we will show that these comparisons can be per-
formed without relying on integer order as long as we can
order the elements in the principal vector of each stamp in-
Comparison between principal elements reduces to a
Proposition 3. For every stamp , ,
Proof. If then, by Proposition 1(1) we have
that and so, by Lemma 2, .
If then, by Proposition 1(3) we have that
For a stamp , let us denote by the restriction of the
intrinsic integer order to the values contained in the princi-
pal vector :
iff and and
Using these orderings, we deﬁne new ones that are appro-
priate to perform the required comparisons. For stamps
and , let their combined order be deﬁned as:
iff and or
For convenience, we also deﬁne the corresponding join
The following proposition establishes the claimed prop-
erties for this ordering.
Proposition 4. For every stamp and and index ,
Proof. (1) Follows directly from Propositions 1 and 3.
(2) Let . When Proposition 3
guaranties that and, by Lemma 2, we have
and then , which establishes . The
case is trivial since, either (in which case
), or and so . Let
(that is, ). The proof proceeds as in the previous
Restricted orders can be explicitly encoded (e.g. by a
sequence) and can be easily manipulated. We now show
that when a synchronization is performed, all the elements
in the resulting principal vector were already present in the
more up-to-date stamp. This means that the restricted order
that results is a restriction of the one from the more up-to-
Proposition 5. Let and be stamps and .
If then, for all ,
Proof. For the pointwise join : if
then ; if then, by Lemma 2, .
Otherwise, note that the resulting principal element ( ) is
already in .
These observations together with the fact that the global
state can only retain a bounded amount of integer values
(an obvious limit is ) opens the way for a change in the
domain from the integers in the CMPV model to a ﬁnite set.
5 Bounded Stamps
A migration from the domain of integer counters in CMPV
to a ﬁnite set is faced with the following difﬁculty: the
update operation should be able to choose a value, that is
not present in any principal vector, for the new principal
element in the primary.
Adopting a set sufﬁciently large (e.g. with ele-
ments) guaranties that such a choice exists under a global
view. The problem lies in making that choice using only
the information in the state of the primary. To overcome
this problem we make a new extension of the model that
allows the primary to keep track of all the values in use in
the principal vectors of all stamps.
We will present this new model parameterized by a set
(the symbol domain), a distinguished element
(the initial element), and an oracle for new symbols
(satisfying an axiom described bellow). For each
replica index , its local state in the bounded stamps model
is denoted by and deﬁned as where:
is the replica index;
is a vector of values from with size — the
is a vector of total orders, encoded as sequences,
representing the full bounded stamp.
This last component contains all the information in the
principal vector, the principal order and the cached orders.
Although the principle vector is redundant (as each com-
ponent is also present in the ﬁrst position of each ), it
is kept in the model for notational convenience in describ-
ing the operations and in establishing the correspondence
between the models.
The intuitive idea is that the state for each stamp keeps an
explicit representation of the restricted orders. More pre-
cisely, for stamp , the sequence contains precisely
the elements of ordered downward (ﬁrst element is ).
From that sequence one easily deﬁnes the restricted order
for stamp , what we call principal order to emphasize its
where denotes the sequence restricted to the elements
in , i.e. and . The combined order
and associated join are deﬁned precisely as in counter
mode, that is
The other sequences in keep information about (poten-
tially outdated) principal orders of other stamps — these are
called the cached orders.
Figure 7 gives the semantics for the operations in this
model. The oracle for new symbols is a function
that gives an element of satisfying the following axiom:
For every stamp ,
The argument in the oracle intends to emphasize
that the choice of the new symbol should be made based on
the primary local state.
if and :
Figure 7: Semantics of operations on BS model.
Data causality ordering under the Bounded Stamps
model is deﬁned by
The correctness of the proposed model follows from the
observation that, apart from the cached orders used for the
symbol reuse mechanism, it is actually an encoding of the
CMPV model. To formalize the correspondence between
both models, we introduce an encoding function that
maps each integer in the CMPV model into the correspond-
ing symbol (in ) in the state resulting from a given trace.
This map is deﬁned recursively on the traces.
Where is the number of update events in , is the
bounded stamp for the primary after trace , and
gives a canonical choice for the new principal element on
the primary after the update. When we discard the cached
orders, the semantics of operations given in Figure 7 are
precisely the ones in CMPV (Figure 6) affected by the en-
coding map. Moreover, the principal orders are encodings
for the restricted orders presented in the previous section.
Lemma 6. For an arbitrary trace , replicas index and
Proof. This results from a simple induction on the length
of traces. When the last operation was it is trivial. When
it was , the result follows from the induction hypothesis
and the axiom for the oracle . When it was ,
the result follows from induction hypothesis, the fact that,
since computes the required joins (Proposition 4), the
deﬁnitions of both models are the same, and the correctness
of the new restricted orders (Proposition 5).
As a simple consequence of the previous result, we can
state the following correctness result.
Proposition 7. For any arbitrary trace and replica in-
dexes and we have
Proof. Immediate from Lemma 6 and the deﬁnitions of
It remains to instantiate the parameters of the model. A
trivial but unbounded instantiation would be: set as the
integers, as value and . In this set-
ting, principal orders would be an explicit representation
of counter mode restricted orders. Obviously, we are inter-
ested in bounded instantiations of . To show that such
instantiations exists, we introduce the following lemma that
puts in evidence the role of cached orders. Once again we
will postpone its proof to the appendix since it uses a simi-
lar technique as the proof of lemma 2.
Lemma 8. For every stamp there exists an such that
Proof. See appendix B.
We are now able to present a bounded instantiation for
the model. Let be a totally ordered set with ele-
ments (the total order is here only to avoid making non-
deterministic choices). We deﬁne:
Lemma 8 guaranties that satisﬁes the axiom. It fol-
lows then that it acts as an encoding of counter mode model
(Proposition 7). Thus we have constructed a bounded
model for the data causality problem in a slice, which gen-
eralizes, by concatenating slices, to the full data causality
problem addressed by version vectors.
6 Related Work
On what concerns bounded replacements for version vec-
tors there is, up to our knowledge, no previous solution to
the problem. The possible existence of a bounded substi-
tute to version vectors was referred in  while introducing
the version stamps concept. Version stamps allow the char-
acterization of data causality in settings where version vec-
tors cannot operate, namely when replicas can be created
and terminated autonomously.
There have been several approaches to version vector
compression. Update coalescing  takes advantage of
the fact that several consecutive updates issued in isolation
in a single replica can be made equivalent to a single large
update. Update coalescing is intrinsic in bounded stamps
since sequence restriction in the update operation discards
non-propagated symbols. Dynamic compression  can
effectively reduce the size of version vectors by removing
a common minimum from all entries (along each slice).
However, this technique requires distributed consensus on
all replicas and therefore cannot progress if one or more
replicas are unreachable. Unilateral version vector prun-
ing  avoids distributed consensus by allowing unilat-
eral deletion of inactive version vectors entries, but relays
on some timing assumptions on the physical-clock’s skew.
Lightweight version vectors  develop an integer en-
coding technique that allows a gradual increase of integer
storage as counters increase. This technique is used in con-
junction with update coalescing to provide a dynamic size
representation. Hash histories  track data causality by
collecting hash ﬁngerprints of contents. This representa-
tion is independent of the number of replicas but grows in
proportion to the number of updates.
The minimality of vectors clocks as a characterization
of Lamport causality , presented by Charron-Bost 
and recently re-addressed in , indicates particular runs
where the full expressiveness of vectors clocks is required.
However there are cases in which smaller representations
can operate: Plausible Clocks  offer a bounded substi-
tute to vectors clocks that are accurate in a large percentage
of situations and may be used in settings were deviations
only impacts performance and not correctness; Resettable
Vector Clocks  allow a bounded implementation of vec-
tor clocks under a speciﬁc communication pattern between
The collection of cached copies of the knowledge in
other replicas has been explored before in [5, 20] and used
for optimization of message passing strategies. This con-
cept is sometimes referred to as matrix clocks . These
clocks are based on integer counters and are similar to our
intermediate “counter mode principal vector” representa-
Version vectors are the key mechanism in the detection of
inconsistency and obsolescence among optimistically repli-
cated data. This mechanism has been used extensively in
the design of distributed ﬁle systems [10, 7], in particu-
lar for data causality tracking among ﬁle copies. It is well
known that version vectors are unbounded due to their use
of counters; some approaches in the literature have tried to
address this problem.
We have brought the attention to the fact that causally
ordering a limited number of replicas does not require the
full expressive power of version vectors. Due to the limited
number of conﬁgurations among replicas, data causality
tracking does not necessarily imply the use of unbounded
mechanisms. As a consequence, Charron-Bost’s minimal-
ity of vector clocks cannot be transposed to version vectors.
We have noted that to ﬁnd a bounded alternative to
version vectors, it was enough to concentrate on a sub-
problem: keeping distributed knowledge about a total order
generated by a single entity.
The key to bounded stamps was deﬁning an intermediate
unbounded mechanism and showing that it was possible to
perform comparisons without requiring a global total order;
this was the bulk of the proof correctness; bounded stamps
were then derived as an encoding into a ﬁnite set of sym-
bols. This required the deﬁnition of a non-trivial symbol
reuse mechanism that is able to progress even if an arbitrary
number of replicas ceases to participate in the exchanges.
This mechanism may have a broader applicability beyond
its current use (e.g. log dissemination and pruning) and be-
come a building block in other mechanisms for distributed
The construction of the mechanism was supported by a
simulator1, which was used in the proof of correctness so
as to probe (and discard) tentative hypotheses. The simula-
tor was also turned into a model checker which proved the
correctness up to , giving some conﬁdence before
the full proof of correctness was attempted.
Bounded version vectors are obtained by substituting in-
teger counters on version vectors by bounded stamps. It
represents the ﬁrst bounded mechanism for detection of ob-
solescence and mutual inconsistency in distributed systems.
 Paulo Sérgio Almeida, Carlos Baquero, and Victor Fonte.
Version stamps – decentralized version vectors. In Proceed-
ings of the 22nd International Conference on Distributed
Computing Systems (ICDCS), pages 544–551. IEEE Com-
puter Society, 2002.
 A. Arora, S. S .Kulkarni, and M. Demirbas. Resettable vec-
tor clocks. In 19th Symposium on Principles of Distributed
Computing (PODC’2000), Portland, 2000. ACM, 2000.
 Bernadette Charron-Bost. Concerning the size of logical
clocks in distributed systems. Information Processing Let-
ters, 39:11–16, 1991.
 Colin Fidge. Timestamps in message-passing systems that
preserve the partial ordering. In 11th Australian Computer
Science Conference, pages 55–66, 1989.
 Michael J. Fischer and A. Michael. Sacriﬁcing serializabil-
ity to attain high availability of data. In Proceedings of the
ACM Symposium on Principles of Database Systems, pages
70–75. ACM, 1982.
 V. K. Garg and C. Skawratananond. String realizers of
posets with applications to distributed computing. In Pro-
ceedings of the ACM Symposium on Principles of Dis-
tributed Computing (PODC’01), pages 72–80. ACM, 2001.
 Richard G. Guy, John S. Heidemann, Wai Mak, Thomas W.
Page, Gerald J. Popek, and Dieter Rothmeier. Implementa-
tion of the ﬁcus replicated ﬁle system. In USENIX Confer-
ence Proceedings, pages 63–71. USENIX, June 1990.
 Yun-Wu Huang and Philip Yu. Lightweight version vec-
tors for pervasive computing devices. In Proceedings of
the 2000 International Workshops on Parallel Processing,
pages 43–48. IEEE Computer Society, 2000.
 Brent ByungHoon Kang, Robert Wilensky, and John Kubi-
atowicz. The hash history approach for reconciling mutual
inconsistency. In Proceedings of the 23nd International
Conference on Distributed Computing Systems (ICDCS),
pages 670–677. IEEE Computer Society, 2003.
 James Kistler and M. Satyanarayanan. Disconnected opera-
tion in the coda ﬁle system. ACM Transaction on Computer
Systems, 10(1):3–25, February 1992.
 Leslie Lamport. Time, clocks and the ordering of events
in a distributed system. Communications of the ACM,
21(7):558–565, July 1978.
 Friedemann Mattern. Virtual time and global clocks in dis-
tributed systems. In Workshop on Parallel and Distributed
Algorithms, pages 215–226, 1989.
 D. Stott Parker, Gerald Popek, Gerard Rudisin, Allen
Stoughton, Bruce Walker, Evelyn Walton, Johanna Chow,
David Edwards, Stephen Kiser, and Charles Kline. Detec-
tion of mutual inconsistency in distributed systems. Trans-
actions on Software Engineering, 9(3):240–246, 1983.
 David Howard Ratner. Roam: A Scalable Replication Sys-
tem for Mobile and Distributed Computing. PhD thesis,
 Frédéric Ruget. Cheaper matrix clocks. In Proceedings of
the 8th International Workshop on Distributed Algorithms,
pages 355–369. Springer Verlag, LNCS, 1994.
 Yasushi Saito. Unilateral version vector pruning using
loosely synchronized clocks. Technical Report HPL-2002-
51, HP Labs, 2002.
 Yasushi Saito and Marc Shapiro. Optimistic replication.
Technical Report MSR-TR-2003-60, Microsoft Research,
 R. Schwarz and F. Mattern. Detecting causal relationships
in distributed computations: In search of the holy grail. Dis-
tributed Computing, 3(7):149–174, 1994.
 FJ Torres-Rojas and M. Ahamad. Plausible clocks: con-
stant size logical clocks for distributed systems. Distributed
Computing, 12(4):179–196, 1999.
 G. T. J. Wuu and A. J. Bernstein. Efﬁcient solutions to the
replicated log and dictionary problems. In Proceedings of
the ACM Symposium on Principles of Distributed Comput-
ing (PODC’84), pages 232–242. ACM, 1984.
A Proof of Lemma 2
The hypothesis of lemma 2 concern two stamps (say and
) in which we can identify some sort of conﬂict between
each stamp knowledge: Stamp has a better knowledge
concerning the primary state ( ) but has an outdated
vision concerning some other stamp (say ), i.e. .
Lemma 2 states that when this happens stamp already at-
tributes the value of to some other stamp (say — that is,
). In order to prove this result, it will be necessary
to reinforce this statement: not only but it is pos-
sible to identify a ﬂow of information between stamp and
. Moreover, this ﬂow of information (a sequence of syn-
chronization operations starting from to ) can be traced
in stamp ’s local state as a sequence of indexes enjoying
some properties. These sequence of indexes are called de-
lay paths and are deﬁned as follows.
9 Deﬁnition (Delay Path). Adelay path between and
is a non-empty sequence of indexes such that,
for any stamp ,
3. for all ,
4. for all ,
5. for all .
Some simple facts concerning delay paths.
Proposition 10. Let be a delay path between
and . The following facts hold:
3. for all ,
Proof. The ﬁrst three facts are immediate consequences
from the deﬁnition and Proposition 1. Regarding the last
fact, if occurred in a position , being , by condi-
tion (4) of delay paths we have ; but this contra-
dicts condition (3). Thus, only occurs in a singleton delay
Some of the conditions on delay paths impose global
constrains on them that will allow to reason about global
state changes and their impact on the local states. The fol-
lowing Lemma exposes the use of such global constrains.
Lemma 11 (Pointwise-join Lemma). Let be
a non-empty sequence of indexes. If for some ,
2. for all , ,
3. for all and any stamp , if then
Then, for any stamp for which , there exists
such that and, for all ,
Proof. By induction on the length of the sequence
. For the base case (singular sequence) we have
that . Since we have and the
remaining condition is vacuous. For the induction step, we
consider the following cases: If then we set
since . Otherwise, we know that
and, by (4), that . So we apply the induction hypoth-
esis to the sequence and set to the resulting
index plus 1.
We now show that the conditions in Lemma 2 are sufﬁ-
cient to establish the existence of delay paths.
Lemma 12. If and are two stamps and a position such
then there exists a delay path between and .
Proof. We prove by induction on the length of the trace. If
the last operation was we use the singleton sequence
for the delay path and the conditions hold trivially. If the
last operation was consider the following cases:
:we pick the sequence that satisﬁes trivially
:after the update , which contradicts the
:if then , which contradicts the
hypothesis. If we use the same delay path that
comes from the IH, which is still valid after the update
because it does not contain position , since
:we use the same delay path from the IH, which
is still valid: (1,2,3) because and are not affected
by the update; (4) because only changes; (5) be-
cause even if for some we have , if ,
then due to (4).
If the last operation was (and lets assume, without loss
of generality, that is the more up-to-date stamp, i.e.
) we need to distinguish the following cases:
:we use the same delay path from the
IH, which is still valid: (1,2,3) because and are
not affected; (4) because can only increase; (5)
because for every , if , then either
is computed pointwise and follows from
the IH, or is either or and (by 4)
:stamps and become equal after the
synchronization and we pick the sequence for the
:in this case the stamp re-
sults from the synchronization of and and we have
. Consider the following two
When and . First, given that
and , we can apply the IH to and
on index and establish the existence of a delay path
for in . Then we preﬁx it by ,
obtaining , which is a suitable delay path
between and , given that: (1) holds by construc-
tion, (2) from the IH, (3) from the IH and
(since ); (4) from the IH and
; (5) from the IH and because for
every stamp , .
Otherwise, then either or ; applying
the IH to either or and in position gives us
a valid delay path for the resulting conﬁguration (all
conditions hold, including (5) as shown for the case
:in this case the stamp re-
sults from the synchronization of and .
When is either or , we have ; but
this means (as and ) that ;
therefore is a delay path.
Otherwise, ; this means that
and by the IH there exists a delay path between
and . Given that also , Lemma 11 estab-
lishes the existence of a sequence
(preﬁx of ) that is a delay path between and
for the following reasons. Positions and do not
appear in — because we are assum-
ing , and for , otherwise
we would have (condition (4) of
delay paths of which is a preﬁx) and then
, which contradicts Lemma 11. Thus, all
elements , with are computed pointwise (i.e
), making conditions (2,3 and 5) immedi-
ate consequences of Lemma 11. Condition (1) is triv-
ially observed ( is a preﬁx of ); and condition (4)
from the IH and because upon a join values can only
We can ﬁnally state Lemma 2.
Lemma (2). For every stamp and , and every index
Proof. Direct from Lemma 12.
B Proof of Lemma 8
Lemma 8 says that each principal order is already contained
in some cached order on the primary. Note that Lemma 2
already states that every principal element belongs to
the primary principal vector, and delay paths were used to
show where it can be found. Now, we will show that it is
precisely in the primary cached order located in the position
pointed out by the delay path between and that we can
ﬁnd all the elements in . To prove this we need to reason
about cached orders along delay paths. This suggests an
extension of these to what we call principal delay paths.
13 Deﬁnition. Aprincipal delay path for stamp is a
delay path between and that additionally
satisﬁes the following condition: for every and
any stamp ,
We now prove the existence of principal delay paths by
extending the proof of existence in Lemma 12. Here we
only go through the cases that are relevant for the additional
Lemma 14. For every stamp there exists a principal
Consider the following additional arguments to the proof
of Lemma 12. If the last operation was (assume
:let . If is either or ,
we know that (since ). Let
. When , by condition (4), we
have or which determines that
. When is computed pointwise, the new
condition follows by the induction hypothesis.
:when and ,
let be the principal delay path for .
The new condition if veriﬁed for since,
the case is trivial (because
). For , the new condition is satisﬁed
since (Proposition 5).
in this case the primary re-
sults from the synchronization of and (i.e. is the
primary before synchronization). Since , then
is computed pointwise. By IH we get a principal
delay path to which we apply Lemma 11 to get a
new sequence where and never occur (c.f. proof
of Lemma 12). The new condition follows by the in-
Lemma (8). For every stamp there exists a position
Proof. Let be the principal delay path for
(given by Lemma 14). Instantiating the new condition for
on we get that