Content uploaded by Carlos Baquero

Author content

All content in this area was uploaded by Carlos Baquero on Feb 26, 2013

Content may be subject to copyright.

Bounded Version Vectors

José Bacelar Almeida Paulo Sérgio Almeida Carlos Baquero

Departamento de Informática, Universidade do Minho

{jba,psa,cbm}@di.uminho.pt

Abstract

Version vectors play a central role in update tracking un-

der optimistic distributed systems, allowing the detection

of obsolete or inconsistent versions of replicated data. Ver-

sion vectors do not have a bounded representation; they are

based on integer counters that grow indeﬁnitely as updates

occur. Existing approaches to this problem are scarce; the

mechanisms proposed are either unbounded or operate only

under speciﬁc settings. This paper examines version vec-

tors as a mechanism for data causality tracking and clariﬁes

their role with respect to vector clocks. Then, it introduces

bounded stamps and proves them to be a correct alternative

to integer counters in version vectors. The resulting mecha-

nism, bounded version vectors, represents the ﬁrst bounded

solution to data causality tracking between replicas subject

to local updates and pairwise symmetrical synchronization.

Keywords: Replication, causality, version vectors, up-

date tracking, bounded state.

1 Introduction

Optimistic replication is a critical technology in dis-

tributed systems, in particular when improving availability

of database systems and adding support to mobility and par-

titioned operation [17]. Under optimistic replication, data

replicas can evolve autonomously by incorporation new up-

dates into their state. Thus, when contact can be established

between two or more replicas, mutual consistency must be

evaluated and potential divergence detected.

The classic mechanism for assessing divergence between

mutable replicas is provided by version vectors which,

since their introduction by Parker et al [13], have been one

of the cornerstones of optimistic data management. Version

vectors associate to each replica a vector of integer coun-

ters that keeps track of the last update that is known to have

been originated in every other replica and in the replica it-

self. The mechanism is simple and intuitive but requires a

state of unbounded size, since each counter in the vector

can grow indeﬁnitely.

The potential existence of a bounded substitute to version

vectors has been overlooked by the community. A possible

cause is a frequent confusion of the roles played by ver-

sion vectors and vector clocks (e.g. [16, 17]), that have the

same representation [13, 4, 12], together with the existence

of a minimality result by Charron-Bost [3], stating that vec-

tor clocks are the most concise characterization of causality

among process events.

In this article we show that a bounded solution is possi-

ble for the problem addressed by version vectors: the detec-

tion of mutual inconsistency between replicas subject to lo-

cal updates and pairwise symmetrical synchronization. We

present a mechanism, bounded stamps, that can be used to

replace integer counters in version vectors, stressing that

the minimality result that precludes bounded vector clocks

does not apply to version vectors.

1.1 On version vectors and vector clocks

Asynchronous distributed systems track causality and log-

ical time among communicating processes by means of

several mechanisms [11, 18], in particular vector clocks

[4, 12].

While being structurally equivalent to version vectors,

vector clocks serve a very distinct purpose. Vector clocks

track causality by establishing a strict partial order on the

events of processes that communicate by message pass-

ing, and are known to be the most concise solution to this

problem. Vector clocks, being a vector of integer counters,

are unbounded in size, but so is the number of events that

must be ordered and timestamped by them. In short, vector

clocks order an unlimited number of events occurring in a

1

given number of processes.

If we consider the role of version vectors, data causal-

ity, there is always a limit to the number of possible rela-

tions that can be established on the set of replicas. This

limit is independent on the number of update events that

are considered on any given run. For example, in a two

replica system only four cases can occur: ,

, and . If the two replicas are al-

ready divergent the inclusion of new update events on any

of the replicas does not change their mutual divergence and

the corresponding relation between them. In short, version

vectors order a given number of replicas, according to an

unlimited number of update events.

The existence of a limited number of relations is a nec-

essary but not sufﬁcient condition for the existence of a

bounded characterization mechanism. A relation, which

is a global abstraction, must be encoded and computed

through local operations on replica pairs without the need

for a global view. This is one of the important properties of

version vectors.

2 Data causality and version vectors

Data causality on a set of replicas can be assessed via set

inclusion of the sets of update events known to each replica.

Data causality is the pre-order deﬁned by:

iff

being and the sets of update events (globally unique

events), known to replicas and .

When tracking data causality with version vectors in

a replica system, one associates to each replica

a vector of integer counters. The

order on version vectors is the standard pointwise (coor-

dinatewise) order:

iff

where denotes component of vector .

The operations on version vectors, formally presented in

Figure 1, are as follows:

Initialization ( ) establishes the initial system state. All

vectors are initialized with zeroes.

Update ( ) an update event in replica increments .

Operation :

Operation :

if ;

otherwise.

Operation :

Figure 1: Semantics of version vector operations.

Synchronization ( ) synchronization of and is

achieved by taking the pointwise join (greatest ele-

ment) of and .

This classic mechanism encodes data causality because

comparing version vectors gives the same result as compar-

ing sets of known update events. For all runs and replicas

and :

iff iff

Figure 2 shows a run with version vectors in a four

replica system. Updates are depicted by a “ ” and synchro-

nization by two “ ” connected by a line.

2.1 Version vector slices

All operations over version vectors exhibit a pointwise na-

ture: a given vector position is only compared or updated to

the same position in other vectors, resulting from all infor-

mation about updates originated in replica being stored

in component of each version vector. This allows a de-

composition of the replicated system into slices, where

each slice represents the updates that were originated in a

given replica. Slice for a replica system is made up of

the th component of each version vector:

This means that data causality in replicas can be en-

coded by the concatenation of the representation for each

of the slices. It also means that it is enough to concen-

trate on a subproblem: encoding the distributed knowledge

about a single source of updates, and the corresponding ver-

sion vector slice (VVS). The source of updates increments

2

0 1 OO

1 2 OO

2

0 0 1 OO

2 2

0OO

2

0 2 2 2

Figure 2: Version Vectors: example run, depicting slice counters by a boxed digit.

Operation :

Operation :

if ;

otherwise.

Operation :

Figure 3: VVS semantics for slice 0.

its counter and all other replicas keep potentially outdated

copies of that counter; this subproblem amounts to storing

a distributed representation of a total order.

For the remainder of the paper we will concentrate, for

notational convenience and without loss of generality, on

ﬁnding a bounded representation for slice 0. Figure 3

presents the semantics of version vectors restricted to slice

0; in the run presented in Figure 2 this slice is shown using

boxed counters.

3 Informal presentation

We now give an informal presentation of the mechanism

and give some intuition of how it works and how it ac-

complishes its purpose. Having shown that it is enough to

concentrate on a subproblem (a single source of updates)

and the corresponding slice of version vectors, we now

present the stamp that will replace, in each replica, the in-

teger counter of the corresponding version vector.

For problem size , i.e. assuming replicas, with

the “primary” where updates take place and

the “secondary” replicas, we represent a stamp by some-

thing like

cb a

ca

a

ca

It has a representation of bounded size, as it consists of

rows, each with at most symbols (letters here), taken

from a ﬁnite set . An example run consisting of four

replicas is presented in Figure 4.

A stamp is, in abstract, a vector of totally ordered sets.

Each of the components (rows in our notation) repre-

sents a total order, with the greatest element on the left (the

ﬁrst row above means ). In a stamp for replica

, row ( ) is what we call the principal

order (displayed with a gray background), while the other

rows are the cached orders. (Thus, the stamp above would

belong to replica .) The cached order in row repre-

sents the principal order of replica at some point in time,

propagated to replica (either directly or indirectly through

several synchronizations).

The greatest element of the principal order (on the left,

depicted in bold over gray) is what we call the principal

element. It represents the most recent update (in the pri-

mary) known by the replica. In a representation using an

inﬁnite total ordered set instead of nothing more would

be needed. This element can be thought of as “correspond-

ing” to the value of the integer counter in version vectors.

The left column in a stamp (depicted in bold) is what

we call the principal vector; it is made up of the greatest

element of each order (row). It represents the most recent

local knowledge about the principal element of each replica

(including itself).

In a stamp, there is a relationship between the principal

order and the principal vector: the elements in the principal

vector are the same ones as in the principal order. In other

words, the set of elements in the principal vector is ordered

according to the principal order.

3.1 Comparison and synchronization as well de-

ﬁned local operations

As we will show below, the mechanism is able to compare

two stamps by a local operation on the respective principal

orders. No global knowledge is used: not even a global

3

a

a

a

a

ba

a

a

a

OO

ba

ba

a

a

cb a

ba

a

a

OO

cb a

ba

a

cb a

OO

c

ca

c

c

bc

ca

c

c

OO

bc

bc

c

c

a

a

a

a

ba

ba

a

a

OO

cb a

ca

a

ca

bc

bc

c

ca

a

a

a

a

OO

cb a

ca

c

c

a

a

a

a

cb a

ba

a

cb a

cb a

ca

a

ca

cb a

ca

c

c

c

ca

c

c

Figure 4: Bounded stamps: example run.

order on the set of symbols is assumed. For comparison

purposes is simply an unordered set, with elements that

are ordered differently in different stamps. As an example,

the comparison of

bc

ca

c

c

with

cb a

ca

a

ca

involves looking at bcand ca, and gives .

When synchronizing two stamps, in the positions of the

two principal elements, the resulting value will be the max-

imum of the two principal elements; the rest of the resulting

principal vector will be the pointwise maximum of the re-

spective values. The comparisons are performed according

to the principal orders of the two stamps involved.

Is is important to notice that, in general, it is not possi-

ble to take two arbitrary total orders and merge them into

a new total order. As such, it could be thought that com-

puting the maximum as mentioned above is ill deﬁned. As

we will show, several properties of the model can be ex-

plored that make these operations indeed possible and well

deﬁned. We will also show that it is possible to totally order

the elements in the resulting principal vector, i.e. to obtain

a new principal order.

3.2 Garbage collection for symbol reuse

The boundedness of the mechanism is only possible

through symbol reuse. When an update operation is per-

formed, instead of incrementing an integer counter, some

symbol is chosen to become the new principal element. By

using a ﬁnite set of symbols , an update will eventually

reuse a symbol that was already used in the past to repre-

sent some previous update that has been synchronized with

other replicas.

However, by reusing symbols, an obvious problem arises

that needs to be addressed: the symbol reuse cannot com-

promise the well-deﬁnedness of the comparison operations

described above. As an example, it would not be accept-

able that, due to reuse, the principal orders of two stamps

end up being ab c and ca, as it would not be possible to

overcome the ambiguity between and and

to infer which one is the greatest stamp.

To address the problem, the mechanism implements a

distributed “garbage collection” of symbols. This is accom-

plished through the extra information in the cached orders.

As we will show, any element in the principal order/vector

of any replica is also present in the primary replica (in some

principal or cached order). This is the key property towards

symbol reuse: when an update is performed, any symbol

which is not present in the primary replica is considered

“garbage” and can be (re)used for the new principal ele-

ment.

As an example, in Figure 4, when the ﬁnal update occurs,

symbol can be used for the new principal element because

4

0

0

0

0

1

0

0

0

OO

1

1

0

0

2

1

0

0

OO

2

1

0

2

0

0

0

0

1

1

0

0

OO

2

2

0

2

0

0

0

0

OO

2

2

2

2

0

0

0

0

2

1

0

2

2

2

0

2

2

2

2

2

Figure 5: Counter mode principal vectors.

it is not present in the primary replica:

c

ca

c

c

Notice that the scheme only assures that does not occur

in the principal orders/vectors. In this example occurs in

some cached orders of replicas

cb a

ca

a

ca

and

cb a

ca

c

c

but this is not a problem because those elements will not be

used in comparisons; the “old” will not be confused with

the “new” .

3.3 Synopsis of formal presentation

The formal presentation and proof of correctness will make

use of an unbounded mechanism which we call the counter

mode principal vectors (CMPV). This auxiliary mechanism

represents what the evolution of the principal vector would

be if we could afford to use integer counters. The mecha-

nism makes use of the total order on natural numbers and

does not encode orders locally. In Figure 5 we present part

of the run in Figure 4 using the counter mode mechanism.

The bulk of the proof consists in establishing several

properties of the CMPV model that allow the relevant com-

parison operations to be computed in a well-deﬁned way

Operation :

Operation :

if ;

otherwise.

Operation :

if ;

otherwise.

Figure 6: Semantics of operations in CMPV.

using only local information. The key idea is that, exploit-

ing these properties, bounded stamps can be seen as an en-

coding of CMPV using a ﬁnite set , where the principal

orders are used to encode the relevant order information.

4 Counter Mode Principal Vectors

Version Vector Slices (VVS) rely on an unbounded totally

ordered set — the integers. Their unbounded nature is ac-

tually a consequence of adopting a predetermined order re-

lation (and hence globally known) to capture data causality

among replicas. To overcome this, we enrich VVS in a

way that order judgments become, in a sense, local to each

replica. In this way, it will be possible to dynamically en-

code the causality order and open the perspective of bound-

ing the “counters” domain.

For a replica index , its local state in the CMPV model

is denoted by and deﬁned as the tuple where

is a vector of integers with size — the principal vector

for (see Figure 5). The value in position of vector

is denoted by and represents the knowledge of stamp

concerning the most recent update known by stamp .

The element plays a central role since it holds ’s view

about the more recent update — this is essentially the infor-

mation contained in VVS counters and we call it the prin-

cipal element for stamp .

Figure 6 deﬁnes the semantics of the operations in the

CMPV model. Symbol denotes the join operation under

integer ordering (i.e. taking the maximum element). Notice

that the order information is only required to perform the

synchronization operation. Moreover, comparisons are al-

ways between principal elements or pointwise (between the

same position in two principal vector). Occasionally, it will

be convenient to write for the result of the synchro-

5

nization on stamps and (i.e. the principal vector of

one of these stamps after synchronization).

Atrace consists of a sequence of operations starting with

and followed by an arbitrary number of updates and syn-

chronizations. In the remainder, when stating properties in

the CMPV, we will leave implicit that they only refer to

reachable states, i.e. states that result from some trace of

operations. Induction over the traces is the fundamental

tool to prove invariance properties, as the following simple

facts about CMPV.

Proposition 1. For every replica , and index ,

1. ,

2. ,

3. .

Proof. Simple induction on the length of traces.

Given stamps and we deﬁne their data causality

order under CMPV ( ) as the comparison of their princi-

pal elements:

iff

By Figure 6 it can be seen that the computation of princi-

pal elements only depends upon principal elements. More-

over, if we restrict the impact of the operations to the princi-

pal element we recover the VVS semantics (Figure 3). This

observation leads immediately to the correctness of CMPV

as a data causality encoding for slice 0:

iff

This result is not surprising since CMPV was deﬁned as a

semantics preserving extension of VVS.

Next we will show that the additional information con-

tained in the CMPV model makes it possible to avoid re-

lying on the integer order, and to replace it with a locally

encoded order. For this, we will use a non-trivial invariant

on the global state given by the following lemma. Its proof

is presented in the appendix since it requires an auxiliary

deﬁnition and some additional lemmata.

Lemma 2. For every stamp and and index ,

and implies

Proof. See appendix A.

Recall that the order information is only required to per-

form the synchronization operation. Moreover, compar-

isons are always between principal elements or pointwise

(between the same position in two principal vector). In the

following we will show that these comparisons can be per-

formed without relying on integer order as long as we can

order the elements in the principal vector of each stamp in-

dividually.

Comparison between principal elements reduces to a

membership testing.

Proposition 3. For every stamp , ,

iff

Proof. If then, by Proposition 1(1) we have

that and so, by Lemma 2, .

If then, by Proposition 1(3) we have that

.

For a stamp , let us denote by the restriction of the

intrinsic integer order to the values contained in the princi-

pal vector :

iff and and

Using these orderings, we deﬁne new ones that are appro-

priate to perform the required comparisons. For stamps

and , let their combined order be deﬁned as:

iff and or

and

For convenience, we also deﬁne the corresponding join

operation as:

if

otherwise.

The following proposition establishes the claimed prop-

erties for this ordering.

Proposition 4. For every stamp and and index ,

1. iff

2. iff

6

Proof. (1) Follows directly from Propositions 1 and 3.

(2) Let . When Proposition 3

guaranties that and, by Lemma 2, we have

and then , which establishes . The

case is trivial since, either (in which case

), or and so . Let

(that is, ). The proof proceeds as in the previous

implication.

Restricted orders can be explicitly encoded (e.g. by a

sequence) and can be easily manipulated. We now show

that when a synchronization is performed, all the elements

in the resulting principal vector were already present in the

more up-to-date stamp. This means that the restricted order

that results is a restriction of the one from the more up-to-

date stamp.

Proposition 5. Let and be stamps and .

If then, for all ,

Proof. For the pointwise join : if

then ; if then, by Lemma 2, .

Otherwise, note that the resulting principal element ( ) is

already in .

These observations together with the fact that the global

state can only retain a bounded amount of integer values

(an obvious limit is ) opens the way for a change in the

domain from the integers in the CMPV model to a ﬁnite set.

5 Bounded Stamps

A migration from the domain of integer counters in CMPV

to a ﬁnite set is faced with the following difﬁculty: the

update operation should be able to choose a value, that is

not present in any principal vector, for the new principal

element in the primary.

Adopting a set sufﬁciently large (e.g. with ele-

ments) guaranties that such a choice exists under a global

view. The problem lies in making that choice using only

the information in the state of the primary. To overcome

this problem we make a new extension of the model that

allows the primary to keep track of all the values in use in

the principal vectors of all stamps.

We will present this new model parameterized by a set

(the symbol domain), a distinguished element

(the initial element), and an oracle for new symbols

(satisfying an axiom described bellow). For each

replica index , its local state in the bounded stamps model

is denoted by and deﬁned as where:

is the replica index;

is a vector of values from with size — the

principal vector;

is a vector of total orders, encoded as sequences,

representing the full bounded stamp.

This last component contains all the information in the

principal vector, the principal order and the cached orders.

Although the principle vector is redundant (as each com-

ponent is also present in the ﬁrst position of each ), it

is kept in the model for notational convenience in describ-

ing the operations and in establishing the correspondence

between the models.

The intuitive idea is that the state for each stamp keeps an

explicit representation of the restricted orders. More pre-

cisely, for stamp , the sequence contains precisely

the elements of ordered downward (ﬁrst element is ).

From that sequence one easily deﬁnes the restricted order

for stamp , what we call principal order to emphasize its

explicit nature.

iff or

where denotes the sequence restricted to the elements

in , i.e. and . The combined order

and associated join are deﬁned precisely as in counter

mode, that is

iff or

The other sequences in keep information about (poten-

tially outdated) principal orders of other stamps — these are

called the cached orders.

Figure 7 gives the semantics for the operations in this

model. The oracle for new symbols is a function

that gives an element of satisfying the following axiom:

For every stamp ,

The argument in the oracle intends to emphasize

that the choice of the new symbol should be made based on

the primary local state.

7

Operation :

Operation :

Operation :

if ,

otherwise,

if :

if ,

otherwise,

if and :

if ,

otherwise,

if ,

otherwise.

Figure 7: Semantics of operations on BS model.

Data causality ordering under the Bounded Stamps

model is deﬁned by

iff

The correctness of the proposed model follows from the

observation that, apart from the cached orders used for the

symbol reuse mechanism, it is actually an encoding of the

CMPV model. To formalize the correspondence between

both models, we introduce an encoding function that

maps each integer in the CMPV model into the correspond-

ing symbol (in ) in the state resulting from a given trace.

This map is deﬁned recursively on the traces.

if ,

otherwise,

Where is the number of update events in , is the

bounded stamp for the primary after trace , and

gives a canonical choice for the new principal element on

the primary after the update. When we discard the cached

orders, the semantics of operations given in Figure 7 are

precisely the ones in CMPV (Figure 6) affected by the en-

coding map. Moreover, the principal orders are encodings

for the restricted orders presented in the previous section.

Lemma 6. For an arbitrary trace , replicas index and

:

1.

2. implies

3. iff

Proof. This results from a simple induction on the length

of traces. When the last operation was it is trivial. When

it was , the result follows from the induction hypothesis

and the axiom for the oracle . When it was ,

the result follows from induction hypothesis, the fact that,

since computes the required joins (Proposition 4), the

deﬁnitions of both models are the same, and the correctness

of the new restricted orders (Proposition 5).

As a simple consequence of the previous result, we can

state the following correctness result.

Proposition 7. For any arbitrary trace and replica in-

dexes and we have

iff

Proof. Immediate from Lemma 6 and the deﬁnitions of

and .

It remains to instantiate the parameters of the model. A

trivial but unbounded instantiation would be: set as the

integers, as value and . In this set-

ting, principal orders would be an explicit representation

of counter mode restricted orders. Obviously, we are inter-

ested in bounded instantiations of . To show that such

instantiations exists, we introduce the following lemma that

puts in evidence the role of cached orders. Once again we

will postpone its proof to the appendix since it uses a simi-

lar technique as the proof of lemma 2.

Lemma 8. For every stamp there exists an such that

Proof. See appendix B.

8

We are now able to present a bounded instantiation for

the model. Let be a totally ordered set with ele-

ments (the total order is here only to avoid making non-

deterministic choices). We deﬁne:

and

Lemma 8 guaranties that satisﬁes the axiom. It fol-

lows then that it acts as an encoding of counter mode model

(Proposition 7). Thus we have constructed a bounded

model for the data causality problem in a slice, which gen-

eralizes, by concatenating slices, to the full data causality

problem addressed by version vectors.

6 Related Work

On what concerns bounded replacements for version vec-

tors there is, up to our knowledge, no previous solution to

the problem. The possible existence of a bounded substi-

tute to version vectors was referred in [1] while introducing

the version stamps concept. Version stamps allow the char-

acterization of data causality in settings where version vec-

tors cannot operate, namely when replicas can be created

and terminated autonomously.

There have been several approaches to version vector

compression. Update coalescing [14] takes advantage of

the fact that several consecutive updates issued in isolation

in a single replica can be made equivalent to a single large

update. Update coalescing is intrinsic in bounded stamps

since sequence restriction in the update operation discards

non-propagated symbols. Dynamic compression [14] can

effectively reduce the size of version vectors by removing

a common minimum from all entries (along each slice).

However, this technique requires distributed consensus on

all replicas and therefore cannot progress if one or more

replicas are unreachable. Unilateral version vector prun-

ing [16] avoids distributed consensus by allowing unilat-

eral deletion of inactive version vectors entries, but relays

on some timing assumptions on the physical-clock’s skew.

Lightweight version vectors [8] develop an integer en-

coding technique that allows a gradual increase of integer

storage as counters increase. This technique is used in con-

junction with update coalescing to provide a dynamic size

representation. Hash histories [9] track data causality by

collecting hash ﬁngerprints of contents. This representa-

tion is independent of the number of replicas but grows in

proportion to the number of updates.

The minimality of vectors clocks as a characterization

of Lamport causality [11], presented by Charron-Bost [3]

and recently re-addressed in [6], indicates particular runs

where the full expressiveness of vectors clocks is required.

However there are cases in which smaller representations

can operate: Plausible Clocks [19] offer a bounded substi-

tute to vectors clocks that are accurate in a large percentage

of situations and may be used in settings were deviations

only impacts performance and not correctness; Resettable

Vector Clocks [2] allow a bounded implementation of vec-

tor clocks under a speciﬁc communication pattern between

processes.

The collection of cached copies of the knowledge in

other replicas has been explored before in [5, 20] and used

for optimization of message passing strategies. This con-

cept is sometimes referred to as matrix clocks [15]. These

clocks are based on integer counters and are similar to our

intermediate “counter mode principal vector” representa-

tion.

7 Conclusions

Version vectors are the key mechanism in the detection of

inconsistency and obsolescence among optimistically repli-

cated data. This mechanism has been used extensively in

the design of distributed ﬁle systems [10, 7], in particu-

lar for data causality tracking among ﬁle copies. It is well

known that version vectors are unbounded due to their use

of counters; some approaches in the literature have tried to

address this problem.

We have brought the attention to the fact that causally

ordering a limited number of replicas does not require the

full expressive power of version vectors. Due to the limited

number of conﬁgurations among replicas, data causality

tracking does not necessarily imply the use of unbounded

mechanisms. As a consequence, Charron-Bost’s minimal-

ity of vector clocks cannot be transposed to version vectors.

We have noted that to ﬁnd a bounded alternative to

version vectors, it was enough to concentrate on a sub-

problem: keeping distributed knowledge about a total order

generated by a single entity.

The key to bounded stamps was deﬁning an intermediate

unbounded mechanism and showing that it was possible to

perform comparisons without requiring a global total order;

this was the bulk of the proof correctness; bounded stamps

were then derived as an encoding into a ﬁnite set of sym-

bols. This required the deﬁnition of a non-trivial symbol

9

reuse mechanism that is able to progress even if an arbitrary

number of replicas ceases to participate in the exchanges.

This mechanism may have a broader applicability beyond

its current use (e.g. log dissemination and pruning) and be-

come a building block in other mechanisms for distributed

systems.

The construction of the mechanism was supported by a

simulator1, which was used in the proof of correctness so

as to probe (and discard) tentative hypotheses. The simula-

tor was also turned into a model checker which proved the

correctness up to , giving some conﬁdence before

the full proof of correctness was attempted.

Bounded version vectors are obtained by substituting in-

teger counters on version vectors by bounded stamps. It

represents the ﬁrst bounded mechanism for detection of ob-

solescence and mutual inconsistency in distributed systems.

References

[1] Paulo Sérgio Almeida, Carlos Baquero, and Victor Fonte.

Version stamps – decentralized version vectors. In Proceed-

ings of the 22nd International Conference on Distributed

Computing Systems (ICDCS), pages 544–551. IEEE Com-

puter Society, 2002.

[2] A. Arora, S. S .Kulkarni, and M. Demirbas. Resettable vec-

tor clocks. In 19th Symposium on Principles of Distributed

Computing (PODC’2000), Portland, 2000. ACM, 2000.

[3] Bernadette Charron-Bost. Concerning the size of logical

clocks in distributed systems. Information Processing Let-

ters, 39:11–16, 1991.

[4] Colin Fidge. Timestamps in message-passing systems that

preserve the partial ordering. In 11th Australian Computer

Science Conference, pages 55–66, 1989.

[5] Michael J. Fischer and A. Michael. Sacriﬁcing serializabil-

ity to attain high availability of data. In Proceedings of the

ACM Symposium on Principles of Database Systems, pages

70–75. ACM, 1982.

[6] V. K. Garg and C. Skawratananond. String realizers of

posets with applications to distributed computing. In Pro-

ceedings of the ACM Symposium on Principles of Dis-

tributed Computing (PODC’01), pages 72–80. ACM, 2001.

[7] Richard G. Guy, John S. Heidemann, Wai Mak, Thomas W.

Page, Gerald J. Popek, and Dieter Rothmeier. Implementa-

tion of the ﬁcus replicated ﬁle system. In USENIX Confer-

ence Proceedings, pages 63–71. USENIX, June 1990.

1http://gsd.di.uminho.pt/bvv/bvv-simulator.py

[8] Yun-Wu Huang and Philip Yu. Lightweight version vec-

tors for pervasive computing devices. In Proceedings of

the 2000 International Workshops on Parallel Processing,

pages 43–48. IEEE Computer Society, 2000.

[9] Brent ByungHoon Kang, Robert Wilensky, and John Kubi-

atowicz. The hash history approach for reconciling mutual

inconsistency. In Proceedings of the 23nd International

Conference on Distributed Computing Systems (ICDCS),

pages 670–677. IEEE Computer Society, 2003.

[10] James Kistler and M. Satyanarayanan. Disconnected opera-

tion in the coda ﬁle system. ACM Transaction on Computer

Systems, 10(1):3–25, February 1992.

[11] Leslie Lamport. Time, clocks and the ordering of events

in a distributed system. Communications of the ACM,

21(7):558–565, July 1978.

[12] Friedemann Mattern. Virtual time and global clocks in dis-

tributed systems. In Workshop on Parallel and Distributed

Algorithms, pages 215–226, 1989.

[13] D. Stott Parker, Gerald Popek, Gerard Rudisin, Allen

Stoughton, Bruce Walker, Evelyn Walton, Johanna Chow,

David Edwards, Stephen Kiser, and Charles Kline. Detec-

tion of mutual inconsistency in distributed systems. Trans-

actions on Software Engineering, 9(3):240–246, 1983.

[14] David Howard Ratner. Roam: A Scalable Replication Sys-

tem for Mobile and Distributed Computing. PhD thesis,

1998. UCLA-CSD-970044.

[15] Frédéric Ruget. Cheaper matrix clocks. In Proceedings of

the 8th International Workshop on Distributed Algorithms,

pages 355–369. Springer Verlag, LNCS, 1994.

[16] Yasushi Saito. Unilateral version vector pruning using

loosely synchronized clocks. Technical Report HPL-2002-

51, HP Labs, 2002.

[17] Yasushi Saito and Marc Shapiro. Optimistic replication.

Technical Report MSR-TR-2003-60, Microsoft Research,

2003.

[18] R. Schwarz and F. Mattern. Detecting causal relationships

in distributed computations: In search of the holy grail. Dis-

tributed Computing, 3(7):149–174, 1994.

[19] FJ Torres-Rojas and M. Ahamad. Plausible clocks: con-

stant size logical clocks for distributed systems. Distributed

Computing, 12(4):179–196, 1999.

[20] G. T. J. Wuu and A. J. Bernstein. Efﬁcient solutions to the

replicated log and dictionary problems. In Proceedings of

the ACM Symposium on Principles of Distributed Comput-

ing (PODC’84), pages 232–242. ACM, 1984.

10

A Proof of Lemma 2

The hypothesis of lemma 2 concern two stamps (say and

) in which we can identify some sort of conﬂict between

each stamp knowledge: Stamp has a better knowledge

concerning the primary state ( ) but has an outdated

vision concerning some other stamp (say ), i.e. .

Lemma 2 states that when this happens stamp already at-

tributes the value of to some other stamp (say — that is,

). In order to prove this result, it will be necessary

to reinforce this statement: not only but it is pos-

sible to identify a ﬂow of information between stamp and

. Moreover, this ﬂow of information (a sequence of syn-

chronization operations starting from to ) can be traced

in stamp ’s local state as a sequence of indexes enjoying

some properties. These sequence of indexes are called de-

lay paths and are deﬁned as follows.

9 Deﬁnition (Delay Path). Adelay path between and

is a non-empty sequence of indexes such that,

for any stamp ,

1. ,

2. ,

3. for all ,

4. for all ,

5. for all .

Some simple facts concerning delay paths.

Proposition 10. Let be a delay path between

and . The following facts hold:

1. ,

2. ,

3. for all ,

4. .

Proof. The ﬁrst three facts are immediate consequences

from the deﬁnition and Proposition 1. Regarding the last

fact, if occurred in a position , being , by condi-

tion (4) of delay paths we have ; but this contra-

dicts condition (3). Thus, only occurs in a singleton delay

path.

Some of the conditions on delay paths impose global

constrains on them that will allow to reason about global

state changes and their impact on the local states. The fol-

lowing Lemma exposes the use of such global constrains.

Lemma 11 (Pointwise-join Lemma). Let be

a non-empty sequence of indexes. If for some ,

1. ,

2. for all , ,

3. for all and any stamp , if then

.

Then, for any stamp for which , there exists

such that and, for all ,

.

Proof. By induction on the length of the sequence

. For the base case (singular sequence) we have

that . Since we have and the

remaining condition is vacuous. For the induction step, we

consider the following cases: If then we set

since . Otherwise, we know that

and, by (4), that . So we apply the induction hypoth-

esis to the sequence and set to the resulting

index plus 1.

We now show that the conditions in Lemma 2 are sufﬁ-

cient to establish the existence of delay paths.

Lemma 12. If and are two stamps and a position such

that

and

then there exists a delay path between and .

Proof. We prove by induction on the length of the trace. If

the last operation was we use the singleton sequence

for the delay path and the conditions hold trivially. If the

last operation was consider the following cases:

:we pick the sequence that satisﬁes trivially

all conditions;

:after the update , which contradicts the

hypothesis;

:if then , which contradicts the

hypothesis. If we use the same delay path that

comes from the IH, which is still valid after the update

because it does not contain position , since

(Proposition 10).

11

:we use the same delay path from the IH, which

is still valid: (1,2,3) because and are not affected

by the update; (4) because only changes; (5) be-

cause even if for some we have , if ,

then due to (4).

If the last operation was (and lets assume, without loss

of generality, that is the more up-to-date stamp, i.e.

) we need to distinguish the following cases:

:we use the same delay path from the

IH, which is still valid: (1,2,3) because and are

not affected; (4) because can only increase; (5)

because for every , if , then either

is computed pointwise and follows from

the IH, or is either or and (by 4)

.

:stamps and become equal after the

synchronization and we pick the sequence for the

delay path;

:in this case the stamp re-

sults from the synchronization of and and we have

. Consider the following two

cases:

When and . First, given that

and , we can apply the IH to and

on index and establish the existence of a delay path

for in . Then we preﬁx it by ,

obtaining , which is a suitable delay path

between and , given that: (1) holds by construc-

tion, (2) from the IH, (3) from the IH and

(since ); (4) from the IH and

; (5) from the IH and because for

every stamp , .

Otherwise, then either or ; applying

the IH to either or and in position gives us

a valid delay path for the resulting conﬁguration (all

conditions hold, including (5) as shown for the case

).

:in this case the stamp re-

sults from the synchronization of and .

When is either or , we have ; but

this means (as and ) that ;

therefore is a delay path.

Otherwise, ; this means that

and by the IH there exists a delay path between

and . Given that also , Lemma 11 estab-

lishes the existence of a sequence

(preﬁx of ) that is a delay path between and

for the following reasons. Positions and do not

appear in — because we are assum-

ing , and for , otherwise

we would have (condition (4) of

delay paths of which is a preﬁx) and then

, which contradicts Lemma 11. Thus, all

elements , with are computed pointwise (i.e

), making conditions (2,3 and 5) immedi-

ate consequences of Lemma 11. Condition (1) is triv-

ially observed ( is a preﬁx of ); and condition (4)

from the IH and because upon a join values can only

increase.

We can ﬁnally state Lemma 2.

Lemma (2). For every stamp and , and every index

,

and implies

Proof. Direct from Lemma 12.

B Proof of Lemma 8

Lemma 8 says that each principal order is already contained

in some cached order on the primary. Note that Lemma 2

already states that every principal element belongs to

the primary principal vector, and delay paths were used to

show where it can be found. Now, we will show that it is

precisely in the primary cached order located in the position

pointed out by the delay path between and that we can

ﬁnd all the elements in . To prove this we need to reason

about cached orders along delay paths. This suggests an

extension of these to what we call principal delay paths.

13 Deﬁnition. Aprincipal delay path for stamp is a

delay path between and that additionally

satisﬁes the following condition: for every and

any stamp ,

implies or

and

12

We now prove the existence of principal delay paths by

extending the proof of existence in Lemma 12. Here we

only go through the cases that are relevant for the additional

condition.

Lemma 14. For every stamp there exists a principal

delay path.

Proof. (Sketch)

Consider the following additional arguments to the proof

of Lemma 12. If the last operation was (assume

):

:let . If is either or ,

we know that (since ). Let

. When , by condition (4), we

have or which determines that

. When is computed pointwise, the new

condition follows by the induction hypothesis.

:when and ,

let be the principal delay path for .

The new condition if veriﬁed for since,

the case is trivial (because

). For , the new condition is satisﬁed

since (Proposition 5).

in this case the primary re-

sults from the synchronization of and (i.e. is the

primary before synchronization). Since , then

is computed pointwise. By IH we get a principal

delay path to which we apply Lemma 11 to get a

new sequence where and never occur (c.f. proof

of Lemma 12). The new condition follows by the in-

duction hypothesis.

Lemma (8). For every stamp there exists a position

such that

Proof. Let be the principal delay path for

(given by Lemma 14). Instantiating the new condition for

on we get that

13