Content uploaded by Hitesh Tewari

Author content

All content in this area was uploaded by Hitesh Tewari on Jul 13, 2021

Content may be subject to copyright.

Framework for a DLT Based COVID-19 Passport

Sarang Chaudhari†, Michael Clear‡, Philip Bradish‡, and Hitesh Tewari‡

†Indian Institute of Technology, Delhi, ‡Trinity College Dublin, Ireland

Abstract. Uniquely identifying individuals across the various networks

they interact with on a daily basis remains a challenge for the digital

world that we live in, and therefore the development of secure and eﬃ-

cient privacy preserving identity mechanisms has become an important

ﬁeld of research. In addition, the popularity of decentralised decision

making networks such as Bitcoin has seen a huge interest in making use

of distributed ledger technology to store and securely disseminate end

user identity credentials. In this paper we describe a mechanism that

allows one to store the COVID-19 vaccination details of individuals on

a publicly readable, decentralised, immutable blockchain, and makes use

of a two-factor authentication system that employs biometric crypto-

graphic hashing techniques to generate a unique identiﬁer for each user.

Our main contribution is the employment of a provably secure input-

hiding, locality-sensitive hashing algorithm over an iris extraction tech-

nique, that can be used to authenticate users and anonymously locate

vaccination records on the blockchain, without leaking any personally

identiﬁable information to the blockchain.

1 Introduction

Immunization is one of modern medicine’s greatest success stories. It is one of the

most cost-eﬀective public health interventions to date, averting an estimated 2 to

3 million deaths every year. An additional 1.5 million deaths could be prevented

if global vaccination coverage improves [13]. The current COVID-19 pandemic

which has resulted in millions of infections worldwide [3] has brought into sharp

focus the urgent need for a “passport” like instrument, which can be used to

easily identify a user’s vaccination record, travel history etc., as they traverse the

globe. However, such instruments have the potential to discriminate or create

bias against citizens [10] if they are not designed with the aim of protecting

the user’s identity and/or any personal information stored about them on the

system.

Given the large number of potential users of such a system and the involve-

ment of many organizations in diﬀerent jurisdictions, we need to design a system

that is easy to sign up to for end users, and for it to be rolled out at a rapid rate.

The use of hardware devices such as smart cards or mobile phones for storing

such data is going to be ﬁnancially prohibitive for many users, especially those

This publication has emanated from research conducted with the ﬁnancial support

of Science Foundation Ireland under Grant Number 13/RC/2094 (Lero).

arXiv:2008.01120v7 [cs.CR] 18 Jan 2021

2 Sarang Chaudhari†, Michael Clear‡, Philip Bradish‡, and Hitesh Tewari‡

in developing countries. Past experience has shown that such “hardware tokens”

are sometimes prone to design ﬂaws that only come to light once a large number

of them are in circulation. Such ﬂaws usually require remedial action in terms

of software or hardware updates, which can prove to be very disruptive.

An alternative to the above dilemma is an online passport mechanism. An

obvious choice for the implementation of such a system is a blockchain, that

provides a “decentralized immutable ledger” which can be conﬁgured in a man-

ner such that it can be only written to by authorised entities (i.e. there is no

requirement for a hard computation such as proof-of-work (PoW) to be carried

out for monetary reward), but can be queried by anyone. However, one of the

main concerns for such a system is based on: How does one preserve the privacy

of user’s data on a public blockchain while providing a robust mechanism to

link users to their data records securely? In other words, one of the key require-

ments is to avoid having any personally identiﬁable information (PII) belonging

to users stored on the blockchain.

In the subsequent sections we describe some of the key components of our

system and the motivation that led us to use them. The three major compo-

nents are - extraction of iris templates, a hashing mechanism to store them

securely and a blockchain technology. Finally, we present a formal description

of our framework which uses the aforementioned components as building blocks.

However, ﬁrst we will brieﬂy discuss some related work and then brieﬂy some

preliminaries with deﬁnitions and notation that are used in the paper.

1.1 Related Work

There has been considerable work on biometric cryptosystems and cancellable

biometrics, which aims to protect biometric data when stored for the purpose of

authentication cf. [11]. Biometric hashing is one such approach that can achieve

the desired property of irreversibility, albeit without salting it does not achieve

unlinkability. Research in biometric hashing for generating the same hash for

diﬀerent biometric templates from the same user is at an infant stage and existing

work does not provide strong security assurances. Locality-sensitive hashing is

the approach we explore in this paper, which has been applied to biometrics in

existing work; for example a recent paper by Dang et al. [4] applies a variant of

SimHash, a hash function we use in this paper, to face templates. However the

technique of applying locality-sensitive hashing to a biometric template has not

been employed, to the best of our knowledge, in a system such as ours.

2 Preliminaries

2.1 Notation

A quantity is said to be negligible with respect to some parameter λ, written

negl(λ), if it is asymptotically bounded from above by the reciprocal of all poly-

nomials in λ.

Framework for a DLT Based COVID-19 Passport 3

For a probability distribution D, we denote by x←$Dthe fact that xis

sampled according to D. We overload the notation for a set Si.e. y←$Sdenotes

that yis sampled uniformly from S. Let D0and D1be distributions. We denote

by D0≈

CD1and the D0≈

SD1the facts that D0and D1are computationally

indistinguishable and statistically indistinguishable respectively.

We use the notation [k] for an integer kto denote the set {1, . . . , k}.

Vectors are written in lowercase boldface letters.

The abbreviation PPT stands for probabilistic polynomial time.

2.2 Entropy

The entropy of a random variable is the average “information” conveyed by the

variable’s possible outcomes. A formal deﬁnition is as follows.

Deﬁnition 1. The entropy H(X)of a discrete random variable Xwhich takes

on the values x1, . . . , xnwith respective probabilities PrhX=x1i,...,PrhX=

xniis deﬁned as

H(X) := −

n

X

i=1

PrhX=xiilog PrhX=xii

In this paper, the logarithm is taken to be base 2, and therefore we measure the

amount of entropy in bits.

3 Iris Template Extraction

Iris biometrics is considered one of the most reliable techniques for implementing

identiﬁcation systems. For the ID of our system (discussed in the overview of

our framework in Section 6), we needed an algorithm that can provide us with

consistent iris templates, which will have not only low intra-class variability,

but also show high inter-class variability. This requirement is essential because

we would expect the iris templates for the same subject to be similar, as this

would then be hashed using the technique described in the section 4. Iris based

biometric techniques have received some good attention in the last decade. One

of the most successful technique was put forward by John Daugman [5], but

most of the current best-in-class techniques are patented and hence unavailable

for an open-source use. For the purpose of writing this paper, we have used the

work of Libor Masek [7] which is an open-source implementation of a reasonably

reliable iris recognition technique. Users can always opt for other commercial

biometric solutions when trying to deploy our work independently.

Masek’s technique works on grey-scale eye images, which are processed in

order to extract the binary template. First the segmentation algorithm, based

on a Hough Transform is used to localise the iris and pupil regions and also isolate

the eyelid, eyelash and reﬂections as shown in Figures 1a and 1b. The segmented

4 Sarang Chaudhari†, Michael Clear‡, Philip Bradish‡, and Hitesh Tewari‡

(a) (b)

(c)

(d)

Fig. 1: Masek’s Iris Template Extraction Algorithm

iris region is then normalised i.e, unwrapped into a rectangular block of constant

polar dimensions as shown in Figure 1c. The iris features are extracted from

the normalised image by one-dimensional Log-Gabor ﬁlters to produce a bit-

wise iris template and mask as shown in Figure 1d. We denote the complete

algorithm by Iris.ExtractFeatureVector which takes a scanned image as input and

outputs a binary feature vector fv ∈ {0,1}n(this algorithm is called upon by

our framework in Section 6). This data is then used for matching, where the

Hamming distance is used as the matching metric. We have used CASIA-Iris-

Interval database [2] in Figure 1 and for some preliminary testing.

Table 1 shows the performance of Masek’s technique as reported by him

in his original thesis [8]. This algorithm performs quite well for a threshold of

0.4 where the false acceptance rate (FAR) is 0.005 and the false rejection rate

(FRR) is 0. These values are used when we present our results to compare the

performance of our technique when a biometric template is ﬁrst hashed and then

the hamming distance is measured to calculate the FAR and FRR, as opposed to

directly measuring the hamming distances in the original biometric templates.

One would assume the performance of our work to get better with the increase

in eﬃciency of extracting consistent biometric templates by other methods.

As aforementioned, we rely on the open-source MATLAB code by Libor

Masek. For each input image, the algorithm produces a binary template which

Framework for a DLT Based COVID-19 Passport 5

Threshold FAR FRR

0.20 0.000 74.046

0.25 0.000 45.802

0.30 0.000 25.191

0.35 0.000 4.580

0.40 0.005 0.000

0.45 7.599 0.000

0.50 99.499 0.000

Table 1: FAR and FRR for the CASIA-a Data Set

contains the iris information, and a corresponding noise mask which corresponds

to corrupt areas within the iris pattern, and marks bits in the template as cor-

rupt. These extracted iris templates and their corresponding masks are 20 ×480

binary matrices each. In the original work, only those bits in the iris pattern

that correspond to ‘0’ bits in the noise masks of both iris patterns were used

in the calculation of Hamming distance. A combined mask is then calculated

and both the templates are masked with it. Finally, the algorithm calculates the

bitwise XOR i.e. distance between the masked templates. The steps given below

provides an overview for the algorithm used by Libor:

Extraction:

(template1, mask1) = createiristemplate(image1)

(template2, mask2) = createiristemplate(image2)

Matching:

c mask =mask1∧mask2

masked template1=template1∧(¬c mask)

masked template2=template2∧(¬c mask)

distance =masked template1⊕masked template2

There are two major issues that we need to deal with before being able to

use these templates in our system, speciﬁcally, template masking and conversion

to linear vector.

3.1 Template Masking

The above matching technique requires one to have 2 pairs of iris patterns and

their corresponding masks to calculate the Hamming distance. However for the

application we are targeting, at any time during veriﬁcation, the system would

have to match the extracted template of an individual (i.e. template and mask)

against a hashed template stored on the blockchain. This means that we cannot

6 Sarang Chaudhari†, Michael Clear‡, Philip Bradish‡, and Hitesh Tewari‡

incorporate the above matching algorithm into our system. We have two choices

to mitigate this problem and obtain the masked template for the remaining

steps:

1. We can calculate the masked template independently for each sample i.e.

masked template =template ∧(¬mask)

An underlying assumption for this method is that the masks for an individual

would be approximately the same in every sample. This assumption is not

too far-fetched as was clear from the preliminary analysis of our database.

We will refer to these as type1templates.

2. We can maintain a global mask which can be the deﬁned as

global mask =mask1∧. . . ∧maskl

for all extracted maskiof all imageibelonging to the training database. And

at the time of veriﬁcation, we generate the masked template as

masked template =template ∧(¬global mask)

This method has some added diﬃculty in ﬁnding the global mask and it also

discards more data from the iris patterns as opposed to directly using the re-

spective maskifor each templatei. But it helps in maintaining a consistency

among the masked templates. We will refer to these as type2templates.

In a follow-up to this paper, we will use and provide results for templates of

both types based on experiments we are conducting at the time of writing.

3.2 Conversion to Linear Vector

For our use case, we need a one-dimensional input stream which can be fed into

the hashing algorithm discussed in the subsequent sections. For converting those

masked template matrices into linear feature vectors, we have two naive choices

of concatenating either the row vectors or the column vectors. Before deciding

the type of conversion, let us look at an important key factor, which is rotational

inconsistencies in the iris templates.

Rotational inconsistencies are introduced due to rotations of the camera, head

tilts and rotations of the eye within the eye socket. The normalisation process

does not compensate these. In order to account for rotational inconsistencies,

when the Hamming distance of two templates is calculated, one template is

shifted left and right bit-wise, and a number of Hamming distance values are

calculated from successive shifts. This bit-wise shifting in the horizontal direction

corresponds to the rotation of the original iris region. This method was suggested

by Daugman [5], and corrects misalignment in the normalised iris pattern caused

by rotational diﬀerences during imaging. From the calculated Hamming distance

values, only the lowest is taken, since this corresponds to the best match between

two templates. Due to this, column-wise conversion seems like the most logical

choice as this would allow us to easily rotate the binary linear feature vectors.

Shifting the linear vector by 20 bits will correspond to shifting the iris template

once (recall that the dimension of iris templates is 20 ×480).

Framework for a DLT Based COVID-19 Passport 7

3.3 Wrap-up

Putting it all together, we deﬁne the steps of our algorithm Iris.ExtractFeatureVector

that we call upon later.

– Iris.ExtractFeatureVector(image):

•(template, mask) = createiristemplate(image) where createiristemplate

is Masek’s open source algorithm.

•Obtain masked template (either type1or type2as deﬁned in Section 3.1).

•Convert masked template to linear vector as in Section 3.2.

•Output binary linear vector fv ∈ {0,1}n

Note that the parameter nis a global system parameter measuring the length

of the binary feature vectors outputted by Iris.ExtractFeatureVector.

4 Locality-sensitive Hashing

To preserve the privacy of individuals on the blockchain, the biometric data has

to be encrypted before being written to the ledger. Hashing is a good alter-

native to achieve this, but techniques such as SHA-256 and SHA-3 cannot be

used, since the biometric templates that we extracted above can show diﬀerences

across various scans for the same individual. Hence using those hash functions

would produce completely diﬀerent hashes. Therefore, we seek a hash function

that generates “similar” hashes for similar biometric templates. This prompts

us to explore Locality-Sensitive Hashing (LSH), which has exactly this property.

Various LSH techniques have been researched to identify whether ﬁles (i.e. byte

streams) are similar based on their hashes. TLSH is a well-known LSH function

that exhibits high performance and matching accuracy but, does not provide a

suﬃcient degree of security for our application. Below we assess another type of

LSH function, which does not have the same runtime performance as TLSH, but

as we shall see, exhibits provable security for our application and therefore is a

good choice for adoption in our framework.

4.1 Input Hiding

In the cryptographic deﬁnition of one-way functions, it is required that it is hard

to ﬁnd any preimage of the function. However, we can relax our requirements for

many applications because it does not matter if for example a random preimage

can be computed as long as it is hard to learn information about the speciﬁc

preimage that was used to compute the hash. In this section, we introduce a

property that captures this idea, a notion we call input hiding.

Input hiding means that if we choose some preimage xand give the hash

h=H(x) to an adversary, it is either computationally hard or information-

theoretically impossible for the adversary to learn xor any partial information

about x. This is captured in the following formal deﬁnition.

8 Sarang Chaudhari†, Michael Clear‡, Philip Bradish‡, and Hitesh Tewari‡

Deﬁnition 2. A hash function family Hwith domain X:= {0,1}nand range

Y:= {0,1}mis said to be input hiding if for all randomly chosen hash functions

H←$H, all i∈[n], all randomly chosen inputs x←$Xand all PPT adversaries

Ait holds that

Prhxi= 0 ∧ A(i, H(x)) →1i−

Prhxi= 1 ∧ A(i, H(x)) →1i

≤negl(λ)

where λis the security parameter.

4.2 Our Variant of SimHash

Random projection hashing, proposed by Charikar [1], preserves the cosine dis-

tance between two vectors in the output of the hash, such that two hashes are

probabilistically similar depending on the cosine distance between the two preim-

age vectors. This hash function is called SimHash. We describe a slight variant

of SimHash here, which we call S3Hash. In our variant, the random vectors that

are used are sampled from the ﬁnite ﬁeld of F3={−1,0,1}. Suppose we choose

a hash length of mbits. Now for our purposes, the input vectors to the hash

are binary vectors in {0,1}nfor some n. First we choose mrandom vectors

ri←${−1,0,1}nfor i∈ {1, . . . , m}. Let R={ri}i∈{1,...,m}be the set of these

random vectors. The hash function S3HashR:{0,1}n→ {0,1}mis thus deﬁned

as:

S3HashR(x) = (sgn(hx,r1i),...,sgn(hx,rmi)) (1)

where sgn :Z→ {0,1}returns 0 if its integer argument is negative and returns

1 otherwise. Note that the notation h·,·i denotes the inner product between the

two speciﬁed vectors. Let x1,x2∈ {0,1}nbe two input vectors. It holds for all

i∈ {1, . . . , m}that Pr[h(1)

i=h(2)

i] = 1 −θ(x1,x2)

πwhere h(1) =S3HashR(x1),

h(2) =S3HashR(x2) and θ(x1,x2) is the angle between x1and x2. Therefore the

similarity of the inputs is preserved in the similarity of the hashes.

An important question is: Is this hash function suitable for our application?

The answer is in the aﬃrmative because it can be proved that the function

information-theoretically obeys a property we call input-hiding that we deﬁned

in Section 4.1. We recall that this property means that if we choose some binary

vector x∈ {0,1}nand give the hash h=S3HashR(x) to an adversary, it is either

computationally hard or information-theoretically impossible for the adversary

to learn xor any partial information about x. This property is suﬃcient in our

application since we only have to ensure that no information is leaked about

the user’s iris template. We now prove that our variant locality-sensitive hash

function S3Hash is information-theoretically input hiding.

Theorem 1. Let Xdenote the random variable corresponding to the domain of

the hash function. If H(X)≥m+λthen S3Hash is information-theoretically

input hiding where λis the security parameter and H(X)is the entropy of X.

Framework for a DLT Based COVID-19 Passport 9

Proof. The random vectors in Rcan be thought of as vectors of coeﬃcients

corresponding to a set of mlinear equation in nunknowns on the left hand side

and on the right hand side we have the melements, one for each equation, which

are components of the hash i.e. (h1, . . . , hm). Now the inner product is evaluated

over the integers and the sgn function maps an integer to an element of {0,1}

depending on its sign. The random vectors are chosen to be ternary. Suppose

we choose a ﬁnite ﬁeld Fpwhere p≥2(m+ 1) is a prime. Since there will be no

overﬂow when evaluating the inner product in this ﬁeld, a solution in this ﬁeld is

also a solution over the integers. We are interested only in the binary solutions.

Because m < n, the system is underdetermined. Since there are n−mdegrees

of freedom in a solution, it follows that there are 2n−mbinary solutions and

each one is equally likely. Now let rdenote the redundancy of the input space

i.e. r=n−H(X). The fraction of the 2n−msolutions that are valid inputs

is 2n−m−r. If 2n−m−r>2λ, then the probability of an adversary choosing the

“correct” preimage is negligible in the security parameter λ. For this condition

to hold, it is required that n−m−r > λ (recall that r=n−H(X)), which

follows if H(X)≥m+λas hypothesized in the statement of the theorem. It

follows that information-theoretically an unbounded adversary has a negligible

advantage in the input hiding deﬁnition.

Our initial estimates suggest that the entropy of the distribution of binary

feature vectors outputted by Iris.ExtractFeatureVector is greater than m+λfor

parameter choices such as m= 256 and λ= 128. A more thorough analysis

however is deferred to future work.

4.3 Evaluation

We have ran experiments with S3Hash applied to feature vectors obtained using

our Iris.ExtractFeatureVector algorithm. The distance measure we use is the ham-

ming distance. The results of these experiments are shown in Table 2. Results

for a threshold of 0.3 in particular indicates that our approach shows promise.

We hope to make further improvements in future work.

5 Blockchain

A blockchain is used in the system for immutable storage of individuals’ vaccina-

tion records. The blockchain we employ is a permissioned ledger to which blocks

can only be added by authorized entities or persons such as hospitals, primary

health care centers, clinicians etc. Such entities have to obtain a public-key cer-

tiﬁcate from a trusted third party and store it on the blockchain as a transaction

before they are allowed to add blocks to the ledger. The opportunity to add a

new block is controlled in a round robin fashion, thereby eliminating the need

to perform a computationally intensive PoW process. Any transactions that are

broadcast to the P2P network are signed by the entity that created the transac-

tion, and can be veriﬁed by all other nodes by downloading the public key of the

10 Sarang Chaudhari†, Michael Clear‡, Philip Bradish‡, and Hitesh Tewari‡

Threshold FAR FRR

0.25 3.99 64.26

0.26 6.35 57.35

0.27 9.49 49.63

0.28 13.92 43.79

0.29 19.60 36.47

0.3 26.23 30.79

0.31 34.01 25.10

0.32 42.30 19.33

0.33 50.91 16.00

0.34 59.04 11.94

0.35 67.05 8.53

Table 2: FAR & FRR for the CASIA-Iris-Interval Data Set

signer from the ledger itself. An example of distributed ledger technology that

fulﬁlls the above requirements is MultiChain [9].

5.1 Interface

We now describe an abstract interface for the permissioned blockchain that cap-

tures the functionality we need. Consider a set of parties ˆ

P. A subset of parties

P⊂ˆ

Pare authorized to write to the blockchain. Each party P ∈ Phas a se-

cret key skPwhich it uses to authenticate itself and gain permission to write to

the blockchain. How a party acquires authorization is beyond the scope of this

paper. For our purposes, the permissioned blockchain consists of the following

algorithms:

–Blockchain.Broadcast(P,skP,tx): On input a party identiﬁer Pthat identiﬁes

the sending party, a secret key skPfor party Pand a transaction tx (whose

form is described below), then broadcast the transaction tx to the peer-to-

peer network for inclusion in the next block. The transaction will be included

iﬀ P ∈ P.

–Blockchain.AnonBroadcast(skP,tx) : On input a secret key skPfor a party P

and a transaction tx, then anonymously broadcast the transaction tx to the

peer-to-peer network for inclusion in the next block. The transaction will be

included iﬀ P ∈ P.

–Blockchain.GetNumBlocks(): Return the total number of blocks currently in

the blockchain.

–Blockchain.RetrieveBlock(blockNo): Retrieve and return the block at index

blockNo, which is a non-negative integer between 0 and Blockchain.GetNumBlocks()−

1.

A transaction has the form (type,payload,party,signature). A transaction in an

anonymous broadcast is of the form (type,payload,⊥,⊥). The payload of a trans-

action is interpreted and parsed depending on its type. In our application, there

Framework for a DLT Based COVID-19 Passport 11

are two permissible types: ’rec’ (a record transaction which consists of a pair (ID,

record)) and ’hscan’ (biometric hash transaction which consists of a hash of an

iris feature vector). This will become clear from context in our formal descrip-

tion of our framework in the next section which makes use of the above interface

as a building block. The ﬁnal point is that a block is a pair (hash,transactions)

consisting of the hash of the block and a set of transactions {txi}i∈[`].

6 Our Framework

6.1 Overview

In this section we provide a formal description of our proposed framework which

makes use of the building blocks presented in the previous sections. Our proposed

system utilises a two-factor authentication mechanism to uniquely identify an

individual on the blockchain. The parameters required to recreate an identiﬁer

are based on information that “one knows” and biometric information that “one

possess”.

Fig. 2: Algorithm Workﬂow

Figure 2 describes the overall algorithm that we employ in our proposed

system. When a user presents themselves to an entity or organisation partic-

12 Sarang Chaudhari†, Michael Clear‡, Philip Bradish‡, and Hitesh Tewari‡

ipating in the system, they are asked for their DoB(dd/mm/yyyy) and Gen-

der(male/female/other). In addition, the organization captures a number of

scans of the user’s iris, and creates a hash H1(fv) from the feature vector ex-

tracted from the “best” biometric scan data. Our system can combine the user’s

DoB and Gender with H1(fv) to generate a unique 256-bit identiﬁer (ID) for

the user:

ID =H2(DoB || Gender || H1(fv)) (2)

The algorithm tries to match the calculated hash H1(fv) with existing “anony-

mous” hashes that are stored on the blockchain. It may get back a set of hashes

that are somewhat “close” to the calculated hash. In that case the algorithm

concatenates each returned hash (Matchi) with the user’s DoB and Gender

to produce f

ID. It then tries to match f

ID with an ID in a vaccination record

transaction on the blockchain.

If a match is found then the user is already registered on the system and

has at least one vaccination record. At this point we may just wish to retrieve

the user’s records or add an additional record, e.g. when a booster dose has

been administered to the user. However if we go through the set of returned

matches and cannot match f

ID to an existing ID in a vaccination record on the

blockchain, i.e. this is the ﬁrst time the user is presenting to the service, then we

store the iris scan hash data H1(fv) as an anonymous record on the blockchain,

and subsequently the ID and COVID-19 vaccination details for the user as

a separate transaction. In each case the transaction is broadcast at a random

interval on the blockchain peer-to-peer (P2P) network for it to be veriﬁed by

other nodes in the system, and eventually added to a block on the blockchain.

Uploading the two transactions belonging to a user at random intervals ensures

that the transactions are stored on separate blocks on the blockchain, and an

attacker is not easily able to identify the relationship between the two.

Figure 3 shows a blockchain in which there are three anonymous transactions

(i.e. Hash of Scan Data) and three COVID-19 vaccination record transactions

stored on the blockchain pertaining to diﬀerent users. The reader is referred to

Section 4 for more details on how the hash is calculated in our system. Note

that the storage of the anonymous hash data has to be carried out only once

per registered user in the system.

6.2 Formal Description

We present a formal description of our framework in Figure 4 and Figure 5. Note

that the algorithms described in these ﬁgures are intended to formally describe

the fundamental desired functionality of our framework and are so described for

ease of exposition and clarity; in particular, they are naive and non-optimized,

speciﬁcally not leveraging more eﬃcient data structures as would a real-world

implementation.

Let Hbe a family of collision-resistant hash functions. The algorithms in

Figure 4 are stateful (local variables that contain retrieved information from

Framework for a DLT Based COVID-19 Passport 13

Fig. 3: Blockchain Structure

the blockchain are shared and accessible to all algorithms). Furthermore, the

parameters tuple params generated in Setup is an implicit argument to all other

algorithms.

The algorithms invoked by the algorithms in Figure 4 can be found in Fig-

ure 5.

7 Conclusions and Future Work

In this paper we have detailed a framework to build a global vaccination passport

using a distributed ledger. The main contribution of our work is to combine a

Locality-sensitive hashing mechanism with a blockchain to store the vaccination

records of users. A variant of the SimHash LSH function is used to derive an

identiﬁer that leaks no personal information about an individual. The only way

to extract a user’s record from the blockchain is by the user presenting themselves

in person to an authorised entity, and providing an iris scan along with other

personal data in order to derive the correct user identiﬁer. However, our research

has raised many additional challenges and research questions whose resolution

require further investigation and experimentation, intended for future work. First

and foremost, we need to improve the accuracy of the Iris.ExtractFeatureVector

algorithm (e.g: deciding whether to use type1or type2template masking) and

to accurately compute the entropy of the feature vectors.

Furthermore, our variant of the SimHash algorithm, referred to in this paper

as S3Hash, requires further analysis and evaluation, especially with respect to

the domain of the random vectors ri, which are restricted to be ternary in this

paper. Additionally, we must choose a suitable blockchain. Finally, the overall

protocol would beneﬁt from a thorough security analysis where not just privacy

but other security properties are tested.

The blockchain based mechanism that we have proposed can also be used as

ageneralised healthcare management system [6, 12] with the actual data being

stored oﬀ-chain for the purpose of eﬃciency. Once a user’s identiﬁer has been

recreated it can be used to pull all records associated with the user, thereby re-

14 Sarang Chaudhari†, Michael Clear‡, Philip Bradish‡, and Hitesh Tewari‡

Algorithm Setup(1λ)

ri←${−1,0,1}nfor i∈ {1,...,m}for i∈[m].

R← {r1,...,rm}

H1←$H

H2←S3HashR

numBlocks ←0

ids ← ∅

records ← ∅

hscans ← ∅

Sync()

Return params := (H1, H2)

Algorithm AddRecord(P,skP,dob,gender,scan,record)

ID ←Authenticate(dob,gender,scan)

If ID =⊥:

ID ←Enroll(P,skP,dob,gender,scan,record)

Return

payload ←(ID,record)

σ←Sign(skP,payload)

tx ←(’rec’,payload,P, σ)

Blockchain.Broadcast(P,skP,tx)

records ←records ∪ {(ID,record)}

Algorithm FetchRecords(dob,gender,scan)

ID ←Authenticate(dob,gender,scan)

If ID =⊥:

Return ∅

results ← {record : (ID,record)∈records}

Return results

Fig. 4: Our Framework for a COVID-19 Passport

trieving their full medical history. We are in the middle of developing a prototype

implementation of the system and hope to present the results of our evaluation

in a follow-on paper. At some time in the future, we hope to trial the system in

the ﬁeld, with the hope of rolling it out on a larger scale. Our implementation

will be made open source.

References

1. S. Charikar. Similarity estimation techniques from rounding algorithms. In Pro-

ceedings of the 34th Annual ACM Symposium on Theory of Computing, pages

380–388, 2002.

2. Chinese Academy Sciences’ Institute of Automation (CASIA) - Iris Image

Database. http://biometrics.idealtest.org/.

Framework for a DLT Based COVID-19 Passport 15

3. COVID-19 Dashboard by the Center for Systems Science and Engineering (CSSE)

at Johns Hopkins University (JHU). https://coronavirus.jhu.edu/map.html.

4. T. M. Dang, L. Tran, T. D. Nguyen, and D. Choi. Fehash: Full entropy hash for face

template protection. In Proceedings of the IEEE/CVF Conference on Computer

Vision and Pattern Recognition (CVPR) Workshops, June 2020.

5. J. Daugman. Statistical richness of visual phase information: Update on recognizing

persons by iris patterns. International Journal of Computer Vision, 45:25–38, 2001.

6. M. Hanley and H. Tewari. Managing lifetime healthcare data on the blockchain.

In 2018 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced

& Trusted Computing, Scalable Computing & Communications, Cloud & Big

Data Computing, Internet of People and Smart City Innovation, pages 246–251,

Guangzhou, 2018.

7. L. Masek. Recognition of human iris patterns for biometric identiﬁcation. Final

Year Project, The School of Computer Science and Software Engineering, The

University of Western Australia, 2003.

8. L. Masek and P.Kovesi. Matlab source code for a biometric identiﬁcation system

based on iris patterns. The School of Computer Science and Software Engineering,

The University of Western Australia, 2003.

9. Multichain. https://www.multichain.com/.

10. A. L. Phelan. Covid-19 immunity passports and vaccination certiﬁcates: scientiﬁc,

equitable, and legal challenges. The Lancet, 395(10237):1595 – 1598, 2020.

11. C. Rathgeb and A. Uhl. A survey on biometric cryptosystems and cancelable

biometrics. EURASIP J. Information Security, 2011:3, 2011.

12. H. Tewari. Blockchain research beyond cryptocurrencies. IEEE Communications

Standards Magazine, 3(4):21–25, Dec. 2019.

13. World Health Organization. https://www.who.int/news-room/

facts-in-pictures/detail/immunization.

16 Sarang Chaudhari†, Michael Clear‡, Philip Bradish‡, and Hitesh Tewari‡

Algorithm Authenticate(dob,gender,scan)

Sync()

fv ←Iris.ExtractFeatureVector(scan)

hscan ←H2(fv)

ID ←H1(dob kgender khscan)

If ID ∈ids:

Return ID

For each h∈hscans:

d←Dist(hscan, h)

If d < THRESHOLD:

f

ID ←H1(dob kgender kh)

If

f

ID ∈ids:

Return

f

ID

Return ⊥

Algorithm Enroll(P,skP,dob,gender,scan,initRecord)

Sync()

fv ←Iris.ExtractFeatureVector(scan)

hscan ←H2(fv)

ID ←H1(dob kgender khscan)

payload ←(ID,initRecord)

σ←Sign(skP,payload)

tx ←(’rec’,payload,P, σ)

Blockchain.Broadcast(P,skP,tx)

t←${1,...,100}

tx0←(’hscan’,hscan)

Queue execution of Blockchain.AnonBroadcast(skP,tx0)

after time t

ids ←ids ∪ {ID}

hscans ←hscans ∪ {hscan}

records ←records ∪ {(ID,initRecord)}

Return ID

Algorithm Sync()

newNumBlocks ←Blockchain.GetNumBlocks()

If newNumBlocks >numBlocks:

For numBlocks ≤i < newNumBlocks:

block ←Blockchain.RetrieveBlock(i)

(hash,transactions)←block

For each tx ∈transactions:

(type,payload,·,·)←tx

If type = ’rec’:

(ID,record)←payload

ids ←ids ∪ {ID}

records ←records ∪ {(ID,record)}

Else if type = ’hscan’:

hscan ←payload

hscans ←hscans ∪ {hscan}

numBlocks ←newNumBlocks

Fig. 5: Additional Algorithms Used By Our Framework