Content uploaded by Hitesh Tewari
Author content
All content in this area was uploaded by Hitesh Tewari on Jul 13, 2021
Content may be subject to copyright.
Framework for a DLT Based COVID-19 Passport
Sarang Chaudhari†, Michael Clear‡, Philip Bradish‡, and Hitesh Tewari‡
†Indian Institute of Technology, Delhi, ‡Trinity College Dublin, Ireland
Abstract. Uniquely identifying individuals across the various networks
they interact with on a daily basis remains a challenge for the digital
world that we live in, and therefore the development of secure and effi-
cient privacy preserving identity mechanisms has become an important
field of research. In addition, the popularity of decentralised decision
making networks such as Bitcoin has seen a huge interest in making use
of distributed ledger technology to store and securely disseminate end
user identity credentials. In this paper we describe a mechanism that
allows one to store the COVID-19 vaccination details of individuals on
a publicly readable, decentralised, immutable blockchain, and makes use
of a two-factor authentication system that employs biometric crypto-
graphic hashing techniques to generate a unique identifier for each user.
Our main contribution is the employment of a provably secure input-
hiding, locality-sensitive hashing algorithm over an iris extraction tech-
nique, that can be used to authenticate users and anonymously locate
vaccination records on the blockchain, without leaking any personally
identifiable information to the blockchain.
1 Introduction
Immunization is one of modern medicine’s greatest success stories. It is one of the
most cost-effective public health interventions to date, averting an estimated 2 to
3 million deaths every year. An additional 1.5 million deaths could be prevented
if global vaccination coverage improves [13]. The current COVID-19 pandemic
which has resulted in millions of infections worldwide [3] has brought into sharp
focus the urgent need for a “passport” like instrument, which can be used to
easily identify a user’s vaccination record, travel history etc., as they traverse the
globe. However, such instruments have the potential to discriminate or create
bias against citizens [10] if they are not designed with the aim of protecting
the user’s identity and/or any personal information stored about them on the
system.
Given the large number of potential users of such a system and the involve-
ment of many organizations in different jurisdictions, we need to design a system
that is easy to sign up to for end users, and for it to be rolled out at a rapid rate.
The use of hardware devices such as smart cards or mobile phones for storing
such data is going to be financially prohibitive for many users, especially those
This publication has emanated from research conducted with the financial support
of Science Foundation Ireland under Grant Number 13/RC/2094 (Lero).
arXiv:2008.01120v7 [cs.CR] 18 Jan 2021
2 Sarang Chaudhari†, Michael Clear‡, Philip Bradish‡, and Hitesh Tewari‡
in developing countries. Past experience has shown that such “hardware tokens”
are sometimes prone to design flaws that only come to light once a large number
of them are in circulation. Such flaws usually require remedial action in terms
of software or hardware updates, which can prove to be very disruptive.
An alternative to the above dilemma is an online passport mechanism. An
obvious choice for the implementation of such a system is a blockchain, that
provides a “decentralized immutable ledger” which can be configured in a man-
ner such that it can be only written to by authorised entities (i.e. there is no
requirement for a hard computation such as proof-of-work (PoW) to be carried
out for monetary reward), but can be queried by anyone. However, one of the
main concerns for such a system is based on: How does one preserve the privacy
of user’s data on a public blockchain while providing a robust mechanism to
link users to their data records securely? In other words, one of the key require-
ments is to avoid having any personally identifiable information (PII) belonging
to users stored on the blockchain.
In the subsequent sections we describe some of the key components of our
system and the motivation that led us to use them. The three major compo-
nents are - extraction of iris templates, a hashing mechanism to store them
securely and a blockchain technology. Finally, we present a formal description
of our framework which uses the aforementioned components as building blocks.
However, first we will briefly discuss some related work and then briefly some
preliminaries with definitions and notation that are used in the paper.
1.1 Related Work
There has been considerable work on biometric cryptosystems and cancellable
biometrics, which aims to protect biometric data when stored for the purpose of
authentication cf. [11]. Biometric hashing is one such approach that can achieve
the desired property of irreversibility, albeit without salting it does not achieve
unlinkability. Research in biometric hashing for generating the same hash for
different biometric templates from the same user is at an infant stage and existing
work does not provide strong security assurances. Locality-sensitive hashing is
the approach we explore in this paper, which has been applied to biometrics in
existing work; for example a recent paper by Dang et al. [4] applies a variant of
SimHash, a hash function we use in this paper, to face templates. However the
technique of applying locality-sensitive hashing to a biometric template has not
been employed, to the best of our knowledge, in a system such as ours.
2 Preliminaries
2.1 Notation
A quantity is said to be negligible with respect to some parameter λ, written
negl(λ), if it is asymptotically bounded from above by the reciprocal of all poly-
nomials in λ.
Framework for a DLT Based COVID-19 Passport 3
For a probability distribution D, we denote by x←$Dthe fact that xis
sampled according to D. We overload the notation for a set Si.e. y←$Sdenotes
that yis sampled uniformly from S. Let D0and D1be distributions. We denote
by D0≈
CD1and the D0≈
SD1the facts that D0and D1are computationally
indistinguishable and statistically indistinguishable respectively.
We use the notation [k] for an integer kto denote the set {1, . . . , k}.
Vectors are written in lowercase boldface letters.
The abbreviation PPT stands for probabilistic polynomial time.
2.2 Entropy
The entropy of a random variable is the average “information” conveyed by the
variable’s possible outcomes. A formal definition is as follows.
Definition 1. The entropy H(X)of a discrete random variable Xwhich takes
on the values x1, . . . , xnwith respective probabilities PrhX=x1i,...,PrhX=
xniis defined as
H(X) := −
n
X
i=1
PrhX=xiilog PrhX=xii
In this paper, the logarithm is taken to be base 2, and therefore we measure the
amount of entropy in bits.
3 Iris Template Extraction
Iris biometrics is considered one of the most reliable techniques for implementing
identification systems. For the ID of our system (discussed in the overview of
our framework in Section 6), we needed an algorithm that can provide us with
consistent iris templates, which will have not only low intra-class variability,
but also show high inter-class variability. This requirement is essential because
we would expect the iris templates for the same subject to be similar, as this
would then be hashed using the technique described in the section 4. Iris based
biometric techniques have received some good attention in the last decade. One
of the most successful technique was put forward by John Daugman [5], but
most of the current best-in-class techniques are patented and hence unavailable
for an open-source use. For the purpose of writing this paper, we have used the
work of Libor Masek [7] which is an open-source implementation of a reasonably
reliable iris recognition technique. Users can always opt for other commercial
biometric solutions when trying to deploy our work independently.
Masek’s technique works on grey-scale eye images, which are processed in
order to extract the binary template. First the segmentation algorithm, based
on a Hough Transform is used to localise the iris and pupil regions and also isolate
the eyelid, eyelash and reflections as shown in Figures 1a and 1b. The segmented
4 Sarang Chaudhari†, Michael Clear‡, Philip Bradish‡, and Hitesh Tewari‡
(a) (b)
(c)
(d)
Fig. 1: Masek’s Iris Template Extraction Algorithm
iris region is then normalised i.e, unwrapped into a rectangular block of constant
polar dimensions as shown in Figure 1c. The iris features are extracted from
the normalised image by one-dimensional Log-Gabor filters to produce a bit-
wise iris template and mask as shown in Figure 1d. We denote the complete
algorithm by Iris.ExtractFeatureVector which takes a scanned image as input and
outputs a binary feature vector fv ∈ {0,1}n(this algorithm is called upon by
our framework in Section 6). This data is then used for matching, where the
Hamming distance is used as the matching metric. We have used CASIA-Iris-
Interval database [2] in Figure 1 and for some preliminary testing.
Table 1 shows the performance of Masek’s technique as reported by him
in his original thesis [8]. This algorithm performs quite well for a threshold of
0.4 where the false acceptance rate (FAR) is 0.005 and the false rejection rate
(FRR) is 0. These values are used when we present our results to compare the
performance of our technique when a biometric template is first hashed and then
the hamming distance is measured to calculate the FAR and FRR, as opposed to
directly measuring the hamming distances in the original biometric templates.
One would assume the performance of our work to get better with the increase
in efficiency of extracting consistent biometric templates by other methods.
As aforementioned, we rely on the open-source MATLAB code by Libor
Masek. For each input image, the algorithm produces a binary template which
Framework for a DLT Based COVID-19 Passport 5
Threshold FAR FRR
0.20 0.000 74.046
0.25 0.000 45.802
0.30 0.000 25.191
0.35 0.000 4.580
0.40 0.005 0.000
0.45 7.599 0.000
0.50 99.499 0.000
Table 1: FAR and FRR for the CASIA-a Data Set
contains the iris information, and a corresponding noise mask which corresponds
to corrupt areas within the iris pattern, and marks bits in the template as cor-
rupt. These extracted iris templates and their corresponding masks are 20 ×480
binary matrices each. In the original work, only those bits in the iris pattern
that correspond to ‘0’ bits in the noise masks of both iris patterns were used
in the calculation of Hamming distance. A combined mask is then calculated
and both the templates are masked with it. Finally, the algorithm calculates the
bitwise XOR i.e. distance between the masked templates. The steps given below
provides an overview for the algorithm used by Libor:
Extraction:
(template1, mask1) = createiristemplate(image1)
(template2, mask2) = createiristemplate(image2)
Matching:
c mask =mask1∧mask2
masked template1=template1∧(¬c mask)
masked template2=template2∧(¬c mask)
distance =masked template1⊕masked template2
There are two major issues that we need to deal with before being able to
use these templates in our system, specifically, template masking and conversion
to linear vector.
3.1 Template Masking
The above matching technique requires one to have 2 pairs of iris patterns and
their corresponding masks to calculate the Hamming distance. However for the
application we are targeting, at any time during verification, the system would
have to match the extracted template of an individual (i.e. template and mask)
against a hashed template stored on the blockchain. This means that we cannot
6 Sarang Chaudhari†, Michael Clear‡, Philip Bradish‡, and Hitesh Tewari‡
incorporate the above matching algorithm into our system. We have two choices
to mitigate this problem and obtain the masked template for the remaining
steps:
1. We can calculate the masked template independently for each sample i.e.
masked template =template ∧(¬mask)
An underlying assumption for this method is that the masks for an individual
would be approximately the same in every sample. This assumption is not
too far-fetched as was clear from the preliminary analysis of our database.
We will refer to these as type1templates.
2. We can maintain a global mask which can be the defined as
global mask =mask1∧. . . ∧maskl
for all extracted maskiof all imageibelonging to the training database. And
at the time of verification, we generate the masked template as
masked template =template ∧(¬global mask)
This method has some added difficulty in finding the global mask and it also
discards more data from the iris patterns as opposed to directly using the re-
spective maskifor each templatei. But it helps in maintaining a consistency
among the masked templates. We will refer to these as type2templates.
In a follow-up to this paper, we will use and provide results for templates of
both types based on experiments we are conducting at the time of writing.
3.2 Conversion to Linear Vector
For our use case, we need a one-dimensional input stream which can be fed into
the hashing algorithm discussed in the subsequent sections. For converting those
masked template matrices into linear feature vectors, we have two naive choices
of concatenating either the row vectors or the column vectors. Before deciding
the type of conversion, let us look at an important key factor, which is rotational
inconsistencies in the iris templates.
Rotational inconsistencies are introduced due to rotations of the camera, head
tilts and rotations of the eye within the eye socket. The normalisation process
does not compensate these. In order to account for rotational inconsistencies,
when the Hamming distance of two templates is calculated, one template is
shifted left and right bit-wise, and a number of Hamming distance values are
calculated from successive shifts. This bit-wise shifting in the horizontal direction
corresponds to the rotation of the original iris region. This method was suggested
by Daugman [5], and corrects misalignment in the normalised iris pattern caused
by rotational differences during imaging. From the calculated Hamming distance
values, only the lowest is taken, since this corresponds to the best match between
two templates. Due to this, column-wise conversion seems like the most logical
choice as this would allow us to easily rotate the binary linear feature vectors.
Shifting the linear vector by 20 bits will correspond to shifting the iris template
once (recall that the dimension of iris templates is 20 ×480).
Framework for a DLT Based COVID-19 Passport 7
3.3 Wrap-up
Putting it all together, we define the steps of our algorithm Iris.ExtractFeatureVector
that we call upon later.
– Iris.ExtractFeatureVector(image):
•(template, mask) = createiristemplate(image) where createiristemplate
is Masek’s open source algorithm.
•Obtain masked template (either type1or type2as defined in Section 3.1).
•Convert masked template to linear vector as in Section 3.2.
•Output binary linear vector fv ∈ {0,1}n
Note that the parameter nis a global system parameter measuring the length
of the binary feature vectors outputted by Iris.ExtractFeatureVector.
4 Locality-sensitive Hashing
To preserve the privacy of individuals on the blockchain, the biometric data has
to be encrypted before being written to the ledger. Hashing is a good alter-
native to achieve this, but techniques such as SHA-256 and SHA-3 cannot be
used, since the biometric templates that we extracted above can show differences
across various scans for the same individual. Hence using those hash functions
would produce completely different hashes. Therefore, we seek a hash function
that generates “similar” hashes for similar biometric templates. This prompts
us to explore Locality-Sensitive Hashing (LSH), which has exactly this property.
Various LSH techniques have been researched to identify whether files (i.e. byte
streams) are similar based on their hashes. TLSH is a well-known LSH function
that exhibits high performance and matching accuracy but, does not provide a
sufficient degree of security for our application. Below we assess another type of
LSH function, which does not have the same runtime performance as TLSH, but
as we shall see, exhibits provable security for our application and therefore is a
good choice for adoption in our framework.
4.1 Input Hiding
In the cryptographic definition of one-way functions, it is required that it is hard
to find any preimage of the function. However, we can relax our requirements for
many applications because it does not matter if for example a random preimage
can be computed as long as it is hard to learn information about the specific
preimage that was used to compute the hash. In this section, we introduce a
property that captures this idea, a notion we call input hiding.
Input hiding means that if we choose some preimage xand give the hash
h=H(x) to an adversary, it is either computationally hard or information-
theoretically impossible for the adversary to learn xor any partial information
about x. This is captured in the following formal definition.
8 Sarang Chaudhari†, Michael Clear‡, Philip Bradish‡, and Hitesh Tewari‡
Definition 2. A hash function family Hwith domain X:= {0,1}nand range
Y:= {0,1}mis said to be input hiding if for all randomly chosen hash functions
H←$H, all i∈[n], all randomly chosen inputs x←$Xand all PPT adversaries
Ait holds that
Prhxi= 0 ∧ A(i, H(x)) →1i−
Prhxi= 1 ∧ A(i, H(x)) →1i
≤negl(λ)
where λis the security parameter.
4.2 Our Variant of SimHash
Random projection hashing, proposed by Charikar [1], preserves the cosine dis-
tance between two vectors in the output of the hash, such that two hashes are
probabilistically similar depending on the cosine distance between the two preim-
age vectors. This hash function is called SimHash. We describe a slight variant
of SimHash here, which we call S3Hash. In our variant, the random vectors that
are used are sampled from the finite field of F3={−1,0,1}. Suppose we choose
a hash length of mbits. Now for our purposes, the input vectors to the hash
are binary vectors in {0,1}nfor some n. First we choose mrandom vectors
ri←${−1,0,1}nfor i∈ {1, . . . , m}. Let R={ri}i∈{1,...,m}be the set of these
random vectors. The hash function S3HashR:{0,1}n→ {0,1}mis thus defined
as:
S3HashR(x) = (sgn(hx,r1i),...,sgn(hx,rmi)) (1)
where sgn :Z→ {0,1}returns 0 if its integer argument is negative and returns
1 otherwise. Note that the notation h·,·i denotes the inner product between the
two specified vectors. Let x1,x2∈ {0,1}nbe two input vectors. It holds for all
i∈ {1, . . . , m}that Pr[h(1)
i=h(2)
i] = 1 −θ(x1,x2)
πwhere h(1) =S3HashR(x1),
h(2) =S3HashR(x2) and θ(x1,x2) is the angle between x1and x2. Therefore the
similarity of the inputs is preserved in the similarity of the hashes.
An important question is: Is this hash function suitable for our application?
The answer is in the affirmative because it can be proved that the function
information-theoretically obeys a property we call input-hiding that we defined
in Section 4.1. We recall that this property means that if we choose some binary
vector x∈ {0,1}nand give the hash h=S3HashR(x) to an adversary, it is either
computationally hard or information-theoretically impossible for the adversary
to learn xor any partial information about x. This property is sufficient in our
application since we only have to ensure that no information is leaked about
the user’s iris template. We now prove that our variant locality-sensitive hash
function S3Hash is information-theoretically input hiding.
Theorem 1. Let Xdenote the random variable corresponding to the domain of
the hash function. If H(X)≥m+λthen S3Hash is information-theoretically
input hiding where λis the security parameter and H(X)is the entropy of X.
Framework for a DLT Based COVID-19 Passport 9
Proof. The random vectors in Rcan be thought of as vectors of coefficients
corresponding to a set of mlinear equation in nunknowns on the left hand side
and on the right hand side we have the melements, one for each equation, which
are components of the hash i.e. (h1, . . . , hm). Now the inner product is evaluated
over the integers and the sgn function maps an integer to an element of {0,1}
depending on its sign. The random vectors are chosen to be ternary. Suppose
we choose a finite field Fpwhere p≥2(m+ 1) is a prime. Since there will be no
overflow when evaluating the inner product in this field, a solution in this field is
also a solution over the integers. We are interested only in the binary solutions.
Because m < n, the system is underdetermined. Since there are n−mdegrees
of freedom in a solution, it follows that there are 2n−mbinary solutions and
each one is equally likely. Now let rdenote the redundancy of the input space
i.e. r=n−H(X). The fraction of the 2n−msolutions that are valid inputs
is 2n−m−r. If 2n−m−r>2λ, then the probability of an adversary choosing the
“correct” preimage is negligible in the security parameter λ. For this condition
to hold, it is required that n−m−r > λ (recall that r=n−H(X)), which
follows if H(X)≥m+λas hypothesized in the statement of the theorem. It
follows that information-theoretically an unbounded adversary has a negligible
advantage in the input hiding definition.
Our initial estimates suggest that the entropy of the distribution of binary
feature vectors outputted by Iris.ExtractFeatureVector is greater than m+λfor
parameter choices such as m= 256 and λ= 128. A more thorough analysis
however is deferred to future work.
4.3 Evaluation
We have ran experiments with S3Hash applied to feature vectors obtained using
our Iris.ExtractFeatureVector algorithm. The distance measure we use is the ham-
ming distance. The results of these experiments are shown in Table 2. Results
for a threshold of 0.3 in particular indicates that our approach shows promise.
We hope to make further improvements in future work.
5 Blockchain
A blockchain is used in the system for immutable storage of individuals’ vaccina-
tion records. The blockchain we employ is a permissioned ledger to which blocks
can only be added by authorized entities or persons such as hospitals, primary
health care centers, clinicians etc. Such entities have to obtain a public-key cer-
tificate from a trusted third party and store it on the blockchain as a transaction
before they are allowed to add blocks to the ledger. The opportunity to add a
new block is controlled in a round robin fashion, thereby eliminating the need
to perform a computationally intensive PoW process. Any transactions that are
broadcast to the P2P network are signed by the entity that created the transac-
tion, and can be verified by all other nodes by downloading the public key of the
10 Sarang Chaudhari†, Michael Clear‡, Philip Bradish‡, and Hitesh Tewari‡
Threshold FAR FRR
0.25 3.99 64.26
0.26 6.35 57.35
0.27 9.49 49.63
0.28 13.92 43.79
0.29 19.60 36.47
0.3 26.23 30.79
0.31 34.01 25.10
0.32 42.30 19.33
0.33 50.91 16.00
0.34 59.04 11.94
0.35 67.05 8.53
Table 2: FAR & FRR for the CASIA-Iris-Interval Data Set
signer from the ledger itself. An example of distributed ledger technology that
fulfills the above requirements is MultiChain [9].
5.1 Interface
We now describe an abstract interface for the permissioned blockchain that cap-
tures the functionality we need. Consider a set of parties ˆ
P. A subset of parties
P⊂ˆ
Pare authorized to write to the blockchain. Each party P ∈ Phas a se-
cret key skPwhich it uses to authenticate itself and gain permission to write to
the blockchain. How a party acquires authorization is beyond the scope of this
paper. For our purposes, the permissioned blockchain consists of the following
algorithms:
–Blockchain.Broadcast(P,skP,tx): On input a party identifier Pthat identifies
the sending party, a secret key skPfor party Pand a transaction tx (whose
form is described below), then broadcast the transaction tx to the peer-to-
peer network for inclusion in the next block. The transaction will be included
iff P ∈ P.
–Blockchain.AnonBroadcast(skP,tx) : On input a secret key skPfor a party P
and a transaction tx, then anonymously broadcast the transaction tx to the
peer-to-peer network for inclusion in the next block. The transaction will be
included iff P ∈ P.
–Blockchain.GetNumBlocks(): Return the total number of blocks currently in
the blockchain.
–Blockchain.RetrieveBlock(blockNo): Retrieve and return the block at index
blockNo, which is a non-negative integer between 0 and Blockchain.GetNumBlocks()−
1.
A transaction has the form (type,payload,party,signature). A transaction in an
anonymous broadcast is of the form (type,payload,⊥,⊥). The payload of a trans-
action is interpreted and parsed depending on its type. In our application, there
Framework for a DLT Based COVID-19 Passport 11
are two permissible types: ’rec’ (a record transaction which consists of a pair (ID,
record)) and ’hscan’ (biometric hash transaction which consists of a hash of an
iris feature vector). This will become clear from context in our formal descrip-
tion of our framework in the next section which makes use of the above interface
as a building block. The final point is that a block is a pair (hash,transactions)
consisting of the hash of the block and a set of transactions {txi}i∈[`].
6 Our Framework
6.1 Overview
In this section we provide a formal description of our proposed framework which
makes use of the building blocks presented in the previous sections. Our proposed
system utilises a two-factor authentication mechanism to uniquely identify an
individual on the blockchain. The parameters required to recreate an identifier
are based on information that “one knows” and biometric information that “one
possess”.
Fig. 2: Algorithm Workflow
Figure 2 describes the overall algorithm that we employ in our proposed
system. When a user presents themselves to an entity or organisation partic-
12 Sarang Chaudhari†, Michael Clear‡, Philip Bradish‡, and Hitesh Tewari‡
ipating in the system, they are asked for their DoB(dd/mm/yyyy) and Gen-
der(male/female/other). In addition, the organization captures a number of
scans of the user’s iris, and creates a hash H1(fv) from the feature vector ex-
tracted from the “best” biometric scan data. Our system can combine the user’s
DoB and Gender with H1(fv) to generate a unique 256-bit identifier (ID) for
the user:
ID =H2(DoB || Gender || H1(fv)) (2)
The algorithm tries to match the calculated hash H1(fv) with existing “anony-
mous” hashes that are stored on the blockchain. It may get back a set of hashes
that are somewhat “close” to the calculated hash. In that case the algorithm
concatenates each returned hash (Matchi) with the user’s DoB and Gender
to produce f
ID. It then tries to match f
ID with an ID in a vaccination record
transaction on the blockchain.
If a match is found then the user is already registered on the system and
has at least one vaccination record. At this point we may just wish to retrieve
the user’s records or add an additional record, e.g. when a booster dose has
been administered to the user. However if we go through the set of returned
matches and cannot match f
ID to an existing ID in a vaccination record on the
blockchain, i.e. this is the first time the user is presenting to the service, then we
store the iris scan hash data H1(fv) as an anonymous record on the blockchain,
and subsequently the ID and COVID-19 vaccination details for the user as
a separate transaction. In each case the transaction is broadcast at a random
interval on the blockchain peer-to-peer (P2P) network for it to be verified by
other nodes in the system, and eventually added to a block on the blockchain.
Uploading the two transactions belonging to a user at random intervals ensures
that the transactions are stored on separate blocks on the blockchain, and an
attacker is not easily able to identify the relationship between the two.
Figure 3 shows a blockchain in which there are three anonymous transactions
(i.e. Hash of Scan Data) and three COVID-19 vaccination record transactions
stored on the blockchain pertaining to different users. The reader is referred to
Section 4 for more details on how the hash is calculated in our system. Note
that the storage of the anonymous hash data has to be carried out only once
per registered user in the system.
6.2 Formal Description
We present a formal description of our framework in Figure 4 and Figure 5. Note
that the algorithms described in these figures are intended to formally describe
the fundamental desired functionality of our framework and are so described for
ease of exposition and clarity; in particular, they are naive and non-optimized,
specifically not leveraging more efficient data structures as would a real-world
implementation.
Let Hbe a family of collision-resistant hash functions. The algorithms in
Figure 4 are stateful (local variables that contain retrieved information from
Framework for a DLT Based COVID-19 Passport 13
Fig. 3: Blockchain Structure
the blockchain are shared and accessible to all algorithms). Furthermore, the
parameters tuple params generated in Setup is an implicit argument to all other
algorithms.
The algorithms invoked by the algorithms in Figure 4 can be found in Fig-
ure 5.
7 Conclusions and Future Work
In this paper we have detailed a framework to build a global vaccination passport
using a distributed ledger. The main contribution of our work is to combine a
Locality-sensitive hashing mechanism with a blockchain to store the vaccination
records of users. A variant of the SimHash LSH function is used to derive an
identifier that leaks no personal information about an individual. The only way
to extract a user’s record from the blockchain is by the user presenting themselves
in person to an authorised entity, and providing an iris scan along with other
personal data in order to derive the correct user identifier. However, our research
has raised many additional challenges and research questions whose resolution
require further investigation and experimentation, intended for future work. First
and foremost, we need to improve the accuracy of the Iris.ExtractFeatureVector
algorithm (e.g: deciding whether to use type1or type2template masking) and
to accurately compute the entropy of the feature vectors.
Furthermore, our variant of the SimHash algorithm, referred to in this paper
as S3Hash, requires further analysis and evaluation, especially with respect to
the domain of the random vectors ri, which are restricted to be ternary in this
paper. Additionally, we must choose a suitable blockchain. Finally, the overall
protocol would benefit from a thorough security analysis where not just privacy
but other security properties are tested.
The blockchain based mechanism that we have proposed can also be used as
ageneralised healthcare management system [6, 12] with the actual data being
stored off-chain for the purpose of efficiency. Once a user’s identifier has been
recreated it can be used to pull all records associated with the user, thereby re-
14 Sarang Chaudhari†, Michael Clear‡, Philip Bradish‡, and Hitesh Tewari‡
Algorithm Setup(1λ)
ri←${−1,0,1}nfor i∈ {1,...,m}for i∈[m].
R← {r1,...,rm}
H1←$H
H2←S3HashR
numBlocks ←0
ids ← ∅
records ← ∅
hscans ← ∅
Sync()
Return params := (H1, H2)
Algorithm AddRecord(P,skP,dob,gender,scan,record)
ID ←Authenticate(dob,gender,scan)
If ID =⊥:
ID ←Enroll(P,skP,dob,gender,scan,record)
Return
payload ←(ID,record)
σ←Sign(skP,payload)
tx ←(’rec’,payload,P, σ)
Blockchain.Broadcast(P,skP,tx)
records ←records ∪ {(ID,record)}
Algorithm FetchRecords(dob,gender,scan)
ID ←Authenticate(dob,gender,scan)
If ID =⊥:
Return ∅
results ← {record : (ID,record)∈records}
Return results
Fig. 4: Our Framework for a COVID-19 Passport
trieving their full medical history. We are in the middle of developing a prototype
implementation of the system and hope to present the results of our evaluation
in a follow-on paper. At some time in the future, we hope to trial the system in
the field, with the hope of rolling it out on a larger scale. Our implementation
will be made open source.
References
1. S. Charikar. Similarity estimation techniques from rounding algorithms. In Pro-
ceedings of the 34th Annual ACM Symposium on Theory of Computing, pages
380–388, 2002.
2. Chinese Academy Sciences’ Institute of Automation (CASIA) - Iris Image
Database. http://biometrics.idealtest.org/.
Framework for a DLT Based COVID-19 Passport 15
3. COVID-19 Dashboard by the Center for Systems Science and Engineering (CSSE)
at Johns Hopkins University (JHU). https://coronavirus.jhu.edu/map.html.
4. T. M. Dang, L. Tran, T. D. Nguyen, and D. Choi. Fehash: Full entropy hash for face
template protection. In Proceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition (CVPR) Workshops, June 2020.
5. J. Daugman. Statistical richness of visual phase information: Update on recognizing
persons by iris patterns. International Journal of Computer Vision, 45:25–38, 2001.
6. M. Hanley and H. Tewari. Managing lifetime healthcare data on the blockchain.
In 2018 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced
& Trusted Computing, Scalable Computing & Communications, Cloud & Big
Data Computing, Internet of People and Smart City Innovation, pages 246–251,
Guangzhou, 2018.
7. L. Masek. Recognition of human iris patterns for biometric identification. Final
Year Project, The School of Computer Science and Software Engineering, The
University of Western Australia, 2003.
8. L. Masek and P.Kovesi. Matlab source code for a biometric identification system
based on iris patterns. The School of Computer Science and Software Engineering,
The University of Western Australia, 2003.
9. Multichain. https://www.multichain.com/.
10. A. L. Phelan. Covid-19 immunity passports and vaccination certificates: scientific,
equitable, and legal challenges. The Lancet, 395(10237):1595 – 1598, 2020.
11. C. Rathgeb and A. Uhl. A survey on biometric cryptosystems and cancelable
biometrics. EURASIP J. Information Security, 2011:3, 2011.
12. H. Tewari. Blockchain research beyond cryptocurrencies. IEEE Communications
Standards Magazine, 3(4):21–25, Dec. 2019.
13. World Health Organization. https://www.who.int/news-room/
facts-in-pictures/detail/immunization.
16 Sarang Chaudhari†, Michael Clear‡, Philip Bradish‡, and Hitesh Tewari‡
Algorithm Authenticate(dob,gender,scan)
Sync()
fv ←Iris.ExtractFeatureVector(scan)
hscan ←H2(fv)
ID ←H1(dob kgender khscan)
If ID ∈ids:
Return ID
For each h∈hscans:
d←Dist(hscan, h)
If d < THRESHOLD:
f
ID ←H1(dob kgender kh)
If
f
ID ∈ids:
Return
f
ID
Return ⊥
Algorithm Enroll(P,skP,dob,gender,scan,initRecord)
Sync()
fv ←Iris.ExtractFeatureVector(scan)
hscan ←H2(fv)
ID ←H1(dob kgender khscan)
payload ←(ID,initRecord)
σ←Sign(skP,payload)
tx ←(’rec’,payload,P, σ)
Blockchain.Broadcast(P,skP,tx)
t←${1,...,100}
tx0←(’hscan’,hscan)
Queue execution of Blockchain.AnonBroadcast(skP,tx0)
after time t
ids ←ids ∪ {ID}
hscans ←hscans ∪ {hscan}
records ←records ∪ {(ID,initRecord)}
Return ID
Algorithm Sync()
newNumBlocks ←Blockchain.GetNumBlocks()
If newNumBlocks >numBlocks:
For numBlocks ≤i < newNumBlocks:
block ←Blockchain.RetrieveBlock(i)
(hash,transactions)←block
For each tx ∈transactions:
(type,payload,·,·)←tx
If type = ’rec’:
(ID,record)←payload
ids ←ids ∪ {ID}
records ←records ∪ {(ID,record)}
Else if type = ’hscan’:
hscan ←payload
hscans ←hscans ∪ {hscan}
numBlocks ←newNumBlocks
Fig. 5: Additional Algorithms Used By Our Framework