Content uploaded by Diego F. Aranha
Author content
All content in this area was uploaded by Diego F. Aranha on Dec 17, 2017
Content may be subject to copyright.
A Secure and Efficient Implementation of the
Quotient Digital Signature Algorithm (qDSA)
Armando Faz-Hernández, Hayato Fujii, Diego F. Aranha, and Julio López ?
Institute of Computing, University of Campinas.
1251 Albert Einstein, Cidade Universitária. Campinas, São Paulo, Brazil.
{armfazh,dfaranha,jlopez}@ic.unicamp.br, hayato@lasca.ic.unicamp.br
Abstract. Digital signatures provide a means to publicly authenticate
messages sent over an insecure channel. Recently, the Quotient Digital
Signature Algorithm (qDSA) was introduced aiming key-compatibility
with the Diffie-Hellman X25519 function. Due to the novelty of qDSA,
there remains a need for an optimized implementation that allows iden-
tifying the real impact of this new algorithm. In this work, we focus on
the secure and efficient implementation of qDSA. By leveraging the use
of precomputation on the right-to-left Joye’s algorithm, we reduced the
running time of signature generation by 30–35%, and the running time
of the verification procedure by 19%. In addition, for increased security,
we show a verification method that validates qDSA signatures unequiv-
ocally. All of these improvements were included into an optimized soft-
ware library targeting 32–bit ARM and 64–bit Intel architectures. The
improved performance achieved in these platforms, it positions qDSA as
a competitive alternative for deploying digital signatures efficiently and
securely.
Keywords: qDSA ·Digital Signatures ·Elliptic Curve Cryptography ·
Secure Software ·Montgomery Curves
1 Introduction
Digital signatures are public-key cryptographic schemes used to authenticate
messages sent over a public channel; thus, anyone with the knowledge of the
signer’s public-key is able to verify whether a signed message comes from a
reliable source. Digital signatures also provide other security services such as
data integrity, authentication, and non-repudiation. One of the most relevant
applications of digital signatures is the certification of public keys in the Public-
Key Infrastructure (PKI). In this scenario, a trusted authority issues and signs a
digital certificate that binds a public key to its owner; then, whenever an entity
?The authors acknowledge support during the development of this research from
Intel and FAPESP under project “Secure Execution of Cryptographic Algorithms”
(grant 14/50704-7), and from LG Electronics Inc. under project “Efficient and Secure
Cryptography for IoT”. The fourth author was partially supported by a research
productivity grant from CNPq.
2 Faz-Hernández, Fujii, Aranha, López
claims to be the owner of a public key, the digital certificate must be presented;
therefore, anybody with the knowledge of the authority’s public key is able to
verify the signature of the certificate that attests this relationship.
In the last decades, several digital signature algorithms have been standard-
ized. In 1998, the National Institute of Standards and Technology (NIST) ap-
proved the use of the Digital Signature Algorithm (DSA) [24] and the RSA
digital signature [34]. Later in 2000, NIST also adopted the use of a digital sig-
nature algorithm that relies on the computational intractability of the elliptic
curve discrete logarithm problem, such a method is known as the Elliptic Curve
Digital Signature Algorithm (ECDSA) [17,25]. Since their standardization, these
algorithms have been widely used in secure communication protocols, such as
the Transport Layer Security (TLS) protocol [33].
More recently, cutting-edge cryptographic research is in pursuit of efficient
digital signature algorithms. The introduction of the Edwards Digital Signa-
ture Algorithm (EdDSA) [2] is an example of the latest progress. EdDSA uses
Edwards curves, which belong to a special family of elliptic curves whose point
addition formulas are more efficient than the formulas used for an arbitrary curve
in the short Weierstrass model. Ed25519 [18] is an instance of EdDSA addressing
the 128-bit security level. Particularly, Ed25519 uses an Edwards curve derived
from the Montgomery curve known as Curve25519 [1]. This latter curve was
intended to accelerate the key exchange protocol leading to the Diffie-Hellman
X25519 function [41]. Although Ed25519 and X25519 can be used in conjunc-
tion benefiting from the common prime field arithmetic, the keys used in each
protocol are not entirely compatible.
To make this compatibility possible, novel alternatives were derived such
as the XEdDSA signature scheme [30]. In the past few months, an alternative
approach was proposed by Renes and Smith [32], who introduced a new signature
scheme based on Curve25519. They named this scheme as the Quotient Digital
Signature Algorithm (qDSA) because scalar point multiplications are performed
on an algebraic variety generated by the quotient of an algebraic curve.
The most salient properties of qDSA are: first, it allows to use X25519’s
keys (without modification) for signing; and second, elliptic curve operations
are performed using only the x-coordinate of points (provided by the use of
Montgomery elliptic curves). On the opposite side, given a qDSA signature, it
is easy to obtain a second signature that also passes the verification procedure.
Although this fact does not represent an attack per se, it does open a breach to
a misuse of the cryptographic scheme that could potentially become an effective
attack [7,8]. Therefore, there is a need for methods that allows verifying qDSA’s
signatures unequivocally.
Contributions. In view of the current scenario, our main contribution focuses on
the secure and efficient software implementation of qDSA. On the security side,
we provide a verification method that validates (without ambiguity) the correct
signature of a message, and we also analyze the overheads on space and time
introduced by our approach. On the efficiency side, we show a technique that ac-
celerates the key generation, signing, and verification procedures. This speedup
A Secure and Efficient Implementation of qDSA 3
was achieved as a consequence of employing precomputed look-up tables during
the evaluation of the right-to-left Joye’s algorithm [19], using a similar approach
to the one introduced by Oliveira et al. [29]. Due to the novelty of qDSA, there
is a need for an optimized implementation beyond the one developed by qDSA’s
authors [32]. For this reason, we focus on the development of a software library
that supports both 32-bit ARM processors (Cortex M4, Cortex A7 and Cor-
tex A15 micro-architectures) and 64–bit Intel processors (Haswell and Skylake
micro-architectures). For all of these architectures, we use optimized prime field
arithmetic and elliptic curve operations leading to an efficient and secure im-
plementation of the qDSA signature scheme. The source code is available at:
[http://github.com/armfazh/qdsa-space17 ].
Regarding the scalar point multiplication algorithm presented in [29], it re-
quires the use of points that are in small subgroups of the elliptic curve, i.e. low-
order points. An attacker can leverage the use of low-order points to weaken the
security of a implementation; for example, by means of side-channel attacks [9],
or by exploiting vulnerabilities on unsecure implementations, like the ones found
in some cryptographic currencies [37]. For this reason and as a side result, we
describe a technique that avoids low-order points during the calculation of scalar
point multiplications.
The remainder of this document is divided as follows. In Sect. 2, we review
the qDSA scheme and the parameters used in our implementation. In Sect. 3, we
show how to accelerate the calculation of fixed-point multiplications. In Sect. 4,
we present a new verification procedure. In Sect. 5, we report the results of the
performance benchmark of our software library. Finally, in Sect. 6, we point out
some concluding comments.
2 The Quotient Digital Signature Algorithm
The Quotient Digital Signature Algorithm (qDSA) is a Schnorr-like signature
scheme [35] that operates over a Kummer variety K. This variety comes from
the quotient of an elliptic (or hyper-elliptic) curve Eas K=E/h±1i, i.e. for the
case of elliptic curves, the points P, −P∈Eare mapped to a single element in
E/h±1i. Although this mapping does not preserve the group structure of E, it is
still possible to compute multiplications by integers. When qDSA is instantiated
with elliptic curves the Kummer variety resultant is a one-dimensional projective
space P1(Fp), also known as the x-line (see [6,32] for more details).
In this section, we revisit elliptic curve operations on Montgomery curves;
then, we detail the qDSA signature scheme together with the instance generated
from Curve25519’s parameters.
2.1 Arithmetic of Montgomery Curves
Let Fpbe a prime field, a Montgomery elliptic curve is defined over Fpas:
EA,B /Fp:By2=x3+Ax2+x , (1)
4 Faz-Hernández, Fujii, Aranha, López
where A, B ∈Fp,A26= 4, and B6= 0. The set of solutions of this equation
forms a commutative group having as identity the element O, which is known
as the point at infinity. Hence, given two points Pand Q, we can obtain a third
point Rsuch that R=P+Q. The inverse of a point P= (x, y)is obtained as
−P= (x, −y). For these curves, the order of the group is always divisible by
four [22]. Given an n-bit integer kand a point P, the scalar point multiplication
is defined as kP =sgn(k)Pn−1
i=0 2ikiP, where kiis the i-th bit of |k|.
For adding points, Montgomery found efficient formulas that operate over
the x-coordinate of points [22]. In order to apply these operations the elliptic
curve must be embedded on a projective space. Let P2(Fp)be a projective space
of dimension two, then the projective representation of a point P= (xP, yP)
is (λXP:λYP:λZP), such that λ6= 0,xP=XP/ZP, and yP=YP/ZP. Mont-
gomery noted that, in the projective space, a point addition can be calculated
using only the x-coordinate of the points. Therefore, the following function maps
elliptic curve points to elements in the Kummer variety E/h±1ias follows:
E→E/h±1i∼
=P1(Fp)
(XP:YP:ZP)7→ (XP:ZP)
O 7→ (1: 0)
.(2)
Let P= (XP:ZP)and Q= (XQ:ZQ)be two points mapped into the Kum-
mer variety. Montgomery devised a formula for computing differential additions
(dadd); thus, given P,Q, and R=P−Q(all in projective coordinates) the
differential addition formula computes P+RQ= (XP+RQ:ZP+RQ)as follows:
XP+RQ=ZR(XPXQ−ZPZQ)2,
ZP+RQ=XR(XPZQ−ZPXQ)2.(3)
For the particular case when the points to be added are equal, we have a point
doubling (doub) denoted as 2P= (X2P:Z2P)and calculated as follows:
X2P= (XP2−ZP2)2,
Z2P= 4XPZP(XP2+AXPZP+ZP2).(4)
Based on (3) and (4), Montgomery also introduced an algorithm for comput-
ing scalar point multiplications. The well-known Montgomery ladder algorithm
(Alg. 1) computes the x-coordinate of kP , given the x-coordinate of Pand an
n-bit integer scalar k. The cost of Alg. 1 is mainly determined by the number
of operations performed in each iteration; hence, Montgomery ladder algorithm
takes one doubling operation and one differential addition per bit of k.
Algorithm 1 uses an auxiliary function cswap(b, U, V ), which interchanges
the values of Uand Vwhenever b= 1, otherwise points are not modified. Since
this function could introduce a time variability in its execution, cswap must
be securely implemented by adding countermeasures that prevent of, for exam-
ple, timings attacks [4,20]. Consequently, we implemented cswap using Boolean
operations; thus, assuming Uand Vare n-bit strings cswap is computed as
A Secure and Efficient Implementation of qDSA 5
Algorithm 1 Montgomery Ladder Algorithm.
Input: k∈Zsuch that k > 0, and P= (XP:ZP).
Output: kP = (XkP :ZkP ).
1: Let (kn−1= 1,...,k0)2be the binary representation of k.
2: Initialize Q0←2P,Q1←P.
3: for i←n−2to 0do
4: (Q0, Q1)←cswap(ki⊕ki+1, Q0, Q1)
5: (Q0, Q1)←doub(Q0),dadd(Q0, Q1, P )⁄⁄Q0←2Q0, Q1←Q0+PQ1
6: end for
7: (Q0, Q1)←cswap(k0, Q0, Q1)
8: return Q0⁄⁄Return also Q1for y-coordinate recovery.
follows:
(U0, V 0) = cswap(b, U, V )
=(¬M∧U)⊕(M∧V),(M∧U)⊕(¬M∧V),(5)
where Mis an n-bit mask initialized to (111 . . . 1)2, i.e. nones, if b= 1; otherwise
M= (000 . . . 0)2, i.e. nzeros.
2.2 Instantiating qDSA with Montgomery Curves
Domain Parameters of qDSA. Given an integer number N, the size of public
keys is fixed to Nbits and the signature’s size is 2Nbits. The following set
represents the domain parameters of the signature scheme:
D={N, p, EA,B , `, G, H },(6)
where: pis a large prime number such that N≈log2(p),EA,B is a Montgomery
elliptic curve defined over Fp, this curve has a large prime subgroup of order `,
Gis a point of order `, and His a hash function producing 2N–bit digests.
A qDSA Instance. Due to the performance features offered by the elliptic curve
named Curve25519 [1], it can also be used to produce an efficient instance of
qDSA; thus, Dis specified as:
–Since p= 2255 −19, we have N= 256.
–The Curve25519 is defined over Fpas E486662,1.
–This curve forms a group of order 8`, where
`= 2252 + 27742317777372353535851937790883648493 (7)
is a prime number.
–A point G= (xG, yG)of order `is fixed as xG= 9 and yG=√39420360 ∈Fp
such that yGis odd.
–Regarding the cryptographic hash function, the authors of qDSA selected
an extendable-output function belonging to the Secure Hash Algorithm v3
(SHA3) standard [26]; therefore, they selected Has the SHAKE128 function
fixing its output size to 512 bits.
6 Faz-Hernández, Fujii, Aranha, López
2.3 Digital Signature Operations
The qDSA scheme consists of three algorithms: key generation (Alg. 2), signature
generation (Alg. 3), and signature verification (Alg. 4). This latter procedure
requires an auxiliary function (Alg. 5) that it will be revised in Sect. 4.
Algorithm 2 Key generation.
Input: D, the domain parameters.
Output: (d0, d1)∈ {0,1}2Nis a private
key, and xQ∈Fpis a public key.
1: d$
←− {0,1}N
2: (h2N−1,...,h0)2←H(d)
3: d0←(h2N−1,...,hN)2
4: d1←(hN−1,...,h0)2
5: Q= (XQ:ZQ)←d0G⁄⁄Alg. 7.
6: xQ←XQ/ZQ
7: return (d0, d1)and xQ
Algorithm 3 Signature generation.
Input: (d0, d1)and xQare the signer’s
keys; and M∈ {0,1}∗is a message.
Output: (xRks)is the signature of M,
where xR∈Fpand s∈ {0,1}N.
1: r←H(d1kM)mod `
2: R= (XR:ZR)←rG ⁄⁄Alg. 7.
3: xR←XR/ZR
4: h←H(xRkxQkM)
5: s←r−hd0mod `
6: return (xRks)
Algorithm 4 Signature verification.
Input: xQis the public key of the signer,
(xRks)is a signature, and
M∈ {0,1}∗is a message.
Output: True, if the signature is valid;
otherwise, False.
1: Q←(xQ: 1)
2: h←H(xRkxQkM)mod `
3: R0←sG ⁄⁄Alg. 7.
4: R1←hQ ⁄⁄Alg. 1.
5: return Check(xR, R0, R1)⁄⁄Alg. 5.
Algorithm 5 Check xR∈ {x(P±Q)}.
Input: xR∈Fp, and (P, Q) are elliptic
curve points in projective coordinates.
Output: True, if xR∈ {x(P±Q)}; oth-
erwise, False.
1: Let f(x)←f2x2+f1x+f0such that
fiare defined as in Equation (10).
2: if f(xR) = 0 then
3: return True
4: else
5: return False
6: end if
By analyzing the elliptic curve operations required by qDSA, it was noted
that the running time is dominated by the computation of scalar point multipli-
cations. Consequently, we focused on the acceleration of this operation. Notice
that a multiple of the base point Gis calculated in each qDSA operation. Since
Gis fixed for the entire scheme, then we can precompute a table that stores some
multiples of G. Hence, a scalar multiplication algorithm can be modified to look
up in the table and to retrieve multiples of Gfor calculating kG; this scenario
is commonly known as a fixed-point multiplication, and it will be addressed in
the next section.
3 Accelerating Fixed-Point Multiplications
In the open literature, there exist specialized algorithms that accelerate the cal-
culation of fixed-point multiplications. In the general setting, the most used
algorithm is the Comb technique [21], which arranges the bits of kin a matrix
A Secure and Efficient Implementation of qDSA 7
form, then the point multiplication algorithm interprets bit-columns as indexes
to look up in the precomputed table. Several fixed-point multiplication algo-
rithms were derived from the Comb technique, for example [10,11,14,15], among
others.
Comb-based algorithms have in common that indexes are directly derived
from the bits of the scalar. This implies that when the scalar is secret, every
access to the look-up table must be protected; otherwise, an attacker could
extract some bits of the scalar by correlating variations in the latency of access
to the cache memory. This kind of attack is known as a cache attack [40], which
in practice have been a successful method for recovering secret keys from insecure
implementations of tabled-based algorithms.
A common countermeasure to protect look-up table queries consists on using
a uniform accessing pattern. Hence, in spite of it occurs variations on the latency
of cache memory accessing, the attacker will not be able to determine from which
part of the table the requested entry was retrieved. However, in some cases
the cost of adding countermeasures impacts negatively on the performance of
point multiplication. A desirable solution for this scenario would be an algorithm
that uses non-secret indexes for accessing to the look-up table. In the following
section, we will show an algorithm that satisfies these conditions.
3.1 A Fixed-Point Multiplication Algorithm with Non-Secret
Indexes
In 2007, Joye presented right-to-left algorithms to compute scalar point multipli-
cations [19]. As their name suggests, these algorithms scan the bits of the scalar
from the least- to the most-significant bit, unlike conventional methods such as
the double-and-add algorithm or the Montgomery ladder algorithm. Moreover,
Joye’s algorithm uses a regular execution pattern of elliptic curve operations
and without using dummy operations, these features aid on the prevention of
timings attacks [20] and fault-based attacks [42,3]. Joye’s algorithm has been ap-
plied on the implementation of both Weierstrass curves [13] and Koblitz binary
curves [28,38].
More recently, Oliveira et al. [29] adapted the right-to-left Joye’s algorithm
to use precomputed look-up tables with the purpose of accelerating fixed-point
multiplications (see Algorithm 6). The central operation of Algorithm 6 is to
add some precomputed multiples of Gin two accumulators, namely Q0and Q1.
The bits of the scalar kdetermine which accumulator must be updated in such
a way that, at the i-th iteration, Algorithm 6 accumulates the point 2iGinto Q0
using a differential addition (with Q1as the difference) whenever ki⊕ki−1= 0;
otherwise, it accumulates 2iGinto Q1also using a differential addition (but
this time with Q0as the difference). Observe that Algorithm 6 is composed of
evaluations of differential additions, since no point doublings are required at all.
Notice that in either case, one operand of the differential addition is known in
advance. Hence, assuming Qis the known point, the differential addition can be
calculated saving one multiplication (as it was proposed in [29]). Let R=P−Q
8 Faz-Hernández, Fujii, Aranha, López
Algorithm 6 Right-to-left fixed-point multiplication algorithm (cf. [29]).
Input: (k, G, S), where k∈Z`and k6= 0;Gis a point of order `; and Sis a point of
order 4 such that S /∈ hGi.
Precomputation: A look-up table storing (µ0,...,µn−1)as defined in Eq. (9).
Output: 8kG = (X8kG :Z8kG ).
1: Let (kn−1,...,k0)2be the n-bit binary repr. of ksuch that n=blog2(`)c+ 1.
2: Initialize Q0←S,Q1←G−S, and define k−1= 0.
3: for i←0to n−1do
4: (Q0, Q1)←cswap(ki⊕ki−1, Q0, Q1)
5: Q0←dadd*(µi, Q0, Q1)⁄⁄Q0←Q0+Q12iG
6: end for
7: Q1←doub(Q1)
8: Q1←doub(Q1)
9: Q1←doub(Q1)
10: return Q1
and µ= (xQ+ 1)(xQ−1)−1∈Fp; then, we denote with dadd* the following
formula:
XP+RQ=ZR[(XP+ZP) + µ(XP−ZP)]2,
ZP+RQ=XR[(XP+ZP)−µ(XP−ZP)]2.(8)
To compute scalar point multiplications Algorithm 6 requires a precomputed
table storing one entry per bit of the scalar. Let n=blog2(`)c+ 1, then the
look-up table will store the values (µ0, . . . , µn−1), where µiis defined as:
µi= (xi+ 1)(xi−1)−1∈Fp,such that (xi, yi)=2iG. (9)
Remark 3.1. To retrieve a point from the look-up table, the index used is actually
a counting variable, and most importantly, this index is not derived from the
secret scalar. Thus, a query is performed by directly choosing the correspondent
value from the table. This enables a faster execution in contrast to Comb-based
methods which require a secure (and sometimes costly) look-up table accessing.
By using Oliveira et al.’s algorithm, we expect an increase on the performance
of fixed-point multiplications. Note that in each iteration, only one differential
addition is processed in contrast with the (left-to-right) Montgomery ladder
and the right-to-left Joye’s algorithm, which require an extra point doubling
per iteration. Before applying Oliveira et al.’s algorithm in the calculation of
fixed-point multiplications, in the following section, we will introduce a set of
modifications to avoid the use of low-order points.
3.2 Circumventing the Use of Low-Order Points
Attention is required during the initialization of the accumulators Q0and Q1
in Algorithm 6, since the formula for differential point addition is not complete.
This means that for adding P+RQsuch that R=P−Q, the differential addition
formula fails whenever R∈ {O,(0,0)}.
A Secure and Efficient Implementation of qDSA 9
We recall that the goal of Algorithm 6 in Oliveira et al.’s work [29] is to
calculate the point 8kG required by the Diffie-Hellman X25519 function. For
this reason, Algorithm 6 initializes accumulators with Q0←Sand Q1←G−S
such that S /∈ hGi. For the case of Curve25519, Swas chosen as a point of
order four (i.e. 4S=O). Thus, Algorithm 6 will compute S+kG, and after
applying three consecutive point doublings, the point Swill vanish resulting in
8kG. Although this procedure is correct, some vulnerabilities could appear due
to a misuse of low-order points [9,37]. Therefore, it is imperative to protect the
implementation against this potential threat.
To avoid the use of low-order points, we show a technique that accomplishes
this requirement. Our technique relies on the observation that if the order of
Gis odd, like in the case of Curve25519; then, the point Sis not required any
more. Notice that replacing Sby Oin Algorithm 6 causes a failure when the
least-significant bit of kis zero; nonetheless, it always computes the correct point
multiplication whenever kis odd. This observation indicates that Algorithm 6
with S=Ocomputes scalar point multiplications only for odd scalars. There-
fore, we introduce a modification in Algorithm 6 that supports even and odd
scalars, and avoids using low-order points.
Let `be the order of G. The key observation is that if `is odd, then the
parity of an element in {1, . . . , `−1}determines a bijection between the disjoint
sets of even and odd elements.
Proposition 3.1. Let `be an odd number. For any value asuch that 0< a < `
define b=`−a; we have that if ais even, then bis odd.
Proof. First, note that bis bounded as 0< b < `. Since a<`, then b=`−a > 0.
Suppose b≥`, then by the definition of bwe have that `−a≥`, i.e. a≤0, which
is a contradiction, since a > 0; thus, 0< b < `. Now, since `is odd and ais even,
then there exist some i, j ∈Zsuch that b=`−a= 2i+ 1 −2j= 2(i−j)+1;
showing that bis odd. ut
Using this proposition, we can calculate kG as k0G, for k0=`−k, whenever
the scalar kis even. Note that if this operation was computed using points in
the affine space, then the point k0Gmust be inverted to obtain kG. Fortunately,
this is not required since we are operating with elements in the Kummer variety,
which maps kG and k0Gto the same element in E/h±1i. All of these observations
led to Algorithm 7, which supports both even and odd scalars, and does not
require low-order points in the computation of the fixed-point multiplication.
Among the changes made, Algorithm 7 starts by computing r=`−kand
then selects the scalar between rand k. This selection could introduce a time
variability in its execution, and consequently, it must be processed using a regular
execution pattern. This task can be achieved using the cswap function as shown
in line 2 of Algorithm 7. Thus after computing a conditional swapping, rwill be
odd allowing to start the main-loop from the second iteration.
Finally, we apply Algorithm 7 to compute multiples of Gduring the qDSA
signature scheme. Since the fixed-point multiplication appears in all operations
of the qDSA scheme, we improve the running time of the entire scheme. Sect. 5
reveals the impact on performance obtained by our software implementation.
10 Faz-Hernández, Fujii, Aranha, López
Algorithm 7 Our proposed right-to-left fixed-point multiplication algorithm
without using low-order points.
Input: (k, G), where k∈Z`and k6= 0; and Gis a point of odd-order `.
Precomputation: A look-up table storing (µ0,...,µn−1)as defined in Eq. (9).
Output: kG = (XkG :ZkG ).
1: r←`−k
2: (k, r)←cswap(k0, k, r)
3: Let (rn−1,...,r0= 1)2be the n-bit binary repr. of rsuch that n=blog2(`)c+ 1.
4: Initialize Q0←G,Q1←G.
5: for i←1to n−1do
6: (Q0, Q1)←cswap(ri⊕ri−1, Q0, Q1)
7: Q0←dadd*(µi, Q0, Q1)⁄⁄Q0←Q0+Q12iG
8: end for
9: (Q0, Q1)←cswap(rn−1, Q0, Q1)
10: return Q1⁄⁄Return also Q0for y-coordinate recovery.
4 A New qDSA Signature Verification Method
Given an alleged signature (xRks)of a message M, the qDSA signature veri-
fication procedure must determine whether xRis the x-coordinate of R0+R1,
where R0=sG and R1=hQ for hdefined as in Algorithm 4. For that purpose,
the authors of qDSA provided Algorithm 5, which checks a weaker relation. Such
a method accepts the signature whenever f(xR) = 0, where fis the quadratic
polynomial f(x) = f2x2+f1x+f0, such that:
f2= (xR0−xR1)2,
f1=−2(xR0xR1+ 1)(xR0+xR1)−4A xR0xR1,
f0= (xR0xR1−1)2.
(10)
This method works since one of the roots of fis xR, however one disadvantage
of this approach is that there is another value x0that also passes the verifica-
tion procedure. Specifically, x0is the other root of fand corresponds to the
x-coordinate of R0−R1. Therefore, Mhas another valid signature (x0ks).
Although a low adversarial advantage can be exploited from this relaxed
verification method, it has a high risk to introduce a misuse of the cryptographic
scheme, such as the ones reported in [7,8,16]. To avoid potential issues in future
implementations, we looked for an efficient method that verifies qDSA signature
of a message unequivocally.
4.1 Unequivocal Techniques for Signature Verification
Let xSand xDbe the x-coordinate of R0+R1and R0−R1, respectively. Given
an alleged signature (xRks), we look for a relation that allows us to deter-
mine whether xR=xSfrom the coordinates of R0and R1, instead of verifying
whether xR∈ {xS, xD}as Algorithm 5 does. Thus, inspired by Montgomery’s
A Secure and Efficient Implementation of qDSA 11
insights [22], we derive the following equivalences:
xS+xD=β/α , (11)
xS×xD=γ/α , (12)
xS−xD=δ/α , (13)
such that α,β,γand δare defined as follows1:
α= (xR0−xR1)2,
β= 2(xR0xR1+ 1)(xR0+xR1)+4A xR0xR1,
γ= (xR0xR1−1)2,
δ=−4ByR0yR1.
(14)
The coefficients of fcan be derived by solving Eq. (11) for xD, and plugging
in this into Eq. (12), what results in a second-degree polynomial function of
xS. Thus, fcan also be written as f(x) = αx2−βx +γ. We note that solving
Eq. (11) for xSand substituting this into Eq. (12) yields into a second-degree
polynomial function of xDthat has the same coefficients as f. This means that
both xSand xDare the roots of f. Therefore, fdoes not help to distinguish
between xSand xD.
Our key idea is to obtain a (linear) polynomial that has a zero in xS. For that
end, we start by solving Eq. (13) for xSand substituting this into Eq. (12); thus
we obtain g0(x) = αx2−δx −γ. Analogously, we apply the same procedure, but
this time solving for xD, and we obtain g1(x) = αx2+δx−γ. So far, we have that
g06=g1, which means that by using g0, we are now able to distinguish between
xSand xD, since g0(xS) = 0 and g0(xD)6= 0. However, g0has zeros in xSand
in −xD. Now, using f(x) = (x−xS)(x−xD)and g0(x)=(x−xS)(x+xD), we
show how to unequivocally identify xS. Note that f(xS)=0and g0(xS)=0;
therefore, we define:
h0(x)=(f+g0)/x = 2αx −δ−β , and
h1(x) = f−g0= (δ−β)x+ 2γ , (15)
such that xSis a zero of both h0and h1. Listing 4.1 shows a SageMath [31]
computer script that validates the formulas used in this section. In summary,
either h0or h1aids to determine the validity of an alleged signature.
Our signature verification method proceeds as follows: given (xRks), it cal-
culates α,β, and δfrom the coordinates of R0and R1; then, it declares a
signature as valid if h0(xR) = 0 (alternatively, it calculates γinstead of αand
accepts the signature if h1(xR) = 0). We have shown two relations that allow to
verify a signature unequivocally.
4.2 Trade-off Analysis of Our Signature Verification Method
In contrast to the original signature procedure, our method requires calculating
the δterm, which implies the knowledge of the y-coordinate of both R0=sG
and R1=hQ.
1To avoid inversions, these terms can also be calculated using projective coordinates.
12 Faz-Hernández, Fujii, Aranha, López
1QQ = Rationals()
2 R.<x1,y1,x2,y2,A,B> = PolynomialRing(QQ,6,"x1,y1,x2,y2,A,B")
3 I = R.ideal([
4 B*y1**2-x1**3-A*x1**2-x1,
5 B*y2**2-x2**3-A*x2**2-x2 ])
6 FQuo = Frac(R.quotient(I))
7 evaluate = lambda F,X: FQuo(F.subs(x=X).rational_simplify())
8
9def addMontgomery(X1,Y1,X2,Y2):
10 global A, B
11 Xs = B*((Y1-Y2)/(X1-X2))**2-A-X1-X2
12 Ys = (2*X1+X2+A)*(Y2-Y1)/(X2-X1)-B*(Y2-Y1)**3/(X2-X1)**3-Y1
13 return Xs,Ys
14
15 xs,ys = addMontgomery(x1,y1,x2,y2)
16 xd,yd = addMontgomery(x1,y1,x2,-y2)
17
18 alpha = (x1-x2)**2
19 betta = 2*(x1*x2+1)*(x1+x2)+4*A*x1*x2
20 gamma = (x1*x2-1)**2
21 delta = -4*B*y1*y2
22
23 relAdd = FQuo(xs+xd)
24 relPro = FQuo(xs*xd)
25 relDif = FQuo(xs-xd)
26 # Verifying Relations
27 assert( relAdd == betta/alpha )
28 assert( relPro == gamma/alpha )
29 assert( relDif == delta/alpha )
30 # Renes&Smith’s f polynomial and testing its zeros
31 f = alpha*x**2-betta*x+gamma
32 assert( evaluate(f,xs) == evaluate(f,xd) == 0 )
33 # Defining g0 and g1 and testing their zeros
34 g0 = alpha*x**2-delta*x-gamma
35 g1 = alpha*x**2+delta*x-gamma
36 assert( evaluate(g0, xs) == evaluate(g0,-xd) == 0 )
37 assert( evaluate(g1,-xs) == evaluate(g1, xd) == 0 )
38 # Defining h0 and h1 and testing their zeros
39 h0 = 2*alpha*x-delta-betta
40 h1 = (delta-betta)*x+2*gamma
41 assert( evaluate(h0,xs) == evaluate(h1,xs) == 0 )
Listing 4.1: SageMath script for the validation of formulas in Q.
One can use the Okeya-Sakurai’s [27] method for recovering the y-coordinate
of R0=sG and R1=hQ. This technique requires some auxiliary points, namely
R2= (s+ 1)Gand R3= (h+ 1)Q, which are also computed by the Montgomery
ladder algorithm (Alg. 1). Thus, following Theorem 2 of [27], we have:
yR0= [(xR0xG+ 1)(xR0+xG+ 2A)−2A−(xR0−xG)2xR2](2ByG)−1,
yR1= [(xR1xQ+ 1)(xR1+xQ+ 2A)−2A−(xR1−xG)2xR3](2ByQ)−1;(16)
then, δcan be written as δ=−4ByR0yR1= (ByGyQ)−1T, where Tis:
T=−(xR0xG+ 1)(xR0+xG+ 2A)−2A−(xR0−xG)2xR2
×(xR1xQ+ 1)(xR1+xQ+ 2A)−2A−(xR1−xG)2xR3.(17)
A Secure and Efficient Implementation of qDSA 13
Algorithm 8 Unequivocally qDSA Verification Procedure.
Input: (xRks)is a signature, M∈ {0,1}∗is a message, and (xQkyQ(0))is the public
key of the signer.
Constants: (xG, yG)are the affine coordinates of the generator G∈EA,B .
Output: True, if the signature is valid; otherwise, False.
1: h←H(xRkxQkM) mod `
2: Q←(xQ: 1),R0←sG,R1←hQ
3: {y0, y00 } ← ±pB−1(xQ3+AxQ2+xQ)∈Fp.
4: Set yQ←y0, if y0≡yQ(0) mod 2; otherwise, yQ←y00 .
5: Calculate α,β, and δas in Eq. (14).
6: if h0(xR) = 0 then ⁄⁄h0as defined in Eq. (15).
7: return True
8: else
9: return False
10: end if
The most important thing to be noticed here is that yGyQmust be known
by the verifier. There are several alternatives to obtain such value:
–The simplest one is to append yGyQ(or (ByGyQ)−1) to the public key; hence
the calculation of δis straightforward, however the public-key’s size doubles.
–Alternatively, the public key could contain an extra bit yQ(0), which is
the least-significant bit of yQ; thus, the verification procedure calculates
{y0, y00 }=±pB−1(xQ3+AxQ2+xQ); then, if y0≡yQ(0) mod 2, it sets
yQ←y0; otherwise it assigns yQ←y00. After that, it calculates yGyQ. Note
that yGmust be also known, fortunately, this is a fixed parameter of the
scheme. This method has the advantage that the public key size is not in-
creased significantly; for example using Curve25519, (xQkyQ(0))fits in 256
bits. However, the cost of verification increases by computing one square-root
and a few multiplications. This approach is summarized in Algorithm 8.
We want to remark that for verifying a qDSA signature unambiguously, it is
mandatory that the verification method knows the y-coordinate of G(which is
a fixed parameter) and the y-coordinate of Qas inputs.
5 Performance Results and Comparisons
We focused on the development a software library that supports the 32-bit ARM
architecture, which is designed for embedded devices, and the 64-bit Intel archi-
tecture, which is wide-spread distributed from commodity computers to high-end
servers. For measuring execution times, we use the clock cycle counter available
in each architecture. Besides that on Intel processors, the advanced hardware
technologies Intel Turbo Boost, Intel Speed Step, and Intel Hyper-Threading
were disabled to obtain stable and reproducible measurements.
14 Faz-Hernández, Fujii, Aranha, López
5.1 Performance of Prime Field Arithmetic
For the arithmetic operations over F2255−19, we use an optimized library for
Cortex M4 ARM-based processors taken from [12]; and for the 64-bit Intel pro-
cessors, we use the optimized library available in [29]. In Table 1, we summarize
the clock cycle measurements of the arithmetic operations.
Table 1. Latency (in clock cycles) of the arithmetic operations on F2255−19 . The last
columns list the ratio of the latency between square and multiplication, and the ratio
between inversion and multiplication.
Archi-
tecture
Micro-
architecture
Processor
Model
Arithmetic Operations Ratios
Add Mul Sqr Inv Sqrt S/M I/M
32-bit
Cortex M4 Teensy 3.2 85 278 250 66,637 132,416 0.90 239.7
Cortex A7 Odroid XU4 49 290 233 63,095 132,785 0.80 217.6
Cortex A15 Odroid XU4 36 225 139 41,978 97,242 0.62 186.6
64-bit Intel Haswell Core i7-4770 8 64 48 14,925 29,344 0.75 233.2
Intel Skylake Core i7-6700K 6 48 39 11,090 22,598 0.81 231.0
The 32–bit implementation of the integer multiplier uses the full consecutive
operand caching technique [36], which in turn utilizes multiply-and-accumulate
instructions (UMLAL/UMAAL instructions). The scheduling of these instructions
was ordered in such a way that reduces the presence of carry values during the
evaluation of the product. The 64–bit implementation of the integer multiplier
followed the operand scanning technique, which is highly compatible with the
MULX instruction. For Skylake, the latency of the multiplier was improved even
more, by using the newest integer addition instructions (ADCX/ADOX instructions).
5.2 Performance of Our Optimized Implementation of qDSA
First of all, we want to highlight the acceleration introduced by the right-to-left
fixed-point multiplication algorithm presented in Sect. 3. To that end, we mea-
sured the percentage of improvement introduced by Algorithm 7 in the execution
time of the qDSA operations. Table 2 shows the timings obtained on a Cortex
M4 and on an Intel Haswell processor.
As it can be noted, the timings for computing qDSA operations were sig-
nificantly reduced; the impact was more evident on the key generation and the
signing procedures achieving, respectively, a 35-40% and 30-34% reduction in
the execution time. Likewise the verification procedure was accelerated by 19%.
Regarding memory footprint, the last row of Table 2 shows the overhead in-
troduced by integrating the use of precomputation. The code’s size (including the
8 KB table stored in ROM) of our implementation was increased by around 36%
and 44% on the 64-bit and 32-bit platforms, respectively. We recall that compu-
tations aided by precomputation always incur on trade-offs between space and
time; hence, the best approach will depend on several engineering aspects.
A Secure and Efficient Implementation of qDSA 15
Table 2. Performance comparison of the qDSA operations by replacing the Mont-
gomery ladder algorithm (Alg. 1) by the right-to-left fixed-point multiplication algo-
rithm (Alg. 7). For each processor, the third column shows the percentage of improve-
ment achieved. Entries represent 103clock cycles, except the last row.
Processor ARM Cortex M4 Intel Haswell
Scalar point mult. Alg. 1 Alg. 7 Savings Alg. 1 Alg. 7 Savings
Key Generation 927.9 604.9 34.8% 171.5 103.8 39.5%
Signing 1,059.1 736.2 30.5% 197.3 130.1 34.1%
Verification 1,746.2 1,422.8 18.5% 347.3 279.5 19.5%
Code size (bytes) 20,898 30,058 -43.8% 30,037 41,000 -36.4%
Table 3. Summary of the performance rendered by our optimized implementation.
Table entries show the latency, reported in 103clock cycles, of each qDSA operation.
qDSA Operation ARM (32-bit) Intel (64-bit)
Cortex M4 Cortex A7 Cortex A15 Haswell Skylake
Key Generation 604.9 538.8 366.5 103.8 86.8
Signing 736.2 652.1 422.7 130.1 114.6
Verification (Alg. 4) 1,422.8 1,271.7 870.6 279.5 231.1
Verification (Alg. 8) 1,555.2 1,404.4 967.8 309.6 253.5
The inclusion of the optimized prime field arithmetic in conjunction with
the use of the fixed-point multiplication algorithm reduced considerably the ex-
ecution time in comparison to the original implementation given by qDSA’s
authors [32]. In Table 3, we summarize the timings of our qDSA implementation
measured in several ARM and Intel platforms.
Table 3 also shows the latency of the proposed verification method (Alg. 8)
described in Sect. 4. Recall that our method must calculate one square-root and
a few multiplications to recover the y-coordinate of the public key. The use of
our method has an overhead increment from 8% to 10% in the execution time.
This timing penalty is compensated by the security benefits that our verification
method provides, besides it prevents some issues that could appear in future
applications of qDSA.
In Table 4, we show a performance comparison of qDSA with other digi-
tal signature algorithms. As can be seen, the qDSA’s signing procedure has a
better performance than RSA and DSA signature schemes. In addition, qDSA
generates signatures as fast as ECDSA does; however, the qDSA’s verification
procedure is faster than ECDSA’s verification. This positions qDSA as a more
efficient alternative for deploying digital signatures in contrast with standardized
signature algorithms.
From the comparison table, one can observe that, in both architectures, the
calculation of Ed25519 signatures is approximately twice as fast as the calcula-
tion of qDSA signatures. One of the reasons for this performance gap relies on
16 Faz-Hernández, Fujii, Aranha, López
Table 4. Performance comparison of qDSA and other digital signature schemes.
Signature
Scheme Instance 32-bit ARM Cortex A7 64-bit Intel Haswell
Sign/sec Verify/sec Sign/sec Verify/sec
RSAa2048 41.3 1,596.9 1,618 36,576
DSAa2048 146.3 137.9 2,071 1,883
ECDSAaP-256 940.5 250.7 25,344 10,198
EdDSA Ed25519 3,414.6b1,840.9b48,701c17,167 c
qDSAdCurve25519 2,148.0 1,001.6 25,109 12,109
aTimings taken using OpenSSL library (v.1.0.2) [39].
bMoon’s implementation [23] using the prime field arithmetic from [12].
cMoon’s implementation [23] compiled for 64-bit architectures.
dThis work.
the properties of the elliptic curve model used by each scheme, which imposes
certain limitations on the point multiplication algorithms.
On Edwards curves, the point addition formula is complete and unified. This
allows to associate point additions in many different ways, like in the Comb-based
algorithms; and because of that, the fixed-point multiplication algorithms for
Edwards curves have more degrees of freedom on their construction. For example,
it allows the use of larger look-up tables; this property has been reflected in state-
of-the-art implementations of Ed25519; for instance, Moon’s [23] implementation
uses a look-up table of 24 KB, whereas Chou’s [5] implementation increased look-
up table’s size to 30 KB for further speed up.
On the other hand, the point addition formula for Montgomery curves is not
complete, meanwhile the differential point addition depends on the coordinates
of an auxiliary third point. These facts restrict point multiplication algorithms to
be, in fact, addition-chain evaluations; for example, the Montgomery ladder algo-
rithm (Alg. 1) or the right-to-left Joye’s algorithm [19]. With the introduction of
precomputation in the right-to-left method, the look-up table size depends now
on the size of `(the order of the main elliptic curve subgroup), since the look-up
table stores the sequence (µ0, . . . , µn−1)where n=blog2(`)c+ 1. Thus, for the
case of Curve25519, the look-up table used in our implementation is not larger
than 8 KB, which is a third of the table size used in Ed25519’s implementations.
Alternatively, qDSA can be also implemented using Edwards curves (through
a birational equivalence with Montgomery curves [6]) for obtaining a perfor-
mance closer to the Ed25519’s one; however, note that our implementation uses
a smaller look-up table, which is a relevant factor that must be noticed when
targeting memory-constrained architectures. We left the Edwards approach as a
future work.
6 Closing Remarks
The novel Quotient Digital Signature Algorithm was designed with the aim to
provide key compatibility with Diffie-Hellman functions based in Montgomery
A Secure and Efficient Implementation of qDSA 17
curves. These curves are also employed for performing the signature operations
of qDSA; hence, the implementation of qDSA benefits from reusing the prime
field and the elliptic curve arithmetic that support the Diffie-Hellman protocol.
Like other elliptic curve based schemes, the performance-critical operation of
qDSA is the calculation of scalar point multiplications. To attend to this issue,
we revisited the fixed-point multiplication proposed by Oliveira et al. [29]. One
advantage of this algorithm is the use of precomputed tables, which reduces
the execution time of point multiplications. However, this algorithm operates
with low-order points during its computation, and it must be recalled that an
improper utilization of these points could open a breach to vulnerabilities.
For that reason and with the aim to provide not only an efficient but also
a secure implementation, we showed modifications on Oliveira et al.’s algorithm
that circumvent the use of low-order points. We noticed that whenever kis odd,
the x-coordinate of kP can be calculated without requiring low-order points;
and in the case kis even, the x-coordinate of −kP is calculated instead. In both
cases, the x-coordinate resultant will be the same, since in the Kummer variety,
scalar multiplication is performed regardless the scalar’s sign. Our observations
led to Algorithm 7 which computes fixed-point multiplications on Montgomery
curves faster and does not require low-order points.
Additionally, we derived a new method to verify qDSA signatures unequiv-
ocally. Our method was inspired by Montgomery’s work and revealed than the
public key must contain not only the x-coordinate of Q, but also its y-coordinate;
with this information the verifier will be able to validate signatures unequivo-
cally. This requirement introduces a trade-off between time and space. On the
one hand, if the public key contain both coordinates, then the verification proce-
dure will remain as efficient as the original method; however, the public key’s size
is increased to double. On the other hand, in order to avoid increasing the size
of keys, the y-coordinate can be encoded into a bit value; nonetheless, the exe-
cution time of the verification procedure increases by 8-10% with respect to the
original method. We remark that opting by the either alternative enables the
unequivocally verification of qDSA signatures, which further prevents against
potential vulnerabilities and the misuse of the original method.
According to the timings obtained in the performance benchmark, it can be
concluded that, for the evaluated platforms, qDSA can be considered as com-
petitive alternative for deploying digital signatures.
Acknowledgments. The authors want to thank the anonymous reviewers of
SPACE 2017 conference for the comments given to this research project.
References
1. Bernstein, D.J.: Curve25519: New Diffie-Hellman Speed Records. In Yung, M.,
Dodis, Y., Kiayias, A., Malkin, T., eds.: Public Key Cryptography - PKC 2006:
9th International Conference on Theory and Practice in Public-Key Cryptography,
New York, NY, USA, April 24-26, 2006. Proceedings, Berlin, Heidelberg, Springer
Berlin Heidelberg (April 2006) 207–228 https://doi.org/10.1007/11745853_14.
18 Faz-Hernández, Fujii, Aranha, López
2. Bernstein, D.J., Duif, N., Lange, T., Schwabe, P., Yang, B.Y.: High-speed high-
security signatures. Journal of Cryptographic Engineering 2(2) (September 2012)
77–89 http://dx.doi.org/10.1007/s13389-012-0027-1.
3. Biehl, I., Meyer, B., Müller, V.: Differential Fault Attacks on Elliptic Curve Cryp-
tosystems. In Bellare, M., ed.: Advances in Cryptology — CRYPTO 2000: 20th An-
nual International Cryptology Conference Santa Barbara, California, USA, August
20–24, 2000 Proceedings, Berlin, Heidelberg, Springer Berlin Heidelberg (August
2000) 131–146 https://doi.org/10.1007/3-540-44598-6_8.
4. Brumley, D., Boneh, D.: Remote Timing Attacks Are Practical. In: Proceedings of
the 12th Conference on USENIX Security Symposium, USENIX Association (Au-
gust 2003) 1–13 https://www.usenix.org/conference/12th-usenix-security-
symposium/remote-timing-attacks-are- practical.
5. Chou, T.: Sandy2x: New Curve25519 Speed Records. In Dunkelman, O., Ke-
liher, L., eds.: Selected Areas in Cryptography - SAC 2015: 22nd International
Conference, Sackville, NB, Canada, August 12-14, 2015, Revised Selected Pa-
pers, Cham, Springer International Publishing (August 2016) 145–160 http:
//dx.doi.org/10.1007/978-3-319-31301- 6_8.
6. Costello, C., Smith, B.: Montgomery curves and their arithmetic. Journal of Cryp-
tographic Engineering (Special Issue on Montgomery Arithmetic) (March 2017)
1–14 http://dx.doi.org/10.1007/s13389-017-0157-6.
7. Egele, M., Brumley, D., Fratantonio, Y., Kruegel, C.: An Empirical Study of
Cryptographic Misuse in Android Applications. In: Proceedings of the 2013
ACM SIGSAC Conference on Computer & Communications Security. CCS ’13,
New York, NY, USA, ACM (2013) 73–84 http://doi.acm.org/10.1145/2508859.
2516693.
8. Fahl, S., Harbach, M., Muders, T., Baumgärtner, L., Freisleben, B., Smith, M.:
Why Eve and Mallory Love Android: An Analysis of Android SSL (in)Security.
In: Proceedings of the 2012 ACM Conference on Computer and Communications
Security. CCS ’12, New York, NY, USA, ACM (2012) 50–61 http://doi.acm.org/
10.1145/2382196.2382205.
9. Fan, J., Gierlichs, B., Vercauteren, F.: To Infinity and Beyond: Combined Attack on
ECC Using Points of Low Order. In Preneel, B., Takagi, T., eds.: Cryptographic
Hardware and Embedded Systems – CHES 2011: 13th International Workshop,
Nara, Japan, September 28 – October 1, 2011. Proceedings, Berlin, Heidelberg,
Springer Berlin Heidelberg (October 2011) 143–159 https://doi.org/10.1007/
978-3-642-23951- 9_10.
10. Faz-Hernández, A., Longa, P., Sánchez, A.H.: Efficient and secure algorithms for
GLV-based scalar multiplication and their implementation on GLV–GLS curves
(extended version). Journal of Cryptographic Engineering 5(1) (Apr 2015) 31–52
https://doi.org/10.1007/s13389-014-0085-7.
11. Feng, M., Zhu, B.B., Zhao, C., Li, S.: Signed MSB-Set Comb Method for Ellip-
tic Curve Point Multiplication. In Chen, K., Deng, R., Lai, X., Zhou, J., eds.:
Information Security Practice and Experience: Second International Conference,
ISPEC 2006, Hangzhou, China, April 11-14, 2006. Proceedings, Berlin, Heidel-
berg, Springer Berlin Heidelberg (April 2006) 13–24 https://doi.org/10.1007/
11689522_2.
12. Fujii, H., Aranha, D.F.: Curve25519 for the Cortex-M4 and Beyond. In: Progress
in Cryptology – LATINCRYPT 2017: 5th International Conference on Cryptology
and Information Security in Latin America 2017, Proceedings. Lecture Notes in
Computer Science, Springer International Publishing (September 2017)
A Secure and Efficient Implementation of qDSA 19
13. Goundar, R.R., Joye, M., Miyaji, A., Rivain, M., Venelli, A.: Scalar multiplication
on Weierstraß elliptic curves from Co-Z arithmetic. Journal of Cryptographic En-
gineering 1(2) (Aug 2011) 161 http://dx.doi.org/10.1007/s13389-011-0012-0.
14. Hamburg, M.: Fast and compact elliptic-curve cryptography. Cryptology ePrint
Archive, Report 2012/309 (May 2012) http://eprint.iacr.org/2012/309.
15. Hedabou, M., Pinel, P., Bénéteau, L.: A comb method to render ECC resistant
against Side Channel Attacks. Cryptology ePrint Archive, Report 2004/342 (De-
cember 2004) http://eprint.iacr.org/2004/342.
16. Jager, T., Schwenk, J., Somorovsky, J.: Practical Invalid Curve Attacks on TLS-
ECDH. In Pernul, G., Y A Ryan, P., Weippl, E., eds.: Computer Security –
ESORICS 2015: 20th European Symposium on Research in Computer Security,
Vienna, Austria, September 21-25, 2015, Proceedings, Part I, Cham, Springer
International Publishing (2015) 407–425 https://doi.org/10.1007/978-3-319-
24174-6_21.
17. Johnson, D., Menezes, A., Vanstone, S.: The Elliptic Curve Digital Signature
Algorithm (ECDSA). International Journal of Information Security 1(1) (August
2001) 36–63 http://dx.doi.org/10.1007/s102070100002.
18. Josefsson, S., Liusvaara, I.: Edwards-Curve Digital Signature Algorithm (EdDSA).
RFC 8032 (January 2017) https://dx.doi.org/10.17487/rfc8032.
19. Joye, M.: Highly Regular Right-to-Left Algorithms for Scalar Multiplication. In
Paillier, P., Verbauwhede, I., eds.: Cryptographic Hardware and Embedded Sys-
tems - CHES 2007: 9th International Workshop, Vienna, Austria, September 10-13,
2007. Proceedings, Berlin, Heidelberg, Springer Berlin Heidelberg (2007) 135–147
http://dx.doi.org/10.1007/978-3-540-74735- 2_10.
20. Kocher, P.C.: Timing Attacks on Implementations of Diffie-Hellman, RSA, DSS,
and Other Systems. In Koblitz, N., ed.: Advances in Cryptology — CRYPTO ’96:
16th Annual International Cryptology Conference Santa Barbara, California, USA
August 18–22, 1996 Proceedings. Springer Berlin Heidelberg, Berlin, Heidelberg
(1996) 104–113 https://doi.org/10.1007/3-540-68697-5_9.
21. Lim, C.H., Lee, P.J.: More Flexible Exponentiation with Precomputation. In
Desmedt, Y.G., ed.: Advances in Cryptology — CRYPTO ’94: 14th Annual In-
ternational Cryptology Conference Santa Barbara, California Proceedings, Berlin,
Heidelberg, Springer Berlin Heidelberg (August 1994) 95–107 https://doi.org/
10.1007/3-540-48658-5_11.
22. Montgomery, P.L.: Speeding the Pollard and Elliptic Curve Methods of Factor-
ization. Mathematics of Computation 48(177) (January 1987) 243–264 http:
//dx.doi.org/10.2307/2007888.
23. Moon, A.: Implementations of a fast Elliptic-curve Digital Signature Algorithm.
https://github.com/floodyberry/ed25519-donna (March 2012)
24. NIST: Digital Signature Standard (DSS). Technical Report FIPS 186-1, National
Institute for Standards and Technology (December 1998)
25. NIST: Digital Signature Standard (DSS). Technical Report FIPS 186-2, National
Institute of Standards and Technology (January 2000) http://csrc.nist.gov/
publications/fips/archive/fips186-2/fips186-2.pdf.
26. NIST: SHA-3 Standard: Permutation-Based Hash and Extendable-Output Func-
tions. Technical Report FIPS-202, National Institute of Standards and Technology
(August 2015) http://dx.doi.org/10.6028/NIST.FIPS.202.
27. Okeya, K., Sakurai, K.: Efficient Elliptic Curve Cryptosystems from a Scalar Mul-
tiplication Algorithm with Recovery of the y-Coordinate on a Montgomery-Form
Elliptic Curve. In Koç, Ç.K., Naccache, D., Paar, C., eds.: Cryptographic Hard-
ware and Embedded Systems — CHES 2001: Third International Workshop Paris,
20 Faz-Hernández, Fujii, Aranha, López
France, May 14–16, 2001 Proceedings, Berlin, Heidelberg, Springer Berlin Heidel-
berg (September 2001) 126–141 http://dx.doi.org/10.1007/3-540- 44709-1_12.
28. Oliveira, T., Aranha, D.F., López, J., Rodríguez-Henríquez, F.: Fast Point Multi-
plication Algorithms for Binary Elliptic Curves with and without Precomputation.
In Joux, A., Youssef, A., eds.: Selected Areas in Cryptography – SAC 2014: 21st
International Conference, Montreal, QC, Canada, August 14-15, 2014, Revised Se-
lected Papers, Cham, Springer International Publishing (August 2014) 324–344
http://dx.doi.org/10.1007/978-3-319-13051- 4_20.
29. Oliveira, T., López, J., Hışıl, H., Faz-Hernández, A., Rodríguez-Henríquez, F.: How
to (pre-)compute a ladder. In: Selected Areas in Cryptography – SAC 2017: 24th
International Conference, Ottawa, Ontario, Canada, August 16 - 18, 2017, Revised
Selected Papers, Springer International Publishing (August 2017)
30. Perrin, T.: The XEdDSA and VXEdDSA Signature Schemes. Technical re-
port, Open Whisper Systems (October 2016) https://whispersystems.org/docs/
specifications/xeddsa/xeddsa.pdf.
31. The Sage Developers: SageMath, the Sage Mathematics Software System (Version
7.6). (2017) http://www.sagemath.org.
32. Renes, J., Smith, B.: qDSA: Small and Secure Digital Signatures with Curve-based
Diffie-Hellman Key Pairs. In: Advances in Cryptology – ASIACRYPT 2017: 23nd
International Conference on the Theory and Application of Cryptology and Infor-
mation Security, Hong Kong, China, December 3-7, 2017, Proceedings. (December
2017)
33. Rescorla, E., Dierks, T.: The Transport Layer Security (TLS) Protocol Version
1.2. RFC 5246 (August 2008) https://dx.doi.org/10.17487/rfc5246.
34. Rivest, R.L., Shamir, A., Adleman, L.: A method for obtaining digital signatures
and public-key cryptosystems. Communications of the ACM 21(2) (February 1978)
120–126 http://doi.org/10.1145/359340.359342.
35. Schnorr, C.P.: Efficient signature generation by smart cards. Journal of Cryptology
4(3) (January 1991) 161–174 http://dx.doi.org/10.1007/BF00196725.
36. Seo, H., Kim, H.: Consecutive Operand-Caching Method for Multiprecision Mul-
tiplication, Revisited. Journal of Information and Communication Convergence
Engineering 13(1) (Mar 2015) 27–35 http://dx.doi.org/10.6109/jicce.2015.
13.1.027.
37. Spagni, R.: Disclosure of a Major Bug in CryptoNote Based Currencies. Announ-
ment on https://getmonero.org/2017/05/17/disclosure-of-a- major-bug-in-
cryptonote-based-currencies.html (May 2017)
38. Taverne, J., Faz-Hernández, A., Aranha, D.F., Rodríguez-Henríquez, F., Hanker-
son, D., López, J.: Speeding scalar multiplication over binary elliptic curves using
the new carry-less multiplication instruction. Journal of Cryptographic Engineer-
ing 1(3) (Sep 2011) 187 https://doi.org/10.1007/s13389-011-0017-8.
39. The OpenSSL Project: OpenSSL: The Open Source toolkit for SSL/TLS. www.
openssl.org (April 2003)
40. Tromer, E., Osvik, D.A., Shamir, A.: Efficient Cache Attacks on AES, and
Countermeasures. Journal of Cryptology 23(1) (January 2010) 37–71 http:
//dx.doi.org/10.1007/s00145-009-9049-y.
41. Turner, S., Langley, A., Hamburg, M.: Elliptic Curves for Security. RFC 7748
(January 2016) https://dx.doi.org/10.17487/rfc7748.
42. Yen, S.M., Joye, M.: Checking before output may not be enough against fault-
based cryptanalysis. IEEE Transactions on Computers 49(9) (Sep 2000) 967–970
https://doi.org/10.1109/12.869328.