Content uploaded by Qasem Abu Al-Haija
Author content
All content in this area was uploaded by Qasem Abu Al-Haija on Dec 14, 2013
Content may be subject to copyright.
Efficient Algorithms & Architectures for Elliptic Curve
Crypto-Processor Over GF(P) Using New Projective
Coordinates Systems
Eng Qasem1982@yahoo.com, tawalbeh@just.edu.jo
Jordan University of Science and Technology, Department of Computer Engineering,
Jordan, Irbid 22110, P. O. Box 3030
Abstract—Elliptic Curve Cryptography (ECC) is a public-
key cryptosystem that is considered among the most important
schemes in information security. ECC computations points on
the curve and suffers from modular inversion operation which is
well-known to be very expensive operation. The use of projective
coordinates to represents the point on the Elliptic Curves
instead of affine coordinates replaced the inversion operations
by several multiplication operations. Many types of projective
coordinates have been proposed for the elliptic curve E: y2=
x3+ ax + b which is defined over prime finite fields: GF(p).
In this paper, we studied new projective coordinates systems
to achieve higher performance. The selected coordinates were
tested by using parallel multipliers to obtain maximum gain.
The experiment showed competitive results when using Tripling
Oriented and Montgomery Curves. Montgomery curves gave
the best results regarding area and time when applied for AT
measure. These findings makes these curves a good choice for
efficient EC Cryptoprocessor design. The results for the FPGA
implementation for EC design using these curves is also proposed
in this paper.
Index Terms—Elliptic Curves, Public Key Cryptography, GF(p),
Montgomery Curves, Modular Inversion, Projective Coordinates
I. INTRODUCTION
Protecting private data during transmission over the Internet
or any non-secure channel is very important. To keep this
data confidential, many mechanisms are used. Cryptographic
algorithms can be used to provide security services such as
secrecy, integrity and authentication [1].
Cryptographic algorithms can be classified into two types
based on the number of keys used: symmetric key algorithms
which use one private key shared between the sender and
the receiver. On the other hand, the public-key (asymmetric)
algorithms use two keys, one is kept private for the decryption
operation, and the other key is made public for all users. Gen-
erally speaking, the public-key algorithms are very powerful
and secure, but they need high computational power compared
with symmetric key algorithms [1], [3], .
The Elliptic Curve Cryptography (ECC) is one of the most
secure public-key crypto-systems. It provides equal or higher
level of security to what is provided by other known crypto-
algorithms with shorter key size. As other public-key crypto-
algorithms, ECC involves modular arithmetic operations such
as division and multiplication over finite fields, which is time
consuming especially for large operands.
The modular division determines the efficiency of the whole
ECC architecture since it is the longest-delay operation. Using
projective coordinates to define the points on the Elliptic
Curves eliminates the division operation.
On the other hand, the implementations of ECC using
standard curves have been used for long time and a vast
amount of research have been done investigating the security
and efficiency of these implementation. New algorithms and
hardware architectures for new Elliptic curves that use new
projective coordinates [29] are a demanding research to look
for best alternatives in terms of performance and security.
In this paper, the equations for ECC operations (point
addition/doubling) for two Elliptic Curves: Tripling Oriented
Doche-Kohel-Icart and Montgomery Curves [15], are derived.
Then, we propose the data flows describing theses operations,
followed by hardware designs that implement the best of
these data flows GF (p)using selected projective coordinates
systems. The designs exploits maximum parallelism to achieve
higher performance. The experimental results for the FPGA
implementations of the proposed architectures were obtained
and compared with other designs.
The next section presents some related work in this area.
Section 3 shows the related algorithms for ECC over GF (p).
The mathematical equations for the new curves are derived in
Section 4. Section 5 presents the data flows and hardware
designs for the curves computations. Section 6 shows the
experimental results, followed by conclusions in Section 7.
II. LITERATURE REV IE W
Many hardware units were proposed to perform Elliptic
Curve Cryptography computations [15, 29] over the prime
finite fields:GF (p). To avoid modular inverse, most of these
solutions used different projective coordinates and replaced the
inversion by several multiplications.
The author in [6] he proposed a new design and imple-
mentation of an elliptic curve cryptographic core to real-
ize point scalar multiplication operations over GF (p). His
design makes use of projective coordinates together with
scalable Montgomery multipliers for data size of up to 256-
bits encryption/decryption using four multiplier cores together
with the ordinary projective coordinates which outperform
implementations with Jacobean coordinates.
Journal of Information Assurance and Security 6 (2011) 063-072
Received August 10, 2010 1554-1010 $ 03.50 Dynamic Publishers, Inc.
Qasem Abu Al-Haija' and Lo'ai Tawalbeh
The authors in [7] analyzed the impact of exploiting the
parallelism available in two common Elliptic Curves for
ECC defined using projective coordinates. They assumed that
point-multiplication is implemented using the m-ary algorithm
instead of the popular binary algorithm. Point multiplication
is implemented using scalable multipliers in order to replicate
the design for varying-size security keys. They used Jacobian,
modified Jacobian, and Chudnovsky-Jacobian projective coor-
dinates.
The authors in [8] combined the inherited parallelism in
both levels: computations of upper scalar multiplication level
and lower point operations level. This work was proposed to
reduce the delay and improve the security against the simple
power attack.
Faster addition and doubling algorithms on elliptic curves
where proposed in [9]. The authors used explicit formulas
(and register allocations) for group operations on an Edwards’s
projective curve. Their algorithm for doubling uses only 3 field
Multiplication(M) and 4 squaring operations (S). If the small
curve parameters are chosen, then the algorithm for mixed
addition uses only 9M + 1S, and the algorithm for non-mixed
addition uses only 10M + 1S.
On the other hand, some researchers prefer to use affine
coordinates to represent the points on the Elliptic Curve, which
requires efficient algorithms and hardware implementation to
compute modular inversion. Tawalbeh in [10], [12] presented
unified inversion algorithm and architecture for both GF (p)
and GF (2k). All field comparisons were replaced by the use
of counters to keep track of the difference between field ele-
ments which are usually expensive and time-consuming. The
proposed architecture uses a scheduling method to reduce the
number of hardware resources without significantly increasing
the total execution time. The proposed hardware can be used
to compute ECC operations in an efficient way.
In [13], the author proposed a new Hardware Architecture
to Compute GF (p)Montgomery inversion with scalability
features. The architecture consists of two parts, computing
unit and memory unit to hold the data. The computing unit
performs all the arithmetic operations in word by word bases
to achieve the scalability of the design.
III. RELATED ALG OR IT HM S FO R ECC OVE R GF (p)
Let E be an Elliptic Curve defined over GF (p)where: E:
y2= x3+ ax + b (mod p) and a, b ∈GF (p)and satisfy (4a3+ 27b2
6=0 ) (mod p). Let P1= (x1, y1), P2=(x2, y2) are two points on the
curve E. The point at infinity, denoted by ∞, is also considered to
be on the curve. Then, adding the two distinct points P1+ P2to get
P3, where P3on E and is given by:
Point Addition: (x16=x2): P3= P1+ P2= (x3, y3) where:
x3=m2−x1−x2, y3=m(x1−x3)−y1(1)
Where: m= (y2−y1)/(x2−x1)
For the case P1= P2, the operation is called point doubling and
P3will be:
Point Doubling: Adding a point to itself: (y16=0) P3= P1+ P1
= 2P1= (x3, y3) where:
x3=m2−2x1, y3=m(x1−x3)−y1(2)
Where: m=dy
dx = (3x2
1+a)/(2y1)
The point addition and point doubling algorithms are combined
to perform the basic operation in the ECC, which is Scalar point
multiplication: kP , where kis a constant and Pis a point on
the curve [24], [25]. Let us consider the Right to Left scalar point
multiplication algorithm [8], [20] and [15] shown below:
Define: n: number of bits in k; ki: the ith bit of k.
Input: k and P (a point on the elliptic curve).
Output: Q=kP (another point on the curve).
Right to Left Algorithm
1. Q:=0;
2. for i = 0 to n-1;
3. if ki= 1 then Q:= Q+P ;
4. P:= P+P ;
5. return Q;
As mentioned earlier, and to avoid inversions, the projective
coordinates are used. In this paper, we will study the use of the
following three projective coordinates:
1) (X/Z, Y/Z).
2) (X/Z, Y/Z2).
3) (X/Z2, Y/Z3).
for the following two curves used in ECC:
1) Tripling Oriented Doche-Icart-Kohel:
Y2=X3+ 3a(X+ 1)2(3)
and
2) Montgomery Curves:
bY 2=X3+aX2+X(4)
IV. MATHEMATICAL EQUATI ON S
In this section, we derive the point addition/doubling equations
for the two Elliptic Curves mentioned before for the three suggested
projective coordinates [15]. These equations will be used in the Scalar
point multiplication operations in ECC. The implementation of the
ECC operations will use parallel multipliers to increase the efficiency.
A. Point Addition
All computations below assume that P1= (X1, Y1),P2=
(X2, Y2)and P3=P1+P2= (X3, Y3) = (X1, Y1) + (X2, Y2)
which will be calculated according to Equation (1) using projective
coordinates.
•Using Projection X/Z, Y/Z
According to Equation (1), we have to compute the slope (m):
m=(y2−y1)
(x2−x1)⇒M=
Y2
Z2
−
Y1
Z1
X2
Z2
−
X1
Z1
=Y2.Z1−Y1.Z2
X2.Z1−X1.Z2
Let: A=Y2.Z1−Y1.Z2, B =X2.Z1−X1.Z2⇒M=A
B
Using Equation (1): we will substitute the new values for each
x, y, and m, we get the following:
X03=A2
B2-X1
Z1-X2
Z2⇒Z1Z2A2−(X1Z2+X2Z1)B2
Z1Z2B2
Y03=A
B[X1
Z1−Z1Z2A2−(X1Z2+X2Z1)B2
Z1Z2B2]−Y1
Z1
⇒A[X1Z2B2−Z1Z2A2+(X1Z2+X2Z1)B2]−Y1Z2B3
Z1Z2B3
by multiplying X03by B
B, we get:
⇒X3=B[Z1Z2A2−(X1Z2+X2Z1)B2]
⇒Y3=A[X1Z2B2−Z1Z2A2−(X1Z2+X2Z1)B2]−Y1Z2B3
064 Al-Haija' and Tawalbeh
⇒Z3=Z1Z2B3
and for simplicity, we denote the additions and multiplications
with symbols to facilitate extracting the parallelism, as follows:
α1=Y2Z1, α2=Y1Z2, α3=X2Z1, α4=X1Z2
⇒A=α1−α2, B=α3−α4, C=α3+α4
α5=A2, α6=B2, α7=Z1Z2, α8=α2B
α9=C.α6, α10 =α4α6, α11 =α7α5, α12 =α6α7
⇒D=α11 −α9,E=α10 −α11 +α9
α13 =B.α12 , α14 =α8α6, α15 =D.B, α16 =E .A
⇒X3=α15 ,Y3=α16 −α14 ,Z3=α13
•Using Projection X/Z, Y/Z2
Using the same procedure applied above, we compute the
slope (m):
m=(y2−y1)
(x2−x1)⇒M=
Y2
Z2
2
−
Y1
Z2
1
X2
Z2
−
X1
Z1
=Y2.Z2
1−Y1.Z2
2
Z1Z2(X2.Z1−X1.Z2)
Let: A=Y2.Z2
1−Y1.Z2
2, B =X2.Z1−X1.Z2⇒
M=A
Z1Z2B
Using Equation (1) to substitute the new values for each x, y,
and m:
X03=A2
Z2
1Z2
2B2-X1
Z1-X2
Z2⇒A2−Z1Z2B2[X1Z2+X2Z1]
(Z1Z2B)2
Let C=X1Z2+X2Z1⇒X03=A2−Z1Z2C.B2
(Z1Z2B)2
Y03=A
Z1Z2B[X1
Z1−A2−Z1Z2C.B2
(Z1Z2B)2]−Y1
Z2
1
⇒A[(X1Z2+C)Z1Z2B2−A2]−Y1Z1(Z2B)3
(Z1Z2B)3
Assume : D=X1Z2+C= 2X1Z2+X2Z1
⇒Y03=A[Z1Z2DB2−A2]−Y1Z1(Z2B)3
(Z1Z2B)3
Let: E=Z1Z2DB2−A2⇒Y03=AE −Y1Z1(Z2B)3
(Z1Z2B)3
and after more simplifications (multiply Y03by Z1Z2B
Z1Z2B):
⇒X3=A2−Z1Z2C.B2
⇒Y3=Z1Z2BAE −Y1Z2
2(Z1Z2B)2B2
⇒Z3= (Z1Z2B)2
and finally, the using the same symbols as above:
α1=X1Z2, α2=X2Z1, α3=Z1Z2, α4=Z2
2
⇒B=α2−α1, C=α2+α1, D=2α1+α2
α5=Z2
1, α6=Bα3, α7=B2, α8=Y1α4
α9=α6.α6, α10 =B.α6, α11 =α8α7, α12 =Y2α5
⇒A=α12 −α8,Z3=α9
α13 =α11α9, α14 =A2, α15 =A.α6, α16 =D.α10
⇒E=α16 −α14
α17 =E.α15 , α18 =C.α10
⇒X3=α14 −α18 ,Y3=α17 −α13
•Using Projection X/Z2, Y/Z3
For the last projection, the slope (m) is:
m=(y2−y1)
(x2−x1)⇒M=
Y2
Z3
2
−
Y1
Z3
1
X2
Z2
2
−
X1
Z1
2=Y2.Z3
1−Y1.Z3
2
Z1Z2(X2.Z2
1−X1.Z2
2)
Let: A=Y2.Z3
1−Y1.Z3
2, andB =X2.Z2
1−X1.Z2
2
⇒M=A
Z1Z2B
and by substituting values for x, y, and m form Equation (1),
we get:
X03=A2
Z2
1Z2
2B2-X1
Z2
1
-X2
Z2
2
⇒A2−B2[X1Z2
2+X2Z2
1]
(Z1Z2B)2
Let C=X1Z2
2+X2Z2
1⇒X03=A2−C.B2
(Z1Z2B)2
Y03=A
Z1Z2B[X1
Z2
1
−A2−C.B2
(Z1Z2B)2]−Y1
Z3
1
⇒A[(X1Z2
2+C)B2−A2]−Y1(Z2B)3
(Z1Z2B)3
Assume : D=X1Z2
2+C= 2X1Z2
2+X2Z2
1
⇒Y03=A[DB2−A2]−Y1(Z2B)3
(Z1Z2B)3
Let: E=DB2−A2⇒Y03=AE −Y1(Z2B)3
(Z1Z2B)3
more simplifications results in:
⇒X3=A2−C.B2
⇒Y3=A.E −Y1(Z2B)3
⇒Z3=Z1Z2B
to get the final formulas, we use the same symbols for additions
and multiplications:
α1=Z2
1, α2=Z2
2, α3=Y1Z2, α4=Y2Z1
α5=X1α2, α6=X2α1, α7=α2α3, α8=α1α4
A=α8−α7,B=α6−α5,C=α6+α5,D= 2α5+α6
α9=A2, α10 =B2, α11 =Z1Z2, α12 =B.α7
α13 =C.α10, α14 =B.α11, α15 =D.α10, α16 =α12 α10
⇒X3=α9−α13 ,Z3=α14 ,E=α15 −α9
α17 =A.E
⇒Y3=α17 −α16
065
Efficient Algorithms & Architectures for Elliptic Curve Crypto-Processor Over GF(P) Using New Projective Coordinates Systems
B. Point Doubling
This subsection shows the computations of point doubling
operation according to equation (2) in three diffident projective
coordinates using Tripling Oriented Doche-Icart-Kohel Curves
and Montgomery Curves. All computations below assume that
X1=X2=X,Y1=Y2=Y.
1) Tripling Oriented Doche-Icart-Kohel Curves:The equa-
tion for Tripling Oriented Doche-Icart-Kohel Elliptic curve with
coefficients in GF(p) is given by:
E:y2=x3+ 3a(x+ 1)2(5)
To derive the equations of point doubling operation according to
Equation (2), we need to find the slope (m) where m = dy
dx :
2y.dy
dx =3x2+3a(2(x+1)) ⇒m = dy
dx =3[x2+2a(x+1)]
2y
This equation will be used in all 3 projections for this curve.
•Using Projection X/Z, Y/Z
Here we substitute (x,y)⇒(x→X/Z, y→Y/Z)
and denote m →M Then:
M = 3[X2
Z2+2a(X+Z)
Z]
2Y
Z
⇒3[X2+2aZ(X+Z)]
2Y Z
Let A = 3[X2+ 2aZ(X+Z)
⇒3[X2+ 2aZX + 2aZ 2)⇒M = A
2Y Z
⇒this will be used to get (X3, Y3, Z3):
Using Equation (2): we will substitute the new values
for each x, y, and m. We get:
X03=A2
4Y2Z2−2X
Z⇒A2−8XZ Y 2
4Z2Y2
Y03=A
2Y Z [X
Z−A2−8XZ Y 2
4Z2Y2]−Y
Z
⇒A[12XZ Y 2−A2]−8Z2Y4
8Z3Y3
To match the denominator for both X03,Y03with the
projection used, we multiply X03by 2Y Z
2Y Z , to get:
⇒X3= 2Y Z[A2−8X ZY 2]
⇒Y3=A[12XZ Y 2−A2]−8Z2Y4
⇒Z3= 8Z3Y3
To simplify the computations of X3, Y3, Z3, we denote the
additions and multiplications with symbols:
α1=X2, α2=Z2, α3=XZ , α5=Y2
⇒A=3[α1+ 2aα3+ 2aα2]
α4=Y Z, α6=α3α5, α7=A2, α10 =α2.α5
⇒B=α7−2α6,C= 8α6−α7
α8=B.α4, α9=A.C , α11 =α10α5,α12 =α10α4
⇒X3= 2α8,Y3=α9- 8α11 ,Z3= 8α12
•Using Projection X/Z, Y/Z2
In this case, we substitute (x,y)⇒(x→X/Z, y→Y/Z2), and
follow exactly the same procedure mentioned above, and we
finally get:
B=X+Z
α1=X2, α2=ZB , α4=Y2
⇒A=3[α1+ 2aα2]
α3=ZA, α5=X α4, α9=α4α4
α6=Aα3, α7=Y.α3, α10 =Y α9
⇒X3=α6- 8α5,C= 12α5−α6
α8=Cα7,α11 =Z α4
⇒Y3=2α8- 16α10 ,Z3= 4α11
•Using Projection X/Z2, Y/Z3
And in this case, we substitute (x,y)⇒(x→X/Z2, y→Y/Z3), and
by following the same steps as in the previous projections, we
end up by:
α1=X2, α2=Z2, α5=Y2
⇒X+α2
α3=α2[X+α2], α6=X.α5, α7=α5.α5
⇒A=α1+ 2aα3
α4=A2, α9=Y Z
⇒X3=9α4- 4α6,Z3= 2α9,B= 8α6−9α4
α8=A.B
⇒Y3=3α8- 8α7
2) Montgomery Curves:The equation for Montgomery Elliptic
Curves over GF(p) is defined by:
E:by2=x3+ax2+ax (6)
and the slope (m), where m = dy
dx will be:
2by.dy
dx =3x2+ 2ax + 1 ⇒m = dy
dx =(3x2+2ax+1)
(2by)
Using this slope equation, we follow the same procedure used to
find the doubling equations for the Tripling Oriented Doche-Icart-
Kohel Curves in all the three projections. To avoid repetition, we
present the results for each projection. For all cases, c= 2b.
•Using Projection X/Z, Y/Z
α1=X2, α2=XZ , α3=Z2, α4=Y2
⇒A=[3α1+ 2aα2+α3]
α5=α2α4, α6=A2, α7=Y Z, α8=α3α4
⇒B=c2α5
⇒D=α6−2B,E= 3B−α6
α9=D.α7, α10 =A.E, α11 =α4α8, α12 =α8α7
066 Al-Haija' and Tawalbeh
⇒X3=cα9,Y3=α10 −c3α11 ,Z3=c3α12
•Using Projection X/Z, Y/Z 2
α1=X2, α2=XZ , α3=Z2, α4=Y2
⇒A=[3α1+ 2aα2+α3]
α5=A4, α6=Xα4, α7=Y Z, α8=α4α4
α9=Z.α5, α10 =Aα7, α11 =Y α8, α12 =Zα4
⇒B= 3cα6−α5,X3=α5−2c2α6,Z3=c2α12
α13 =B.α7
⇒Y3=cα13 −c4α11
•Using Projection X/Z 2, Y/Z 3
α1=X2, α2=Z2, α3=Y2
α4=α2α2, α5=Xα2, α6=α3α3
⇒A=[3α1+ 2aα5+α4]
α7=A2, α8=Xα3, α9=Y Z
⇒B= 3c2α8−α7,X3=α7−2c2α8,Z3=cα9
α10 =B.A
⇒Y3=α10 −c3α6
V. DATA FLOWS
This section presents all the data flows that describes the equa-
tions for point addition/doubling obtained in the previous section.
These diagrams were used to implement the point addition/doubling
operations in hardware using adders and multipliers.
A. Point Additions Data Flows
The point addition operation depends only on the points (X1, Y1,
Z1), (X2, Y2, Z2), and the projection used in the system, and it is
doesn’t depend on the curves. Based on that, we have three different
data flows for point addition according to each one of the three
projections. The Figures below show these data flows.
B. Point Doubling Data Flows
The point doubling operation depends on the curve and the
projection being used. This is because the slope is the implicit
derivative of the curve itself, and using a different projection will
change the computations of the point doubling.
The following Figures show the point doubling data flows for
Tripling Oriented Doche-Kohel and Montgomery Curves for each
Projection.
The point addition operation is independent of curve formula, but
it depends on the projection used. From the Figures, we notice that
the projection (X/Z, Y/Z) gives the best implementation for point
addition operation.
Also, we can use these Figures to find the critical path delay and
the required area for each implementation. For example, the critical
path delay TCP D for point addition with the best implementation
will be:
TCP D = 4 ∗TMUL + 3 ∗TADR .
Fig. 1. Point Addition operation using the projection (X/Z, Y/Z).
Fig. 2. Point Addition operation using the projection (X/Z, Y/Z2).
067
Efficient Algorithms & Architectures for Elliptic Curve Crypto-Processor Over GF(P) Using New Projective Coordinates Systems
Fig. 3. Point Addition operation using the projection (X/Z2, Y/Z3).
Fig. 4. Montgomery Curves-Point Doubling operation using the projection
(X/Z, Y/Z).
Fig. 5. Montgomery Curves-Point Doubling operation using the projection
(X/Z, Y/Z2).
Fig. 6. Montgomery Curves-Point Doubling operation using the projection
(X/Z2, Y/Z3).
068 Al-Haija' and Tawalbeh
Fig. 7. Tripling Oriented Curve-Point Doubling operation using the projection
(X/Z, Y/Z).
Fig. 8. Tripling Oriented Curve-Point Doubling operation using the projection
(X/Z, Y/Z2).
Fig. 9. Tripling Oriented Curve-Point Doubling operation using the projection
(X/Z2, Y/Z3).
Moreover, the point multiplication contains two main operations
(point doubling, and point addition). These operations can be per-
formed in parallel, which means that it takes at least 4 sequential
multiplications determined by the point addition operation to do one
iteration of the loop using any one of the curves with all projections.
C. Comparisons & Analysis
Table 1 summarizes and compares the results extracted from the
Figures to compute point doubling operation for the two curves with
three different projective coordinates.
The curves are compared it terms of the following parameters:
1) Parallel Multipliers (PM), and Sequential Multipliers required
(SM).
2) Parallel Adders (PA), and Sequential Adders required (SA).
3) The number of idle components (Idle).
4) The Total number of multiplication operations (TM).
5) The Total number of addition operations (TA).
As it can be seen from Table 1, the two curves give a comparable
results to the standard curves (Short Weierstrass). The best choice
that gives best performance happens when using the two curves with
the projection (X/Z, Y/Z).
The area of the design can found by estimating the area of the
multipliers and adders used in the system. Table 2 shows the Area
(A) comparison for the two curves using the projection (X/Z,Y/Z),
where N is the original number of bits used.
On the other hand, the computation time can be estimated by the
number of cycles multiplied by the clock cycle period. The number
of clock cycles for all designs depends completely on the operands
size. We assume the operands used are up to 256 bits, which gives
high level of security in ECC when compared to other well known
public key Crypto-systems such as RSA [4]. Table 3 shows the Time
(T) comparison for the two curves using the projection (X/Z,Y/Z).
069
Efficient Algorithms & Architectures for Elliptic Curve Crypto-Processor Over GF(P) Using New Projective Coordinates Systems
TABLE I
Comparing the point doubling results for the two curves
Curve Name Formula Projection PM PA SM SA Idle TM TA
X/Z , Y/Z 4 2 3 3 2A 12 6
Tripling Oriented X2+Y2=dX2Y2X/Z , Y/Z23 2 4 4 1M ,3A 11 6
X/Z2, Y/Z33 2 4 4 3M ,3A 9 6
X/Z , Y/Z 4 2 3 3 2A 12 7
Montgomery curves X2+Y2=dX2Y2X/Z , Y/Z24 2 4 3 3M ,2A 13 5
X/Z2, Y/Z33 2 4 3 2M ,2A 10 5
X/Z , Y/Z 4 2 4 3 2M ,1A 14 5
Short Weierstrass Y2=X3+aX +bX/Z , Y/Z23 2 4 4 3M ,3A 15 5
(Standard Curves) X/Z2, Y/Z33 2 4 4 4M ,3A 14 5
TABLE II
Area comparison for the two Curves using the projection (X/Z, Y/Z)
Curve Name Add’s Area Mul’s Area Total Area
Montgomery Curves 2(8N+34) 4(4N+15) 32N+128
Tripling Oriented 2(8N+42) 4(4N+19) 32N+160
TABLE III
Time comparison for the two curves using the projection (X/Z, Y/Z)
Curve Name Add’s Time Mul’s Time Total Time
Montgomery Curves 3(8N+34) 3(4N+15)248N2+ 384N + 777
Tripling Oriented 3(8N+42) 3(4N+19)248N2+ 456N + 1083
Finally, lets take the Area*Time (AT) as a measurement that
combines area and time. Less AT values indicate better design and
higher performance. Table 4 shows the AT Characteristics for the
two curves using the projection (X/Z, Y/Z). The numbers in Table
4 are sketched in Figure 10. As we can see from the Figure, AT
Characteristics show very good close results for both curves.
TABLE IV
AT characteristics for the two curves
Curve Name AT
Montgomery Curves 1536N3+18432N2+74016N+99456
Tripling Oriented 1536N3+22272N2+107616N+173280
Fig. 10. AT characteristics for the two different curves
D. Proposed Design
The proposed design uses sequential multiplier with redundant
Carry Save Adder (CSA) as the basic building block. The basic idea
behind using CSA is to perform an addition of three binary vectors
using an array of 1-bit adder (full adders) without propagating the
carry [15]. As shown in Figure 11, the output is represented by two
binary vectors: carry vector (Vc) and sum vector (Vs).
Fig. 11. The internal design of the Carry Save Adder-CSA
In this implementation, the multiplication process will be done by
the shift-and-add sequential multiplication procedure. The execution
time (T) takes n cycles, and each cycle delay is the sum of the delay
of digit multiplication (one digit of the multiplier multiplied by the
whole multiplicand), the delay of addition, and register delay, that
yields:
T=n(Tdigmult +Tadd +Treg )(7)
The execution time can be reduced if a redundant adder is used (CSA
for example). The sequential multiplier with redundant CSA adder is
shown in Figure 12. From the Figure we can notice the following:
•two registers are used: one of them is a shift register and used
to hold the Multiplier (Y), and the other is used to hold the
Multiplicand. The shift register performs 1-bit shift right with
each clock cycle.
•the Multiple Generator circuit generates the multiples by per-
forming the AND operation between the Lease Significant Bit
(LSB) of Y and the whole vector of X.
•each generated multiple will extended to be compatible with the
CS Adder by using the sign extension circuit.
•the CSA adder use the [3:2] reduction adders and the result
of multiplication will be in redundant form with two vectors
(Ps,Pc). and the final result can be retrieved in a conventional
form by using a Carry Lookahead Adder (CLA).
VI. EXPERIMENTAL RESULT S
This section presents the experimental results obtained by im-
plementing the best designs for Elliptic Curve computations over
070 Al-Haija' and Tawalbeh
Fig. 12. The internal design of the Sequential Multiplier
GF(p) with the best projection [15],[29]. The proposed best designs
(Montgomery and Tripling curve with the projection X/Z, Y/Z) were
described in VHDL and simulated using ModelSim, and synthesized
using Xilinx Synthesis tool with the target FPGA chip chosen to be
XC5VLX30 Virtex-5 [15],[21].
Tables 5 and 6 show the critical path delays (in nano-seconds)
and the Area (as a number of logic gates) of the best designs
to compute Elliptic Curve point doubling using Montgomery and
Tripling Oriented curves, respectively, for the precision range from
16 to 512 bits.
TABLE V
Area and Delay design results for Montgomery curves
Precision(Bit) 16 bit 32 bit 64 bit 128 bit 256 bit 512 bit
Area(No. Gate) 2284 4076 7660 14828 29164 57836
Time(Nano Sec) 12.7 13.4 14.2 14.0 13.8 14.4
TABLE VI
Area and Delay design results for Tripling Oriented curves
Precision(Bit) 16 bit 32 bit 64 bit 128 bit 256 bit 512 bit
Area(No. Gate) 2356 4148 7732 14900 29236 57908
Time(Nano Sec) 13.0 13.7 14.5 14.3 14.2 14.6
From Tables 5 and 6, we can see that the minimum delay happens
at precision of 16 bit. When the operand precision increases then the
delay increases slightly and saturates at higher precision. And for the
area, we can notice that it increases linearly with the operands size.
The time needed to compute one Sequential Multiplication can be
computed by:
Tmult = (cycles/bit)∗n∗clockperiod. (8)
The computation of scalar multiplication operation (kP) using the
RTL-Algorithm (Double-and-Add method) [8] requires 4 sequential
multiplications for point addition and 3 sequential multiplications for
point doubling and an inversion operation to transform the results
from projective to affine coordinates. The time needed to compute
modular inversion is estimated by Tinv = 3 ∗Tmult [26].
Based on the experimental results, and after doing the math, we
can approximate the time to perform the point scalar multiplication
operation to be around 20 m second. Comparisons with other designs
could be inconsistent because of the use of different designs and
Galois Fields and devices technologies [15]-[19]. For example, the
FPGA implementation of the Elliptic Curve processor over GF (2m)
presented in [14] takes about 80.3 m sec to compute scalar point mul-
tiplication at operand size of 163-bits, which is very slow compared
to our design.
The design proposed in [27] operates on 66 MHz. Our design
performs about 10 times more point multiplications operations. The
crypto-processor in [20] performs 700 point multiplications per
second on 192-bit prime field. Our design is about 1.6 times faster
than this design [20].
On the other hand, the Area can be approximated from Tables 5
and 6 as follows (increases linearly with the operand size):
AREAMon ≈112.9 * n + 460 = O(n) gates
AREAT ri ≈113.1 * n + 540 = O(n) gates
VII. CON CL US IO NS
In this paper, we proposed efficient curves (Doche-Icart-Kohel
curves and Montgomery curves) to be used in Elliptic Curve Cryp-
tography computations over GF(p) using new projective coordinates
systems. The obtained data flows showed the required number of
multiplications and the parallelism level than can be extracted to
achieve higher performance.
The comparisons between the proposed curves using differ-
ent projective coordinates were carried out to identify the best
curve/projection combination to be implemented. The best results
can be obtained when using Montgomery or tripling Oriented curves
with the (X/Z, Y/Z0 projection.
The hardware design that implements the best combinations were
proposed. The designs use Carry Save Adders to avoid carry propa-
gation, and so, reducing the critical path delay. The designs were
described in VHDL and simulated, then synthesized for the the
target Vertix-5 FPGA chip to obtain Area and Delay results. The
experimental results found to be very competitive with many other
designs results in this field.
Acknowledgement
The authors would like to thank Jordan University of Science
and Technology and the Scientific Research Support Fund at the
Ministry of High Education in Jordan for supporting this research.
References
[1]Menezes, A.J., P.C. Van Oorschot, And S.A. Vanstone, ”Handbook Of
Applied Cryptography”, CRC Press, Boca Raton, Florida, 1996.
[2]Arto Salomaa,” Public Key Cryptography”, Springer-Vcrliig Berlin Ilei-
delhcrg IW. Second, Enlarged Edition With 22 Figures, Printed In
Germany, May 1996.
[3]Wade Trappe, And Lawrence C. Washington, ”Introduction To Cryptog-
raphy With Coding Theory,” By Prentice Hall, 2002, 1: 1-176.
[4]Darrel Hankerson, Alfred Menezes, Scott Vanstone,”Guide To Elliptic
Curve Cryptography,” Springer-Vl-Rlag New York, Inc., 175 I-’If th
Avenue, New York, Ny 10010, USA, 2004.
[5]E. Savas, C. K. Koc ”The Montgomery Modular Inverse - Revisited”
IEEE- Transactions On Computers, July 2000, 49.
[6]Adnan Gutub, ”Efficient Utilization of Scalable Multipliers in Parallel
to Compute GF(p) Elliptic Curve Cryptographic Operations”, Kuwait
Journal of Science & Engineering (KJSE), December 2007, 34(2): 165-
182.
[7]Hakim Khali, MIEEE And Ahcene Farah,”Cost-Effective Implementa-
tions Of GF(P) Elliptic Curve Cryptography Computations,” Ijcsns In-
ternational Journal Of Computer Science And Network Security, August
2007, 7(8).
[8]Adnan Gutub, Mohammad Ibrahim, and Turki Al-Somani, ”Paralleliz-
ing GF(P) Elliptic Curve Cryptography Computations for Security and
Speed”, IEEE International Symposium on Signal Processing and its
Applications in conjunction with the International Conference on In-
formation Sciences, Signal Processing and their Applications (ISSPA),
Sharjah, United Arab Emirates, February 12-15, 2007.
[9]Daniel J. Bernstein1 And Tanja Lange, ”Faster Addition And Doubling
On Elliptic Curves,” Springer Berlin / Heidelberg, Supported In Part By
The European Commission Through The Ist Programme,November 05,
2007, 4833.
[10]L. Tawalbeh, ”A Novel Unified Algorithm And Hardware Architecture
For Integrated Modular Division And Multiplication In GF(P) And
GF(2N) Suitable For Public-Key Cryptography.”, Ph.D. Thesis, School Of
Electrical Engineering And Computer Science, Oregon State University,
October 28, 2004.
[11]Gerald Lai, ”Analysis Of Modular Inverse Gf(P) Implementations,” IEEE
Trans. Inform. School Of Electrical Engineering And Computer Science,
Oregon State University, Corvallis, Oregon 97331, 2004.
[12]L. Tawalbeh And A. Tenca, ”An Algorithm And Hardware Architecture
For Integrated Modular Division And Multiplication In GF(P) And
071
Efficient Algorithms & Architectures for Elliptic Curve Crypto-Processor Over GF(P) Using New Projective Coordinates Systems
GF(2N),” In the IEEE 15th International Conference on Application-
specific Systems, Architectures and Processors (ASAP) Sept., 2004.
[13]Adnan Gutub, ”High Speed Hardware Architecture to Compute GF(p)
Montgomery Inversion with Scalability Features”, IET (IEE) Proceedings
Computers and Digital Techniques, July 2007, 1: 389-396.
[14]G. Orlando and C. Paar, ”Implementation of elliptic curve cryptographic
Coprocessor over GF(2m) on FPGA,” in Cryptographic Hardware and
Embedded Systems CHES 2000, C . K. Koc and C. Paar, eds., Lecture
Notes in Computer Science, Springer, Berlin, Germany, 2000, 1(2162):
25-40.
[15]Qasem Saleh Abu Al-Haija , ”Efficient Algorithms For Elliptic Curve
Cryptography Using New Coordinates System”, Master Thesis, Computer
Engineering Department, Jordan University of Science and Technology,
discussed in 28/Dec/2009.
[16]Mathias Schmalisch and Dirk Timmermann, ”Comparison Of Algo-
rithms For Elliptic Curve Cryptography Over Finite Fields Of GF(2m),”
IASTED International Conference on Communication, Network, and
Information Security; New York, NY; USA; 10-12 Dec. 2003: 136-140.
[17]Turki F. Al-Somani, ”Performance Evaluation of Elliptic Curve Projective
Coordinates with Parallel GF(p) Field Operations and Side-Channel
Atomicity,” JOURNAL OF COMPUTERS, JANUARY 2010, 5(1).
[18]N. P. Smart, ”A comparison of different finite fields for use in elliptic
curve Crypto-systems,” University of Bristol,Department Of Computer
Science, CSTR Press 00-007, June 2000.
[19]Andrew Odlyzko, ”Discrete logarithms: The past and the future,” AT&T
Labs - Research July 19, 1999.
[20]Berna Ors and C.Koc, ”FPGA Implementation of an Elliptic Curve
Crypto-system over GF(3m),”, Proceedings of the International Confer-
ence on Reconfigurable Computing and FPGAs, 2008.
[21]XILINX Company, ”Virtex-5 Family Overview,” www.xilinx.com, Prod-
uct Specification, DS100 (v5.0) February 6, 2009.
[22]Stallings, W. Cryptography and Network Security: Principles and Prac-
tice, Second Edition, Prentice Hall Inc., New Jersey, 1999.
[23]Okada, Torii, Itoh, and Takenaka, ”Implementation of Elliptic Curve
Cryptographic Coprocessor over GF(2m) on an FPGA”, Workshop on
Cryptographic Hardware and Embedded Systems, CHES 2000, Mas-
sachusetts, August 2000.
[24]Diffie, and Hellman, ”New Directions on Cryptography”, IEEE Trans. on
Information Theory, November 1976, 22: 644-654.
[25]Kobayashi, and Morita, ”Fast Modular Inversion Algorithm to Match Any
Operation Unit”, IEICE Trans. Fundamentals, May 1999, E82-A(5): 733-
740.
[26]Loai Tawalbeh, et al., ”An Efficient Hardware Architecture of a Scalable
Elliptic Curve Crypto-Processor Over GF(2n)”, Proceedings of the SPIE
(The International Society for Optical Engineering), Sep 2005, 5910: 216-
226.
[27]Hans Eberle, Nils Gura, Sheueling Chang Shantz, Vipul Gupta, Leonard
Rarick, and Shreyas Sundaram, ”A public-key cryptographic processor
for RSA and ECC”, In ASAP 04: Proceedings of the Application-
Specific Systems, Architectures and Processors, 15th IEEE International
Conference on (ASAP04), Texas, USA, 2004: 98-110.
[28]Akashi Satoh and Kohji Takano, ”A scalable dual-field elliptic curve
cryptographic processor”. IEEE Transactions on Computers, 2003: 449-
460.
[29]Qasem Abu Al-Haija and Mohammad Al-Khatib, ”Parallel Hardware
Algorithms and Designs for Elliptic Curves Cryptography to Improve
Point Operations Computations”. Accepted for publication at Journal of
Information Assurance and Security (JIAS), By Dynamic Publishers Inc.,
USA, April 2010, Vol.4, Issue 1, Paper6: (588-594).
072 Al-Haija' and Tawalbeh