Available via license: CC BY 4.0
Content may be subject to copyright.
arXiv:2107.08473v1 [cs.DS] 18 Jul 2021
Elliptic Curve Fast Fourier Transform (ECFFT) Part I:
Fast Polynomial Algorithms over all Finite Fields
Eli Ben-Sasson∗Dan Carmon∗Swastik Kopparty †David Levit∗
July 20, 2021
Abstract
Over finite fields Fqcontaining a root of unity of smooth order n(smoothness means nis the product
of small primes), the Fast Fourier Transform (FFT) leads to the fastest known algebraic algorithms
for many basic polynomial operations, such as multiplication, division, interpolation and multi-point
evaluation. These operations can be computed by constant fan-in arithmetic circuits over Fqof quasi-
linear size; specifically, O(nlog n) for multiplication and division, and O(nlog2n) for interpolation and
evaluation.
However, the same operations over fields with no smooth order root of unity suffer from an asymptotic
slowdown, typically due to the need to introduce “synthetic” roots of unity to enable the FFT. The
classical algorithm of Sch¨onhage and Strassen [SS71] incurred a multiplicative slowdown factor of log log n
on top of the smooth case. Recent remarkable results of Harvey, van der Hoeven and Lecerf [HvdHL17,
HvdH19a] dramatically reduced this multiplicative overhead to exp(log∗(n)).
We introduce a new approach to fast algorithms for polynomial operations over all large finite fields.
The key idea is to replace the group of roots of unity with a set of points L⊂Fqsuitably related to
a well-chosen elliptic curve group over Fq(the set Litself is not a group). The key advantage of this
approach is that elliptic curve groups can be of any size in the Hasse–Weil interval [q+ 1 ±2√q] and
thus can have subgroups of large, smooth order, which an FFT-like divide and conquer algorithm can
exploit. Compare this with multiplicative subgroups over Fqwhose order must divide q−1. By analogy,
our method extends the standard, multiplicative FFT in a similar way to how Lenstra’s elliptic curve
method [Len87] extended Pollard’s p−1 algorithm [Pol74] for factoring integers.
For polynomials represented by their evaluation over subsets of L, we show that multiplication,
division, degree-computation, interpolation, evaluation and Reed–Solomon encoding (also known as low-
degree extension) with fixed evaluation points can all be computed with arithmetic circuits of size similar
to what is achievable with the classical FFTs when the field size qis special. For several problems, this
yields the asymptotically smallest known arithmetic circuits even in the standard monomial representa-
tion of polynomials.
The efficiency of the classical FFT follows from using the 2-to-1 squaring map to reduce the evaluation
set of roots of unity of order 2kto similar groups of size 2k−i, i > 0. Our algorithms operate similarly,
using isogenies of elliptic curves with kernel size 2 as 2-to-1 maps to reduce Lof size 2kto sets of size
2k−ithat are, like L, suitably related to elliptic curves, albeit different ones.
1 Introduction
The rocket fuel that powers modern fast algorithms for polynomial algebra is the Fast Fourier Transform
(FFT). The original FFT, due to Cooley–Tukey [CT65]1, is a divide-and-conquer algorithm that evaluates
a polynomial P(X) = Pi<n aiXi∈C[X], given by its sequence of coefficient (a0,...,an−1), on the nth
∗StarkWare Industries Ltd. {eli,dancar,david}@starkware.co
†Department of Mathematics and Department of Computer Science, University of Toronto. Research supported in part by
NSF grants CCF-1540634 and CCF-1814409, at Rutgers University. swastik.kopparty@gmail.com
1The history of this algorithm is much longer, and dates back to Gauss, see [HJB85].
1
roots of unity in C. It does so using O(nlog n) arithmetic operations over Cwhenever nis an integer power
of 2, or more generally, when nis a smooth number – a product of O(1)-sized primes. This immediately
enables O(nlog n) time multiplication of polynomials of degree < n/2 – by evaluation at the nth roots of
unity, pointwise multiplication of these evaluations, and then interpolation from the nth roots of unity via
the inverse FFT (iFFT) algorithm. Polynomial multiplication turns out to be the crucial operation for a
wide variety of other algorithmic problems of polynomial algebra. See the books [vzGG13,BCS97] for a
taste of the impact of the FFT on computer algebra.
Over finite fields Fq, these ideas generalize to some extent [Pol71]. Define Mq(n) to be the number of Fq
operations needed for the fastest algorithm over Fqwhich takes as input the coefficients of two polynomials
in Fq[X] of degree < n, and returns the coefficients of their product. Using the same FFT algorithm, if Fq
contains an nth root of unity for smooth n, we have Mq(n) = O(nlog n). More generally, we get the same
upper bound on Mq(n) even if a bounded degree extension field FqO(1) contains such a root of unity which
generates a multiplicative subgroup of smooth order. However, most finite fields are not “special” in this
way, which raises the following well-known open problem:
Open Question 1: Does the bound Mq(n) = O(nlog n) hold for all prime powers qand all n?
Until recently, the best general upper bound on Mq(n) was the classical result of Sch¨onhage and Strassen [SS71]
(see also Sch¨onhage [Sch77] and Cantor–Kaltofen [CK91]), who showed that:
Mq(n) = O(nlog nlog log n).
This algorithm involves introducing a synthetic root of unity and recursively running FFTs over more general
rings. The algorithm is inspired by, and closely mirrors, the classical (Boolean) algorithm of Sch¨onhage and
Strassen for integer multiplication, which shows that MZ(n), the Boolean circuit complexity of multiplying
two n-bit integers presented in base 2, satisfies:
MZ(n)≤O(nlog nlog log n).
Remark 1.1 (Computational Model).Unless explicitly specified otherwise, we use the word “algorithm”
to mean an algebraic algorithm that uses only field operations and field constants. In particular, we do
not consider any precision issues or the cost of computing the constants used by the computation. This
computational model is more commonly known as an arithmetic circuit or a straight-line program. When we
refer to the running time of such an algorithm, we mean the size of the straight-line program or arithmetic
circuit, which means we assign unit computational cost to each arithmetic operation over the ambient field.
As in the case of C, the best known algorithms for a wide variety of algorithmic problems of polynomial
algebra over Fqdepend on polynomial multiplication over Fq, and thus their running time depends on Mq(n).
Of particular interest are the following classical results.
1. Horowitz [Hor72b,Hor72a] gave an algorithm for polynomial interpolation at npoints with preprocess-
ing in time O(Mq(n) log2n). Here we are given a subset Bof Fqof size nand a function f:B→Fq, and
after doing arbitrary preprocessing of B, we want to compute the coefficients of the unique polynomial
of degree < n that interpolates f.
2. In the above mentioned paper, Horowitz [Hor72b] presented a fast algorithm for evaluating all elemen-
tary symmetric polynomials over nvariables on a specific input (α1,...,αn) in time O(Mq(n) log n).
3. Subsequently, Borodin and Moenck [BM74] improved Horowitz’s algorithm and gave an algorithm for
polynomial interpolation at npoints without preprocessing in time O(Mq(n) log n).
4. Along the way, Borodin and Moenck [BM74] also showed how to do multi-point evaluation of degree
< n polynomials at narbitrary points in time O(Mq(n) log n).
2
In recent years, there have been some remarkable advances in our understanding of the complexity of
multiplying polynomials over finite fields. These advances closely track breakthroughs on the fundamental
problem of understanding the complexity of multiplying integers in the Boolean circuit or (multi-tape)
Turing Machine model. The starting point for all these recent advances was the result of F¨urer [F¨ur07] (see
also [DKSS08]) who showed that MZ(n) = O(nlog n·2O(log∗n)). Soon after, Harvey, van der Hoeven and
Lecerf [HvdHL17] simplified and improved the constant in the exponent in F¨urer’s bound on MZ(n), while
also developing an Fq-analogue of this algorithm to show that Mq(n) = O(nlog n·2O(log∗n)). Harvey and
van der Hoeven [HvdH19a] further improved the constant in the exponent in the bound on Mq(n).
Finally, Harvey and van der Hoeven [HvdH21] proved the breakthrough MZ(n) = O(nlog n), settling a
long-standing conjecture. There they discussed the reasons why their results do not extend to a similar bound
on Mq(n). Nevertheless, their results do imply (via Kronecker substitutions, see Section 1.2 of [HvdH19a])
that multiplication of degree npolynomials over Fqfor n=qO(1), can be done in time O(nlog q(log n+
log log q)) in the Turing machine model, which seems to be as good a bound one can hope to deduce in the
Turing Machine model from the conjectured Mq(n) = O(nlog n).
Returning to Mq(n), Harvey and van der Hoeven showed in [HvdH19b, Theorem 9.2], which is a com-
panion paper to [HvdH21], that under a number theoretic conjecture on the least prime in arithmetic
progressions, Mq(n) is indeed O(nlog n).
Summarizing, the recent wave of results come extremely close to answering Open Question 1 uncondi-
tionally, but we are not quite there yet.
1.1 Our Results
The main contribution of our paper is a new approach to fast polynomial algorithms via a new polynomial
representation that works over all large finite fields. The approach is very closely related to the classical FFT
algorithm, but instead of working with subgroups of Fqof smooth order (be they multiplicative or additive),
it works with elliptic curve groups with large, smooth order subgroups, which exist for all Fq.
Our approach is unrelated to all the recent results mentioned above, and unconditionally yields some
new results that would follow if Mq(n) = O(nlog n) were true.
The new representation for polynomials suggested here is essentially the evaluation tables of the poly-
nomials at carefully chosen subsets of Fq. These sets are related to some subgroup of some elliptic curve
over Fq. This is the analogue of taking multiplicative/additive subgroups of large, smooth order, which is
only possible when qis special—either a power of a constant prime or such that q−1 is divisible by a large
smooth factor.
In the classical multiplicative subgroup based FFT, we can convert the evaluation table representation
into the standard coefficient representation in time O(nlog n) via the classical inverse FFT. Unfortunately,
in our elliptic curve group case, we do not know how to do this conversion as fast. What we can do
instead is to quickly extend the evaluation of the polynomial on our chosen subset Sto another subset S′
of Fq. This is the analogue of using a combination of FFT and inverse-FFT (with some scaling) to use
the evaluations of some low degree polynomial at a multiplicative subgroup Sto deduce the evaluations
of that low degree polynomial at some coset of S. In fact, the way we compute the low degree extension
to the subset S′is also a combination of some FFT-like transform (which we call the ECFFT) and the
inverse transform. It just so happens that the intermediate representation, i.e., the result of our iFFT-
analogue, is not the standard monomial expansion of the polynomial, but some other representation. In
this respect, our approach resembles the additive FFT-like transforms of [LCH14] which also lead to non-
monomial representations supporting fast operations (see also [GM10,Can89]); however, their algorithms
have S, S′being additive subgroups of Fq, and require Fqto have constant characteristic to have O(nlog n)
running time.
We systematically exploit the above-mentioned fast algorithm for extending polynomial evaluations on
special sets to develop fast algorithms2for a variety of polynomial computation problems, giving the following
results, defined formally in Section 6:
2We remind the reader that the model of computation is algebraic circuits (and for one problem, algebraic decision trees),
where the circuit may depend arbitrarily on nand q. The preprocessing cost of setting up this circuit for a given nor q, which
3
1. When polynomials of degree less than nover Fq,n≤qO(1), are represented as evaluations over special
sets, the following three operations can all be done in time O(nlog n):
(a) addition,
(b) multiplication, and
(c) degree computation3
Note that addition trivially takes O(n) time for polynomials evaluated on any set of points, as does
multiplication of polynomials whose degrees sum to less than n; the crux here is that polynomials
can still be multiplied in quasi-linear time even if their product has degree above n, by extending the
evaluations to a larger set, supporting higher degrees. Degree compuation is also non-trivial, as the
polynomials are not represented directly by their coefficients.
As far as we know, this is the only known representation of polynomials that allows all the above three
operations to be computed in O(nlog n) algebraic operations for general qand n≤qO(1).
A folklore question, which was recently resolved by the breakthrough on integer multiplication [HvdH21],
asked to find a representation of integers that supports addition, multiplication and comparison in
O(nlog n) time. Our result can be viewed as a positive answer to the analogous question for polyno-
mials over arbitrary finite fields.
2. We develop fast algorithms for other basic operations on these representations, such as division with
remainder and Chinese remaindering, modulu fixed polynomials.
3. Converting between our new representation and the standard representation of polynomials by their
monomial coefficients (in both directions) can be done in time O(nlog2n).
Armed with these tools for working with polynomials in the new representation, we get the following new
results for classical problems that have nothing to do with the new representation. All these results improve
on the state of the art by a multiplicative exp(log∗n) factor, and are consequences of the conjectured bound
Mq(n) = O(nlog n). See Section 7for the formal statements.
1. We give an O(nlog2n) time algorithm to evaluate all nelementary symmetric polynomials on n
inputs, provided n≤qO(1). It was not known how to do this in general for all n≤qO(1) even for the
computation of just the n/2-th elementary symmetric polynomial.
2. Given an arbitrary set Bof npoints, we give an O(nlog2n) time algorithm for interpolating a poly-
nomial (and representing it in the standard monomial basis) from its evaluation on B(we allow
preprocessing based on B).
3. We give an O(nlog2n) time algorithm for multi-point evaluation of a degree < n polynomial at an
arbitrary set Bof npoints (here, too, we allow preprocessing based on B).
4. Combining the above two results, we get a an O(nlog2n) time algorithm for computing low-degree
extensions of function evaluated at narbitrary points to nother arbitrary points. The two sets of points
are assumed to be known in advance, and preprocessed to derive constants used by the algorithm.
We believe this representation will have further uses in the development of fast algorithms for polynomial
algebra. The most compelling question here is whether these methods can improve the bound on Mq(n)
itself. It is also interesting to see if we can do away with the need for preprocessing in the above algorithms.
in our case involves searching for a suitable elliptic curve, is not included in the complexity bounds. Under standard number
theoretic heuristics, this preprocessing can be done by a randomized Turing machine in O(n·poly(log n, log q)) time. Details
will appear in [BCKL21].
3The formal model for this is Algebraic Decision Tree (since the output is an integer), and by “running time” for this model
we mean the depth of this tree.
4
Further applications in Part II [BCKL21]: The applications of FFT-like divide and conquer for
polynomials is not limited to the design of fast algorithms. In a sequel to this paper (which is oriented
towards applied cryptography), we explore applications of the Elliptic Curve based Fast Fourier Transforms
to interactive oracle proofs (IOPs), IOPs of proximity (IOPPs) for algebraic geometry codes and scalable
transparent arguments of knowledge (STARK) systems, generalizing the use of the standard FFT in PCPPs
for Reed–Solomon codes [BS08], the FRI protocol for proving proximity to Reed–Solomon codes [BBHR18],
and the STARK protocol and analogous transparent IOP based proof systems for verifying general com-
putation [BBHR19,BCR+19,COS20,Sta21]. Because of applications of the latter two to cryptography in
the real world, where the natural field of definition of the problems is specified by external sources, there is
a natural need to prove computational integrity statements about computations of length nexecuted over
specific finite fields q≫n. For example, the qused in the ECDSA algorithm that is part of the Bitcoin
standard is such that q−1 has no large smooth factor, and this is also the case for any qwhich is a “safe
prime” which means that (q−1)/2 is a large prime. Indeed, such examples were the original motivation for
looking for generalizations of FFTs to all fields, and resolving it requires a deeper scrutiny of the ECFFT,
used here only as a “black-box”, and several other ideas.
1.2 ECFFT – Informal Explanation
The standard FFT algorithm exploits the structure of the group of 2k-th roots of unity and its subgroups,
using the squaring map x7→ x2to simultaneously (i) project the group of size nto a subgroup of half the size
and (ii) split a polynomial of degree ninto two polynomials of half the degree, expressed using the squaring
map.
Let n= 2k, and suppose we are working in a field Fwhich contains all nof the nth roots of 1. Let L(0) ⊆F
denote all the nth roots of 1, assuming we wish to represent polynomials of degree < n by evaluating them on
L(0). Let ψ(X) = X2be the squaring map. For each i, let L(i+1) =ψ(L(i)). Thus L(i)is the set of n
2ith roots
of unity in Fand ψis a 2-to-1 map of degree 2 from L(i)onto L(i+1). Thus far we have described how ψis
used to “compress” an evaluation set L(i)to a smaller evaluation set L(i+1) of half the size. Simultaneously,
ψcan be used to “split” a polynomial presented in the standard monomial basis thus:
P(X) = X
i<n
ai·Xi=
X
i<n/2
a2i·ψ(X)i
+X·
X
i<n/2
a2i+1 ·ψ(X)i
=P0(ψ(X)) + X·P1(ψ(X)).
The FFT evaluates Pon L(0) by recursively evaluating both P0(Y) and P1(Y) on y∈ψ(L(0)) = L(1) and
then combining the results using O(n) operations via the formula above. The running time F(n) satisfies
the recursive formula F(n) = 2 ·F(n/2) + O(n) leading to O(nlog n) running time.
The essential elements we preserve in our ECFFT are the usage of degree-2 maps ψ(i)that are 2-to-1
maps on special sets of points L(i)of size n
2i, along with the ability to express a polynomial P(X) of degree
< n in terms of two other polynomials P0(ψ(i)(X)), P1(ψ(i)(X)) of degree < n/2, such that the value of
P(x), x ∈L(i)can be obtained “locally” from the values of P0(ψ(i)(x)), P1(ψ(i)(x)). Thus, we use such maps
and sets of points to describe new FFTrees. An FFTree (see Definition 3.3) is an “FFT-inspired” object that is
a layered binary tree whose nodes residing at the ith layer are labeled by the members of L(i), and such that
the 2-to-1 map ψ(i)defines directed edges from two elements s0, s1∈L(i)to t=ψ(i)(s0) = ψ(i)(s1)∈L(i+1).
So far we have listed similarities between the FFT and our new ECFFT, so let us now describe the
differences. First, our set L(i)is not a multiplicative group, and in fact it is not a group at all (soon, in
Section 1.3, we’ll explain what L(i)actually is). But examining the classical FFT, we could do its first step
using any degree-2 polynomial ψ(X) which is 2-to-1 on some set of points L(0) (mapping it to an arbitrary
set of points L(1) of size n/2). The group structure is useful for knowing, recursively, that we can find further
2-to-1 maps from L(1) to L(2) and so on. A second point of difference is that our 2-to-1 maps may vary
with i, whereas the classical FFT uses only squaring4to move from L(i)to L(i+1). Finally, the maps ψ(i)we
4When nis factored into different prime factors (say, n= 2a·3b) one would also use different maps in the FFT (say, squaring
and cubing) to move between L(i)and L(i+1) , and varying maps are also used in additive FFTs [GM10,Can89,LCH14].
5
use are not degree-2 polynomials but rather degree-2 rational maps, ratios of two degree-2 polynomials. We
show that any such map is just as good for the purpose of splitting a polynomial into two subpolynomials
of half the degree (see Lemma 3.1), and using rational maps rather than polynomials gives us more degrees
of freedom when searching for 2-to-1 maps on special sets of points. These points, and the way they are
obtained, are our next, and main, point in this intuitive description of the ECFFT.
1.3 Elliptic Curves as a Source for FFTrees over Arbitrary Finite Fields
Elliptic curves are a vast topic of study, with wide-ranging impact across mathematics (e.g., [Wil95]), and
we shall not attempt to describe their importance here. An elliptic curve Eover the finite field Fqis defined
by a suitable polynomial C(X, Y )∈Fq[X, Y ], and the solutions (x, y)∈F2
qof C(X, Y ) = 0 are the points of
interest (the description here is intentionally simplified, see Section 4.1 for a formal and accurate definition).
Elliptic curves have some remarkable properties that have led to a number of significant and surprising
applications in theoretical computer science. A small selection of notable examples include: Lenstra’s elliptic
curve method for factoring integers [Len87]; Schoof ’s deterministic algorithm for finding square roots modulo
a prime [Sch85]; cryptosystems, starting with Miller’s EC Diffie–Hellman (ECDH) key exchange [Mil86] and
Koblitz’s EC integrated encryption scheme (ECIES) [Kob87,ABR99] and including Vanstone’s EC digital
signature algorithm (ECDSA) [Van92] and applications based on pairings, such as Joux’s one-round 3-way
key agreement [Jou04] and the Boneh–Franklin identity based encryption protocol [BF03].
We remark that Lenstra’s method for factoring integers using elliptic curves [Len87] in particular was
a major inspiration for this paper. Lenstra’s method is a generalization of Pollard’s p−1 algorithm for
factoring [Pol74]: The p−1 method only works when, for some prime factor p, the multiplicative group F×
p
has a special property, which is only true for few primes p. Lenstra’s method extends the p−1 method to all
possible p’s by replacing the group F×
pwith elliptic curves. Very similarly, the standard FFT works inside
Fqonly when the field has special roots of unity, which is true only for sporadic q, and this paper extends
core applications of FFT to all prime powers qby replacing the group F×
qwith elliptic curves.
The main properties of elliptic curves that we use are:
•The number of points on the curve Ecan be nearly any number in the range [q±2√q+ 1] (see
Section 4.1.4 for a precise discussion of the number of points).
•These points form an abelian group, called, appropriately, an elliptic curve group. Varying over curves,
and acknowledging the previous point, elliptic curve groups could be of nearly any size in [q±2√q+ 1].
In particular, we can find subgroups Gof elliptic curve groups of size n= 2kfor n=O(√q) (see
Theorem 4.4 and Claim 4.6).
•If H < G are subgroups of an elliptic curve Eover Fq, there is an |H|-to-1 map φ(called an isogeny)
with kernel Hfrom the points of the curve Eto points on a different curve E′over Fq. Thus, the
image of Gunder the isogeny is of size |G|/|H|.
The observations above give us nearly all that we need. We can find a set of points G(0) inside a curve
E(0) that is a group of size 2k≤O(√q) irrespective of the exact nature of q, and we have at our disposal
isogenies that “compress” groups of points G(i)to groups G(i+1) half the size via 2-to-1 maps φ(i), where the
new group G(i+1) belongs to a different curve E(i+1). The only remaining gap is that elements in the groups
G(i)are pairs (x, y)∈F2
qwhereas we are interested in univariate polynomials and evaluation sets over Fq.
The final ingredient is to pick curves represented in a certain format (extended Weierstrass form) such that
suitably shifting and then projecting G(i)to the xcoordinate gives a set L(i)⊂Fqthat is the same size as
G(i)and, crucially, the isogeny map φ(i)gives rise to a degree-2 rational map that is 2-to-1 from L(i)onto
L(i+1) (see Proposition 4.1 and Theorem 4.9).
Remark 1.2.The degree-2 (or higher degree) maps so obtained are generalizations of Latt´es maps [Lat18]
(see [Sil07]). Latt´es maps are the rational maps arising from the x-coordinate mapping of isogenies from an
elliptic curve to itself. The rational maps that underlie the FFTree construction arise from the x-coordinate
6
mapping of isogenies from an elliptic curve Eto some other elliptic curve E′, which may or may not equal
E.
Summarizing, the abundance of elliptic curve groups of various sizes over any large finite field assures us
that we’ll find a subgroup of smooth size; isogenies and their projections give 2-to-1 degree-2 rational maps
from sets of size 2k(in Fq) to sets of size 2k−1for all needed k, and thereby we have the needed FFTree
structure which leads to efficient FFT-like running times for all finite fields.
Organization of paper The following Section 2gives notation. Section 3defines and discusses (i) the
FFTree data structure and (ii) the polynomial decomposition lemma (using rational maps); these two in-
gredients are needed to abstract and generalize the classical FFT algorithm to arbitrary sets of points and
maps. Section 4instantiates FFTrees and decomposition maps using elliptic curve and pro jections of isoge-
nies, showing that the necessary data structures exist over all large finite fields. Section 5defines the way we
represent polynomials for efficient operations – by evaluating them over the special sets of points that arise
from the previously defined FFTrees. Section 6presents fast algorithms for fundamental operations applied
to polynomials that are represented in this special way. Finally, Section 7uses these efficient algorithms to
efficiently solve “classical” problems about polynomials, like interpolation, evaluation over general sets of
points, and computation of elementary symmetric polynomials.
2 Notation
2.1 Functions and Polynomials
For g:D→Ra function and S⊂Rdenote by g−1(S) the set of g-preimages of S, namely g−1(S) =
{x:g(x)∈S}, and for u∈Rlet g−1(u) = g−1({u}). Likewise for D′⊂Dwe let g(D′) = {g(x) : x∈D′}.
For a set A⊆Fq, we define the vanishing polynomial of Ato be the polynomial Z(X)∈Fq[X] given by:
Z(X) = Y
α∈A
(X−α).
We define:
B(X)rem A(X)
to be the unique polynomial with degree <deg(A) which is congruent to B(X) mod A(X).
When B(X), A(X) are coprime polynomials, we define
(B(X))−1
A(X)
to be the unique polynomial C(X) with degree <deg(A) such that B(X)·C(X)≡1 (mod A(X)).
2.2 Projective Space
We denote by Pn(Fq) (or simply Pn) the n-dimensional projective space over Fq; only P1and P2will
appear in the paper. Points in Pnare given by homogenized coordinates [x1:x2:···:xn+1] where at least
one xiis non-zero, and with the equivalence relation
[x1:x2:···:xn+1]∼[cx1:cx2:···:cxn+1],∀c6= 0.
Points in the affine space Fn
qare given by affine coordinates (x1,...,xn), and in this paper we equate
such points with their standard embedding into projective space, i.e.
(x1,...,xn) = [x1:···:xn: 1].
Thus, Pnis the disjoint union of Fn
qand a copy of Pn−1“at infinity”, i.e. with an additional xn+1 = 0
coordinate. In particular, P1(Fq) = Fq∪ {∞}, where ∞denotes the unique point at infinity, [1 : 0].
7
We will refer to the two coordinates of the affine plane F2
qas xand y. For a point P∈F2
q, we will denote
its x, y coordinates by Px, Py, respectively. For a point P∈P2, the coordinates Px, Pywill only be defined
if it is an affine point, according to the above notation.
2.3 Rational functions
Rational functions over Fqare quotients R(X) = P(X)/Q(X) where P(X), Q(X)∈Fq[X] are coprime
polynomials and Qis non-zero. Rational functions form a field, denoted by Fq(X).
Rational functions can be considered as maps from P1to itself, where zeros of Qare mapped to ∞
and are called poles of the rational function, with multiplicity equal to their multiplicity as zeros of Q.
Depending on whether deg(P)−deg(Q) is positive, negative, or zero, the point ∞is either a pole of
multiplicity deg(P)−deg(Q), a zero of multiplicity deg(Q)−deg(P), or mapped to the ratio between the
leading coefficients of Pand Q, correspondingly.
The degree of Ris defined as deg(R):= max(deg(P),deg(Q)), and is equal to both the total number of
zeros and the total number of poles of R, including at ∞, counted with multiplicity.
3 Polynomial decompositions and FFTrees
In this section we show that any rational map can be used to decompose a polynomial into lower degree
polynomials, in a way similar to how the squaring map is used in FFTs (see Lemma 3.1). We then define a
generalized notion of FFT-like sets of evaluation points (Section 3.2). In the next section we shall instantiate
both of these—rational maps and FFTrees—using elliptic curve groups.
3.1 Polynomial decompositions based on rational functions
Let Vdbe the Fq-linear subspace of Fq[X] consisting of polynomials of degree strictly less than d. A
crucial component in the standard FFT is the decomposition of a polynomial P(X) = Pi<d aiXi∈Vdinto
two polynomials in Vd/2, one containing the terms of even degree and the other containing the terms of odd
degree:
P(X) =
X
i<d/2
a2iX2i
+X·
X
i<d/2
a2i+1 ·X2i
=P0(X2) + X·P1(X2).(1)
The results of this section generalize this partition by replacing X2with any rational function. Later, we
shall instantiate the results of this section with rational functions coming from projections of isogenies of
elliptic curves. We state the decomposition lemma next; its proof appears in Appendix A.
Lemma 3.1 (Decomposition).Let ψ(X)∈Fq(X)be a rational map given by:
ψ(X) = u(X)
v(X),
where u(X), v(X)∈Fq[X]are relatively prime polynomials. Let δ= deg(ψ) = max{deg(u),deg(v)}. Let d
be a multiple of δ. Then for every P(X)∈Vd, there is a unique tuple:
(P0(X), P1(X),...,Pδ−1(X)) ∈(Vd/δ)δ
such that:
P(X) = δ−1
X
i=0
Xi·Pi(ψ(X))!·v(X)d
δ−1.(2)
The next statement says that, as in the case of the standard FFT, moving between the two representations
of Eqs. (1) and (2) is done via a set of δ-local invertible linear transformations.
8
Lemma 3.2 (Locality and invertibility).Let t∈Fq. Keeping the notation of the previous lemma, suppose
ψ−1(t) = {s0,...,sδ−1}is a set of elements of Fqof size exactly δ. Then the transformation
Mt:Fδ
q→Fδ
q, Mt(P(s0),...,P(sδ−1)) 7→ (P0(t),...,Pδ−1(t)) (3)
is linear and invertible.
Proof. The assumption t∈Fqand, in particular, t6=∞, implies v(sj)6= 0 for each sj. The relationship
between the P(sj) and the Pi(t) is captured by the following system of linear equations:
P(sj) = δ−1
X
i=0
si
j·Pi(t)!·v(sj)d
δ−1.
Inspection shows that the underlying matrix is a nonsingular Vandermonde matrix with rows scaled by
nonzero scalars.
For the rest of this paper, we will focus on the δ= 2 case, although everything generalizes to larger δ.
We briefly instantiate the above lemmas in this case, to expose the similarity to the classical FFT.
Let ψ(X) be a degree 2 rational function. Suppose dis even. Fix any P(X)∈Vd, and consider the two
polynomials P0(X), P1(X) given by Lemma 3.1. Then we have the following decomposition that resembles
the classical FFT case of Eq. (1):
P(X) = (P0(ψ(X)) + XP1(ψ(X))) ·(v(X))d
2−1,
and so, for any s∈Fq:
P(s) = (P0(ψ(s)) + sP1(ψ(s))) ·(v(s))d
2−1.(4)
Let s0, s1, t ∈Fqbe such that ψ(s0) = ψ(s1) = twith s06=s1. Then Lemma 3.2 implies that the values
P(s0), P (s1) determine P0(t), P1(t) and vice versa (this uses the fact that s06=s1), and the transformation
between the two pairs of values is computed by multpilication by an invertible 2 ×2 matrix, whose coefficients
depend only on the values of s0, s1, v(s0), and v(s1).
Thus, when we have a degree 2 rational function ψthat is 2-to-1 from Sto ψ(S) = T, finding evaluations
of a polynomial P(X) at the points of Sis equivalent to finding evaluations of P0(X) and P1(X) at the
points of T.
3.2 FFTrees
We now define FFTrees, a structure abstracting out relevant properties of evaluation sets and maps
between them, which suffice to simulate an FFT-like algorithm.
Definition 3.3 (FFTrees).Let qbe a prime power, and let kbe an integer. An FFTree over Fqof depth
kis a collection of subsets L(0), L(1) ...,L(k)⊆Fqalong with degree 2rational functions ψ(i)(X)∈Fq(X)
such that:
1. |L(i)|= 2k−i.
2. ψ(i)(L(i)) = L(i+1) (and so ψ(i)is a 2-to-1map from L(i)to L(i+1)).
Let Fdenote the rooted, layered, binary tree, whose layers are indexed by i∈ {0,1,...,k}. The set of vertices
in layer iis L(i). The root of Fis the unique element of L(k). The leaves of Fare all the vertices in L(0).
For each i < k, the parent of the vertex s∈L(i)of the i-th layer is the vertex ψ(i)(s)∈L(i+1) of the (i+ 1)st
layer.
9
Because of the decomposition lemma, evaluations of a polynomial on L(i)can be deduced from evaluations
of 2 related lower degree polynomials on L(i+1), and this serves as the basis for fast “divide and conquer”
algorithms.
Our eventual use of FFTrees will be as follows. We will first fix an FFTree over Fq. We will use Lto
denote L(0). Let K=|L|= 2k. Then for any n≤K, polynomials of degree < n will be represented by
evaluations at specific subsets of Lof size O(n). The FFTree structure will then enable fast algorithms for
working with these representations.
Thus any given FFTree will be useful for working with polynomials of degree up to 2k−1. Therefore it
is interesting to find FFTrees with as large depth kas possible.
In the next section, we use elliptic curves to show the existence of FFTrees over Fqwith depth Ω(log q).
4FFTrees from Elliptic Curves
In this section we prove the existence of FFTrees of depth Ω(log q) in any finite field Fq. Specifically,
we show that there exist FFTrees over Fqwhose base set L(0) has size Ω(√q). We start by recounting the
necessary definitions and results regarding elliptic curves. In Section 4.2 we then prove our main results
about existence of FFTrees using rational maps that are projections of isogenies.
4.1 Background on elliptic curves and isogenies
In this subsection we provide a brief overview of the necessary definitions and theorems regarding elliptic
curves. Further details and proofs can be found in most basic texts on the subject. Except where specifically
noted, all results can be found in [Sil09] or [Was08].
4.1.1 Elliptic curve in Weierstrass form
An elliptic curve Eis a smooth, projective, algebraic curve of genus 1, with a special marked point O,
defined over a field. In this paper all curves will be defined over the finite field Fq. Every elliptic curve can
be presented in extended Weierstrass form as the set of planar points (x, y)∈F2
qsatisfying a cubic equation
Y2+a1XY +a3Y=X3+a2X2+a4X+a6(5)
or equivalently
F(X, Y ) := Y2+a1XY +a3Y−X3−a2X2−a4X−a6= 0 (6)
parameterized by a1, a2, a3, a4, a6, together with the marked point O= [0 : 1 : 0] ∈P2(Fq), called its point
at infinity.
4.1.2 The group law
The points of an elliptic curve Eform an abelian group, in which Ois the neutral element, and any three
distinct points P, Q, R ∈Esatisfy P+Q+R=Oiff they are colinear. If P=Q6=R, the condition is that
the tangent to Eat Ppasses through R, and if P=Q=Rthe condition is that the tangent at Pto Eis
doubly tangent at the point.
Lines passing through Oare either the line at infinity (which is doubly tangent to Eat O), or lines of
the form X=c. Thus P+Q=O, i.e. P=−Q, iff their coordinates satisfy Px=Qxand Py6=Qy; or
P=Q6=Oand the line X=Pxis tangent to Eat P; or P=Q=O. Note that in both affine cases, we also
have Qy=−a1Px−a3−Py, since Py, Qyare the two (not necessarily distinct) roots of a monic quadratic
in ywith linear coefficient a1Px+a3.
10
4.1.3 Isogenies and x-projection
For a curve Ein extended Weierstrass form, let π:E→P1denote the pro jection to the x-coordinate,
defined by π(O) = ∞ ∈ P1and π(P) = Px∈Fqfor P∈E\ {O}. Additionally, as noted in Section 4.1.2,
for any P, Q ∈E,π(P) = π(Q) if and only if P=±Q, thus the preimages π−1(π(P)) = {±P}are either
sets of size two, or a singleton {P}when 2P=O. In particular, it follows that for any subset C⊂Esuch
that Cis disjoint from −C={−P:P∈C}, the map π|Cis 1-to-1 from Cto Fq.
Let E, E ′be elliptic curves over the same field. An isogeny between the curves is a rational map
φ:E→E′satisfying φ(O) = O′, where O′is the neutral element of E′. We follow [Was08, Chapters
2.9, 12.2] to give an algebraic, rather than geometric, description of isogenies. When E, E′are in extended
Weierstrass form, φcan be expressed in a standard form:
Proposition 4.1. Let φ:E→E′be an isogeny between two curves in extended Weierstrass form. Then,
in coordinates, we may write
φ(x, y) = (ψ(x), ξ(x, y)),
where ψ:P1→P1is a rational function. Equivalently, if π:E→P1, π′:E′→P1are the x-projection
maps in each curve, then there exists a unique rational function ψsuch that the diagram
E E′
P1P1
π
φ
π′
ψ
is commutative.
This fact appears to be folklore, and is most commonly discussed only in the special case of curves in
short Weierstrass form E:y2=x3+Ax +B, where ξ(x, y) can also be expressed as ytimes a rational
function—see [Was08, Chapter 2.9] for a discussion of this case. When focusing only on the x-coordinate,
the same proof is valid also for the extended Weierstrass form. For completeness, a full proof of this fact is
included in Appendix B.1.
Definition 4.2. Let φ:E→E′be an isogeny between two curves in extended Weierstrass form, and let ψ
be as in Proposition 4.1. We define deg φ:= deg ψ, i.e. the degree of the isogeny φis defined to be equal to
the degree of ψas a rational function. The isogeny φis called separable if the derivative (in x) of ψis not
identically zero.
The term d-isogeny is shorthand for degree disogeny.
An important property of isogenies is that they are also group homomorphisms, with finite kernels. If φ
is separable, then |ker φ|= deg φ. The converse is also true, and is a crucial part of our construction:
Proposition 4.3 ([Sil09, III.4.12]).Let Ebe an elliptic curve and let H < E be a finite subgroup of E.
There is a unique elliptic curve E′and a separable |H|-isogeny φ:E→E′with ker φ=H.
See also [V´el71] for an explicit construction of such isogenies. We will apply the proposition for groups
Hwith |H|= 2, but all our results generalize to larger H. In this case φis 2-isogeny, meaning ψis a degree
2 rational function.
4.1.4 Group size and structure
The group Eis abelian, and it is always of rank at most 2, i.e. it is isomorphic to a product of at most
2 cyclic groups
E≃Z/m1Z×Z/m2Z
with m1|m2and m1·m2=|E|.
11
Hasse’s theorem states that for every elliptic curve E, the order of the group |E|belongs to a range of
length 4√qcentered at q+ 1, that is,
q−2√q+ 1 ≤ |E| ≤ q+ 2√q+ 1.
By a theorem of Deuring [Deu41], any number in this range is indeed attainable as the size of an elliptic
curve, in the case where qis prime. Waterhouse [Wat69, Theorem 4.1] provides the complete characterization
of achievable sizes for the prime power case. We will require a much weaker form, about possible factors of
|E|. The following is the simplest case of Waterhouse’s theorem:
Theorem 4.4. Let N=q+ 1 −tbe an integer such that |t| ≤ 2√qand tis coprime to q. Then there exists
an elliptic curve E/Fqwith |E|=N.
4.2 An FFT-friendly sequence of rational maps coming from elliptic curves
As noted in Section 3, the depth (or size) of an FFTree limits the degrees of the polynomials which it can
be used to evaluate. Thus, we would like to find the largest FFTree possible: if smaller degrees are sufficient,
we can always use a subtree instead. We will denote by b
Kqthe largest possible size of an FFTree which can
be obtained by our method. More rigorously, we define
Definition 4.5. Let qbe a prime power. Define b
Kqto be the largest power of 2 such that there exists an
elliptic curve Edefined over Fqwhose size satisfies b
Kq| |E|and |E|>2b
Kq.
We claim that b
Kqis in fact fairly large with respect to q:
Claim 4.6. Let q≥7be a prime power. Then b
Kq>√q. Equivalently, for any K= 2k≤2√q, there exists
an elliptic curve Edefined over Fqwith K| |E|and |E|>2K. If qis even, then b
Kq≥q
4.
Before we prove Claim 4.6, having defined and bounded b
Kq, we are now able to precisely state the main
theorem of this section:
Theorem 4.7 (Existence of large FFTrees).Let qbe a prime power, and let the integer kbe such that
K= 2k≤b
Kq; in particular, one may take Kto be any power of two up to 2√qfor q≥7.
Then there exists an FFTree over Fqwith depth k.
We now proceed with building up the infrastructure towards proving Theorem 4.7.
Proof of Claim 4.6.We will ignore at first the condition |E|>2K. By Theorem 4.4, it is enough to show
that there exists an integer tsuch that K|q+ 1 −t,|t| ≤ 2√q, and tis coprime to q.
Since the closed interval [q−2√q+ 1, q + 2√q+ 1] has length at least 2K, it must contain at least two
integers q+ 1 −a, q + 1 −(a+K) which are both divisible by K. Note that at least one of a, a +Kmust
be coprime to the characteristic pof Fq: indeed, if p6= 2, this follows since their difference K= 2kis not
divisible by p, whereas if p= 2, then K|q+ 1 −aimplies both a, a +Kare odd and thus coprime to q—and
in fact a= 1 simply works, yielding a curve of size q(also known as an “anomalous” curve) and showing
b
Kq≥q
4. Thus we can always choose at least one of a, a +Kas our candidate for t, for which a corresponding
curve exists.
Finally, to assert |E|>2K, note that |E|> q −2√qand 2K≤4√q, thus for all q≥36 we get
|E|> q −2√q≥6√q−2√q≥2K
as claimed. The finitely many cases of 7 ≤q < 36 can be manually checked to verify that indeed for each
such qthere is an elliptic curve Ewith size exactly 3b
Kq.
Remark 4.8.Claim 4.6 is false for q= 2,4,5: since 2√qis not much smaller than qfor these prime powers,
for the largest Kbelow 2√q, we have q+ 2√q+ 1 <3K, and therefore no curve has order divisible by Kand
greater than 2K.
12
See also [SS17] for an overview of practical algorithms for finding such curves. We note that restricting
the size of Kfurther, e.g. K≤√qor even K=o(√q), greatly increases the number of possible curves, and
similarly decreases the difficulty of finding one.
Starting from a curve as guaranteed by Claim 4.6, we now construct a chain of curves and isogenies with
useful properties.
Theorem 4.9. For any prime power qand any 1<K= 2k≤b
Kq, there exist elliptic curves E0, E1,...,Ek
over Fqin extended Weierstrass form, a subgroup G0⊆E0of size K,2-isogenies φi:Ei→Ei+1 and rational
functions ψ(i):P1→P1of degree 2, such that the following diagram is commutative:
E0E1··· Ek
P1P1··· P1
φ0
π0
φ1
π1
φk−1
πk
ψ(0) ψ(1) ψ(k−1)
(7)
where:
•πiare the projection maps to the x-coordinate of each curve;
•ker(φi)⊆Gi:=φi−1◦ · ·· ◦ φ0(G0)for all i; and
•G0has a coset Csuch that C6=−C(as elements of the quotient group E0/G0).
Remark 4.10.The existence of the coset Cwith C6=−Cwill be crucial in the derivation of Theorem 4.11,
i.e. in the construction of the FFTree structure.
Proof. By the definition of b
Kq, there exists an elliptic curve E0over Fqwith exactly Npoints, where K|N
and N > 2K. Since E0is abelian, it has a subgroup of any order dividing N, in particular of order K.
However, since we want to ensure the existence of coset Cwith C6=−C, we may need to choose G0more
carefully.5The proof that an appropriate G0exists is technical and not of particular importance, and the
interested reader may find it in Appendix B.2. We note that the condition that N > 2Kis required exactly
to ensure the existence of such G0and C.
Having constructed G0, we choose inside it a subgroup of size 2, and use Proposition 4.3 to find a new
Weierstrass curve E1and 2-isogeny φ0:E0→E1whose kernel is the subgroup. Thus G1=φ0(G0) is a
subgroup of E1of order 2k−1, and we continue iteratively, at step iconstructing Ei+1 and φisuch that the
kernel of φiis a size 2 subgroup of Gi, the image of G0in Ei, which is of size 2k−i. The iteration stops at
Ek, where the image Gkof G0becomes a singleton.
By Proposition 4.1 and Definition 4.2, having written all curves Eiin extended Weierstrass forms, we
find that there exist rational functions ψ(i), of degrees equal to deg φi= 2, which complete the commutative
diagram as claimed.
Focusing on the bottom row of (7), we obtain Theorem 4.7 as a direct corollary of Theorem 4.9. The
following theorem is an equivalent reformulation of Theorem 4.7, directly recalling the definition of the
FFTree.
Theorem 4.11. Let qbe a prime power, and let kbe such that K= 2k≤b
Kq. There exist subsets
L(0), L(1) ,...,L(k)⊆Fqand degree 2rational functions ψ(i)(X) = u(i)(X)
v(i)(X)∈Fq(X)such that:
1. |L(i)|= 2k−i.
2. ψ(i)is a 2-to-1map from L(i)onto L(i+1).
5As the proof shows, this is in fact only an issue when N
K= 4; in other cases any choice of G0works.
13
Proof. The case K= 1 is trivial. If K>1, apply Theorem 4.9 to find Ei, φi, ψ(i)and G0as above,
and let Cbe a coset of G0such that C6=−C. For each idefine Cito be the image of Cin Ei, i.e.
Ci=φi−1◦ ··· ◦ φ1◦φ0(C). Since ker(φi−1◦ ·· · ◦ φ0)< G0, by the third isomorphism theorem, the map
φi−1◦ · ·· ◦ φ0induces an embedding E0/G0֒→Ei/Giwhich maps distinct cosets of G0to distinct cosets
of Gi, and Cto Ci. In particular C6=−Cas cosets of G0implies to Ci6=−Cias cosets of Gi. Define
L(i)=πi(Ci). Note that since Ci,−Ciare cosets, Ci6=−Cimeans they are disjoint, and thus πiis a 1-to-1
map from Cionto L(i). In particular |L(i)|=|Ci|= 2k−i.
Finally, since the diagram is commutative and φiis a 2-to-1 map from Cionto Ci+1,ψ(i)is a 2-to-1 map
from L(i)onto L(i+1).
Remark 4.12.Not to miss the forest for the trees, we clarify some features of this elliptic curve based
construction. A careful examination of the proof in Appendix B.2 shows that C=−Cholds for at most 4
different cosets. The rest of the cosets appear in pairs {C(j),−C(j)}, each pair projecting through π0to a
different (and disjoint) L(0)
j=π0(C(j)) = π0(−C(j)). Thus, our construction actually yields at least N
2K−2
different FFTrees, with pairwise disjoint vertices from all trees at every fixed level, but with the same rational
functions ψ(i)across all trees.
Thus, there exists not only a single FFTree, but an entire FFForest of disjoint FFTrees all sharing the
same maps. The algorithms in Section 6will all be described for the case of a single FFTree and subsets of
its vertices, but we note that many of them can also be applied without additional complexity on sets taken
from two (or O(1)) different FFTrees belonging to the same FFForest. Note that the total number of leaves
in this FFForest is Ω(q), or, more accurately, q
2−O(√q+K).
5 Representing polynomials via FFTrees
In this section, we show how to use FFTrees to get a nice representation for polynomials that supports
fast operations.
We begin by fixing an FFTree for the rest of this section. Thus we have sets L(0), L(1) ,...,L(k)⊆Fq,
and degree-2 rational functions ψ(i):L(i)→L(i+1) . We let L=L(0) and let K=|L|= 2k. Also recall the
associated binary tree Fwhose set of leaves is L.
All the data structures and algorithms for polynomials that we describe will be in the context of this
FFTree. While the exact details of how this FFTree is obtained are not important for anything in this section,
it will be helpful to recall the parameters of FFTrees that are achievable via Theorem 4.7.
5.1 Evaluation tables
We shall represent polynomials by their evaluations on various special sets of points, so we introduce a
special notation that will emphasize the sets of evaluation points used. Concretely, an evaluation table is
specified by the following data:
•a set S⊆Fq,
•a function f:S→Fq.
We denote the associated evaluation table by hf≀Si, pronounced “fon S”.
For a polynomial or rational function P(X)∈Fq(X) with P(X) defined on S, we define the associated
evaluation table hP≀Sito be the evaluation table hP|S≀Si, where P|Sis the function from Sto Fqgiven
by evaluation of P. Looking ahead, we shall use evaluation tables for operations like
•Adding, multiplying and dividing, as in this example: given hf≀Si,hg≀Si,hh≀Si,hP≀Sifor some
P(X)∈Fq[X], we can compute Df+P(X)g
h≀SE.
•Restricting an evaluation table hf≀Sito a subset S0⊆S, denoting the restricted table by hf≀S0i
14
•Partitioning a set Sinto S=S0∪S1, and “splitting” hf≀Siinto hf0≀S0iand hf1≀S1i, as well as
doing the inverse operation of forming the combined evaluation table hf≀Si=hf0≀S0i ∪ hf1≀S1i,
where f:S→Fqis given by:
f|S0=f0,
f|S1=f1.
5.2 Basic sets
We now identify some important subsets of L.
Definition 5.1 (Basic sets).We define a basic set to be a subset Sof Lwhich is the set of all descendants
in Lof some vertex of F.
Equivalently, it is a set of size 2afor some integer a, such that if we let gdenote the composed function
ψ(a−1) ◦ψ(a−2) ◦ · ·· ◦ ψ(1) ◦ψ(0) , then S=g−1(u)for some u∈L(a).
We have the following important property of basic sets: they can be partitioned into two basic sets of
equal size.
Lemma 5.2. Any basic set Sof size 2a≥2can be partitioned to two basic sets S0∪S1, where each Sihas
size 2a−1.
The proof is immediate from Definitions 5.1 and 3.3: if Sis the set of all descendants in Lof the vertex
u∈ F, then letting {u0, u1}be the children of u, we can take Sito be the set of all descendants in Lof ui.
We shall call S0and S1the moieties of S. Note that the two moieties are equivalent, and can be labeled
S0, S1or S1, S0interchangeably.
The following property of sets and polynomials with respect to moieties of basic sets will also prove to
be important in the paper, especially for algorithms related to modular arithmetic:
Definition 5.3. Let Sbe a basic set, and let A⊂Fqbe an arbitrary set. We say Ais half-disjoint from S
if it is disjoint from at least one moiety of S. Similarly, we say a polynomial P(X)is half-disjoint from S
if its set of zeros is disjoint from at least one moiety of S.
We now consider representations of polynomials by evaluation tables. Since nonzero polynomials of degree
< n cannot vanish in npoints, we immediately get the following fundamental fact. For distinct polynomials
P(X), Q(X)∈Fq[X] with deg(P),deg(Q)< n, and a set Swith |S|=n, we have that
hP≀Si 6=hQ≀Si.
Thus, for a fixed set Swith |S|=n,hP≀Siis a way of representing a polynomial Pwith degree < n. The
key to our fast algorithms for working with such a representation is to choose Sto be a basic set.
We now define a standard representation for polynomials (in the context of the fixed FFTree). This
standard representation will support fast operations, and will be used when we describe applications to
classical problems.
For each a≤k, we arbitrarily pick a basic set Uawith size 2asuch that:
U0⊆U1⊆ ··· ⊆ Uk=L.
We will call this Uathe standard basic set of size 2a.
For a polynomial P(X) and an integer awith 2a>deg(P), we define the standard representation of P
at scale a, denoted hPia, to be hP≀Uai.
For a polynomial P(X), we define the standard representation of P, to be the hPia0, where a0is the
smallest integer with 2a0>deg(P).
This standard representation will be our data structure for representing polynomials. In the next section,
we show how the FFTree enables fast operations for this representation of polynomials.
15
6 Fast polynomial algorithms from FFTrees
As in the previous section, we assume that we have fixed an FFTree. Again, the exact details of how this
FFTree is obtained is not important for anything in this section, but it will be helpful to recall the parameters
of FFTrees that are achievable via Theorem 4.7.
In this section we give a number of fast algorithms for working with polynomials P(X) represented using
evaluation tables hP≀Si, where Sis a basic set. Inspection will reveal that nearly all of these algorithms
can be converted to arithmetic circuits over Fqwith constant fan-in and size that matches the proclaimed
running time (the only exception is the computation of polynomial degree, which outputs an integer, not a
field element). Thus, henceforth when we say an algorithm “runs in time t(n)” we shall allow it to receive
advice that will be explicitly stated, and also mean that it can be computed by an arithmetic circuit over Fq
with t(n) gates (and constant fan-in). In particular, we assume each basic arithmetic operation (+,−,×, /)
over Fqhas constant computational cost. While the algorithms of this section use division of elements in Fq
for clarity, by inspecting the details it can be seen that they can be reformulated to avoid division by taking
advice in a different form (for example, taking D1
f≀SEas advice instead of hf≀Sias advice).
Algorithmic notations We use the notation ALGP1,P2,... (I1, I2,...) for our algorithms/circuits. ALG is
the name of the algorithm, the subscript elements P1, P2,... denote fixed parameters that affect constants
of the algorithm/circuit and the inputs (I1, I2,...) are given inside the parenthesis, and are variables. In
particular, any data which depends only on P1, P2,... can be assumed to be included as part of the circuit,
or given by a precomputation advice, and our running times exclude the time required to obtain these
parameters and constants. Furthermore, qand the FFTree that we fixed are always assumed to be part of
the fixed parameters of the algorithm.
Directory of algorithms Below we give a list of the algorithms in this section.
1. EXTENDS,S′which does low degree extension of polynomial evaluations from a basic set Sto another
basic set S′.EXTEND is the basis for all the remaining algorithms in this section.
2. MULT, which multiplies polynomials in the new representation (allowing for the possibility of the
degree growing). Addition is trivially done in linear time so we do not explicitly describe it.
3. MEXTEND, a version of EXTEND for monic polynomials of known, fixed degree.
4. DEGREE, which computes the degree of a polynomial given in the new representation.
5. REDC, which performs Montgomery reduction—a technical operation that helps with the remaining
operations.
6. MOD, which performs modular reduction, reducing a given polynomial in the new representation
modulo a fixed polynomial.
7. DIV, which finds the quotient after division by a fixed polynomial.
8. ENTER and EXIT, which convert between the new representation and the standard monomial repre-
sentation.
9. CRT which computes one direction of the Chinese Remainder Theorem, constructing a polynomial
from its residues modulo two fixed and relatively prime polynomials. (The other direction of the CRT
can be done by MOD.)
16
6.1 Low degree extension
Our first primitive extends the evaluation of Pfrom one basic set to another basic set of the same size
in time O(nlog n) (i.e., via an arithmetic circuit over Fqwith constant fan-in and O(nlog n) gates). In
other words, the algorithm performs Reed–Solomon encoding in quasi-linear time, as long as the message
is provided by the evaluation of Pon a basic set, and is encoded by evaluating Pon a constant collection
of basic sets. Such low-degree extensions are often used to produce interactive proofs and interactive oracle
proofs.
Theorem 6.1 (Low-degree extension).For any two basic sets S, S′⊂Fqwith |S|=|S′|=n, there is an
algorithm that runs in time O(nlog n), denoted EXTENDS,S′, which when given as input:
•hP≀Si, where P(X)∈Fq[X]with deg(P)< n,
outputs hP≀S′i.
For the proof of this theorem (and only for this proof ) we need a generalization of basic sets:
Definition 6.2 (i-basic sets).We define an i-basic set to be a subset Sof L(i)which is the set of all
descendants in L(i)of some vertex of F.
Equivalently, an i-basic set is a subset Sof L(i)of size 2afor some integer a, such that if we let gdenote
the function
ψ(a+i−1) ◦ψ(a+i−2) ◦ · ·· ◦ ψ(i+1) ◦ψ(i),
then S=g−1(u)for some u∈L(a+i).
Notice that 0-basic sets are simply basic sets per Definition 5.1. In our proof, stated next, we shall use
the property that for every i-basic set S,ψ(i)(S) is an (i+ 1)-basic set Tof size |S|/2, and we say Tlies
above Sand is induced by ψ(i).
Proof of Theorem 6.1.We give a more general algorithm EXTENDS,S′,i to solve the analogous extension
problem where Sand S′are i-basic sets with |S|=|S′|=n. The algorithm EXTENDS,S′claimed in
Theorem 6.1 is obtained by fixing i= 0, i.e., EXTENDS,S′(hπ≀Si) = EXTENDS,S′,0(hπ≀Si).
The EXTENDS,S′,i algorithm uses the map ψ(i)to reduce the extension problem for i-basic sets of size
n= 2ato two analogous extension problems for (i+ 1)-basic sets of size n/2, and then proceeds recursively,
by induction on a.
Let T=ψ(i)(S), T′=ψ(i)(S′) be the (i+ 1)-basic sets above S, S′, respectively, which are induced by
ψ(i). By Lemma 3.1, there are unique polynomials P0(X), P1(X) of degree < n/2 with:
P(X) = P0(ψ(i)(X)) + XP1(ψ(i)(X))(v(i)(X)) n
2−1.(8)
EXTENDS,S′,i first computes hP0≀Tiand hP1≀Ti. (Since |T|=n/2, these uniquely determine P0(X)
and P1(X)). Then it runs EXTENDT,T ′,i+1 on this to get hP0≀T′iand hP1≀T′i, and combines the results
to get hP≀S′i.
The algorithm takes as advice (v(i)(X)) n
2−1≀S, which can be precomputed since it only depends on S
and ψ(i), along with whatever advice is needed in the recursive calls.
Algorithm EXTENDS,S′,i :
Input: an evaluation table hπ≀Si
1. If n= 1 (recall that n=|S|=|S′|), then
(a) Let S={s}and S′={s′}.
17
(b) Define
π′:S′→Fq
by π′(s′) = π(s).
(c) Return hπ′≀S′i.
2. Let T=ψ(i)(S), T ′=ψ(i)(S′) be the sets that lie above Sand S′respectively.
3. For each t∈T:
(a) Define s0, s1to be the ψ(i)-preimages of t(noticing they are distinct because Sis a basic set)
(b) Compute (π0(t), π1(t)) = Mt(π(s0), π(s1))) where Mtis defined in Eq. (3).
4. Form the evaluation tables hπ0≀Tiand hπ1≀Ti.
5. Let hπ′
0≀T′iand hπ′
1≀T′ibe the evaluation tables returned by:
EXTENDT,T ′,i+1 (hπ0≀Ti)
EXTENDT,T ′,i+1 (hπ1≀Ti)
6. For each s′∈S′, define π′(s′) by
π′(s′) = π′
0(ψ(i)(s′)) + s′·π′
1(ψ(i)(s′))v(i)(s′)n
2−1.(9)
7. Return hπ′≀S′i.
Correctness: Suppose P(X)∈Fq[X] is a polynomial of degree < n. We want to show that EXTENDS,S′,i(hP≀Si)
returns hP≀S′i.
The main claim is that when the input hπ≀Siis hP≀Si, the functions π0, π1:T→Fqcomputed by the
algorithm satisfy:
hπ0≀Ti=hP0≀Ti,
hπ1≀Ti=hP1≀Ti,
where P0, P1are as in Equation (8). This is trivially correct for a= 0 (i.e., when n= 1) so we focus
henceforth on larger values of n= 2a.
Take any tin T, and take s0, s1∈Swith ψ(i)(s0) = ψ(i)(s1) = t. Using the fact that π(s0) = P(s0)
and π(s1) = P(s1), and the definition of Mtfrom Eq. (3), Lemma 3.2 implies that π0(t) = P0(t) and
π1(t) = P1(t). Thus
hπ0≀Ti=hP0≀Ti,
hπ1≀Ti=hP1≀Ti.
By induction on a, we conclude that EXTENDT,T ′,i+1 on hP0≀Tiand hP1≀Tireturns hP0≀T′iand
hP1≀T′i.
Thus
hπ′
0≀T′i=hP0≀T′i,
hπ′
1≀T′i=hP1≀T′i,
Using this along with Equations (9) and (4), we get that
hπ′≀S′i=hP≀S′i,
as desired. This completes the proof of correctness.
18
Running time: By inspection, we see that our algorithm uses O(n) arithmetic operations over Fqto
reduce an instance of EXTEND of size nto two instances of size n/2. (Recall that the algorithm fixes
various constants, like v(s)n/2−1and the values of the matrix Mt.) Thus the total running time F(n) of this
algorithm satisfies the recursion:
F(n)≤2F(n/2) + O(n).
We conclude the running time (or circuit size) is O(nlog n) and this completes our proof.
Remark 6.3.The EXTEND algorithm as described is defined for S, S′which are basic sets of the same size
in the same FFTree. However, we note that it works just as well when S, S ′are basic sets of the same size
from two different FFTrees in the same FFForest (see also Remark 4.12).
6.2 Multiplication
We give a quick application of the previous algorithm to multiplication of polynomials in the new repre-
sentation.
Theorem 6.4 (Multiplication).Let Sbe a basic set with |S|=n. Let S0⊆Sbe a moiety of S.
There is an algorithm MULTS,S0, which when given as input:
•hP≀S0i, where P(X)∈Fq[X]with deg(P)< n/2, and
•hQ≀S0i, where Q(X)∈Fq[X]with deg(Q)< n/2,
runs in time
O(nlog n)
and computes hP·Q≀Si.
Proof. The algorithm is basically immediate given EXTEND. Let S1be the other moiety of S. We first run
EXTENDS0,S1on hP≀S0iand hQ≀S0ito get hP≀S1iand hQ≀S1i. Combining these, we get hP≀Siand
hQ≀Si, and by pointwise multiplication we get hP·Q≀Si. The running time comes from two invocations of
EXTEND and O(n) other operations, and is thus O(nlog n).
6.3 Monic polynomial extension
As noted before, for a set Sof size n, the linear space of all possible evaluation tables hP≀Siis in one-to-
one correspondence with the space of all polynomials P(X) of degree < n. It is also interesting to note that
these spaces are in one-to-one correspondence with the set of all monic polynomials of degree exactly n. In
fact, if Z(X) is the vanishing polynomial of S, and P(X), Q(X) are polynomials with deg(Q)< n = deg(P)
and Pis monic, then hP≀Si=hQ≀Siif and only if P(X) = Q(X) + Z(X).
This property allows us to easily adapt the EXTEND algorithm into an extension algorithm for monic
polynomials, which we call MEXTEND.
Theorem 6.5 (Monic polynomial extension).For any two basic sets S, S ′⊂Fqwith |S|=|S′|=n, there
is an algorithm that runs in time O(nlog n), denoted MEXTENDS,S′, which when given as input:
•hP≀Si, where P(X)∈Fq[X]is monic with deg(P) = n,
outputs hP≀S′i.
Proof. Let Z(X) be the vanishing polynomial of S. As noted above, for such polynomials P(X), we have
hP≀Si=hP−Z≀Si, and deg(P(X)−Z(X)) < n. By the properties of EXTEND it thus follows that
EXTENDS,S′(hP≀Si) = EXTENDS,S ′(hP−Z≀Si) = hP−Z≀S′i
19
and adding hZ≀S′ipointwise yields
MEXTENDS,S′(hP≀Si):=EXTENDS,S′(hP≀Si) + hZ≀S′i=hP≀S′i
as needed. The algorithm takes hZ≀S′ias advice, calls EXTEND once and does an additional O(n) opera-
tions, thus runs in time O(nlog n).
This algorithm can replace EXTEND in applications where the polynomials are known to be monic and of
known degrees, with more efficient run times. For example, it can be used to multiply two monic polynomials
of degree n/2, represented as evaluation tables on a set of size n/2, with the product similarly being a monic
polynomial of degree n, represented as an evaluation table on a set of size n. If we were instead to multiply
such polynomials using the standard EXTEND algorithm, we would have to represent each polynomial by
its values on a set of size n, and their product on a set of size 2n, and use extensions from nto 2ninstead
of extensions from n/2 to n, which would more than double the required run-time.
6.4 Degree Computation
The next operation we describe is that of computing the degree of a polynomial Prepresented by its
evaluation on a basic set.
Theorem 6.6 (Degree Computation).Let Sbe a basic set of size |S|=n. There is an algorithm DEGREES,
which when given as input:
•hP≀Si, where P(X)∈Fq[X]with deg(P)< n,
runs in time
O(nlog n)
and computes deg(P).
Proof. Let S0, S1be the moieties of S, and let Z0(X) be the vanishing polynomial of S0. The algorithm we
give will assume that hZ0≀S1iis given as advice: this is a fixed precomputation that depends only on S.
Algorithm DEGREES:
Input: An evaluation table hπ≀Si
1. If |S|= 1 with S={s}, then
•if π(s)6= 0 return 0; else, return −∞.
2. Let hg≀S1i=EXTENDS0,S1(hπ≀S0i).
3. If hg≀S1i=hπ≀S1i, then return DEGREES1(hπ≀S1i).
4. Otherwise, using hπ≀S1i,hg≀S1iand hZ0≀S1i, compute:
π−g
Z0≀S1,
and return n
2+DEGREES1π−g
Z0≀S1.
20
Correctness: The case n= 1 is trivial, and when Pis the zero polynomial notice by inspection the result
will be −∞, as required.
Suppose n > 1. Let P(X) be a polynomial with 0 ≤deg(P)< n. Let us consider the execution of the
above algorithm on input hP≀Si.
•Case 1: deg(P)< n/2. Then by the defining property of EXTENDS0,S1, we have that
EXTENDS0,S1(hP≀S0i) = hP≀S1i.
Thus in the execution of the algorithm, we will have hg≀S1i=hP≀S1i, and thus in Step 3the algorithm
will return
DEGREES1(hP≀S1i),
which equals deg(P) by induction, as desired.
•Case 2: deg(P)≥n/2. Let P(X) = R(X) + Z0(X)·Q(X), where deg(R)< n/2. Thus deg(P) =
n/2 + deg(Q).
By the above relation between Pand R, we have
hR≀S0i=hP≀S0i=hπ≀S0i.
By the defining property of EXTENDS0,S1, we get that hg≀S1i=hR≀S1i. Thus
π−g
Z0≀S1=P−R
Z0≀S1=hQ≀S1i,
which implies, by induction, that Step 4returns
n/2 + DEGREES1(hQ≀S1i) = n/2 + deg(Q) = deg(P),
as desired.
Running time: The algorithm calls one instance of EXTEND on an instance of size O(n), does O(n)
operations, and makes one recursive call to itself on an instance of size n/2. Thus the running time F(n)
satisfies:
F(n)≤O(nlog n) + F(n/2),
and thus F(n)≤O(nlog n), as claimed.
6.5 Modular and Montgomery Reduction
6.5.1 Modular Reduction—theorem statement
The goal of this chapter is to present an algorithm that computes the remainder of the division of an
input polynomial P(in the new representation) by a fixed polynomial A:
Theorem 6.7 (Modular Reduction).Let Sbe a basic set of size n, and let A(X)∈Fq[X]be a polynomial
of degree at most n/2which is half-disjoint from S, i.e. A(X)has no zeroes in at least one moiety of S.
There is an algorithm running in time O(nlog n), denoted MODS,A, which when given as input:
•hP≀Si, where P(X)∈Fq[X]with deg(P)< n,
computes hQ≀Si, where Q(X)∈Fq[X]is given by:
Q(X) = P(X)rem A(X).
Before presenting the proof and the algorithm, we introduce an auxiliary algorithm, which we call Mont-
gomery reduction, inspired by Montgomery’s [Mon85] algorithm for “modulo-free” modular multiplication,
which we also describe briefly.
21
6.5.2 Montgomery Reduction
Montgomery’s algorithm for multiplication is motivated by the observation that while the operation a
mod Nfor a generic (odd) integer Nmight be computationally expensive, the operation amod Rwhere
R= 2r&Nis very efficient, in computing systems based on binary representations.
In Montgomery’s method, each residue x(mod N) is represented instead by xR mod N. To get the
representation of the product xy, i.e. xyR mod N, we first multiply the two representations to get an integer
equivalent to xyR2(mod N), and then apply the reduction algorithm REDC, which efficiently maps an
integer tto tR−1mod N, without explicitly computing the division by N. The reduction algorithm relies
on having the constant number (−N−1) mod Ras advice.
The representation xR mod Ncan be transformed back to xmod Nby simply applying reduction.
In the other direction, xmod Ncan be transformed into xR mod Nby performing the full Montgomery
multiplication (i.e. integer multiplication + reduction) between xmod Nand the constant R2mod N, which
is again given as advice.
For our purposes, we want to perform modular arithmetic of polynomials. We observe that the vanishing
polynomial Z(X) of a basic set Sis a natural analogue to the radix R= 2r, as arithmetic operations on the
tables hP≀Siare equivalent to arithmetic operations on polynomials modulo Z(X). Thus, we can attempt
to create a version of REDC which transforms hP≀Siinto P·Z−1≀S, and then use this algorithm to
perform general modular operations, such as MOD. In fact, we apply REDC directly only inside MOD.
Theorem 6.8 (Montgomery Reduction).Let Sbe a basic set with |S|=n. Let S0⊆Sbe a moiety of S.
Let A(X)∈Fq[X]be a polynomial of degree at most n/2having no zeroes in S0. Let Z0(X)be the vanishing
polynomial of S0.
There is an algorithm running in time O(nlog n), denoted REDCS,S0,A, which when given as input:
•hP≀Si, where P(X)∈Fq[X]satisfies deg(P)< n,
computes hQ≀Si, where Q(X)∈Fq[X]is a polynomial such that
•Q(X)≡P(X)·Z0(X)−1(mod A(X)), and
•deg(Q)≤max(deg(P)−n/2,deg(A)−1) < n/2.
Remark 6.9.If deg(P)< n/2 + deg(A), then it follows that deg(Q)<deg(A), and therefore
Q(X) = P(X)·(Z0(X))−1
A(X)rem A(X).
However, the last identity is not true in general when n/2 + deg(A)≤deg(P)< n, since Qmight not be of
degree less than deg(A).
Proof. Let S1be the other moiety of S. The algorithm uses the values of hZ0≀S1i,hA≀S0i,hA≀S1i, which
depend only on S,S0and A.
Algorithm REDCS,S0,A:
Input: an evaluation table hπ≀Si
1. From hπ≀S0iand hA≀S0i, compute π
A≀S0.
2. Let hg≀S1i=EXTENDS0,S1(π
A≀S0).
3. From hπ≀S1i,hg≀S1i,hA≀S1i,hZ0≀S1i, compute:
hh1≀S1i=π−gA
Z0≀S1.
4. Compute:
hh0≀S0i=EXTENDS1,S0(hh1≀S1i).
5. Return hh0≀S0i ∪ hh1≀S1i.
22
Proof of correctness: Let P(X) be a polynomial of degree < n. We will analyze the above algorithm
when its input hπ≀Siis taken to be hP≀Si. Let G(X)∈Fq[X] be the unique polynomial of degree < n/2
interpolating π
Aon S0; namely:
hG≀S0i=Dπ
A≀S0E.
Then by the defining property of EXTEND, we get that:
EXTENDS0,S1(Dπ
A≀S0E) = EXTENDS0,S1(hG≀S0i) = hG≀S1i.
Thus in Step 2of the algorithm, we will have
hg≀S1i=hG≀S1i.
By definition of G(X), we have that P(X)
A(X)−G(X) vanishes on S0. Therefore P(X)−G(X)A(X)∈Fq[X]
vanishes on S0, and so Z0(X) divides P(X)−G(X)A(X). Let H(X)∈Fq[X] be given by:
H(X) = P(X)−G(X)A(X)
Z0(X).
Note that
deg(H)≤max{deg(P),deg(A) + deg(G)} − deg(Z0)≤max(deg(P)−n/2,deg(A)−1) < n/2.(10)
The second inequality follows from the fact that deg(G)<deg(Z0) = n/2, and the final inequality from the
assumptions deg(P)< n and deg(A)≤n/2.
We have
hh1≀S1i=π−gA
Z0≀S1=P−GA
Z0≀S1=hH≀S1i.
Thus in Step 4EXTENDS1,S0yields
hh0≀S0i=hH≀S0i,
and so the algorithm returns:
hh0≀S0i ∪ hh1≀S1i=hH≀S0i ∪ hH≀S1i=hH≀Si
and we have already shown in Eq. (10) that H(X) = Q(X) is of the claimed degree.
Finally, from the definition of H(X) we get
H(X)Z0(X) = P(X)−G(X)A(X)≡P(X) (mod A(X)),
which after dividing by Z0(X) is equivalent to
H(X)≡P(X)·Z0(X)−1(mod A(X)).
This completes the proof of correctness.
Running time: The algorithm does O(n) operations and invokes EXTEND twice on instances of size n/2.
Thus the total running time is O(nlog n).
23
6.5.3 Modular Reduction—algorithm and proof
Proof of Theorem 6.7.Let S0, S1be the moieties of S, and suppose without loss of generality that A(X)
has no zeros in S0(otherwise, it has no zeros in S1by assumption, and we may swap the labeling of the
moieties).
Let C(X) = Z0(X)2rem A(X), which has degree deg(C)<deg(A). The algorithm uses the values of
hC≀Si, that depend only on A(X), Sand S0, as well as values used internally by the REDCS,S0,A sub-circuit.
Algorithm MODS,A:
Input: an evaluation table hπ≀Si
1. From hπ≀Si, compute
hh≀Si=REDCS,S0,A(hπ≀Si).
2. From hh≀Siand hC≀Si, compute
hh·C≀Si.
3. Compute:
hg≀Si=REDCS,S0,A(hh·C≀Si).
4. Return hg≀Si.
Proof of correctness: Suppose P(X)∈Fq[x] with deg(P)< n. We will analyze the above computation
when its input hπ≀Siis taken to be hP≀Si. By Theorem 6.8 about REDC, we get that Step 1computes
hh≀Si=hH≀Si, where H(X) is a polynomial with deg H(X)< n/2 and satisfying
H(X)≡P(X)·Z0(X)−1(mod A(X)).
Thus Step 2computes
hH·C≀Si,
where H(X)·C(X) is a polynomial with
deg(H·C) = deg(H) + deg(C)< n/2 + deg(A)
and
H(X)·C(X)·Z0(X)−1≡P(X)·Z0(X)−1·Z0(X)2·Z0(X)−1≡P(X) (mod A(X)).
Thus in Step 3, as noted in Remark 6.9, the algorithm returns hQ≀Si, where
Q(X) = H·C·(Z0(X))−1
A(X)rem A(X) = P(X)rem A(X),
as desired.
Running time: The algorithm invokes REDC twice on instances of size n, and does O(n) other operations.
Thus the running time is O(nlog n).
Remark 6.10.As noted earlier in this section, we have no further direct applications of REDC in this paper,
and all calls to it are mediated by calls to MOD. Nonetheless, we note that it may hold individual interest
for real-world applications, as it is naturally more than twice as fast as MOD, due to MOD containing two
calls to REDC. Thus, applying REDC directly might be more efficient in certain situations.
24
6.6 Division
We give a quick application of the previous algorithm to finding the quotient of an input polynomial P
(in the new representation) by a fixed polynomial A.
Theorem 6.11 (Division).Let Sbe a basic set with |S|=n.
Let A(X)be a polynomial with degree at most n/2having no zeroes in S.
There is an algorithm DIVS, which when given as input:
•hP≀Si, where P(X)∈Fq[X]with deg(P)< n, and
runs in time
O(nlog n)
and computes hQ≀Si, where Q(X)is the quotient when P(X)is divided by A(X).
Proof. The algorithm is basically immediate given MOD. Letting R=MODS,A(hP≀Si), the algorithm
returns P−R
A≀S.
6.7 Exiting to Standard Polynomial Representation
The next computation transforms a polynomial represented by its evaluation on a basic set to the set of
coefficients that form the standard representation as PiaiXi.
Theorem 6.12 (Exit to Standard Polynomial Representation).Let Sbe a basic set with |S|=n.
There is an algorithm EXITS, which when given as input:
•hP≀Si, where P(X)∈Fq[X]with deg(P)< n,
runs in time
O(nlog2n)
and computes the coefficients aiof P(X)in the standard expansion
P(X) =
n−1
X
i=0
aiXi.
Proof. Let S0, S1be the moieties of S, and note that we may assume without loss of generality that 0 /∈S0.
Thus Xn/2has no roots in S0, and in particular an algorithm MODS,Xn/2exists.
On input hP≀Si, the algorithm will compute hU≀S0iand hV≀S0i, where P(X) = U(X) + Xn/2·V(X)
with deg(U),deg(V)< n/2, in time O(nlog n). Then by recursively calling EXITS0on these two smaller
instances and combining the results in the obvious way, we get the coefficients of P(X) in time O(nlog2n).
The algorithm uses as advice the values Xn/2≀S0, which depend only on S, as well as auxiliary values
used by the MOD algorithm (namely, hZ0≀S1i,Z2
0rem Xn/2≀S).
Algorithm EXITS:
Input: an evaluation table hπ≀Si
1. If |S|= 1 with S={s}, return (π(s)).
2. Let hu≀Si=MODS,Xn /2(hπ≀Si).
3. Let
(a0, a1,...,an
2−1) = EXITS0(hu≀S0i).
25
4. From hπ≀S0i,hu≀S0iand Xn/2≀S0, compute:
hv≀S0i=π−u
Xn/2≀S0.
5. Let
(b0, b1,...,bn
2−1) = EXITS0(hv≀S0i).
6. Return
(a0, a1,...,an
2−1, b0, b1,...,bn
2−1).
Correctness: Suppose P(X)∈Fq[X] with deg(P)< n. We will analyze what the above algorithm does
when its input hπ≀Siis taken to be hP≀Si.
If n= 1 the algorithm is clearly correct.
Now assume n > 1. Write P(X) = U(X) + Xn/2·V(X), where deg(U),deg(V)< n/2.
Then U(X) = P(X)rem Xn/2. By properties of MOD, we get that Step 2computes hu≀Si=hU≀Si.
Thus hu≀S0i=hU≀S0i.
Also note that V(X) = P(X)−U(X)
Xn/2. Then
hv≀S0i=π−u
Xn/2≀S0=P−U
Xn/2≀S0=hV≀S0i.
By induction, we get that the algorithm correctly computes the coefficients of U(X) and V(X), and by
concatenating them together, it computes the coefficients of P(X), as desired.
Running time: The algorithm makes one call to MOD on an instance of size nand two recursive calls to
EXIT on instances of size n/2. Thus the running time F(n) satisfies the recurrence:
F(n)≤2F(n/2) + O(nlog n),
and thus
F(n)≤O(nlog2n).
6.8 Entering from Standard Polynomial Representation
The next algorithm is the inverse of EXIT, it transforms a polynomial given in standard representation
to its evaluation over a basic set.
Theorem 6.13 (Entering from Standard Polynomial Representation).Let Sbe a basic set with |S|=n.
There is an algorithm ENTERS, which when given as input:
•a0, a1,...,an−1∈Fq,
runs in time
O(nlog2n)
and computes hP≀Si, where
P(X) =
n−1
X
i=0
aiXi.
26
Proof. If |S|= 1, the task is trivial.
Otherwise, let S0, S1be the moieties of S. The algorithm is based on writing the polynomial P(X) as:
P(X) = U(X) + Xn/2·V(X),
where deg(U),deg(V)< n/2, and finding the evaluation tables of U, V on both S0, S1. A priori, this seems
like reducing an ENTER instance of size nto 4 ENTER instances of size n/2 (leading to a quadratic running
time), but in fact this can be done by 2 recursive calls to ENTER and 2 invocations of EXTEND.
The algorithm below takes Xn/2≀Sas advice. This can be precomputed since it only depends on S.
Algorithm ENTERS:
Input: (a0, a1,...,an−1)∈Fn
q.
1. If |S|= 1 with S={s}
•Define g:S→Fqby g(s) = a0
•Return hg≀Si
2. Let hu0≀S0i=ENTERS0(a0,...,an
2−1).
3. Let hu1≀S1i=EXTENDS0,S1(hu0≀S0i).
4. Let hv0≀S0i=ENTERS0(an/2,...,an−1).
5. Let hv1≀S1i=EXTENDS0,S1(hv0≀S0i).
6. Let
hπ≀Si=Du0+Xn/2v0≀S0E∪Du1+Xn/2v1≀S1E.
7. Return hπ≀Si.
Correctness: The correctness follows immediately from the discussion preceding the algorithm.
Running time: This algorithm makes two recursive calls to ENTER on instances of half the size, makes
two invocations of EXTEND on instances of half the size, along with O(n) other operations. Thus the running
time F(n) satisfies:
F(n)≤2F(n/2) + O(nlog n) + O(n),
and thus F(n)≤O(nlog2n), as claimed.
6.9 Chinese Remaindering
The following operation receives as input two polynomials P, Q and computes the polynomial Rwhose
remainders modulo two relatively prime polynomials A, B are Pand Q, respectively.
Theorem 6.14 (Chinese Remaindering).Let Sbe a basic set with |S|=n. Let S0⊆Sbe a moiety of S.
Let A(X), B(X)be relatively prime polynomials with degrees at most n/2. Suppose that both A(X)and
B(X)are half-disjoint from S; the moieties having no zeroes of Aand Bmay be the same moiety for both
or a different one for each.
There is an algorithm CRTS,S0,A,B , which when given as input:
•hP≀S0i, where P(X)∈Fq[X]with deg(P)< n/2, and
•hQ≀S0i, where Q(X)∈Fq[X]with deg(Q)< n/2,
27
runs in time
O(nlog n)
and computes hR≀Siwhere Ris the unique polynomial of degree <deg(A) + deg(B)such that R≡P
(mod A)and R≡Q(mod B).
Proof. By the usual proof of the Chinese Remainder Theorem, the desired R(X) is of the form:
((P(X)·G(X)) rem A(X)) ·B(X) + ((Q(X)·H(X)) rem B(X)) ·A(X),
where G(X) = (B(X)−1)A(X),H(X) = (A(X)−1)B(X)depend only on Aand B, and have degrees
deg(G),deg(H)< n/2.
Thus the algorithm simply extends hP≀S0iand hQ≀S0ito find hP≀Siand hQ≀Si. Then, using hG≀Si
and hH≀Sias advice (which can be precomputed, since they only depend on A,Band S), as well as hA≀Si
and hB≀Si, we compute:
MODS,A(hP·G≀Si)· hB≀Si+MODS,B (hQ·H≀Si)· hA≀Si
which is the desired output. Note that deg(P·G),deg(Q·H)< n, as MOD requires.
The run-time comes from two invocations of EXTEND, two invocations of MOD, and O(n) other opera-
tions, and is thus O(nlog n) overall.
7 Applications to classical problems
The previous Section 6presented fast algorithms (and arithmetic circuits) for manipulating polynomials
represented by their evaluations on basic sets. This section uses those results to efficiently solve “classical”
problems of algebraic computation, in which the polynomials are represented in the “classical” way, as sums
of monomials. In all cases below we shall transition to a representation of polynomials by their evaluations
on basic sets, and this will result in running times, over any polynomially large field, that are as good as
those of special, classical-FFT-friendly, finite fields.
7.1 Elementary Symmetric Polynomial Evaluation
Theorem 7.1 (Evaluating Elementary Symmetric Polynomials).Let t < n < qO(1). There is an arithmetic
circuit over Fqof size
O(nlog2n)
which takes as input variables α1,...,αnand computes
Symn,t(α1,...,αn) := X
J⊆[n],|J|=tY
j∈J
αj.
Proof. We follow the classical approach of computing elementary symmetric polynomials as coefficients of a
certain product, except that we work with polynomials in the new representation.
The idea is to compute the coefficients, in the standard monomial representation, of the polynomial
P(X) =
n
Y
i=1
(X−αi) =
n
X
i=0
(−1)n−i·Symn,n−i(α1,...,αn)Xi
We do this by first computing hP≀Sifor a big enough basic set S, and then running EXITS(P) to compute
the coefficients of P. Details follow.
By adding some dummy 0 inputs αiand increasing nby at most a factor 2, we may assume that nis a
power of 2. Next, we claim that Fqcan be assumed to contain a basic set of size at least 2n. Indeed, this
28
can be done by replacing Fqby an O(1)-degree extension of Fqwhich is sufficiently large, of size O(n2), as
needed for a basic set of size at least 2nto exist in Fq(cf. Section 4.2). Moving to a larger qincreases the
number of arithmetic operations by a factor of at most O(1) because we assume n < qO(1).
Let m= log2(2n) and fix arbitrary basic sets U0⊆U1... ⊆Um⊆Fqwith |Uj|= 2j. We shall compute
hP≀Umiin a bottom-up manner by computing products of terms Pi(X) := X−αi, i ∈[n], of increasing
size. We start by computing
hP1≀U1i,...,hPn≀U1i
which takes time O(1) for each term Pi(and total time O(n)).
Let Q(X) = Qi0+2j−1
i=i0Pi(X) and assume, inductively, that we have already computed hQ′≀Ujiand
hQ′′ ≀Ujiwhere
Q′(X) =
i0+2j−1−1
Y
i=i0
Pi(X), Q′′(X) =
i0+2j−1
Y
i=i0+2j−1
Pi(X).
We shall now compute hQ≀Uj+1 ias follows:
•Compute hQ′≀Uj+1iusing the EXTEND6algorithm
•Compute hQ′′ ≀Uj+1iusing the EXTEND algorithm
•Pointwise multiply the two to obtain hQ≀Uj+1i=hQ′·Q′′ ≀Uj+1i
Since EXTEND runs in time O(nlog n) and pointwise multiplication runs in time O(n), the running time
F(n) for this algorithm satisfies:
F(n)≤2F(n/2) + O(nlog n)≤O(nlog2n).
Finally, once we have hP≀Umi, we can find its standard monomial expansion using EXITUm, which also
runs in time O(nlog2n).
The desired output Symn,t is one of the coefficients in this standard monomial expansion, and is thus
computed in time O(nlog2n), as claimed.
7.2 Multipoint evaluation over general sets of points
Previously we evaluated polynomials over basic sets in quasi-linear time (see Theorem 6.13). The next
result shows that evaluating polynomials over general sets of points can also be done in (slightly worse)
quasi-linear time.
Theorem 7.2 (Multipoint polynomial evaluation).Assume n < q. Given any set Bof mpoints in Fq,
there exists an arithmetic circuit over Fq(that depends on B) of size
O(nlog2n+mlog2m)
which takes as input (a0,...,an−1)∈Fn
qand computes hP≀Bifor P(X) = Pn−1
i=0 aiXi.
Remark 7.3.Note that if n≥qthen we can first reduce P(X) modulo Xq−Xtrivially in nsteps and get
back to the case n < q. Moreover, if m > n, then we can partition Binto O(m/n) sets of size at most O(n),
getting a run-time of O(mlog2n), whereas if m < n, then we can decompose Pinto O(n/m) polynomials of
degree at most O(m), getting a run-time of O(nlog2m).
6Note that all polynomials computed in this algorithm are monic and of degrees equal to powers of 2. Thus, as noted in
Section 6.3, it is natural to extend and multiply these polynomials using MEXTEND instead of EXTEND, allowing us to take
m= log2(n), start from evaluations at U0, and cut down the running time by a factor of 2.
29
Proof. Let B={b1,...,bm}. The idea of the algorithm is based on the fact that
P(bi) = P(X)rem (X−bi).
To find P(X)rem (X−bi), we start with P(X)rem Qm
i=1(X−bi), and successively compute P(X)rem Qi∈I(X−
bi) for smaller and smaller sets I⊆[m]. Details follow.
The algorithm starts by running ENTERUato find
hP≀Uai,
where a= log2n+O(1). This step runs in O(nlog2n) time.
Next, we tweak Band Uauntil they are of similar sizes, specifically, 2a−2<|B| ≤ 2a−1.
In the case |B| ≤ 2a−2, let a′=⌈log2m⌉+ 1 < a. We wish to assume that Bis half-disjoint from Ua: if it
is not the case, we may simply split Binto two parts that are each half-disjoint, e.g. B∩Ua−1and B\Ua−1.
Then, assuming half-disjointness, we may run MODUa,Qb∈B(X−b)(hP≀Uai) in O(nlog n) time to obtain
*Prem Y
b∈B
(X−b)≀Ua+.
The resulting polynomial will have degree strictly less than |B| ≤ 2a′−1, and we may restrict its evaluation
table Prem Qb∈B(X−b)≀Uato
*Prem Y
b∈B
(X−b)≀Ua′−1+
at no cost while maintaining the fact that it represents Prem Qb∈B(X−b). We then continue to evaluate this
polynomial on B, replacing awith a′, and noting that 2a′−2<|B| ≤ 2a′−1, and a′≤min(log2(n),log2(m)) +
O(1).
In the case |B|>2a−1, split Barbitrarily into ldisjoint parts, each of size at most 2a−1=O(n), and
proceed on each part separately. As in the previous case we further require that each part be half-disjoint
from Ua, and observe that again this requires at most one additional part (e.g. by taking one of the parts
equal to B∩Ua−1), and can be achieved using only l=O(m
n+ 1) parts, and note that this bound also
covers the previous case (with 1 or 2 parts). The complexity of the remaining work done on each part will
be multiplied by lto obtain the total complexity. We continue now with |B|denoting a single part, of size at
most 2a−1, and half-disjoint from Ua. Again we have a≤min(log2(n),log2(m)) + O(1). As in the previous
case, the next step is to run MODUa,Qb∈B(X−b)(hP≀Uai) and restrict to Ua−1, yielding
*Prem Y
b∈B
(X−b)≀Ua−1+
in O(nlog n) time.
We can now get the desired result by applying the following recursive step, for j=a−1, a −2,...,1:
Suppose A(X) is a product of ≤2jdifferent linear factors. Then we may write A(X) = A′(X)·A′′(X),
where deg(A′),deg(A′′)≤2j−1, and both A′, A′′ are half-disjoint from Uj. Then given hPrem A≀Uji, we
can compute
hPrem A′≀Uji=MODUj,A′(hPrem A≀Uji)
hPrem A′′ ≀Uji=MODUj,A′′ (hPrem A≀Uji)
in time O(|Uj|log |Uj|), and then restrict the tables to hPrem A′≀Uj−1i,hPrem A′′ ≀Uj−1i.
At layer jof the recursion we perform 2a−jMODUjoperations, taking a total run time of O(|Ua|log |Uj|),
and summing over all layers jwe get a run time of
O(|Ua|log2|Ua|) = O(min(nlog2n, m log2m))
30
per part. Multiplying by the number of parts l=O(m
n+ 1) and adding the O(nlog2n) from ENTER, we get
that the total run time is
O(nlog2n+mlog2m),
as claimed.
7.3 Interpolation from general evaluation sets
In Section 6.7 we showed how to interpolate in quasi-linear time from evaluations on basic sets. The
following result, the converse of the previous Theorem 7.2, obtains quasi-linear running time (with somewhat
worse parameters) for interpolating from general evaluation sets.
Theorem 7.4 (Polynomial interpolation from general evaluation sets).Let B⊆Fqbe a set of mpoints.
There is an arithmetic circuit over Fq(depending on B) of size
O(mlog2m)
which takes as input an evaluation table hπ≀Biand computes the coefficients aiof the unique polynomial of
degree < m:
P(X) =
m−1
X
i=0
aiXi,
such that hP≀Bi=hπ≀Bi.
Proof Sketch. Since this algorithm is roughly the opposite of the previous algorithm, instead of applying
MOD in each recursive step as done above, we use CRT to do fast Chinese remaindering to compute hP≀Ui
for a basic set U, followed by calling EXIT(hP≀Ui) to get the desired standard polynomial representation.
The running time and analysis are similar to that of the previous Theorem 7.2.
Acknowledgements
Some of this research was done while SK was visiting StarkWare in 2019. SK is grateful to StarkWare
for the warm hospitality and the electrifying atmosphere.
31
References
[ABR99] Michel Abdalla, Mihir Bellare, and Phillip Rogaway. DHAES: An encryption scheme
based on the diffie-hellman problem. Cryptology ePrint Archive, Report 1999/007, 1999.
https://eprint.iacr.org/1999/007.
[BBHR18] Eli Ben-Sasson, Iddo Bentov, Yinon Horesh, and Michael Riabzev. Fast reed-solomon interactive
oracle proofs of proximity. In Ioannis Chatzigiannakis, Christos Kaklamanis, D´aniel Marx, and
Donald Sannella, editors, ICALP, volume 107 of LIPIcs, pages 14:1–14:17. Schloss Dagstuhl -
Leibniz-Zentrum f¨ur Informatik, 2018.
[BBHR19] Eli Ben-Sasson, Iddo Bentov, Yinon Horesh, and Michael Riabzev. Scalable zero knowledge with
no trusted setup. In Alexandra Boldyreva and Daniele Micciancio, editors, CRYPTO, volume
11694 of Lecture Notes in Computer Science, pages 701–732. Springer, 2019.
[BCKL21] Eli Ben-Sasson, Dan Carmon, Swastik Kopparty, and David Levit. Elliptic Curve Fast Fourier
Transform Part II: FRI and STARK over all finite fields. In preparation, 2021.
[BCR+19] Eli Ben-Sasson, Alessandro Chiesa, Michael Riabzev, Nicholas Spooner, Madars Virza, and
Nicholas P. Ward. Aurora: Transparent succinct arguments for R1CS. In Yuval Ishai and Vincent
Rijmen, editors, Advances in Cryptology - EUROCRYPT 2019 - 38th Annual International
Conference on the Theory and Applications of Cryptographic Techniques, Darmstadt, Germany,
May 19-23, 2019, Proceedings, Part I, volume 11476 of Lecture Notes in Computer Science,
pages 103–128. Springer, 2019.
[BCS97] Peter B¨urgisser, Michael Clausen, and M. Amin Shokrollahi. Algebraic Complexity Theory.
Springer-Verlag, Berlin, 1997.
[BF03] Dan Boneh and Matthew K. Franklin. Identity-based encryption from the weil pairing. SIAM
J. Comput., 32(3):586–615, 2003.
[BM74] Allan Borodin and R. Moenck. Fast modular transforms. J. Comput. Syst. Sci, 8(3):366–386,
1974.
[BS08] Eli Ben-Sasson and Madhu Sudan. Short PCPs with polylog query complexity. SIAM J. Comput,
38(2):551–607, 2008.
[Can89] David G Cantor. On arithmetical algorithms over finite fields. Journal of Combinatorial Theory,
Series A, 50(2):285–300, 1989.
[CK91] Cantor and Kaltofen. On fast multiplication of polynomials over arbitrary algebras. ACTAINF:
Acta Informatica, 28, 1991.
[COS20] Alessandro Chiesa, Dev Ojha, and Nicholas Spooner. Fractal: Post-quantum and transparent
recursive proofs from holography. In Anne Canteaut and Yuval Ishai, editors, Advances in
Cryptology - EUROCRYPT 2020 - 39th Annual International Conference on the Theory and
Applications of Cryptographic Techniques, Zagreb, Croatia, May 10-14, 2020, Proceedings, Part
I, volume 12105 of Lecture Notes in Computer Science, pages 769–793. Springer, 2020.
[CT65] J. M. Cooley and J. W. Tukey. An algorithm for the machine calculation of complex fourier
series. Math. Comp., 19:297, 1965.
[Deu41] Max Deuring. Die typen der multiplikatorenringe elliptischer funktionenk¨orper. Abhandlungen
aus dem Mathematischen Seminar der Universit¨at Hamburg, 14(1):197–272, Dec 1941.
32
[DKSS08] Anindya De, Piyush P. Kurur, Chandan Saha, and Ramprasad Saptharishi. Fast integer multi-
plication using modular arithmetic. In ACM, editor, STOC ’08: proceedings of the 39th Annual
ACM Symposium on Theory of Computing, Victoria, British Columbia, Canada, May 17–20,
2008, pages 499–506, pub-ACM:adr, 2008. ACM Press.
[F¨ur07] Martin F¨urer. Faster integer multiplication. In STOC’07, pages 57–66, 2007.
[GM10] Shuhong Gao and Todd Mateer. Additive fast fourier transforms over finite fields. IEEE Trans-
actions on Information Theory, 56(12):6265–6272, 2010.
[HJB85] Michael T Heideman, Don H Johnson, and C Sidney Burrus. Gauss and the history of the fast
fourier transform. Archive for history of exact sciences, 34(3):265–277, 1985.
[Hor72a] E. Horowitz. Errata: A fast method for interpolation with preconditioning. Information Pro-
cessing Letters, 1(5):216, October 1972.
[Hor72b] Ellis Horowitz. A fast method for interpolation using preconditioning. Information Processing
Letters, 1(4):157–163, June 1972.
[HvdH19a] David Harvey and Joris van der Hoeven. Faster polynomial multiplication over finite fields using
cyclotomic coefficient rings. Journal of Complexity, 54:101404, 2019.
[HvdH19b] David Harvey and Joris van der Hoeven. Polynomial multiplication over finite fields in time
O(nlog n). Technical report, HAL, 2019. http://hal.archives-ouvertes.fr/hal-02070816.
[HvdH21] David Harvey and Joris van der Hoeven. Integer multiplication in time o (n log n). Annals of
Mathematics, 193(2):563–617, 2021.
[HvdHL17] David Harvey, Joris van der Hoeven, and Gr´egoire Lecerf. Faster polynomial multiplication over
finite fields. Journal of the ACM (JACM), 63(6):1–23, 2017.
[Jou04] Antoine Joux. A one round protocol for tripartite diffie-hellman. J. Cryptology, 17:263–276,
2004.
[Kob87] N. Koblitz. Elliptic curve cryptosystems. Mathematics of Computation, 48:203–209, 1987.
[Lat18] S. Latt`es. Sur l’it´eration des substitutions rationnelles et les fonctions de Poincar´e.C. R. Acad.
Sci., Paris, 166:26–28, 1918.
[LCH14] Sian-Jheng Lin, Wei-Ho Chung, and Yunghsiang S. Han. Novel polynomial basis and its appli-
cation to reed-solomon erasure codes. In 2014 IEEE 55th Annual Symposium on Foundations of
Computer Science, pages 316–325, 2014.
[Len87] H. W. Lenstra. Factoring integers with elliptic curves. Annals of Mathematics, 126(3):649–673,
1987.
[Mil86] Victor S. Miller. Use of elliptic curves in cryptography. In Hugh C. Williams, editor, Advances
in Cryptology — CRYPTO ’85 Proceedings, pages 417–426, Berlin, Heidelberg, 1986. Springer
Berlin Heidelberg.
[Mon85] Peter L. Montgomery. Modular multiplication without trial division. Mathematics of Computa-
tion, 44(170):519–521, 1985.
[Pol71] John M Pollard. The fast fourier transform in a finite field. Mathematics of computation,
25(114):365–374, 1971.
[Pol74] J. M. Pollard. Theorems on factorization and primality testing. Mathematical Proceedings of
the Cambridge Philosophical Society, 76(3):521–528, 1974.
33
[Pos11] Alexey Pospelov. Faster polynomial multiplication via discrete fourier transforms. In Alexan-
der S. Kulikov and Nikolay K. Vereshchagin, editors, CSR, volume 6651 of Lecture Notes in
Computer Science, pages 91–104. Springer, 2011.
[Sch77] A. Sch¨onhage. Fast multiplication of polynomials over fields of characteristic 2. Acta Inf.,
7(4):395–398, 1977.
[Sch85] Ren´e Schoof. Elliptic curves over finite fields and the computation of square roots mod p.Math-
ematics of Computation, 44(170):483–494, 1985.
[Sil07] Joseph H Silverman. The arithmetic of dynamical systems, volume 241. Springer Science &
Business Media, 2007.
[Sil09] Joseph H. Silverman. The Arithmetic of Elliptic Curves. Graduate texts in mathematics.
Springer, Dordrecht, 2nd edition, 2009.
[SS71] Arnold Sch¨onhage and Volker Strassen. Schnelle multiplikation großer zahlen. Computing, 7(3-
4):281–292, 1971.
[SS17] Igor E. Shparlinski and Andrew V. Sutherland. Finding elliptic curves with a subgroup of
prescribed size. International Journal of Number Theory, 13(1):133–152, February 2017.
[Sta21] StarkWare. ethstark documentation. Cryptology ePrint Archive, Report 2021/582, 2021.
https://eprint.iacr.org/2021/582.
[Van92] Scott Vanstone. Responses to nist’s proposal. Communications of the ACM, pages 50–52, 7
1992.
[V´el71] Jacques V´elu. Isog´enies entre courbes elliptiques. Comptes-Rendus de l’Acad´emie des Sciences,
S´erie I, 273:238–241, juillet 1971.
[vzGG13] Joachim von zur Gathen and J¨urgen Gerhard. Modern Computer Algebra (3. ed.). Cambridge
University Press, 2013.
[Was08] Lawrence C. Washington. Elliptic Curves: Number Theory and Cryptography, Second Edition.
Chapman & Hall/CRC, 2 edition, 2008.
[Wat69] William C. Waterhouse. Abelian varieties over finite fields. Annales scientifiques de l’ ´
Ecole
Normale Sup´erieure, Ser. 4, 2(4):521–560, 1969.
[Wil95] Andrew Wiles. Modular elliptic curves and fermat’s last theorem. Annals of Mathematics,
141(3):443–551, 1995.
34
A Proof of the decomposition lemma 3.1
Lemma 3.1 (Decomposition).Let ψ(X)∈Fq(X)be a rational map given by:
ψ(X) = u(X)
v(X),
where u(X), v(X)∈Fq[X]are relatively prime polynomials. Let δ= deg(ψ) = max{deg(u),deg(v)}. Let d
be a multiple of δ. Then for every P(X)∈Vd, there is a unique tuple:
(P0(X), P1(X),...,Pδ−1(X)) ∈(Vd/δ )δ
such that:
P(X) = δ−1
X
i=0
Xi·Pi(ψ(X))!·v(X)d
δ−1.(2)
Proof. For general Pi(Y)∈Vd/δ, where
Pi(Y) =
d/δ−1
X
j=0
aij Yj,
consider the polynomial
P(X) =
δ−1
X
i=0
XiPi(ψ(X)) ·v(X)d
δ−1.
Observe that
P(X) =
δ−1
X
i=0
XiPi(u(X)/v(X)) ·v(X)d
δ−1=
δ−1
X
i=0
Xi
d/δ−1
X
j=0
aij (u(X)/v(X))j·v(X)d
δ−1
=
δ−1
X
i=0
d/δ−1
X
j=0
aij Xiu(X)j·v(X)d
δ−1−j,(11)
and thus P(X)∈Vd. We shall use the following claim, proved below:
Claim A.1. For every choice of P0(X), P1(X),...,Pδ−1(X)∈Vd/δ , not all Pibeing zero, the polynomial:
P(X) =
δ−1
X
i=0
XiPi(ψ(X)) ·v(X)d
δ−1
is nonzero.
Together with the fact that the dimension of Vd/δδequals the dimension of Vd, the theorem follows.
Proof of Claim A.1.Reordering the right hand side of Eq. (11) gives
P(X) =
d/δ−1
X
j=0
Qj(X)u(X)jv(X)d/δ−1−j,
where Qj(X) = Pδ−1
i=0 aij Xiis a polynomial of degree < δ.
Since deg(ψ) = δ, we have that either deg(u(X)) = δor deg(v(X)) = δ. Suppose deg(u(X)) = δ, the
other case being similar.
35
The assumption that not all Pi(X) are zero implies that not all Qj(X) are zero, so let j0be the minimal
integer such that Qj0(X) is a nonzero polynomial. Then P(X) is divisible by u(X)j0, and
P(X)
u(X)j0=
d/δ−1
X
j=j0
Qj(X)u(X)j−j0v(X)d/δ−1−j,
Finally, we observe this polynomial is nonzero modulo u(X), since modulo u(X) it equals:
Qj0(X)·v(X)d/δ−1−j0,
v(X) is invertible modulo u(X) (since v(X) is relatively prime to u(X)), and Qj0(X) is nonzero modulo
u(X) because it is a nonzero polynomial of degree strictly less than δ. This implies that P(X) is a nonzero
polynomial, completing the proof of the claim.
B Proofs from Section 4
B.1 Proof of Proposition 4.1
Proposition 4.1. Let φ:E→E′be an isogeny between two curves in extended Weierstrass form. Then,
in coordinates, we may write
φ(x, y) = (ψ(x), ξ(x, y)),
where ψ:P1→P1is a rational function. Equivalently, if π:E→P1, π′:E′→P1are the x-projection
maps in each curve, then there exists a unique rational function ψsuch that the diagram
E E′
P1P1
π
φ
π′
ψ
is commutative.
Proof. As noted in Section 4.1.2,π′(Q) = π′(−Q) for all points Q∈E′. In fact, this equality holds for all the
points of E′(Fq) – the set of all solutions of the curve equation in the algebraic closure of Fq(i.e., considering
all solutions over all the finite field extensions of Fq). The composition π′◦φ:E→P1can be represented
as an element of Fq(X)[Y]/F (X, Y ) where F(X, Y ) = 0 is the equation that defines E(see Eq. (6)). Notice
that Fq(X)[Y]/F (X, Y ) is a degree 2 extension field of Fq(X) because Fis a degree 2 polynomial in Ywith
coefficients in Fq(X), so we can write π′◦φ(X, Y ) = ψ(X) + Y·χ(X) for some ψ, χ ∈Fq(X). We know that
φis a group homomorphism, so π′(φ(−Q)) = π′(−φ(Q)) = π′(φ(Q)) for all points Q∈E(Fq). In particular,
since Qand −Qhave the same xcoordinate but different ycoordinates (unless Q=−Q), then χ(x) = 0
for every xcoordinate of a point in E(Fq) except for at most 4 points (see [Sil09, Exercise 3.7] or [Was08,
Example 2.5]). Since there are infinitely many such points over the algebraic closure Fqwe conclude that χ
is the constant 0 function and π′◦φ(x, y) = ψ(x) or equivalently π′◦φ=ψ◦π.
B.2 Existence of an appropriate G0in the proof of Theorem 4.9
Recall that we have constructed a curve E0of size Nwith K|Nand N > 2K, where Kis a power of 2.
Our goal in this section is to show that there exists a subgroup G0< E0which is of size K, and such that
there exists a coset Cof G0with C6=−C.
As noted in Section 4.1.4,E0is of rank at most 2, and there is an isomorphism
τ:E0↔Z/(m12l1Z)×Z/(m22l2Z)
36
where m1, m2are odd with m1|m2,l1≤l2,m1m22l1+l2=Nand in particular l1+l2≥k. A subgroup G0
of size Kwill necessarily be of the form
G0=τ−1(m12l1−k1Z)/(m12l1Z)×(m22l2−k2Z)/(m22l2Z)≃Z/2k1Z×Z/2k2Z
with k1≤l1,k2≤l2and k1+k2=k, and the quotient E/G0is then isomorphic to
E0/G0≃Z/(m12l1−k1Z)×Z/(m22l2−k2Z).
We wish to ensure that this group contains an element Csuch that C6=−C, or equivalently, 2C6= 0.
This is clearly the case for any choice of k1, k2, except if m1=m2= 1 and l1−k1, l2−k2≤1, which
are the cases where E0/G0is isomorphic to either the trivial group, Z/2Z, or Z/2Z×Z/2Z. But since
m1m22l1−k1+l2−k2=N
K>2, this happens only when N= 4Kand for the choice k1=l1−1 and k2=l2−1.
But, by the assumption K>1 and by l2≥l1, we find l2≥2, thus we may choose instead k1=l1and
k2=l2−2, to obtain E0/G0≃Z/4Z, which indeed contains an element Cwith C6=−C.
37