Conference PaperPDF Available

# Lazy Evaluation in Infinite-Dimensional Function Spaces with Wavelet Basis

Authors:

## Abstract and Figures

Vectors in numerical computation, i.e., arrays of numbers, often represent continuous functions. We would like to reflect this with types. One apparent obstacle is that spaces of functions are typically infinite-dimensional, while the code must run in finite time and memory. We argue that this can be overcome: even in an infinite-dimensional space, the vectors can in practice be stored in finite memory. However, dual vectors (corresponding essentially to distributions) require infinite data structure. The distinction is usually lost in the finite dimensional case, since dual vectors are often simply represented as vectors (by implicitly choosing a scalar product establishing the correspondence). However, we shall see that an explicit type-level distinction between functions and distributions makes sense and allows directly expressing useful concepts such as the Dirac distribution, which are problematic in the standard finite-resolution picture. The example implementation uses a very simple local basis that corresponds to a Haar Wavelet transform.
Content may be subject to copyright.
Lazy Evaluation in Innite-Dimensional Function
Spaces with Wavelet Basis
Justus Sagemüller
Faculty of Engineering and Science
Western Norway University of Applied Science
Bergen, Norway
jsag@hvl.no
Olivier Verdier
Faculty of Engineering and Science
Western Norway University of Applied Science
Bergen, Norway
olivier.verdier@hvl.no
Department of Mathematics
KTH-Royal Institute of Technology
Stockholm, Sweden
olivierv@kth.se
Abstract
Vectors in numerical computation, i.e., arrays of numbers,
oen represent continuous functions. We would like to re-
ect this with types. One apparent obstacle is that spaces of
functions are typically innite-dimensional, while the code
must run in nite time and memory.
We argue that this can be overcome: even in an innite-
dimensional space, the vectors can in practice be stored in
nite memory. However, dual vectors (corresponding essen-
tially to distributions) require innite data structure. e dis-
tinction is usually lost in the nite dimensional case, since
dual vectors are oen simply represented as vectors (by im-
plicitly choosing a scalar product establishing the correspon-
dence). However, we shall see that an explicit type-level dis-
tinction between functions and distributions makes sense
and allows directly expressing useful concepts such as the
Dirac distribution, which are problematic in the standard
nite-resolution picture.
e example implementation uses a very simple local ba-
sis that corresponds to a Haar Wavelet transform.
CCS Concepts Mathematics of computing Approx-
imation;Distribution functions;Arbitrary-precision arithmetic;
Discretization;Continuous functions; Coding theory; Integral
equations; • Computing methodologies Representa-
tion of mathematical functions; Image compression; •
Applied computing Mathematics and statistics.
Keywords wavelet, multiresolution, lazy evaluation
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for prot or commercial advantage and that copies
bear this notice and the full citation on the rst page. Copyrights for compo-
nents of this work owned by others than ACM must be honored. Abstract-
ing with credit is permied. To copy otherwise, or republish, to post on
servers or to redistribute to lists, requires prior specic permission and/or
a fee. Request permissions from permissions@acm.org.
FHPNC ’19, August 18, 2019, Berlin, Germany
© 2019 Association for Computing Machinery.
ACM ISBN 978-1-4503-6814-8/19/08…\$15.00
https://doi.org/10.1145/3331553.3342615
ACM Reference Format:
Justus Sagemüller and Olivier Verdier. 2019. Lazy Evaluation in
Innite-Dimensional Function Spaces with Wavelet Basis. In Pro-
ceedings of the 8th ACM SIGPLAN International Workshop on Func-
tional High-Performance and Numerical Computing (FHPNC ’19),
August 18, 2019, Berlin, Germany. ACM, New York, NY, USA, 7pages.
https://doi.org/10.1145/3331553.3342615
1 Introduction
Consider the unit interval D1= [1,1] R. is paper dis-
cusses functions on that domain, but the methods are writ-
ten so as to straightforwardly generalise to unbounded mul-
tidimensional domains.
e set RD1of functions D1Ris a vector space, but it
is, in a sense, too big. To wit, this set contains “pathological”
functions that have in every point, even in points very close
to each other, a completely unrelated value. (A classical ex-
ample is the function that assigns each rational number the
value 1, each irrational 0.) Pathological functions like that
are largely irrelevant for real-world applications; yet not
only does D1Rcontain such functions, in a cardinal-
ity sense most of them are pathological. erefore, for the
purposes of an ecient computer implementation, sets of
all functions on some domain are quite impractical.
It is helpful and established practice in maths and sci-
ence to consider subspaces containing only “beer behaved”
functions. For instance,
• C0(D1): continuous functions. ese can be charac-
terised thus: to obtain any function value f(x)with at
least precision ε, one can instead consider f(˜
x)where
˜
xneeds to be merely close enough to x, i.e. within a
distance δx,ε. By making the range of inputs, i.e. the
δ, suciently small, one can also limit the range εof
results. So, unlike for general functions, it is not nec-
essary to know the argument exactly, it is enough to
know it with good accuracy: then the function value
will also be known accurately.
• C1(D1): continuously dierentiable functions. Whilst
continuous functions merely guarantee that f(˜
x)does
27
FHPNC ’19, August 18, 2019, Berlin, Germany Justus Sagemüller and Olivier Verdier
1
-1
1.00.5-0.5-1.0
𝑓(𝑥)
𝑥
PCM (𝑛=12)𝑓
Figure 1. Example of how a function f:D1Ris repre-
sented in PCM.
not deviate too much from f(x)at nearby ˜
x, dieren-
tiability also make some tangible statement of how it
deviates.
• L2(D1): square-integrable functions. ese do not re-
strict what the function does at individual points, but
they guarantee that the behaviour in a whole inter-
val can be “summarised” by the integral over it. Func-
tions that are indistinguishable by any such integra-
tion over small intervals are considered equivalent.
Remark 1.1. Continuity has a straightforward physical in-
terpretation. One could argue that a function representing
an observable value on a continuous domain must be con-
tinuous, at least almost everywhere, as a physical setup is
never completely exact.
2 e Space of PCM-Sampled Functions
An extremely common way of representing continuous or
L2functions in computer programs is what we will in the
following call the PCM representation1. e idea is to divide
the domain into equal-sized segments, and store in each of
these segments only one function value. at can be done
with a simple array, or list.
1newtype PCM_D¹ y = PCM_D¹ {
2getPCMSampling :: [y] }
1e abbreviation “PCM” stands for pulse code modulation. e term is used
in digital signal processing for a signal in time domain that is sampled
on equal time intervals with a digital value proportional to the analogue
voltage it represents. We use “DSP” as a shorthand for general equal-spaced
sampling. In applications of numerical dierential equations, this might
rather be referred to as a nite dierences (FD) representation.
where the yvalues correspond to equally-spaced samples of
the represented function, i.e.
f(1 + 2
n·i)i[1
2.. n1
2].
ere is a body of theory, the Shannon-Nyquist formalism,
that makes PCM a quite solid method for representing a sub-
space of L2, the subspace of bandlimited functions. We will
not discuss this aspect here.
e space of PCM-sampled functions is a vector space,
and many applications make use of this fact all the time:
array-languages like Matlab and NumPy implement addi-
tion, multiplication and more on arrays, and provided that
all functions are sampled with the same resolution, this cor-
responds perfectly to the conceptually intended extensional
1h = PCM_D¹ \$ zipWith (+)
2(getPCMSampling f) (getPCMSampling g)
h(x) = f(x) + д(x).
is does not hold up anymore if fand дare sampled at dif-
ferent resolution. Oen, this is simply prevented altogether
(made a runtime error or, somewhat beer, type-error with
indexed-length vector types).
In some applications this is not a problem: for example
In DSP, the input resolution is xed by a physical ADC,
and subsequent calculations (LTI lters etc.) simply
leave it as-is or perhaps double the sampling. Never-
theless, it can be desirable to combine signals from
dierent sources with dierent sampling rate, which
then requires suitable resampling.
Explicit integrators for hyperbolic PDEs iterate a re-
solution-preserving propagator-transformation invol-
ving the pointwise value of the old state and a numeri-
cal approximation (e.g. nite dierences) of its spatial
derivatives. is calculation will depend on the given
resolution, but there is no need to precompute and
store it in something like a matrix that would need
to be valid for any of the possible resolutions: even if
change of resolution should be required, the transfor-
mation can just be recomputed from scratch.
Many other applications, however, require coecients to be
stored before the required resolution is even known. is is
both necessary for implicit methods / inverse problems –
i.e. where there is no explitly programmed way to deduce
the result from an input, but rather an input needs to be
“guessed” that will match the desired output2, but also for
calculations which simply require a stored-coecient form
2In practice, implicit methods use heuristics to choose a suitable resolu-
tion. A common approach is to just match the data count of the input to
that of the intended output, which is reasonable enough since inversion
is clearly dependent on some notion of isomorphism – however, even an
isomorphism does not necessarily preserve the highest, locally required res-
olution, but can “concentrate” data in given spots. is can again be taken
into account with more heuristics to adaptively rene the mesh, or the
28
Lazy Evaluation in Infinite-Dimensional Function Spaces… FHPNC ’19, August 18, 2019, Berlin, Germany
for eciency. Doing that in a general manner which would
work regardless of what resolution turns out to be necessary
in the end would amount to choosing an innite resolution.
Experience with Haskell suggests that the problem of com-
puting innitely many coecients should be possible with
lazy evaluation, provided only nitely many of them actu-
ally need to be evaluated in the end. However, a data struc-
ture like PCM_D¹ is not suitable for this, since adding new
coecients to the end of the list would change the mean-
ing of the already calculated ones (they would be squeezed
to the le of the domain). What is required is some form of
progressive resolution.
3 Multiscale Resolution
e main usefulness of spaces like L2is that the innite di-
mensionality can be managed easily if one only needs nite-
width integrals over the function, as these average out small-
scale uctuations. To benet from this, we must nd a way
to compute the integral without evaluating all the small-
scale structure. at is the idea behind multiscale or wavelet
methods. ese are oen derived as some orthonormal basis
of L2, but we can also give a construction more from rst
principles.
To directly enable the O(1) evaluation of large-scale in-
tegrals, one might consider the following representation of
D1y functions:
1data PreIntg_D¹ y = PreIntg
2{ offset :: y
3, lSubstructure :: PreIntg_D¹ y
4, rSubstructure :: PreIntg_D¹ y
5}
e idea is to decompose a function into a constant oset
(proportional to the integral D1dx f (x)) plus ner-grained
uctuations in each half of the domain, which are in turn
recursively represented by the same type.
f(y0,fl,fr)(x) = y0+fl(xl)if xon le
fr(xr)if xon right
e data structure PreIntg_D¹ dened above is a binary tree
which has always innite size. is can be handled in a lan-
guage with lazy evaluation like Haskell, but only if all that
is ever requested from the function are integrals over nite-
extend subintervals. Pointwise evaluation would recurse in-
nitely.
Note also that it is not really a function from D1to y if it
cannot be evaluated at any individual point in D1. In prac-
tice, for any given real-world measured function, there will
be only nitely many data points available at any given mo-
ment, so at a suciently small scale one would eventually
extra detail can just be smoothed out (which can be useful to keep compu-
tation eort limited in nonlinear PDE solvers, however it is mathematically
not really a correct solution then).
store only the oset, and assume that any smaller uctua-
tions are negligible.
is eventual cuto can be implemented by wrapping the
substructure elds in Maybe. Here we dene instead a specic
constructor for the zero function (to improve the semantics
of the generic Nothing constructor). We now obtain a con-
ventional tree with nite depth. Note that it can now also be
strictly evaluated (here enforced by the exclamation marks).
1data PreIntg_D¹ y
2= PreIntgZero
3| PreIntg !y !(PreIntg_D¹ y)
4!(PreIntg_D¹ y)
Pointwise function evaluation is then readily implemented
recursively:
2=> PreIntg_D¹ y -> D¹ -> y
3evalPreIntg_D¹ PreIntgZero _ = 0
4evalPreIntg_D¹ (PreIntg y0 l r) x
5= y0 + if x<0
6then evalPreIntg_D¹ l (2*x+1)
7else evalPreIntg_D¹ r (2*x-1)
Here, 2*x+1 or 2*x-1 “zoom in” onto the le or right half
subinterval, depending on where xlies.
One downside of data PreIntg_D¹ is that it is redundant:
the oset already xes what the integral over the complete
function should be, but there is nothing preventing the sub-
interval functions from contributing their own part to the
integral. One solution is to use a type which represents func-
tions having vanishing integrals. is means there can be no
global oset, instead the highest-level structure is the oset
dierence between the domain halves. is changes nothing
1data HaarUnbiased y
2= HaarZero
3| HaarUnbiased !y !(HaarUnbiased y)
4!(HaarUnbiased y)
Here, the yvalue now represents the dierence in oset be-
tween the le and right halves, or, by our convention, the
oset in the right half and, implicitly, the negated oset in
the le half (which must be the same for the integral to van-
ish). One can support functions with nonzero integral by
simply adding an absolute oset with a wrapper-type at the
top level:
1data Haar_D¹ y = Haar_D¹
2{ global_offset :: !y
3, variation :: HaarUnbiased y }
e name “Haar” indicates that the basis functions of this
data type (meaning those functions represented when ex-
actly one of the elds of type ℝ in the HaarUnbiased struc-
ture is 1, all other zero) are exactly the unnormalised Haar
wavelets.
29
FHPNC ’19, August 18, 2019, Berlin, Germany Justus Sagemüller and Olivier Verdier
1.0
0.5
-0.5
-1.0
1.00.5-0.5-1.0
𝑓(𝑥)
𝑥
𝑓r𝑓lδlroset𝑓
Figure 2. Example of how a function f:D1Ris de-
composed into a constant oset, plus a step-function (Haar
wavelet) for the oset-dierence between le and right half,
plus local uctuations in each of the halves.
4 Integration
Whether using traditional orthonormal-basis methods, or
the domain-decomposition approach introduced in section 3,
the numerical representations are obtained from integrat-
ing the function on an interval. If the function is given as
an analytic expression, it might be possible to calculate this
integral exactly, but this is typically impossible, and one re-
sorts to numerical approximations. All such approximations
amount to some weighted average of the sample points:
D1
dx f (x)
i
wi·f(xi)
with xiD1,iwi= 1. e choice of the evaluation points
xiand weights wiare subject to considerations of eciency
and accuracy which we will not discuss here.
Crucially for our purposes, the calculation can be split up
across the domain just like the recursive HaarUnbiased data
structure is:
D1
dx f (x) = 1
2D1
dx f (x1
2) + 1
2D1
dx f (x+1
2)
Observe that x1
2only evaluates on the le-, x+1
2only on
the right half of the domain.
is allows constructing the HaarUnbiased tree in single
pass with boom-up propagation of the partial integrals, to
obtain the oset estimates at each level without redundant
computation. e choice of numerical approximation only
occurs at the smallest level; the simplest possibility is to only
evaluate it at a single point in the middle and give that full
weight (rectangular method). us the Haar representation
can be obtained in O(n·log n)from a function on the inter-
val:
1homsampleHaar_D¹ :: ( VectorSpace y
2,Fractional (Scalar y) )
3=> PowerOfTwo -> (D¹ -> y) -> Haar_D¹ y
4homsampleHaar_D¹ (TwoToThe 0) f
5= Haar_D¹ (f 0) HaarZero
6homsampleHaar_D¹ (TwoToThe i) f
7=case homsampleHaar_D¹ (TwoToThe \$ i-1)
8<\$> [ f . \x -> (x-1)/2
9, f . \x -> (x+1)/2 ] of
10 [Haar_D¹ y0l sfl, Haar_D¹ y0r sfr]
11 -> Haar_D¹ ((y0l+y0r)/2)
12 \$ HaarUnbiased ((y0r-y0l)/2) sfl sfr
is algorithm is in DSP called a fast wavelet transform, ex-
cept it normally starts out with a PCM-sampled array in-
proach is that it is not necessary to select one global max-
imum resolution (here the PowerOfTwo parameter); instead,
a heuristic can be added that renes the resolution locally
until the function is satisfactorily approximated.
Remark 4.1. Similar adaptive resolution strategies oen
dramatically improve performance in real-world applications,
such as physics/engineering simulations (nite elements or
nite volumes methods, where they correspond to adaptive
mesh renement). It is also the main principle behind image
compression formats which use quantization on a wavelet
expansion. e reason is that images or solutions to non-
liner dierential equations are oen quite smooth in most
of the domain, but include sharp edges / transients / shocks
conned to a much smaller area.
5 Square-Integrable Functions and Beyond
e Haar_D¹ structure as given in section 3, with its strict
spine and thus nite depth for every tree, can not represent
every D1function. 3It can approximate arbitrarily well any
L2function (in the L2-norm sense). Namely, given a func-
tion f, the sequence
1[homsampleHaar_D¹ (TwoToThe n) f | n<-[0..]]
converges to f. is is much the same for a PCM representa-
tion: improving the resolution will allow it to match an L2
or continuous function ever beer.
However, unlike with the PCM array representation, this
progressive renement of resolution does not change the
top-level structure but only adds sub-branches at ever deeper
levels in the trees. It is alluring to consider allowing the trees
3Perhaps most strikingly, all these functions are discontinuous. Like with
PCM, this could be “xed” through interpolation post-processing, but that
does in neither case enable to exactly t any function.
30
Lazy Evaluation in Infinite-Dimensional Function Spaces… FHPNC ’19, August 18, 2019, Berlin, Germany
if only we drop the strictness annotations in the data struc-
ture. Would that then allow representing any L2function
exactly?
It does not quite work this way, at least not practically:
evalPreIntg_D1 would recurse innitely. So even if
the innite tree would theoretically represent the de-
sired function, it would not be possible to evaluate it
as such in nite time.
homsampleHaar_D¹, as it stands, would not be able to
provide even the top-level node (i.e. the global inte-
gral), without rst going into the local branches that
are needed4to compute it.
Mathematically speaking, an innite tree would correspond
to an innite sum over ever smaller Haar-wavelets. Innite
sums are possible to compute, provided they converge. What
this means for practical computer applications is generally
that the sum is not evaluated completely, but only the nite
partial sum that suces to achieve the required precision
(this also applies to other convergent sequences, e.g. Taylor
expansion of an analytic function). In other words, if the re-
sults need not be exact, then a nite cuto of the converging-
sum innite tree is also sucient, which is why we suggest
keeping the Haar_D¹ structure strict/nite.
On the other hand, the coecients in an innite tree a in
principle not constrained in a way that would require con-
vergence. us they can also represent things that are not
D1Rfunctions at all.
is is not as unreasonable as it may seem. In fact, parti-
cularly in physics, “limit functions” that are dened by a not-
really-convergent sequence are quite commonly used, albeit
oen with lack of mathematical explanation. Best known,
the “Dirac function”, informally dened as
δ:RR∪ {∞}
δ(x) = if x= 0
0otherwise .
e idea is that the integral should come out as 1, and cru-
cially for any other function д,
R
dxδ(x)·д(x) = д(0)
should hold. is allows rewriting pointwise evaluations as
integrals vice versa. e integral over the product is also
known as the L2scalar product.
e above integral equation is quite tractable in a limit
sense: consider a sequence of ever narrower and higher box-
functions
δ[n](x) = nif 1
n<x<1
n
0else .
4is could be overcome if we assume there is some other way of obtaining
the target function’s integral over a whole interval. However, if that is
possible, e.g. because the function is given by an analytical expression, then
one does not really need to resort to a numerical representation!
en
R
dxδ[n](x)·д(x) = n·
1/n
1/n
dxд(x).
If дis continuous, then it will on the ever-smaller integra-
tion interval eventually behave like the constant д0, and
n·
1/n
1/n
dxд0=д0=д(0).
is shows that the idea is sensible. But δas a function is not:
the sequence (δ[n])ndoes not converge, neither pointwise
nor in the square-integral.
What does work about it really is just the contraction /
scalar product with д.
By analogy, we propose that it would be useful to also
consider innite-depth trees, but not as a representation for
functions on D1but just for contracting against such func-
tions:
1data CoHaarUnbiased y
2= CoHaarZero
3| CoHaarUnbiased !y (HaarUnbiased y)
4(HaarUnbiased y)
5data CoHaar_D¹ y
6= CoHaar_D¹ !y (CoHaarUnbiased y)
Even though this is now non-strict, the following is guar-
anteed to terminate5because Haar_D¹ has only nite depth
and will thus eventually terminate the recursion:
1(<.>^) :: CoHaar_D¹ ℝ-> Haar_D¹ ℝ-> ℝ
2CoHaar_D¹ q0 qFluct <.>^ Haar_D¹ f0 fFluct
3= q0 * f0 + qFluct~<.>^fFluct
4where CoHaarZero ~<.>^ _ = 0
5_ ~<.>^ HaarZero = 0
6CoHaarUnbiased δq ql qr
7~<.>^ HaarUnbiased δf fl fr
8= δq * δf + ql~<.>^fl + qr~<.>^fr
is looks similar enough to a scalar product: correspond-
ing entries in both of the tree structures are multiplied, and
the results summed together – like one would also do with
the arrays of a PCM-representation. In both cases there are
some constant factors missing to make it the actual L2scalar
product; we will ignore that here since anyways there is no
interpretation of CoHaar_D¹ as a type of functions on D1
anymore.
Rather, it has an interpretation as a type of functions on
Haar_D¹ ℝ, i.e. (seeing those as D1-functions) higher order
functions or functionals, specically linear functionals. (Lin-
earity is important because these form themselves a vector
space, the dual space.)
5is is in striking analogy with the Agda programming language, in which
data types are strict by default but there is also co-data (coinductive types)
allowing for innite streams.
31
FHPNC ’19, August 18, 2019, Berlin, Germany Justus Sagemüller and Olivier Verdier
e Dirac distribution is a very particular example of lin-
ear functional, that can indeed be implemented as a value
of CoHaar_D¹ ℝ. What it does – evaluation at a single point
– is a special case of evaluation over a whole interval and
averaging: essentially also what the elements of the δ[n]se-
quence do, but because there is no explicit integral there is
no need to scale up the height to innity as the width is
reduced.
1boxDistribution :: (D¹, D¹) −− ^ Target interval
2-> ℝ −− ^ Total weight
3-> Haar_D¹ DistributionSpace ℝ
4boxDistribution (D¹ l, D¹ r) y
5| l > r = boxDistribution (D¹ r, D¹ l) y
6boxDistribution (D¹ (-1), D¹ 1) y
7= CoHaar_D¹ y zeroV
8boxDistribution (D¹ l, D¹ r) y
9| l<0, r>0 −− intersecting both halves of domain
10 = CoHaar_D¹ y \$ CoHaarUnbiased (wr-wl) lstru rstru
11 | l<0 −− target intersects only le half
12 = CoHaar_D¹ y \$ CoHaarUnbiased (-wl) lstru 0
13 |otherwise −− target intersects only right half
14 = CoHaar_D¹ y \$ CoHaarUnbiased wr 0 rstru
15 where CoHaar_D¹ wl lstru = boxDistribution
16 (D¹ \$ l*2 + 1, D¹ \$ min 0 r*2 + 1)
17 (y * if r>0 then l/(l-r) else 1)
18 CoHaar_D¹ wr rstru = boxDistribution
19 (D¹ \$ max 0 l*2 - 1, D¹ \$ r*2 - 1)
20 (y * if l<0 then r/(r-l) else 1)
e tree generated this way will in general have innite
depth in order to select the desired interval with any delim-
iters, i.e. this really requires the non-strict data structure.
However, the distribution will only narrow in on this selec-
tion provided that the function on which we evaluate has
any structure at that level. When the function is eventually
constant, only the top-level coecient is evaluated (as that
corresponds to the integral, which is what is sought here).
Furthermore, boxDistribution itself only builds up the in-
nitely ne resolution where it is actually required, i.e., at
the boundaries of the target interval: on those subdivisions
that are fully in the interval, again only to top-level coef-
cient of the function needs to be evaluated, whereas out-
side of the target the result will simply be zero. us, the
tree has only two long branches, and <.>^ has only a com-
plexity of O(log n)in the resolution of the function which
the distribution is contracted against. (Compare this with a
PCM implementation, where a box distribution would need
to contain O(n)nonzero entries, all of which would need to
be evaluated.)
Finally, all of this works even if the target “interval” actu-
ally has zero width:
1dirac :: D¹ -> CoHaar_D¹ ℝ
2dirac x0 = boxDistribution (x0,x0) 1
at implementation of the Dirac distribution does in-
deed evaluate functions of arbitrary resolution at one point.
We have tested this with ickCheck:
1testProperty "Dirac eval of Haar function"
2\$ \f p -> dirac p<.>^f ~= evalHaarFunction f p
ere, the ickCheck Arbitrary instance generates arbi-
trarily deep tree structures, and picks any point on D1for
evaluation. e ~= operator checks equality up to oating-
point inaccuracy (in our test suite, the relative error is set to
109).
Note that the behaviour, both of evalHaarFunction and
dirac is strictly speaking undened at the discontinuities
created by the Haar representation, but the implementations
shown here are in agreement. At any rate this seems safe as
long as the Haar_D¹ function is an approximation of a con-
tinuous function, because then the jumps have only very
small height.
6 Tensor Products and Linear Maps
One of the most salient aspects about the dual space imple-
mentation is that it allows for a storable implementation of
arbitrary linear mappings.
1newtype LinearMap v w = LinearMap
2(TensorProduct (DualVector v) w)
e TensorProduct for a parameterised type like Haar_D¹ and
CoHaar_D¹ – generally, for any functor in the category of vec-
tor spaces6– is simply given by instantiating the parameter
with the right factor space.
1type instance Scalar y ~ ℝ
2=> TensorProduct (CoHaar_D¹ ℝ) w = CoHaar_D¹ w
So specically, LinearMap (Haar_D¹ ℝ) (Haar_D¹ ℝ) is repre-
sented by a distribution of functions, i.e. by values of the
type CoHaar_D¹ (Haar_D¹ ℝ). is type is important because
it would be the type of the identity linear mapping, which is
required for Haar_D¹ to be a member of an actual category
and prerequisite for generalising several linear algebra algo-
rithm from the nite-Euclidean case to innite-dimensional
spaces like Haar_D¹ ℝ. And practically speaking, if id is de-
ned then it is easy to sample/convert any linear function
(dened as a Haskell function) into a tensor-based linear
mapping.
id is another reason why CoHaar_D¹ must be non-strict:
the identity mapping needs to use an innite tree in order to
properly handle functions with arbitrarily high resolution.
Concretely,
1id :: LinearMap (Haar_D¹ ℝ) (Haar_D¹ ℝ)
2id = LinearMap \$ CoHaar_D¹
6ey are in fact also functors in the Hask category, but we recommend
keeping that instance a private implementation detail because fmapping a
nonlinear function is not invariant of the choice of basis, i.e. it is not safe
with respect to refactoring to another representations.
32
Lazy Evaluation in Infinite-Dimensional Function Spaces… FHPNC ’19, August 18, 2019, Berlin, Germany
3(Haar_D¹ 1 zeroV)
4(fmap (\ δ-> Haar_D¹ 0 δ) idUnbiased)
5where idUnbiased :: TensorProduct (CoHaarUnbiased ℝ)
6(HaarUnbiased ℝ)
7idUnbiased = CoHaarUnbiased
8(CoHaar_D¹ 1 zeroV zeroV)
9(fmap (\l -> HaarUnbiased 0 l zeroV) idUnbiased)
10 (fmap (\r -> HaarUnbiased 0 zeroV r) idUnbiased)
7 Outlook
Although the Haar-wavelet-expansion type presented in this
paper provides a useful starting point for numerical calcula-
tions on innite-dimensional spaces, the fact that the repre-
sented functions are inherently discontinuous step-functions
limits its usefulness for actual numerical applications. e
step functions certainly are not dierentiable.
Even for pointwise evaluation alone, the piecewise con-
stant structure means that it is relatively inecient at ap-
proximating continuous functions, namely, the discretisa-
tion error ε=fexact fHaar reduces proportionally to the
step size δ, i.e. anti-proportionally to the required tree size.
e adaptiveness of resolution can somewhat mitigate this
(regions with small gradients have low εto begin with, so
it is sucient to focus on those with stronger gradient or
even discontinuity), however this is limited unless the func-
tion really is constant on most of the domain.
By contrast, piecewise linear functions can scale εδ2,
cubic ones εδ4and so on. us it would be desirable
to combine such a higher-order local model with the tree-
based multiscale structure. Comparison with wavelet the-
ory suggests that this may not be as straightforward as it is
to employ polynomial interpolation for a PCM sampling or
for a nite elements model. Namely, the simplest piecewise-
linear orthogonal wavelet is not a hat function by the rather
complicated Strömberg wavelet.
However, our approach has the advantage that it does not
actually rely on L2-orthogonality, but uses domain decom-
position and direct value reado for its sampling process.
erefore it is plausible that the “mother wavelet” can be
kept much more basic. In particular, a very simple way to
construct a continuous function from a Haar-based one is
through integration. Because the complete integral can al-
ways be evaluated in O(1), this would also still allow ef-
cient random-access pointwise evaluation of the continu-
ous function, unlike the integral of a PCM-sampled function
(which would be O(n)for single-point evaluation).
Another important generalisation will be multi-dimensio-
nal domains. In fact, Haar_D¹ already supports those in a
sense because a function vector space on a product domain
is isomorphic to the tensor product of the function spaces
on the factor domains, i.e. Haar_D¹ (Haar_D¹ ℝ) represents
functions on D1×D1. However, this would largely circum-
vent the locality properties of the Haar expansions (since
nearby points in y-direction would lie in completely dier-
ent trees of the decomposition in x-direction). An ecient
implementation would probably need to intersperse the di-
rection-spliings, to give a kind of kd-tree structure in space
an Morton Z order of the leaves.
In summary: we have presented a data structure that can
express function types in a way that beer represents the
mathematical (functional-analysis) notion of such an innite-
dimensional space than the mainstream numerical expan-
sions do. It combines features of techniques from established
numerical schemes (wavelets from multiscale analysis, tree
backbones from Barnes-Hut style simulations, parametrici-
ty/tensor-product from numerical linear algebra), and we
expect that it can be extended to be of similar practical use
while being more mathematically general and transparent.
33