Content uploaded by Justus Sagemüller

Author content

All content in this area was uploaded by Justus Sagemüller on Jul 21, 2019

Content may be subject to copyright.

Lazy Evaluation in Innite-Dimensional Function

Spaces with Wavelet Basis

Justus Sagemüller

Faculty of Engineering and Science

Western Norway University of Applied Science

Bergen, Norway

jsag@hvl.no

Olivier Verdier

Faculty of Engineering and Science

Western Norway University of Applied Science

Bergen, Norway

olivier.verdier@hvl.no

Department of Mathematics

KTH-Royal Institute of Technology

Stockholm, Sweden

olivierv@kth.se

Abstract

Vectors in numerical computation, i.e., arrays of numbers,

oen represent continuous functions. We would like to re-

ect this with types. One apparent obstacle is that spaces of

functions are typically innite-dimensional, while the code

must run in nite time and memory.

We argue that this can be overcome: even in an innite-

dimensional space, the vectors can in practice be stored in

nite memory. However, dual vectors (corresponding essen-

tially to distributions) require innite data structure. e dis-

tinction is usually lost in the nite dimensional case, since

dual vectors are oen simply represented as vectors (by im-

plicitly choosing a scalar product establishing the correspon-

dence). However, we shall see that an explicit type-level dis-

tinction between functions and distributions makes sense

and allows directly expressing useful concepts such as the

Dirac distribution, which are problematic in the standard

nite-resolution picture.

e example implementation uses a very simple local ba-

sis that corresponds to a Haar Wavelet transform.

CCS Concepts •Mathematics of computing →Approx-

imation;Distribution functions;Arbitrary-precision arithmetic;

Discretization;Continuous functions; Coding theory; Integral

equations; • Computing methodologies →Representa-

tion of mathematical functions; Image compression; •

Applied computing →Mathematics and statistics.

Keywords wavelet, multiresolution, lazy evaluation

Permission to make digital or hard copies of all or part of this work for

personal or classroom use is granted without fee provided that copies are

not made or distributed for prot or commercial advantage and that copies

bear this notice and the full citation on the rst page. Copyrights for compo-

nents of this work owned by others than ACM must be honored. Abstract-

ing with credit is permied. To copy otherwise, or republish, to post on

servers or to redistribute to lists, requires prior specic permission and/or

a fee. Request permissions from permissions@acm.org.

FHPNC ’19, August 18, 2019, Berlin, Germany

© 2019 Association for Computing Machinery.

ACM ISBN 978-1-4503-6814-8/19/08…$15.00

https://doi.org/10.1145/3331553.3342615

ACM Reference Format:

Justus Sagemüller and Olivier Verdier. 2019. Lazy Evaluation in

Innite-Dimensional Function Spaces with Wavelet Basis. In Pro-

ceedings of the 8th ACM SIGPLAN International Workshop on Func-

tional High-Performance and Numerical Computing (FHPNC ’19),

August 18, 2019, Berlin, Germany. ACM, New York, NY, USA, 7pages.

https://doi.org/10.1145/3331553.3342615

1 Introduction

Consider the unit interval D1= [−1,1] ⊂R. is paper dis-

cusses functions on that domain, but the methods are writ-

ten so as to straightforwardly generalise to unbounded mul-

tidimensional domains.

e set RD1of functions D1→Ris a vector space, but it

is, in a sense, too big. To wit, this set contains “pathological”

functions that have in every point, even in points very close

to each other, a completely unrelated value. (A classical ex-

ample is the function that assigns each rational number the

value 1, each irrational 0.) Pathological functions like that

are largely irrelevant for real-world applications; yet not

only does D1→Rcontain such functions, in a cardinal-

ity sense most of them are pathological. erefore, for the

purposes of an ecient computer implementation, sets of

all functions on some domain are quite impractical.

It is helpful and established practice in maths and sci-

ence to consider subspaces containing only “beer behaved”

functions. For instance,

• C0(D1): continuous functions. ese can be charac-

terised thus: to obtain any function value f(x)with at

least precision ε, one can instead consider f(˜

x)where

˜

xneeds to be merely close enough to x, i.e. within a

distance δx,ε. By making the range of inputs, i.e. the

δ, suciently small, one can also limit the range εof

results. So, unlike for general functions, it is not nec-

essary to know the argument exactly, it is enough to

know it with good accuracy: then the function value

will also be known accurately.

• C1(D1): continuously dierentiable functions. Whilst

continuous functions merely guarantee that f(˜

x)does

27

FHPNC ’19, August 18, 2019, Berlin, Germany Justus Sagemüller and Olivier Verdier

1

-1

1.00.5-0.5-1.0

𝑓(𝑥)

𝑥

PCM (𝑛=12)𝑓

Figure 1. Example of how a function f:D1→Ris repre-

sented in PCM.

not deviate too much from f(x)at nearby ˜

x, dieren-

tiability also make some tangible statement of how it

deviates.

• L2(D1): square-integrable functions. ese do not re-

strict what the function does at individual points, but

they guarantee that the behaviour in a whole inter-

val can be “summarised” by the integral over it. Func-

tions that are indistinguishable by any such integra-

tion over small intervals are considered equivalent.

Remark 1.1. Continuity has a straightforward physical in-

terpretation. One could argue that a function representing

an observable value on a continuous domain must be con-

tinuous, at least almost everywhere, as a physical setup is

never completely exact.

2 e Space of PCM-Sampled Functions

An extremely common way of representing continuous or

L2functions in computer programs is what we will in the

following call the PCM representation1. e idea is to divide

the domain into equal-sized segments, and store in each of

these segments only one function value. at can be done

with a simple array, or list.

1newtype PCM_D¹ y = PCM_D¹ {

2getPCMSampling :: [y] }

1e abbreviation “PCM” stands for pulse code modulation. e term is used

in digital signal processing for a signal in time domain that is sampled

on equal time intervals with a digital value proportional to the analogue

voltage it represents. We use “DSP” as a shorthand for general equal-spaced

sampling. In applications of numerical dierential equations, this might

rather be referred to as a nite dierences (FD) representation.

where the yvalues correspond to equally-spaced samples of

the represented function, i.e.

f(−1 + 2

n·i)i←[1

2.. n−1

2].

ere is a body of theory, the Shannon-Nyquist formalism,

that makes PCM a quite solid method for representing a sub-

space of L2, the subspace of bandlimited functions. We will

not discuss this aspect here.

e space of PCM-sampled functions is a vector space,

and many applications make use of this fact all the time:

array-languages like Matlab and NumPy implement addi-

tion, multiplication and more on arrays, and provided that

all functions are sampled with the same resolution, this cor-

responds perfectly to the conceptually intended extensional

addition of functions, i.e.

1h = PCM_D¹ $ zipWith (+)

2(getPCMSampling f) (getPCMSampling g)

≈

⇔h(x) = f(x) + д(x).

is does not hold up anymore if fand дare sampled at dif-

ferent resolution. Oen, this is simply prevented altogether

(made a runtime error or, somewhat beer, type-error with

indexed-length vector types).

In some applications this is not a problem: for example

•In DSP, the input resolution is xed by a physical ADC,

and subsequent calculations (LTI lters etc.) simply

leave it as-is or perhaps double the sampling. Never-

theless, it can be desirable to combine signals from

dierent sources with dierent sampling rate, which

then requires suitable resampling.

•Explicit integrators for hyperbolic PDEs iterate a re-

solution-preserving propagator-transformation invol-

ving the pointwise value of the old state and a numeri-

cal approximation (e.g. nite dierences) of its spatial

derivatives. is calculation will depend on the given

resolution, but there is no need to precompute and

store it in something like a matrix that would need

to be valid for any of the possible resolutions: even if

change of resolution should be required, the transfor-

mation can just be recomputed from scratch.

Many other applications, however, require coecients to be

stored before the required resolution is even known. is is

both necessary for implicit methods / inverse problems –

i.e. where there is no explitly programmed way to deduce

the result from an input, but rather an input needs to be

“guessed” that will match the desired output2, but also for

calculations which simply require a stored-coecient form

2In practice, implicit methods use heuristics to choose a suitable resolu-

tion. A common approach is to just match the data count of the input to

that of the intended output, which is reasonable enough since inversion

is clearly dependent on some notion of isomorphism – however, even an

isomorphism does not necessarily preserve the highest, locally required res-

olution, but can “concentrate” data in given spots. is can again be taken

into account with more heuristics to adaptively rene the mesh, or the

28

Lazy Evaluation in Infinite-Dimensional Function Spaces… FHPNC ’19, August 18, 2019, Berlin, Germany

for eciency. Doing that in a general manner which would

work regardless of what resolution turns out to be necessary

in the end would amount to choosing an innite resolution.

Experience with Haskell suggests that the problem of com-

puting innitely many coecients should be possible with

lazy evaluation, provided only nitely many of them actu-

ally need to be evaluated in the end. However, a data struc-

ture like PCM_D¹ is not suitable for this, since adding new

coecients to the end of the list would change the mean-

ing of the already calculated ones (they would be squeezed

to the le of the domain). What is required is some form of

progressive resolution.

3 Multiscale Resolution

e main usefulness of spaces like L2is that the innite di-

mensionality can be managed easily if one only needs nite-

width integrals over the function, as these average out small-

scale uctuations. To benet from this, we must nd a way

to compute the integral without evaluating all the small-

scale structure. at is the idea behind multiscale or wavelet

methods. ese are oen derived as some orthonormal basis

of L2, but we can also give a construction more from rst

principles.

To directly enable the O(1) evaluation of large-scale in-

tegrals, one might consider the following representation of

D1→y functions:

1data PreIntg_D¹ y = PreIntg

2{ offset :: y

3, lSubstructure :: PreIntg_D¹ y

4, rSubstructure :: PreIntg_D¹ y

5}

e idea is to decompose a function into a constant oset

(proportional to the integral D1dx f (x)) plus ner-grained

uctuations in each half of the domain, which are in turn

recursively represented by the same type.

f(y0,fl,fr)(x) = y0+fl(xl)if xon le

fr(xr)if xon right

e data structure PreIntg_D¹ dened above is a binary tree

which has always innite size. is can be handled in a lan-

guage with lazy evaluation like Haskell, but only if all that

is ever requested from the function are integrals over nite-

extend subintervals. Pointwise evaluation would recurse in-

nitely.

Note also that it is not really a function from D1to y if it

cannot be evaluated at any individual point in D1. In prac-

tice, for any given real-world measured function, there will

be only nitely many data points available at any given mo-

ment, so at a suciently small scale one would eventually

extra detail can just be smoothed out (which can be useful to keep compu-

tation eort limited in nonlinear PDE solvers, however it is mathematically

not really a correct solution then).

store only the oset, and assume that any smaller uctua-

tions are negligible.

is eventual cuto can be implemented by wrapping the

substructure elds in Maybe. Here we dene instead a specic

constructor for the zero function (to improve the semantics

of the generic Nothing constructor). We now obtain a con-

ventional tree with nite depth. Note that it can now also be

strictly evaluated (here enforced by the exclamation marks).

1data PreIntg_D¹ y

2= PreIntgZero

3| PreIntg !y !(PreIntg_D¹ y)

4!(PreIntg_D¹ y)

Pointwise function evaluation is then readily implemented

recursively:

1evalPreIntg_D¹ :: AdditiveGroup y

2=> PreIntg_D¹ y -> D¹ -> y

3evalPreIntg_D¹ PreIntgZero _ = 0

4evalPreIntg_D¹ (PreIntg y0 l r) x

5= y0 + if x<0

6then evalPreIntg_D¹ l (2*x+1)

7else evalPreIntg_D¹ r (2*x-1)

Here, 2*x+1 or 2*x-1 “zoom in” onto the le or right half

subinterval, depending on where xlies.

One downside of data PreIntg_D¹ is that it is redundant:

the oset already xes what the integral over the complete

function should be, but there is nothing preventing the sub-

interval functions from contributing their own part to the

integral. One solution is to use a type which represents func-

tions having vanishing integrals. is means there can be no

global oset, instead the highest-level structure is the oset

dierence between the domain halves. is changes nothing

about the data structure, just about its meaning:

1data HaarUnbiased y

2= HaarZero

3| HaarUnbiased !y !(HaarUnbiased y)

4!(HaarUnbiased y)

Here, the yvalue now represents the dierence in oset be-

tween the le and right halves, or, by our convention, the

oset in the right half and, implicitly, the negated oset in

the le half (which must be the same for the integral to van-

ish). One can support functions with nonzero integral by

simply adding an absolute oset with a wrapper-type at the

top level:

1data Haar_D¹ y = Haar_D¹

2{ global_offset :: !y

3, variation :: HaarUnbiased y }

e name “Haar” indicates that the basis functions of this

data type (meaning those functions represented when ex-

actly one of the elds of type ℝ in the HaarUnbiased ℝ struc-

ture is 1, all other zero) are exactly the unnormalised Haar

wavelets.

29

FHPNC ’19, August 18, 2019, Berlin, Germany Justus Sagemüller and Olivier Verdier

1.0

0.5

-0.5

-1.0

1.00.5-0.5-1.0

𝑓(𝑥)

𝑥

𝑓r𝑓lδlroﬀset𝑓

Figure 2. Example of how a function f:D1→Ris de-

composed into a constant oset, plus a step-function (Haar

wavelet) for the oset-dierence between le and right half,

plus local uctuations in each of the halves.

4 Integration

Whether using traditional orthonormal-basis methods, or

the domain-decomposition approach introduced in section 3,

the numerical representations are obtained from integrat-

ing the function on an interval. If the function is given as

an analytic expression, it might be possible to calculate this

integral exactly, but this is typically impossible, and one re-

sorts to numerical approximations. All such approximations

amount to some weighted average of the sample points:

D1

dx f (x)≈

i

wi·f(xi)

with xi∈D1,iwi= 1. e choice of the evaluation points

xiand weights wiare subject to considerations of eciency

and accuracy which we will not discuss here.

Crucially for our purposes, the calculation can be split up

across the domain just like the recursive HaarUnbiased data

structure is:

D1

dx f (x) = 1

2D1

dx f (x−1

2) + 1

2D1

dx f (x+1

2)

Observe that x−1

2only evaluates on the le-, x+1

2only on

the right half of the domain.

is allows constructing the HaarUnbiased tree in single

pass with boom-up propagation of the partial integrals, to

obtain the oset estimates at each level without redundant

computation. e choice of numerical approximation only

occurs at the smallest level; the simplest possibility is to only

evaluate it at a single point in the middle and give that full

weight (rectangular method). us the Haar representation

can be obtained in O(n·log n)from a function on the inter-

val:

1homsampleHaar_D¹ :: ( VectorSpace y

2,Fractional (Scalar y) )

3=> PowerOfTwo -> (D¹ -> y) -> Haar_D¹ y

4homsampleHaar_D¹ (TwoToThe 0) f

5= Haar_D¹ (f 0) HaarZero

6homsampleHaar_D¹ (TwoToThe i) f

7=case homsampleHaar_D¹ (TwoToThe $ i-1)

8<$> [ f . \x -> (x-1)/2

9, f . \x -> (x+1)/2 ] of

10 [Haar_D¹ y0l sfl, Haar_D¹ y0r sfr]

11 -> Haar_D¹ ((y0l+y0r)/2)

12 $ HaarUnbiased ((y0r-y0l)/2) sfl sfr

is algorithm is in DSP called a fast wavelet transform, ex-

cept it normally starts out with a PCM-sampled array in-

stead of a function-to-be-sampled. One advantage of our ap-

proach is that it is not necessary to select one global max-

imum resolution (here the PowerOfTwo parameter); instead,

a heuristic can be added that renes the resolution locally

until the function is satisfactorily approximated.

Remark 4.1. Similar adaptive resolution strategies oen

dramatically improve performance in real-world applications,

such as physics/engineering simulations (nite elements or

nite volumes methods, where they correspond to adaptive

mesh renement). It is also the main principle behind image

compression formats which use quantization on a wavelet

expansion. e reason is that images or solutions to non-

liner dierential equations are oen quite smooth in most

of the domain, but include sharp edges / transients / shocks

conned to a much smaller area.

5 Square-Integrable Functions and Beyond

e Haar_D¹ structure as given in section 3, with its strict

spine and thus nite depth for every tree, can not represent

every D1function. 3It can approximate arbitrarily well any

L2function (in the L2-norm sense). Namely, given a func-

tion f, the sequence

1[homsampleHaar_D¹ (TwoToThe n) f | n<-[0..]]

converges to f. is is much the same for a PCM representa-

tion: improving the resolution will allow it to match an L2

or continuous function ever beer.

However, unlike with the PCM array representation, this

progressive renement of resolution does not change the

top-level structure but only adds sub-branches at ever deeper

levels in the trees. It is alluring to consider allowing the trees

to have innite depth, which can readily be had in Haskell

3Perhaps most strikingly, all these functions are discontinuous. Like with

PCM, this could be “xed” through interpolation post-processing, but that

does in neither case enable to exactly t any function.

30

Lazy Evaluation in Infinite-Dimensional Function Spaces… FHPNC ’19, August 18, 2019, Berlin, Germany

if only we drop the strictness annotations in the data struc-

ture. Would that then allow representing any L2function

exactly?

It does not quite work this way, at least not practically:

•evalPreIntg_D1 would recurse innitely. So even if

the innite tree would theoretically represent the de-

sired function, it would not be possible to evaluate it

as such in nite time.

•homsampleHaar_D¹, as it stands, would not be able to

provide even the top-level node (i.e. the global inte-

gral), without rst going into the local branches that

are needed4to compute it.

Mathematically speaking, an innite tree would correspond

to an innite sum over ever smaller Haar-wavelets. Innite

sums are possible to compute, provided they converge. What

this means for practical computer applications is generally

that the sum is not evaluated completely, but only the nite

partial sum that suces to achieve the required precision

(this also applies to other convergent sequences, e.g. Taylor

expansion of an analytic function). In other words, if the re-

sults need not be exact, then a nite cuto of the converging-

sum innite tree is also sucient, which is why we suggest

keeping the Haar_D¹ structure strict/nite.

On the other hand, the coecients in an innite tree a in

principle not constrained in a way that would require con-

vergence. us they can also represent things that are not

D1→Rfunctions at all.

is is not as unreasonable as it may seem. In fact, parti-

cularly in physics, “limit functions” that are dened by a not-

really-convergent sequence are quite commonly used, albeit

oen with lack of mathematical explanation. Best known,

the “Dirac function”, informally dened as

δ:R→“R∪ {∞}”

δ(x) = “∞” if x= 0

0otherwise .

e idea is that the integral should come out as 1, and cru-

cially for any other function д,

R

dxδ(x)·д(x) = д(0)

should hold. is allows rewriting pointwise evaluations as

integrals vice versa. e integral over the product is also

known as the L2scalar product.

e above integral equation is quite tractable in a limit

sense: consider a sequence of ever narrower and higher box-

functions

δ[n](x) = nif −1

n<x<1

n

0else .

4is could be overcome if we assume there is some other way of obtaining

the target function’s integral over a whole interval. However, if that is

possible, e.g. because the function is given by an analytical expression, then

one does not really need to resort to a numerical representation!

en

R

dxδ[n](x)·д(x) = n·

1/n

−1/n

dxд(x).

If дis continuous, then it will on the ever-smaller integra-

tion interval eventually behave like the constant д0, and

n·

1/n

−1/n

dxд0=д0=д(0).

is shows that the idea is sensible. But δas a function is not:

the sequence (δ[n])ndoes not converge, neither pointwise

nor in the square-integral.

What does work about it really is just the contraction /

scalar product with д.

By analogy, we propose that it would be useful to also

consider innite-depth trees, but not as a representation for

functions on D1but just for contracting against such func-

tions:

1data CoHaarUnbiased y

2= CoHaarZero

3| CoHaarUnbiased !y (HaarUnbiased y)

4(HaarUnbiased y)

5data CoHaar_D¹ y

6= CoHaar_D¹ !y (CoHaarUnbiased y)

Even though this is now non-strict, the following is guar-

anteed to terminate5because Haar_D¹ has only nite depth

and will thus eventually terminate the recursion:

1(<.>^) :: CoHaar_D¹ ℝ-> Haar_D¹ ℝ-> ℝ

2CoHaar_D¹ q0 qFluct <.>^ Haar_D¹ f0 fFluct

3= q0 * f0 + qFluct~<.>^fFluct

4where CoHaarZero ~<.>^ _ = 0

5_ ~<.>^ HaarZero = 0

6CoHaarUnbiased δq ql qr

7~<.>^ HaarUnbiased δf fl fr

8= δq * δf + ql~<.>^fl + qr~<.>^fr

is looks similar enough to a scalar product: correspond-

ing entries in both of the tree structures are multiplied, and

the results summed together – like one would also do with

the arrays of a PCM-representation. In both cases there are

some constant factors missing to make it the actual L2scalar

product; we will ignore that here since anyways there is no

interpretation of CoHaar_D¹ ℝ as a type of functions on D1

anymore.

Rather, it has an interpretation as a type of functions on

Haar_D¹ ℝ, i.e. (seeing those as D1-functions) higher order

functions or functionals, specically linear functionals. (Lin-

earity is important because these form themselves a vector

space, the dual space.)

5is is in striking analogy with the Agda programming language, in which

data types are strict by default but there is also co-data (coinductive types)

allowing for innite streams.

31

FHPNC ’19, August 18, 2019, Berlin, Germany Justus Sagemüller and Olivier Verdier

e Dirac distribution is a very particular example of lin-

ear functional, that can indeed be implemented as a value

of CoHaar_D¹ ℝ. What it does – evaluation at a single point

– is a special case of evaluation over a whole interval and

averaging: essentially also what the elements of the δ[n]se-

quence do, but because there is no explicit integral there is

no need to scale up the height to innity as the width is

reduced.

1boxDistribution :: (D¹, D¹) −− ^ Target interval

2-> ℝ −− ^ Total weight

3-> Haar_D¹ DistributionSpace ℝ

4boxDistribution (D¹ l, D¹ r) y

5| l > r = boxDistribution (D¹ r, D¹ l) y

6boxDistribution (D¹ (-1), D¹ 1) y

7= CoHaar_D¹ y zeroV

8boxDistribution (D¹ l, D¹ r) y

9| l<0, r>0 −− intersecting both halves of domain

10 = CoHaar_D¹ y $ CoHaarUnbiased (wr-wl) lstru rstru

11 | l<0 −− target intersects only le half

12 = CoHaar_D¹ y $ CoHaarUnbiased (-wl) lstru 0

13 |otherwise −− target intersects only right half

14 = CoHaar_D¹ y $ CoHaarUnbiased wr 0 rstru

15 where CoHaar_D¹ wl lstru = boxDistribution

16 (D¹ $ l*2 + 1, D¹ $ min 0 r*2 + 1)

17 (y * if r>0 then l/(l-r) else 1)

18 CoHaar_D¹ wr rstru = boxDistribution

19 (D¹ $ max 0 l*2 - 1, D¹ $ r*2 - 1)

20 (y * if l<0 then r/(r-l) else 1)

e tree generated this way will in general have innite

depth in order to select the desired interval with any delim-

iters, i.e. this really requires the non-strict data structure.

However, the distribution will only narrow in on this selec-

tion provided that the function on which we evaluate has

any structure at that level. When the function is eventually

constant, only the top-level coecient is evaluated (as that

corresponds to the integral, which is what is sought here).

Furthermore, boxDistribution itself only builds up the in-

nitely ne resolution where it is actually required, i.e., at

the boundaries of the target interval: on those subdivisions

that are fully in the interval, again only to top-level coef-

cient of the function needs to be evaluated, whereas out-

side of the target the result will simply be zero. us, the

tree has only two long branches, and <.>^ has only a com-

plexity of O(log n)in the resolution of the function which

the distribution is contracted against. (Compare this with a

PCM implementation, where a box distribution would need

to contain O(n)nonzero entries, all of which would need to

be evaluated.)

Finally, all of this works even if the target “interval” actu-

ally has zero width:

1dirac :: D¹ -> CoHaar_D¹ ℝ

2dirac x0 = boxDistribution (x0,x0) 1

at implementation of the Dirac distribution does in-

deed evaluate functions of arbitrary resolution at one point.

We have tested this with ickCheck:

1testProperty "Dirac eval of Haar function"

2$ \f p -> dirac p<.>^f ~= evalHaarFunction f p

ere, the ickCheck Arbitrary instance generates arbi-

trarily deep tree structures, and picks any point on D1for

evaluation. e ~= operator checks equality up to oating-

point inaccuracy (in our test suite, the relative error is set to

10−9).

Note that the behaviour, both of evalHaarFunction and

dirac is strictly speaking undened at the discontinuities

created by the Haar representation, but the implementations

shown here are in agreement. At any rate this seems safe as

long as the Haar_D¹ function is an approximation of a con-

tinuous function, because then the jumps have only very

small height.

6 Tensor Products and Linear Maps

One of the most salient aspects about the dual space imple-

mentation is that it allows for a storable implementation of

arbitrary linear mappings.

1newtype LinearMap v w = LinearMap

2(TensorProduct (DualVector v) w)

e TensorProduct for a parameterised type like Haar_D¹ and

CoHaar_D¹ – generally, for any functor in the category of vec-

tor spaces6– is simply given by instantiating the parameter

with the right factor space.

1type instance Scalar y ~ ℝ

2=> TensorProduct (CoHaar_D¹ ℝ) w = CoHaar_D¹ w

So specically, LinearMap (Haar_D¹ ℝ) (Haar_D¹ ℝ) is repre-

sented by a distribution of functions, i.e. by values of the

type CoHaar_D¹ (Haar_D¹ ℝ). is type is important because

it would be the type of the identity linear mapping, which is

required for Haar_D¹ ℝ to be a member of an actual category

and prerequisite for generalising several linear algebra algo-

rithm from the nite-Euclidean case to innite-dimensional

spaces like Haar_D¹ ℝ. And practically speaking, if id is de-

ned then it is easy to sample/convert any linear function

(dened as a Haskell function) into a tensor-based linear

mapping.

id is another reason why CoHaar_D¹ must be non-strict:

the identity mapping needs to use an innite tree in order to

properly handle functions with arbitrarily high resolution.

Concretely,

1id :: LinearMap (Haar_D¹ ℝ) (Haar_D¹ ℝ)

2id = LinearMap $ CoHaar_D¹

6ey are in fact also functors in the Hask category, but we recommend

keeping that instance a private implementation detail because fmapping a

nonlinear function is not invariant of the choice of basis, i.e. it is not safe

with respect to refactoring to another representations.

32

Lazy Evaluation in Infinite-Dimensional Function Spaces… FHPNC ’19, August 18, 2019, Berlin, Germany

3(Haar_D¹ 1 zeroV)

4(fmap (\ δ-> Haar_D¹ 0 δ) idUnbiased)

5where idUnbiased :: TensorProduct (CoHaarUnbiased ℝ)

6(HaarUnbiased ℝ)

7idUnbiased = CoHaarUnbiased

8(CoHaar_D¹ 1 zeroV zeroV)

9(fmap (\l -> HaarUnbiased 0 l zeroV) idUnbiased)

10 (fmap (\r -> HaarUnbiased 0 zeroV r) idUnbiased)

7 Outlook

Although the Haar-wavelet-expansion type presented in this

paper provides a useful starting point for numerical calcula-

tions on innite-dimensional spaces, the fact that the repre-

sented functions are inherently discontinuous step-functions

limits its usefulness for actual numerical applications. e

step functions certainly are not dierentiable.

Even for pointwise evaluation alone, the piecewise con-

stant structure means that it is relatively inecient at ap-

proximating continuous functions, namely, the discretisa-

tion error ε=fexact −fHaar reduces proportionally to the

step size δ, i.e. anti-proportionally to the required tree size.

e adaptiveness of resolution can somewhat mitigate this

(regions with small gradients have low εto begin with, so

it is sucient to focus on those with stronger gradient or

even discontinuity), however this is limited unless the func-

tion really is constant on most of the domain.

By contrast, piecewise linear functions can scale ε∝δ2,

cubic ones ε∝δ4and so on. us it would be desirable

to combine such a higher-order local model with the tree-

based multiscale structure. Comparison with wavelet the-

ory suggests that this may not be as straightforward as it is

to employ polynomial interpolation for a PCM sampling or

for a nite elements model. Namely, the simplest piecewise-

linear orthogonal wavelet is not a hat function by the rather

complicated Strömberg wavelet.

However, our approach has the advantage that it does not

actually rely on L2-orthogonality, but uses domain decom-

position and direct value reado for its sampling process.

erefore it is plausible that the “mother wavelet” can be

kept much more basic. In particular, a very simple way to

construct a continuous function from a Haar-based one is

through integration. Because the complete integral can al-

ways be evaluated in O(1), this would also still allow ef-

cient random-access pointwise evaluation of the continu-

ous function, unlike the integral of a PCM-sampled function

(which would be O(n)for single-point evaluation).

Another important generalisation will be multi-dimensio-

nal domains. In fact, Haar_D¹ already supports those in a

sense because a function vector space on a product domain

is isomorphic to the tensor product of the function spaces

on the factor domains, i.e. Haar_D¹ (Haar_D¹ ℝ) represents

functions on D1×D1. However, this would largely circum-

vent the locality properties of the Haar expansions (since

nearby points in y-direction would lie in completely dier-

ent trees of the decomposition in x-direction). An ecient

implementation would probably need to intersperse the di-

rection-spliings, to give a kind of kd-tree structure in space

an Morton Z order of the leaves.

In summary: we have presented a data structure that can

express function types in a way that beer represents the

mathematical (functional-analysis) notion of such an innite-

dimensional space than the mainstream numerical expan-

sions do. It combines features of techniques from established

numerical schemes (wavelets from multiscale analysis, tree

backbones from Barnes-Hut style simulations, parametrici-

ty/tensor-product from numerical linear algebra), and we

expect that it can be extended to be of similar practical use

while being more mathematically general and transparent.

33