Page 1

arXiv:astro-ph/9906115v1 7 Jun 1999

A&A manuscript no.

(will be inserted by hand later)

Your thesaurus codes are:

02(12.07.1; 03.13.1; 03.13.4)

ASTRONOMY

AND

ASTROPHYSICS

A fast direct method of mass reconstruction for gravitational

lenses

M. Lombardi1,2and G. Bertin2

1European Southern Observatory, Karl-Schwarzschild Straße 2, D 85748 Garching bei M¨ unchen, Germany

2Scuola Normale Superiore, Piazza dei Cavalieri 7, I 56126 Pisa, Italy

Received 25 March 1999; accepted 17 May 1999

Abstract. Statistical analyses of observed galaxy distor-

tions are often used to reconstruct the mass distribu-

tion of an intervening cluster responsible for gravitational

lensing. In current projects, distortions of thousands of

source galaxies have to be handled efficiently; much larger

data bases and more massive investigations are envisaged

for new major observational initiatives. In this article we

present an efficient mass reconstruction procedure, a di-

rect method that solves a variational principle noted in an

earlier paper, which, for rectangular fields, turns out to re-

duce the relevant execution time by a factor from 100 to

1000 with respect to the fastest methods currently used,

so that for grid numbers N = 400 the required CPU time

on a good workstation can be kept within the order of 1

second. The acquired speed also opens the way to some

long-term projects based on simulated observations (ad-

dressing statistical or cosmological questions) that would

be, at present, practically not viable for intrinsically slow

reconstruction methods.

Key words: cosmology: gravitational lensing – methods:

analytical – methods: numerical

1. Introduction

In the context of weak or statistical lensing, the problem of

the determination of the dimensionless mass density distri-

bution κ(θ) from a map of the reduced shear g(θ) has been

considered in detail by various authors, using either simu-

lations or analytical calculations (e.g., Bartelmann 1995,

Schneider 1995, Seitz & Schneider 1996, Squires & Kaiser

1996, Lombardi & Bertin 1998a, 1998b).

The mass inversion is usually performed starting from

the vector field ˜ u(θ) defined in terms of the measured re-

duced shear g(θ) (Kaiser 1995). In the ideal case where

the measured shear g(θ) is just the true shear g0(θ), the

vector field ˜ u0(θ) can be shown to satisfy the relation

˜ u0(θ) = ∇ln?1 − κ0(θ)?= ∇˜ κ0(θ) ,

Send offprint requests to: M. Lombardi

Correspondence to: lombardi@sns.it

(1)

where κ0is the true dimensionless mass map and ˜ κ0(θ) =

ln?1 − κ0(θ)?. However, because of statistical and mea-

surement errors, ˜ u is not necessarily curl-free, and thus

κ can be determined only approximately. In a separate

paper (Lombardi & Bertin 1998b) we have shown that

– The statistical errors on κ(θ) are minimized if this

function is calculated as

˜ κ(θ) = ¯ κ +

?

Ω

HSS(θ,θ′) · ˜ u(θ′)d2θ′,(2)

where ¯ κ is a constant (introduced to take into account

the sheet invariance), HSS(θ,θ′) is the noise-filtering

kernel (Seitz & Schneider 1996), and Ω is the field of

observation.

– The same mass map can be obtained by solving the

equations

∇2˜ κ = ∇ · ˜ u ,

∇˜ κ · n = ˜ u · n

where n is the unit vector perpendicular to the bound-

ary ∂Ω of the field of observation Ω. Hence, the kernel

HSS(θ,θ′) can be identified as the Green function of

this Neumann boundary problem.

– Equations (3) and (4) are precisely the Euler equations

associated with the functional

(3)

on ∂Ω ,(4)

S =1

2

?

Ω

??∇˜ κ(θ) − ˜ u(θ)??2d2θ .

In other words, the functional S is minimized when

˜ κ(θ) is calculated using Eq. (2) or, equivalently, by

solving Eqs. (3) and (4).

(5)

To these three formulations of the mass inversion problem

there correspond three practical methods to calculate κ(θ)

from a given set of data.

– The first method considered is based on a direct cal-

culation of the kernel HSS(Seitz & Schneider 1996).

Once this kernel has been calculated for a given field

Ω, the mass inversion is straightforward (note that

the kernel HSSdepends on the field of observation).

A problem with this method is that a calculation of

Page 2

2 M. Lombardi & G. Bertin: A direct method of mass reconstruction

HSSis expensive in terms of memory requirements and

computation time. In fact, in order to compute κ on a

square grid of N × N points, HSSmust be calculated

on a multidimensional grid of N ×N ×N ×N points.

Moreover, 2N4multiplications are needed to evaluate

Eq. (2), and thus the method is of order O?N4?. Be-

cause of the large memory needed to allocate HSS, cal-

culations can be performed only with a limited value

of N (typically N ∼ 50).

– The introduction of a method that directly solves the

Neumann problem allows one to go beyond many of the

limitations of the HSSmethod. Equations (3) and (4)

can be solved using an over-relaxation method (Seitz

& Schneider 1998). In this case ˜ κ(θ) is calculated di-

rectly, and thus we need to allocate only N × N real

numbers. Moreover, the method can be applied with-

out difficulties to “strange” geometries Ω (while the

previous method is straightforward only when applied

to rectangular or circular fields). The over-relaxation

method is quicker than the kernel method, being ap-

proximately of order O?N3?.

– A direct method to minimize the functional (5) will be

presented in this paper. As we will see, this method

has several advantages and turns out to be very ef-

ficient from a computational point of view, being of

order O?N2logN?. Moreover, it is extremely easy to

implement (two implementations for rectangular fields

Ω written in C and in IDL are freely available on re-

quest).

We should stress that, as proved in an earlier paper

(Lombardi & Bertin 1998b), the three formulations are

mathematically equivalent. Thus it would not be surpris-

ing to find that proper numerical implementations per-

form, for large values of the grid number N, in a similar

manner as far as accuracy and reliability are concerned.

In practice, for finite values of N, the third method turns

out to be characterized by small errors, often smaller than

those associated with the other two procedures.

2. A direct method to solve the variational

principle

Direct methods in variational problems are well-known

especially in applied mathematics (see, e.g., Gelfand &

Fomin 1963). Suppose that one can find a complete set

of functions {fα} on the domain Ω (the full definition of

“complete” will be given below), so that any function on

Ω can be represented as a linear combination of the form

˜ κ(θ) =

∞

?

α=1

cαfα(θ) .(6)

More precisely, we assume that for any function ˜ κ(θ),

there is a choice for the coefficients {cα} such that

?

Ω

α=1

????∇˜ κ(θ) −

∞

?

cα∇fα(θ)

????

2

d2θ = 0 . (7)

Let us now introduce a sequence of trial mass maps

˜ κ[n](θ) =

n

?

α=1

c[n]

αfα(θ) . (8)

We further require that the function ˜ κ[n]minimizes the

functional S: in other words, c[n]

so that the functional S has minimum value. This obvi-

ously happens when

1,c[n]

2,...,c[n]

n are chosen

∂S

∂c[n]

α

= 0for α = 1,2,...,n . (9)

Solving this set of n equations, we obtain the n coeffi-

cients c[n]

operation for a sequence of values of n, we find a sequence

of functions ˜ κ[n]. These functions, under suitable assump-

tions (verified in our problem), have the following proper-

ties (see Gelfand & Fomin 1963 for a detailed discussion):

(i) Let us call S[n]the value of S when ˜ κ is replaced by

the function ˜ κ[n]. Then, obviously, the sequence S[n]is not

increasing. (ii) If the set {fα} is complete, then the func-

tions ˜ κ[n]converge to the solution ˜ κ of the problem. This

method thus provides a way to obtain the function ˜ κ(θ)

with desired accuracy.

The method described here can be easily applied to

our problem. In fact, by expanding ˜ κ[n](θ) as in Eq. (8),

we find

α , and thus the function ˜ κ[n]. By repeating this

∂S

∂c[n]

α

=

?

Ω

∇fα(θ)·

? n

β=1

?

c[n]

β∇fβ(θ)− ˜ u(θ)

?

d2θ = 0 .(10)

The previous equation, for α = 1,2,...,n, represents a

linear system of n equations for the n variables?c[n]

solution is thus the set of coefficients to be used in Eq. (8).

However, we note that care must be taken in the choice

of a complete set of functions. Let us define, for the pur-

pose, the product ?v,w? between two generic vector fields

v(θ) and w(θ) as

α

?. Its

?v,w? =

?

Ω

v(θ) · w(θ)d2θ .(11)

As our problem involves ∇˜ κ, the completeness has to be

referred to the set of the gradients. In other words, the set

?fα

?∇fα,∇˜ κ? = 0

for every α implies ˜ κ(θ) = constant. It is easy to show

that this condition is equivalent to Eq. (7).

The direct method can be further simplified if a set of

functions {fα} can be taken to satisfy a suitable orthonor-

mality condition, so that the gradients of the functions

{fα} satisfy

?is complete if

(12)

?∇fα,∇fβ? = δαβ,(13)

Page 3

M. Lombardi & G. Bertin: A direct method of mass reconstruction3

where δαβ= 1 for α = β and 0 otherwise. Then Eq. (10)

can be rewritten simply as

c[n]

α= ?∇fα, ˜ u? .

Thus, with the use of an orthonormal set of functions

we have secured two important advantages: (i) The lin-

ear system (10) has been diagonalized, so that its solution

is trivial. (ii) The coefficients c[n]

that is, the coefficients of the exact solution are given by

cα= ?∇fα, ˜ u?.

Because of these advantages, whenever possible an or-

thonormal set of functions should be used. We note, how-

ever, that the orthonormality condition (13) depends on

the field of observation Ω. Even if the existence of an or-

thonormal set of functions is always guaranteed by the

spectral theory for the Laplace operator (see Brezis 1987),

for “strange” geometries, it may be non trivial to find a

complete orthonormal set of functions. In such cases, we

need to solve the linear system (10).

The direct method described above has several ad-

vantages with respect to the “kernel” method and to the

over-relaxation method: (i) The method is fast in the case

where an orthonormal set of functions can be found. In

fact, we need only to evaluate one integral for each coeffi-

cient cαthat we want to calculate. (ii) The method does

not require a large amount of memory: we need to retain

only the n values of the coefficients cα. (iii) The precision

of the inversion is driven in a natural way by the value of

n. Typically, the larger α is, the smaller the length scale

of fα(see below). (iv) In some cases, the decomposition

of the mass density ˜ κ(θ) in terms of the functions fαcan

be useful.

(14)

α no longer depend on n:

3. Rectangular fields

When the field Ω is rectangular, an orthonormal set of

functions can be written easily. Here we consider the spe-

cial case when Ω is a square of length π (in some suitable

units); any rectangular field can be handled in a simi-

lar manner. In the case considered, an orthonormal set of

functions is given by

fαβ(θ) = nαβcosαθ1cosβθ2,(15)

with (α,β) ∈ N2\(0,0). The normalization nαβis defined

as

√2

π

?

2

π

?

The function f00is not defined. Note that here we use two

indices for the set. Cosines must be used in order to have

a complete set (see Eqs. (7), (12), and Appendix A). Our

problem is solved in terms of the coefficients cαβ:

nαβ=

α2+ β2

for α = 0 or β = 0 ,

α2+ β2

otherwise .

(16)

cαβ= −nαβ

?

Ω

?α˜ u1(θ)sinαθ1cosβθ2+

β˜ u2(θ)cosαθ1sinβθ2

?d2θ , (17)

˜ κ(θ) =

?

αβ

cαβnαβcosαθ1cosβθ2. (18)

We now observe that the particular choice of the orthonor-

mal set {fαβ} allows us to use fast Fourier transform

(FFT) techniques to evaluate Eqs. (17) and (18). The use

of FFT makes the direct method very efficient: in particu-

lar the method becomes of order O?N2logN?. Moreover,

several optimized FFT libraries are available.

The optimal truncation for the series (18) is deter-

mined by the adopted grid numbers: for a grid of N × M

points, α should run from 0 to N − 1, and β from 0 to

M − 1 (this is standard practice for FFT libraries).

4. Performance

Our method has been implemented in C and in IDL. The C

version uses the library FFTW (“Fastest Fourier Transform

in the West,” version 2.0.1) to perform discrete Fourier

transforms (DFT). This library, written by Matteo Frigo

and Steven G. Johnson, is considered the quickest DFT

library publicly available. The performance of our di-

rect method is compared with that of the over-relaxation

method, also implemented in C. The procedure used in the

tests is summarized in the following points:

1. A simple model for the dimensionless mass distribu-

tion has been chosen. Then the mass distribution ˜ κ0is

calculated on a grid of N × N points.

2. The associated field ˜ u0is calculated on the same grid

using a 3-point Lagrangian interpolation in order to

numerically evaluate the derivatives that are needed.

3. Noise is added to the vector field ˜ u0using an analytical

model for the noise derived earlier (Lombardi & Bertin

1998b). In practice, the various Fourier components

of the noise are added using a suitable model for the

power spectrum.

4. The resulting noisy ˜ u map is inverted using the over-

relaxation method and the present direct method. The

two dimensionless mass maps obtained are then com-

pared. Moreover, the inversion times are recorded.

The results obtained in the tests are the following:

– The two mass densities obtained are consistent with

each other.

– Because of the set of functions used, the errors pro-

duced by the direct method are larger on the bound-

ary of the field. For this reason, we suggest that a one

pixel strip around the field should be discarded. The

area discarded is very small.

– Some tests have been performed by providing ˜ u0to the

inversion procedures. This allows us to compare the

reconstructed mass density with the original map κ0.

From such tests we have noted that the discretization

Page 4

4M. Lombardi & G. Bertin: A direct method of mass reconstruction

0.01

0.1

1

10

100

1000

10000

50100 200400

time (sec)

N

Fig.1. Execution time per call vs. grid number N. The

solid line refers to the direct method applied to “good

numbers” (values 2N−1 that can be factorized with small

primes), the long-dashed refers to the direct method ap-

plied to “bad numbers” (2N − 1 prime), and the short-

dashed line to the over-relaxation method.

errors of the direct method are slightly smaller than the

discretization errors of the over-relaxation method.

– The results of the two methods differ because of the

sheet invariance: in particular, the direct method al-

ways gives a “reduced” mass map ˜ κ with vanishing

total mass.

Regarding the second item, we note that the error is re-

lated to the finite sampling scale of the method; the er-

ror affects only the outermost pixel because of the proper

choice of the truncation (see comment at the end of

Sect. 3).

The measured execution times are plotted in Fig. 1 for

different values of N. These are the averaged CPU execu-

tion times for a single reconstruction on a SUN Ultra 1

workstation. From this figure it is clear that the direct

method is much faster than the over-relaxation method.

Here we should recall that, because of some character-

istics of the FFTW library, the execution time of the di-

rect method can change significantly even for neighbour-

ing values of N. In particular, the inversion is faster when

(2N −1) can be factorized with small prime numbers, and

is slower in other cases (see Fig. 1). For example, the ex-

ecution time (on a SUN Ultra 1) changes from 2.942 to

0.232 seconds when N changes from 121 to 122. Finally,

we observe that our implementation of the direct method

is not optimal: in fact, with a different (non-trivial) use

of FFT one might gain an additional factor of 4 on the

execution time.

Besides the appealing aspects of simplicity inherent to

the direct method described in this paper, we should note

that gaining three orders of magnitude in CPU time will

make it possible to undertake a few long-term projects

of simulated observations (in particular, with the goal of

a statistically sound investigation of the quality of mass

?

?

???

0.6

0.5

0.4

0.3

0.2

0.1

0

?

?

10’

8’

6’

4’

2’

0’

?

?

10’

8’

6’

4’

2’

0’

0.1

0.2

0.3

0.4

0.5

????

0.6

0.5

0.4

0.3

0.2

0.1

0

?

?

10’

8’

6’

4’

2’

0’

?

?

10’

8’

6’

4’

2’

0’

0

0.1

0.2

0.3

0.4

0.5

??

?

??????

?

???

?0.0010

?0.0005

0

?0.0005

?0.0010

?

?

10’

8’

6’

4’

2’

0’

?

?

10’

8’

6’

4’

2’

0’

?0.0005

0

?0.0005

Fig.2. A typical result of mass reconstruction; at the

adopted distance for the lensing cluster, the side of the

square field, 10 arcminutes, corresponds to approximately

2.88 Mpc. From top to bottom, true dimensionless mass

distribution, reconstructed distribution (from the direct

method), and difference between maps of the variable ˜ κ

derived from direct and over-relaxationmethods. The very

small residuals show that the two methods are practically

equivalent in terms of accuracy.

reconstruction; but other objectives might be formulated,

e.g. in the cosmological context) that would remain prac-

tically out of reach for other intrinsically slow reconstruc-

tion methods.

5. Examples of simulated reconstructions

In addition to the reconstructions from “synthetic” data

as described in the previous Section, we have performed

several additional tests in order to demonstrate the relia-

bility of our method. The tests, designed with the aim

to reproduce the main features of a “real” reconstruc-

tion, have been made following a straightforward proce-

dure (see, e.g., Lombardi & Bertin 1999 for similar simu-

lations).

Page 5

M. Lombardi & G. Bertin: A direct method of mass reconstruction5

First we have generated a population of source galaxies

using a pseudo-random number generator. Here a source

galaxy is represented by its position and by its ellipticity

(see Eq. (11) of Lombardi & Bertin 1999, following Seitz &

Schneider 1997). Positions are drawn from a homogeneous

distribution (with a density of 70 galaxies per square ar-

cmin), while ellipticities are drawn from a truncated Gaus-

sian distribution with variance σ2= (0.3)2. Sources are

assumed to have all the same redshift zs= 1.5. Source el-

lipticities are then transformed into observed ellipticities.

For simplicity, the observed galaxy positions are assumed

to be equal to the source positions: in other words, no

depletion effects are included in the simulations.

Then the calculation of the observed ellipticities has

been done by referring to a cluster of galaxies placed

at zd = 0.3 with total mass inside the 10’ × 10’ field

1.5 × 1015M⊙. For the purpose of introducing the lens-

ing effects, we only need to specify the dimensionless pro-

jected mass map κ(θ). For simplicity, we have used a den-

sity distribution made of three symmetrical components;

each component is described by the analytical model out-

lined by Schneider et al. (1992, p. 244), which, at large

radii, is approximately isothermal.

By averaging the observed ellipticities, we have then

obtained a map of the reduced shear g(θ) and, from that

map, the vector field ˜ u(θ). The mass inversion has been

performed using the direct method and the over-relaxation

method. The two mass distributions have been then com-

pared.

An example of typical results is shown in Fig. 2. Here,

from top to bottom, we display the original cluster mass

distribution, the reconstruction obtained using the direct

method, and the residuals, i.e. the difference between the

reconstructed maps from the direct method and from the

over-relaxation method. As the figure clearly shows, dif-

ferences are mainly confined to the boundary of the field

where they are found to be of the order of 0.0002, well

below the statistical errors of the reconstruction. In the

inner field the differences are about one order of magni-

tude smaller. Note also that the wavy overall appearance

of the reconstructed map is normal for weak lensing re-

constructions, resulting from the relatively low number of

source galaxies involved (see Lombardi & Bertin 1998b for

a discussion of the statistical aspects of the problem).

Acknowledgements. We thank Luigi Ambrosio and Peter

Schneider for interesting discussions and suggestions. The

reconstruction code uses FFTW 2.0.1 by Matteo Frigo and

Steven G. Johnson. This work has been partially supported

by MURST and by ASI of Italy.

Appendix A: Completeness of {fαβ}

In this appendix we will verify explicitly that the set of

functions defined in Eq. (15) is complete, in the sense that

Eqs. (3) and (4) can be recovered. For the purpose, we

will apply the Fourier theorem (Brezis 1987). In the fol-

lowing, we will assume that ˜ u(θ) is a smooth vector field

(we stress that this condition is needed only for the proof

provided below; the method remains applicable to more

general cases).

Let us consider a solution of the form (6). Then,

because of the orthonormality condition (13), we have

?∇fα,∇˜ κ? = cα= ?∇fα, ˜ u?, so that

?

Ω

?∇fα,∇˜ κ − ˜ u? =∇fα· (∇˜ κ − ˜ u)d2θ = 0 . (A.1)

Now we observe that the previous equation holds for

any α: then it holds also for any linear combination

f =?

0 =

Ω

?

∂Ω

αdαfαof {fα}. Thus

?

∇f · (∇˜ κ − ˜ u)d2θ = −

?

Ω

f∇ · (∇˜ κ − ˜ u)d2θ

+

f(∇˜ κ − ˜ u) · ndθ .(A.2)

In the last step we have integrated by parts (n is the unit

vector orthogonal to the boundary ∂Ω of Ω).

We now use this equation to show that the chosen set

of functions, described by Eq. (15), is complete, while, e.g.,

a similar set made of sine functions would not be complete.

By the nature of the chosen set of functions we already

know that we can properly represent any smooth function

f. Using this property, we want to show that the two terms

∇ · (∇˜ κ − ˜ u) and (∇˜ κ − ˜ u) · n entering in the r.h.s. of

Eq. (A.2) vanish on Ω and on ∂Ω respectively.

For the purpose, we observe that if cosines are used

as set of functions, we can “build” any function f pro-

vided that the function has periodic derivatives on the

boundary. In particular, if A ⊂ Ω is an arbitrary open

subset of Ω, there is a function f that is positive on A and

vanishes on Ω \ A. Now suppose per absurdum that the

solution obtained from the direct method does not satisfy

Eq. (3), so that, e.g., ∇·(∇˜ κ− ˜ u) > 0 on a point θ∗∈ Ω.

Then, for the sign persistence theorem, this quantity must

be strictly positive in a neighborhood A of θ∗. However,

if we take a function f which is positive on A and van-

ishes elsewhere, the rhs of Eq. (A.2) will be positive, while

the lhs vanishes, which is contradictory. This proves that

Eq. (3) is verified by cosines.

In a similar manner, now that we have “disposed of”

the first term, we observe that using cosines we can build

a function f that vanishes everywhere on the boundary

of ∂Ω except for a neighborhood. In other words, given

an open subset B ⊂ ∂Ω of the boundary ∂Ω, there is a

function f that is positive on B and vanishes on ∂Ω \ B.

Note that this property would not be satisfied if the set of

functions {fα} were based on sines. Using a proof similar

to the one given above, we obtain that (∇˜ κ − ˜ u) ·n must

vanish, thus leading to Eq. (4).

One might worry that, on the boundary, the chosen

set of functions of Eq. (15) has zero derivative in the di-

rection of n, i.e. ∇fαβ· n = 0. This might suggest that

Page 6

6M. Lombardi & G. Bertin: A direct method of mass reconstruction

the boundary condition (4) cannot be reproduced. In re-

ality, although this is true pointwise, this does not affect

the convergence in L2that is relevant for our problem (see

Eqs. (7) and (12)). This important point is best illustrated

by the following example.

Suppose that we measure a constant field ˜ u(θ) = (1,0)

on a square field Ω of side π. The obvious solution for ˜ κ

in this case is ˜ κ(θ) = θ1. Suppose now that we try to

use a set of functions made of sines. Then the correspond-

ing coefficients cαβwould be proportional to the integrals

α?π

cient cαβ vanishes. This proves that a set based on sines

is not complete (condition (12) is not satisfied or, equiva-

lently, a curl-free vector field ˜ u leads to a vanishing mass

map). On the other hand, if we use the set (15), the co-

efficients cαβ do not vanish and the corresponding mass

distribution is given by

0θ1cosαθ1dθ1

?π

0sinβθ2dθ2= 0. Hence, every coeffi-

˜ κ(θ) = −

?

α odd

4

πα2cosαθ1. (A.3)

This function can be shown to reduce to ˜ κ(θ) = θ1−π/2.

References

Bartelmann M., 1995, A&A 303, 643

Brezis H., 1987, “Analyse fonctionnelle : th´ eorie et applica-

tion,” Masson, Paris

Gelfand I.M., Fomin S.V., 1963, “Calculus of Variations,”

Prentice-Hall, Englewood Cliffs, NJ

Kaiser N., 1995, ApJ 493, L1

Lombardi M., Bertin G., 1998a, A&A 330, 791

Lombardi M., Bertin G., 1998b, A&A 335, 1

Lombardi M., Bertin G., 1999, A&A 342, 337

Schneider P., 1995, A&A 302, 639

Schneider P., Ehlers J., Falco E.E., 1992, “Gravitational

Lenses,” Springer-Verlag, Berlin

Seitz C., Schneider P., 1997, A&A 318, 687

Seitz S., Schneider P., 1996, A&A 305, 383

Seitz S., Schneider P., 1998, preprint astro-ph/9802051

Squires G., Kaiser N., 1996, ApJ 473, 65